Design & Implementation of a PDF to Excel Conversion Tool (P2X)

Penny, Latoyia Devonne
The conversion of a portable document structures into an editable format is formally described. Conversion of paper based documents to electronic form is a necessity encountered by public and private sectors. The converted electronic form may not be editable. There are several applications that need documents in editable or plain text form. In this thesis we address this problem with the design and implementation of a conversion tool, P2X. The conversion tool was developed to automatically convert batches of PDF tabular data to editable spreadsheet format using a novel approach. We show that significant improvements to the quality of data conversion can be achieved at insignificant cost and with minimal complexity.