Guides

Extract Tables from PDF on Mac

Overview

To extract table content from a PDF document.

Standard table and non-standard table

Commonly, tables can be divided into two categories: standard tables and non-standard tables. The specific definitions are as follows:

Standard table: The table border and the inner lines of the table are complete and clear. There is no need to manually add table lines to divide the table content.

Non-standard tables: Table borders or inner lines are missing, and table lines are unclear. Table lines need to be manually added to separate the table content.

Notice

Non-standard tables in the original PDF document cannot be extracted when the OCR option is not enabled.
It is recommended to enable OCR or AI layout analysis options for higher accuracy of table extraction and the support of non-standard table recognition.

Sample

Full sample code which illustrates the table extraction capabilities.

objective-c

// Get the path of the PDF file.
NSString *pdfPath = @"...";
// Get the path to the json file.
NSString *outputPath = @"...";

CPDFConvertJsonTableOptions *options = [[CPDFConvertJsonTableOptions alloc] init];

CPDFConverterJsonTable *converter = [[CPDFConverterJsonTable alloc] initWithURL:[NSURL fileURLWithPath:pdfPath] password:nil];
[converter convertToFilePath:outputPath pageIndexs:nil options:options];

Extract Tables from PDF on Mac ​

Overview ​

Standard table and non-standard table ​

Notice ​

Sample ​

Extract Tables from PDF on Mac

Overview

Standard table and non-standard table

Notice

Sample