On this page
Guides
Extract Tables from PDFs
Overview
To extract table content from a PDF document.
Standard table and non-standard table
Commonly, tables can be divided into two categories: standard tables and non-standard tables. The specific definitions are as follows:
- Standard table: The table border and the inner lines of the table are complete and clear. There is no need to manually add table lines to divide the table content.
- Non-standard tables: Table borders or inner lines are missing, and table lines are unclear. Table lines need to be manually added to separate the table content.
Sample
To extract table content from a PDF document.
kotlin
val cPDFConvert = CPDFConverterTableToJson(context, uri, "")
val params = CPDFConvertTableToJsonOptions()
val result: ConvertError = cPDFConvert.convert(outputDir, outputFilenameNoSuffix, params, pageArrays,
onHandle = onHandleCal,
onProgress = onProgressCal,
onPost = onPostCal)