Skip to content
Guides

Convert PDF to TXT

Overview

When you need to extract the text content in the PDF file, in order for data analysis, text mining, information retrieval, etc. Using CompDF Convert SDK, you can easily extract the text in the PDF into the TXT file.

Note

  • The current version of SDK will ignore the rotated PDF text.

Sample

This sample demonstrates how to convert PDFs to TXT files.

java
        CPDFConvert cpdfConvertTxt = new CPDFConvertTxt();
        CPDFConvertTxtOptions convertTxtOptions = new CPDFConvertTxtOptions();
        String inputPath = rootDir + input_file + "word.pdf";
        List<Integer> pageCounts = getPageCounts(cpdfConvertTxt.getPageCount(inputPath, password));
        ConvertResult convert = cpdfConvertTxt.convert(inputPath, rootDir + output_file, "", convertTxtOptions, pageCounts, password, page -> {
        });