Tutorials

How to Convert PDF to Excel Using Java - Free PDF Conversion API

By ComPDFKit | Thu. 18 Jan. 2024
PDF APIConversionJava

For security reasons, many documents like invoices, industry reports, tax forms, etc., are usually saved in PDF format. However, if you need to extract data from these documents for more in-depth data analysis and visualization, it's often necessary to convert them from PDF to Excel. In this post, you’ll learn how to convert PDF to XLSX in your Java application using ComPDFKit’s  PDF to Excel API and API libraries for Java. With our API, you can convert up to 1000 PDF files per month for free. All you need to do is create a free account to get access to your API key.

 

Java PDF to Excel Conversion REST API - ComPDFKit

 

ComPDFKit PDF converter API allows you to effortlessly convert PDF from/to XLSX, DOCX, PPTX, HTML, PNG, TXT, CSV, RTF, etc. Document conversion is just one of our 30+ PDF API tools. You can combine our conversion tool with other tools to create complex document processing workflows. Here are some of our other tools:

 

Merge, split, insert, extract, delete specific PDF pages

 

OCR, watermark, or compress PDFs

 

Compare documents (including content comparison and overlay comparison)

 

To increase developer productivity and simplify the development process, developers can quickly integrate and start using the API services of ComPDFKit API Libraries, which means you don’t need to build requests from scratch. In addition, ComPDFKit API Libraries support multiple languages like Java, PHP, Python, C# .NET, Swift, Node.JS, C/C++, Kotlin, Ruby, Go, etc.

 

Request Workflow

 

The processing workflow of the ComPDFKit API is very simple. It consists of four basic request instructions: create a task, upload a file, execute a task, and download a result file. Through these four requests, you can select the corresponding PDF tool to process your file and obtain the download link of the result file.

 

ComPDFKit API request workflow

 

Convert PDF to Excel with ComPDFKit API in Java

 

The ComPDFKit API uses HTTP requests for communication, allowing seamless access and integration across different programming languages like Java. 

 

When using this API for programmatic PDF to XLSX conversion, you have a range of parameters at your disposal. You can customize the data extraction process, choosing to include only tables, only text, or all content. Additionally, you can specify how worksheets are created—whether for each table, each page, or for the entire file. Further options include deciding whether to incorporate annotations and images, and whether to activate OCR and enable AI for table recognition.

 

The steps to access the PDF to Excel API tool and process PDF conversion are as below: 

 

Step 1 — Creating a Free Account on ComPDFKit

 

Go to our website, where you’ll see the page below, prompting you to create your free account.

 

sign up ComPDFKit API

 

Once you’ve created your account, you’ll be welcomed by the page below, which shows an overview of your plan details.

 

Dashboard of ComPDFKit API

 

As you can see on the dashboard, you can process 1000 documents per month for free, and you’ll be able to access all our PDF API tools.

 

Step 2 — Obtaining the API Key for Authentication

 

After you’ve verified your email, you can get your API key from the dashboard. In the menu on the left, click API Keys. You’ll see the following page, which is an overview of your keys:

 

Get API Key for authentication

 

Now You need to replace public_key and secret_key with accessToken in the publicKey and secretKey authentication return values you get from the console.

 

import java.io.*;
import okhttp3.*;
public class main {
  public static void main(String []args) throws IOException{
    OkHttpClient client = new OkHttpClient().newBuilder()
      .build();
    MediaType mediaType = MediaType.parse("text/plain");
    RequestBody body = RequestBody.create(mediaType, "{\n    \"publicKey\": \"{{public_key}}\",\n    \"secretKey\": \"{{secret_key}}\"\n}");
    Request request = new Request.Builder()
      .url("https://api-server.compdf.com/server/v1/oauth/token")
      .method("POST", body)
      .build();
    Response response = client.newCall(request).execute();
  }
}

 

Step 3 — Creating Task

 

You need to replace the accessToken which was obtained from the previous step, and replace the language type you want to display the error information. After replacing them, you will get the taskId in the response data.

 

import java.io.*;
import okhttp3.*;
public class main {
  public static void main(String []args) throws IOException{
    OkHttpClient client = new OkHttpClient().newBuilder()
      .build();
    MediaType mediaType = MediaType.parse("text/plain");
    RequestBody body = RequestBody.create(mediaType, "");
    Request request = new Request.Builder()
      .url("https://api-server.compdf.com/server/v1/task/pdf/xlsx?language={{language}}")
      .method("GET", body)
      .addHeader("Authorization", "Bearer {{accessToken}}")
      .build();
    Response response = client.newCall(request).execute();
  }
}

 

Step 4 — Uploading Files

 

Replace the file you want to convert, the taskId obtained in the previous step, the language type you want to display the error information, and the accessToken obtained in the first step.

 

import java.io.*;
import okhttp3.*;
public class main {
  public static void main(String []args) throws IOException{
    OkHttpClient client = new OkHttpClient().newBuilder()
      .build();
    MediaType mediaType = MediaType.parse("text/plain");
    RequestBody body = new MultipartBody.Builder().setType(MultipartBody.FORM)
      .addFormDataPart("file","{{file}}",
                       RequestBody.create(MediaType.parse("application/octet-stream"),
                                          new File("")))
      .addFormDataPart("taskId","{{taskId}}")
      .addFormDataPart("language","{{language}}")
      .addFormDataPart("password","")
      .addFormDataPart("parameter","{  \"contentOptions\": \"2\",  \"worksheetOptions\": \"1\"}")
      .build();
    Request request = new Request.Builder()
      .url("https://api-server.compdf.com/server/v1/file/upload")
      .method("POST", body)
      .addHeader("Authorization", "Bearer {{accessToken}}")
      .build();
    Response response = client.newCall(request).execute();
  }
}

 

Step 5 — Processing Files

 

Replace the taskId you obtained from the Create task, and the accessToken obtained in the first step, and replace the language type you want to display the error information.

 

import java.io.*;
import okhttp3.*;
public class main {
 public static void main(String []args) throws IOException{
   OkHttpClient client = new OkHttpClient().newBuilder()
     .build();
   MediaType mediaType = MediaType.parse("text/plain");
   RequestBody body = RequestBody.create(mediaType, "");
   Request request = new Request.Builder()
     .url("https://api-server.compdf.com/server/v1/execute/start?taskId={{taskId}}&language={{language}}")
     .method("GET", body)
     .addHeader("Authorization", "Bearer {{accessToken}}")
     .build();
   Response response = client.newCall(request).execute();
 }
}

 

Step 6 — Getting Task Information

 

Replace taskId with the taskId you obtained from the step "Create the task", access_token replaced by access_token obtained in the first step.

 

import java.io.*;
import okhttp3.*;
public class main {
  public static void main(String []args) throws IOException{
    OkHttpClient client = new OkHttpClient().newBuilder()
      .build();
    MediaType mediaType = MediaType.parse("text/plain");
    RequestBody body = RequestBody.create(mediaType, "");
    Request request = new Request.Builder()
      .url("https://api-server.compdf.com/server/v1/task/taskInfo?taskId={{taskId}}")
      .method("GET", body)
      .addHeader("Authorization", "Bearer {{accessToken}}")
      .build();
    Response response = client.newCall(request).execute();
  }
}

 

Convert PDF to Excel with API Libraries for Java

 

The ComPDFKit API Libraries for Java follow the REST standard, offering a comprehensive set of PDF features such as viewing, editing, annotating, signing, OCR, compressing, and converting PDFs. Developers find it more efficient and user-friendly to utilize API requests rather than manually crafting them.

 

To kick off PDF to Excel conversion using API libraries in Java, ensure you meet the requirements and complete the installation process. 

 

Requirements:

 

Programming Environment: Java JDK 1.8 and higher.

 

Dependencies: Maven.

 

Installation:

 

Add the following dependency to your pom.xml:

 

Installation ComPDFKit API Library in Java

 

PDF To Excel Example

 

With the PDF to Excel tool, you can convert your PDF file into an Excel file. The following examples show how to upload a test PDF file and convert it into an Excel file (.xlsx) using Java.

 

// Create a client
CPDFClient client = new CPDFClient(publicKey,secretKey);

// Create a task
// Create an example of a PDF tO Excel task
CPDFCreateTaskResult result = client.createTask(CPDFConversionEnum.PDF_TO_EXCEL);

// Get a task id
String taskId = result.getTaskId();

// File handling parameter settings
CPDFToExcelParameter fileParameter = new CPDFToExcelParameter();
fileParameter.setIsContainAnnot("1");
fileParameter.setIsContainImg("1");
fileParameter.setContentOptions("2");
fileParameter.setWorksheetOptions("1");

// Upload files
client.uploadFile(new File("test.pdf"), taskId, fileParameter);

// Execute task
client.executeTask(taskId);

// Query TaskInfo
CPDFTaskInfoResult taskInfo = client.getTaskInfo(taskId);

 

Free Online PDF to Excel Converter

 

How to convert PDF to Excel online for free? Please try our online PDF to XLSX converter to create an Excel file from a PDF document. This converter is developed using the ComPDFKit PDF Converter API. You can also use our free online PDF tools to convert PDF from/to Office and other formats such as HTML, PNG, HTML, TXT, CSV, RTF, etc. Don’t need to download any apps, and all are 100% Free!

 

Conclusion

 

In this article, you’ve learned how to convert PDF data into Excel sheets in Java, programmatically upload the PDF file to the project and then download the converted Excel file from the project, and convert any PDF to Excel for free using an online PDF to XLSX converter.

 

You can integrate all these PDF functionalities into your applications or systems. With the same API token, you can also perform other operations, such as splitting or merging PDFs, adding watermarks, using OCR and AI table recognition, and more. To get started with a free trial, sign up here.