New Features
1. PDF to JSON adds version 2, 11 new recognition tags.
・Text: Ordinary text type object, including text content.
・Text: Ordinary text type object, including text content.
・Image: Image type object, including the path of the image.
・Table and UnstdTable Table type object, including the content and structure of the table.
・Catalogue Catalogue type object, including the content of the catalogue.
・List and UnorderedList List type object, including the content of the list.
・Formula Formula type object, including the content of the formula.
・Header Header type object, including the content of the header.
・Footer Footer type object, including the content of the footer.
・PageNumber Page number type object, including the content of the page number.
・FigureTitle Figure title type object, including the content of the figure title.
・FigureCaption Figure caption type object, including the content of the figure caption.
2. PDF to TXT adds version 2, supports text style retention.
3. DocumentAI layout analysis adds 11 new recognition tags, refer to the above PDF to JSON new tags.
4. DocumentAI supports GPU operation.
New Features
・PDF and image conversion: Support setting OCR recognition language.
・Extract images (JPG, JPEG, BMP, PNG) into Json interface (needs to be based on OCR).
Issues Addressed
・Fixed the problem of occupying the resources of the source file without releasing when converting scanned PDF files into editable PDFs.
・Some special hyperlink types document crash problem.
New Features
・PDF and image conversion, support setting OCR recognition language
Issues Addressed
・Reduce the misrecognition rate of forms
・Optimize the problem of image layer loss
・Fixed a small probability of blank table overwriting problem
・Fixed an issue where an error message was reported when a password was not set when uploading a file with a password in some conversion functions.
New Features
・Added Curl language sample code, take PDF to Word as an example, you can view the code on the official website.
New Features
・Added scheduled task processing: Waiting tasks will be resent to the queue after more than 15 minutes.
Issues Addressed
・The status of "TaskProcessing" in the queue is marked as "TaskFinish" when it is processed for the second time.
・In the dashboard, user assets that have not called the API are not refreshed.
・Solving the problem of LibreOffice multi-container startup failure due to mounting data volume in /tmp directory
New Features
・Added support for converting scanned PDF files to searchable PDF files.
・Added support for converting images to Word, Excel, PPT, HTML, TXT, CSV, and RTF.
・Added support for table recognition in the flowing text layout when converting PDF to Word.
Issues Addressed
・Optimized the conversion effect when converting PDF to Excel and outputting the entire PDF document content in one worksheet.
・Fixed the issue of conversion failure when the incoming file path contains the '%' symbol.
・Updated the error codes of the underlying library when converting files.
New Features
・Added support for extracting the information from images in PDFs and saving structured data as JSON files.
・Added support for automatically releasing the memory of the container via asynchronous API requests.
・Added support for recognizing the table regions with AI when converting PDF, improved the accuracy of table recognition.
・Added support for accurate file conversion with OCR, improved the conversion effect of scanned PDF files, and optimized the image loss problems when converting files.
・Added support for recognizing the image with OCR when converting PDF to Excel, and enhanced the picture rendering effect in the Excel table.
・Added support for recognizing and extracting the highlight, underline, squiggly, and strikeout in PDF files, and keeping these annotation features after converting them to Word, Excel, PPT, and HTML.
・Added support for converting PDF to Word or PPT with hyperlinks and keeping the links working properly.
・Added support for extracting text and text coordinates from PDF files, and output structured data as JSON files.
・Added support for extracting tables from PDF files and output structured data as JSON files.
・Optimized the way to call the C/C++ interface of the underlying library, with better operation consistency, more flexible error handling, and dynamic parameter passing, as well as higher performance and efficiency.
・Upgraded text detection model and text recognition model to improve OCR accuracy and precision.
・Optimized the line detection effect of the table recognition function for low-resolution images.
・Optimized the process logic of table recognition. If it fails to recognize the standard tables, the non-standard table algorithm will be used for recognition again to improve the table recognition success rate.
New Features
・Added support for comparing PDF documents, including overlay comparison and content comparison.
Issues Addressed
・Optimized the "front" parameter of the image watermark, allowing choosing whether to put the image watermark on top or not.
・Optimized converting PDFs to HTML files with smaller file sizes.
・Optimized the slow response issue when clicking cancel when converting PDF to RTF.
・Fixed the issue that setting negative dpi values would cause a crash when converting PDF to JPG.
・Fixed issues when converting PDF to HTML that can't jump to specific outlines in some cases and comment loss.
・Fixed the issue with getting font names failed on Linux.
Issues Addressed
・Optimized the API interface languages. English and Simplified Chinese supported.
New Features
・Added support for image processing, including edge detection, intelligent automatic image correction, ISO noise correction, automatic skew correction, automatic document orientation detection, etc. to improve image quality.
・Added support for recognizing the table regions and extracting the table structure and data information completely.
・Added support for layout analysis to detect and analyze images and forms, and process them separately.
・Added support for trim correction of PNG image files.
・Added support for automatically detecting and recognizing stamps in contract documents or common bills, and output text content, stamp location information, and the number of stamps.
New Features
・Added support for Retain Flowing Text when converting PDF to Word.
・Added support for setting DPI when converting PDF to JPG, such as setting DPI 300.
・Added optimization options customization.
New Features
・Added support for OCR to recognize scanned PDFs or images-based PDFs as editable and searchable PDFs.
New Features
・Comprehensively upgraded server architecture to increase server performance, and improve server operating speed, stability, carrying capacity, and security.
New Features
・Added support for inserting a blank page, or selecting another PDF to insert into the existing document.
・Added support for deleting one or more pages from a PDF file.
・Added support for rotating the chosen pages to 90 or 180 degrees.
・Added support for extracting pages or page ranges from documents and saving them as a new PDF document.
・Added support for creating and deleting text or image (.jpg) watermarks on PDF files. Allowed to set watermarks' scale, opacity, rotation angle, and other properties.
New Features
・Added support for converting PDFs to HTML (.html).
・Added support for converting PDFs to CSV (.zip or .csv).
・Added support for converting PDFs to RTF (.rtf).
・Added support for converting HTML (.html) to PDFs.
・Added support for converting RTF (.rtf) to PDFs.
・Added support for converting CSV (.csv) to PDFs.
・Added support for combining and merging two documents or a list of documents into one PDF document.
・Added support for splitting PDF files by pages and saving them in individual PDFs or splitting a particular collection of pages from a PDF file.
ComPDFKit API is an HTTP API that provides you with a simple document-in, document-out-based workflow. PDF conversion, document editor, document AI, document comparison, and other more advanced PDF functionalities are available.
ComPDFKit API V1.0.0 supports the functionalities as follows:
・Support converting PDFs to Word (.docx).
・Support converting PDFs to Excel (.xlsx).
・Support converting PDFs to PowerPoint (.pptx).
・Support converting PDFs to Image (.png or .jpg).
・Support converting PDFs to TXT (.txt).
・Support converting Word (.doc or .docx) to PDFs.
・Support converting Excel (.xls or .xlsx) to PDFs.
・Support converting PPT (.ppt or .pptx) to PDFs.
・Support converting Image (.png) to PDFs.
Useful Links:
・Home Page: https://www.compdf.com
・API Page: https://api.compdf.com/
・Contact Sales: https://api.compdf.com/contact-us
・Technical Issues Feedback: https://www.compdf.com/support