Skip to content
Guides

Convert PDF to HTML

Overview

ComPDFKit Conversion SDK provides the PDF to HTML function, which can convert PDF files to HTML files while maintaining the layout and format of the original document, allowing users to browse and view the document on Web.

Notice

When converting PDF to HTML format, ComPDFKit Conversion SDK provides the following four options to create HTML files:

OptionsDescription
PageAndNavigationPaneOptions.SinglePageConvert the entire PDF file into a single HTML file, where all PDF pages are connected in sequence according to page number, displayed on the same HTML page.
PageAndNavigationPaneOptions.SinglePageNavigationByBookmarksConvert the PDF file into a single HTML file with an outline for navigation at the beginning of the HTML page. Still, all PDF pages are connected in sequence according to page number, displayed on the same HTML page.
PageAndNavigationPaneOptions.MultiplePagesConvert the PDF file into multiple HTML files. Each HTML file corresponds to a PDF page, and users can navigate to the next HTML file via a link at the bottom of the HTML page.
PageAndNavigationPaneOptions.MultiplePagesNavigationByBookmarksConvert the PDF file into multiple HTML files. Each HTML file corresponds to a PDF page, and users can navigate to the next HTML file via a link at the bottom of the HTML page. The links of all the HTML files are presented in an outline HTML file for navigation.

Sample

This sample demonstrates how to convert from a PDF to HTML file.

c#
string inputFilePath = "***";
string outputFolderPath = "***";
string outputFileName = "***";

CPDFConverterHTML converter = CPDFConvertFactroy.CreateConverter(CPDFConvertType.CPDFConvertTypeHtml, inputFilePath) as CPDFConverterHTML;

CPDFConvertHTMLOptions htmlOptions = new CPDFConvertHTMLOptions();
htmlOptions.PageAndNavigationPaneOpts = PageAndNavigationPaneOptions.SinglePageNavigationByBookmarks;
htmlOptions.IsAllowOCR = false;
htmlOptions.IsContainAnnotations = true;
htmlOptions.IsContainImages = true;

int pageCount = converter.GetPagesCount();
int[] pageArray = new int[pageCount];
for (int i = 0; i < pageArray.Length; i++)
{
    pageArray[i] = i + 1;
}

ConvertError error = ConvertError.ERR_UNKNOWN;
converter.Convert(outputFolderPath, ref outputFileName, htmlOptions, pageArray, ref error, getPorgress);