Skip to content
Guides

Getting Started Guide

This guide will walk you through the steps to start ComPDFKit On-premises. It will also show you how to use it to process documents.

Requirement

ComPDFKit On-premises for Linux can run on multiple platforms. It supports the following operating systems:

  • Ubuntu, Fedora, Debian, or CentOS. It also supports Ubuntu and Debian derivatives such as Kubuntu or Xubuntu. Currently, it only supports 64-bit Intel (x86_64) processors.

Regardless of the operating system you are using, you will need at least 4GB of RAM.

Install Docker

ComPDFKit On-premises for Linux is distributed in the form of a Docker container. To run it on your computer, you need to install the Docker runtime environment for your operating system.

Please follow the instructions on the Docker official website to install and start Docker Engine.

After installing Docker, you can use the installation instructions to install Docker Compose. The instructions can be found at here.

Start ComPDFKit On-premises

ComPDFKit On-premises also provides a simple web page that is primarily used to render PDF files and display them on a web page. You can directly process PDF files on the web page. You can access it through localhost:7000/index.html.

ComPDFKit On-premises for Linux uses MySQL database for data storage. Therefore, you need to configure an available MySQL database and import the ”compdfkit.sql“ file into your database.

Register Docker Hub, and reference the mirrors of ComPDFKit On-premises for Linux using the compdfkit/compdfkit:tag. To pull the latest mirrors of ComPDFKit On-premises for Linux, run the following command:

shell
docker pull compdfkit/compdfkit:3.0.0

You need to ensure that docker-compose is already installed on your system. To install docker-compose, please visit Overview of installing Docker Compose | Docker Docs to learn how to install it on your operating system.

First, you need to create a docker-compose.yml file at a location of your choice. You can use vim docker-compose.yml to edit it. Copy the following content into the docker-compose.yml file:

yaml
version: '3.3'
services:
  compdfkit_processor:
    restart: always
    image: compdfkit/compdfkit:3.0.0
    container_name: compdfkit_processor
    # If you do not use GPU, you can omit this part.
    deploy:
      resources:
        reservations:
          devices:
          - driver: "nvidia"
            count: "all"
            capabilities: [ "gpu" ]
    ports:
      - 7000:7000
    environment:
      LICENSE: your LICENSE_KEY
      DB_URL: dbmysql:3306/compdfkit
      DB_USERNAME: root
      DB_PASSWORD: mypassword
      # Temporary storage space for file processing.
      TMP_PATH: /tmp/compdfkit
      # Error message language setting, supports Chinese and English (default is en).
      LANGUAGE: en
      USEGPU: 'true'
      GPU_ID: "0"
    depends_on:
      - dbmysql
  dbmysql:
    image: mysql:8.0.27
    restart: always
    command: --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci --skip-character-set-client-handshake --skip-name-resolve --max_allowed_packet=500M --max_connections=1000 --default-authentication-plugin=mysql_native_password
    container_name: dbmysql
    ports:
      - 3306:3306
    environment:
      MYSQL_ROOT_PASSWORD: mypassword
      MYSQL_DATABASE: compdfkit
    volumes:
      - ./data:/var/lib/mysql
      - ./compdfkit.sql:/docker-entrypoint-initdb.d/compdfkit.sql

Next, you need to copy the ”compdfkit.sql“ file to the same directory level as the docker-compose.yml file.

Then you need to execute the following command:

sh
docker-compose up

When you see the following content, it means the project has started successfully:

shell
Running ComPDFKit Processor version 3.0.0 port(s) 7000 (http)

API Configuration

Required:

  • LICENSE - This is the license key used to activate ComPDFKit On-premises. If not specified or incorrect, ComPDFKit On-premises will not be able to start.

  • DB_URL - This is the database connection address, composed of <host>:<port>/<database name>. If not specified or incorrect, ComPDFKit On-premises will not be able to start.

  • DB_USERNAME - This is the username of database connection. If not specified or incorrect, ComPDFKit On-premises will not be able to start.

  • DB_PASSWORD - This is the user password of database connection. If not specified or incorrect, ComPDFKit On-premises will not be able to start.

Optional:

  • SERVER_PORT - ComPDFKit On-premises listening port. Default is 7000.

  • TMP_PATH - Temporary storage space for loading temporary files. Default is /tmp/compdfkit.

  • LANGUAGE - Interface error description language zh_cn || en. Default is en.

  • CONVERT_TIMEOUT - File processing timeout in minutes. Default is 15.

  • USEGPU - Whether to enable GPU to execute DocumentAI related functions. Default is false.

  • GPU_ID - Select the GPU ID to use. Default is 0.

Automated Container Memory Release Solution

ComPDFKit's automated container memory release solution provides you with basic load balancing capabilities and automated container monitoring.

Queue Service:

We use RabbitMQ as our asynchronous message queue solution. RabbitMQ's high performance and reliability enable us to achieve efficient message delivery, improving the responsiveness and reliability of our system.

Container Monitoring:

We provide the Compdfkit Server service, which will automatically monitor the memory usage of the containers after startup. If the memory usage exceeds the maximum value you defined in docker-compose.yml, it will automatically free up the container's memory if the container has no files being processed.

Example

1. Open the Docker daemon port

To obtain container-related data from Docker to manipulate the container for container monitoring, you need to open the Docker daemon port.

  • Open the Docker configuration file

    Server: ubuntu18.0

    sudo vim /lib/systemd/system/docker.service

    Server: ubuntu20.0

    sudo vim /usr/lib/systemd/system/docker.service

  • Change setting

    Please find the following code in the configuration file and add -H tcp://0.0.0.0:12375. The specific code is as follows: ExecStart=/usr/bin/dockerd -H unix:///var/run/docker.sock

    After successful addition, the example is as follows: ExecStart=/usr/bin/dockerd -H unix:///var/run/docker.sock -H tcp://0.0.0.0:12375

  • Restart Docker

    systemctl daemon-reload && systemctl restart docker

2. Start

ComPDFKit Server and ComPDFKit Async use a MySQL database to store data, so you need to configure an available MySQL database and import "compdfkit.sql" into your database.

First, please register Docker Hub and use compdfkit/compdfkit-xxx:tag to reference the ComPDFKit image. To pull the latest ComPDFKit image, run the following command:

docker pull compdfkit/compdfkit-server:1.3.0
docker pull compdfkit/compdfkit-async:

Secondly, you need to create a docker-compose.yml file in a location where you want and allow to edit it using vim docker-compose.yml. After successful creation, copy the following content to the docker-compose.yml file.

Note: Please make sure that docker-compose is installed in your system first. If docker-compose has not been installed yet, please go to Overview of installing Docker Compose | Docker Docs to learn how to install docker-composeon the system you are using.

yml
version: '3.3'
services:
  compdfkit_async1:
    restart: always
    image: compdfkit/compdfkit-async:
    container_name: compdfkit_async1
    ports:
      - 17001:7000
    environment:
      LICENSE: your LICENSE_KEY
      DB_URL: dbmysql:3306/compdfkit
      DB_USERNAME: root
      DB_PASSWORD: mypassword
      # Temporary storage space for file processing.
      TMP_PATH: /tmp/compdfkit
      # Error prompt language setting supports Chinese and English (default English).
      LANGUAGE: zh_cn
      MQ_HOST: rabbitmq
      MQ_PORT: 5672
      MQ_USERNAME: admin
      MQ_PASSWORD: admin
      # compdfkit_server container access address
      COMPDFKIT_SERVER_ADDRESS: http://127.0.0.1:17000
    volumes:
      - /tmp/compdfkit:/tmp/compdfkit
    depends_on:
      - dbmysql
  compdfkit_async2:
    restart: always
    image: compdfkit/compdfkit-async:
    container_name: compdfkit_async2
    ports:
      - 17002:7000
    environment:
      LICENSE: your LICENSE_KEY
      DB_URL: dbmysql:3306/compdfkit
      DB_USERNAME: root
      DB_PASSWORD: mypassword
      # Temporary storage space for file processing.
      TMP_PATH: /tmp/compdfkit
      # Error prompt language setting supports Chinese and English (default English).
      LANGUAGE: zh_cn
      MQ_HOST: rabbitmq
      MQ_PORT: 5672
      MQ_USERNAME: admin
      MQ_PASSWORD: admin
      # compdfkit_server container access address
      COMPDFKIT_SERVER_ADDRESS: http://127.0.0.1:17000
    volumes:
      - /tmp/compdfkit:/tmp/compdfkit
    depends_on:
      - dbmysql  
  compdfkit_server:
    restart: always
    image: compdfkit/compdfkit-server:1.3.0
    container_name: compdfkit_server
    ports:
      - 17000:17000
    environment:
      SERVER_PORT: 17000
      DOCKER_HOST: tcp://127.0.0.1:12375
      DOCKER_VERSION: 1.41
      CONTAINER_LIST: compdfkit_async1,compdfkit_async2
      MEMORY_USAGE_RATE_LIMIT: 30
      MEMORY_USAGE_LIMIT: 2048
      DB_URL: 127.0.0.1:3307/compdfkit
      DB_USERNAME: root
      DB_PASSWORD: mypassword
      MQ_HOST: 127.0.0.1
      MQ_PORT: 5672
      MQ_USERNAME: admin
      MQ_PASSWORD: admin
      TMP_PATH: /tmp/compdfkit
      LANGUAGE: en
    volumes:
      - /tmp/compdfkit:/tmp/compdfkit
    network_mode: "host"
    depends_on:
      - dbmysql
  dbmysql:
    image: mysql:8.0.27
    restart: always
    command: --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci --skip-character-set-client-handshake --skip-name-resolve --max_allowed_packet=500M --max_connections=1000 --default-authentication-plugin=mysql_native_password
    container_name: dbmysql
    ports:
      - 3307:3306
    environment:
      MYSQL_ROOT_PASSWORD: mypassword
      MYSQL_DATABASE: compdfkit
    volumes:
      - ./compdfkit.sql:/docker-entrypoint-initdb.d/compdfkit.sql
  rabbitmq:
    image: rabbitmq:3.10.0-rc.3-management
    container_name: convert-rabbitmq
    restart: always
    ports:
      - '5672:5672'
      - '15672:15672'
    environment:
      - RABBITMQ_DEFAULT_USER=admin
      - RABBITMQ_DEFAULT_PASS=admin
    volumes:
      - ./rabbitmq_data:/var/lib/rabbitmq

Finally, you need to copy the "compdfkit.sql" file to the same level directory as docker-compose.yml. After completion, please execute the following command:

sh
docker-compose up -d

When you see the following content, it means the project has been successfully launched:

shell
Creating dbmysql...done
Creating convert-rabbitmq ... done
Creating compdfkit_async1 ... done
Creating compdfkit_async2 ... done
Creating compdfkit_server ... done

Must be configured:

  • LICENSE - This is the license key used to activate ComPDFKit Async. If not specified or incorrect, ComPDFKit Async will fail to start.

  • DB_URL - Database link address, consisting of <host>:<port>/<database name>. If not specified or incorrect, ComPDFKit Server or Async will fail to start.

  • DB_USERNAME - Database link user name. If not specified or incorrect, ComPDFKit Server or Async will fail to start.

  • DB_PASSWORD - Database link user password. If not specified or incorrect, ComPDFKit Server or Async will fail to start.

  • DOCKER_VERSION - Docker daemon API version, which can be viewed through the command docker version --format '{ {.Server.APIVersion} }'. If the Docker daemon port is not opened or is incorrect, ComPDFKit Monitor will not be able to monitor the corresponding container in Docker.

  • DOCKER_HOST - Docker daemon API address. If the Docker daemon port is not opened or is incorrect, ComPDFKit Server will not be able to monitor the corresponding container in Docker.

  • CONTAINER_LIST - A list of ComPDFKit Conversion container names that you need to monitor. If not specified or the container name is incorrect, ComPDFKit Server will not automatically monitor the container.

  • MEMORY_USAGE_RATE_LIMIT - The maximum memory usage you can accept for ComPDFKit Conversion. If it exceeds, ComPDFKit Server will release its memory.

  • MEMORY_USAGE_LIMIT - The maximum memory usage, in MB, that you can accept for ComPDFKit Conversion. If it exceeds, ComPDFKit Server will release its memory.

  • MQ_HOST - The address of the queue, default is 127.0.0.1. If not set, ComPDFKit Server or Async will fail to start.

  • MQ_PORT - The port of the queue, default is 127.0.0.1. If not set, ComPDFKit Server or Async will fail to start.

  • MQ_USERNAME - Queue username, default is admin. If not set, ComPDFKit Server or Async will fail to start.

  • MQ_PASSWORD - Password for the queue, default is admin. If not set, ComPDFKit Server or Async will fail to start.

  • COMPDFKIT_SERVER_ADDRESS - The request address of the ComPDFKit Server service. If not set or set incorrectly, ComPDFKit Conversion will not start properly.

Optional configuration:

  • SERVER_PORT - ComPDFKit On-premises listening port, default 7000.
  • TMP_PATH - Temporary storage space for loading temporary files, default /tmp/compdfkit.
  • LANGUAGE - Interface error description language zh_cn ||en, default en.
  • CONVERT_TIMEOUT - File processing timeout (in minutes), default 15.

Install curl

The interaction with the processor is done through its HTTP API: sending requests with files and commands, and receiving result files. Before that, you need to have curl installed in order to make API calls. Most desktop Linux distributions come bundled with curl. You can check if it is already installed by running the command curl --version in the terminal. If you get an error, you can install it using the package manager of your distribution: curl.

Ubuntu/Debian :

shell
apt-get update && apt-get install -y curl

Example Feature: PDF Conversion

Everything is now all set up, and you can begin using the ComPDFKit On-premises for PDF document processing, using the PDF to Word conversion function as an example. For more API calls for other functions, please refer to the “ComPDFKit_Processor_API_Reference_v1.2.0.pdf” .

  • Select the file you wish to perform the operation on and move it to your desired location (e.g., document.pdf file).

  • Run the following command:

shell
curl -f -X POST http://localhost:7000/file/handle \
-H "Content-Type: multipart/form-data" \
-F file=@"document.pdf" \
-F executeType="pdf/docx" \
-F password="file open password" \
> result.docx

Open the file result.docx in Word Viewer - you will see a Word document.