Getting Started Guide
This guide will walk you through the steps to start ComPDFKit On-premises. It will also show you how to use it to process documents.
Requirement
ComPDFKit On-premises for Linux can run on multiple platforms. It supports the following operating systems:
- Ubuntu, Fedora, Debian, or CentOS. It also supports Ubuntu and Debian derivatives such as Kubuntu or Xubuntu. Currently, it only supports 64-bit Intel (x86_64) processors.
Regardless of the operating system you are using, you will need at least 4GB of RAM.
Install Docker
ComPDFKit On-premises for Linux is distributed in the form of a Docker container. To run it on your computer, you need to install the Docker runtime environment for your operating system.
Please follow the instructions on the Docker official website to install and start Docker Engine.
After installing Docker, you can use the installation instructions to install Docker Compose. The instructions can be found at here.
Start ComPDFKit On-premises
ComPDFKit On-premises also provides a simple web page that is primarily used to render PDF files and display them on a web page. You can directly process PDF files on the web page. You can access it through localhost:7000/index.html
.
ComPDFKit On-premises for Linux uses MySQL database for data storage. Therefore, you need to configure an available MySQL database and import the ”compdfkit.sql“ file into your database.
Register Docker Hub, and reference the mirrors of ComPDFKit On-premises for Linux using the compdfkit/compdfkit:tag
. To pull the latest mirrors of ComPDFKit On-premises for Linux, run the following command:
docker pull compdfkit/compdfkit:3.0.0
You need to ensure that docker-compose
is already installed on your system. To install docker-compose
, please visit Overview of installing Docker Compose | Docker Docs to learn how to install it on your operating system.
First, you need to create a docker-compose.yml
file at a location of your choice. You can use vim docker-compose.yml
to edit it. Copy the following content into the docker-compose.yml
file:
version: '3.3'
services:
compdfkit_processor:
restart: always
image: compdfkit/compdfkit:3.0.0
container_name: compdfkit_processor
# If you do not use GPU, you can omit this part.
deploy:
resources:
reservations:
devices:
- driver: "nvidia"
count: "all"
capabilities: [ "gpu" ]
ports:
- 7000:7000
environment:
LICENSE: your LICENSE_KEY
DB_URL: dbmysql:3306/compdfkit
DB_USERNAME: root
DB_PASSWORD: mypassword
# Temporary storage space for file processing.
TMP_PATH: /tmp/compdfkit
# Error message language setting, supports Chinese and English (default is en).
LANGUAGE: en
USEGPU: 'true'
GPU_ID: "0"
depends_on:
- dbmysql
dbmysql:
image: mysql:8.0.27
restart: always
command: --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci --skip-character-set-client-handshake --skip-name-resolve --max_allowed_packet=500M --max_connections=1000 --default-authentication-plugin=mysql_native_password
container_name: dbmysql
ports:
- 3306:3306
environment:
MYSQL_ROOT_PASSWORD: mypassword
MYSQL_DATABASE: compdfkit
volumes:
- ./data:/var/lib/mysql
- ./compdfkit.sql:/docker-entrypoint-initdb.d/compdfkit.sql
Next, you need to copy the ”compdfkit.sql“ file to the same directory level as the docker-compose.yml
file.
Then you need to execute the following command:
docker-compose up
When you see the following content, it means the project has started successfully:
Running ComPDFKit Processor version 3.0.0 port(s) 7000 (http)
API Configuration
Required:
LICENSE
- This is the license key used to activate ComPDFKit On-premises. If not specified or incorrect, ComPDFKit On-premises will not be able to start.DB_URL
- This is the database connection address, composed of<host>
:<port>
/<database name>
. If not specified or incorrect, ComPDFKit On-premises will not be able to start.DB_USERNAME
- This is the username of database connection. If not specified or incorrect, ComPDFKit On-premises will not be able to start.DB_PASSWORD
- This is the user password of database connection. If not specified or incorrect, ComPDFKit On-premises will not be able to start.
Optional:
SERVER_PORT
- ComPDFKit On-premises listening port. Default is7000
.TMP_PATH
- Temporary storage space for loading temporary files. Default is/tmp/compdfkit
.LANGUAGE
- Interface error description languagezh_cn
||en
. Default isen
.CONVERT_TIMEOUT
- File processing timeout in minutes. Default is15
.USEGPU
- Whether to enable GPU to execute DocumentAI related functions. Default isfalse
.GPU_ID
- Select the GPU ID to use. Default is0
.
Automated Container Memory Release Solution
ComPDFKit's automated container memory release solution provides you with basic load balancing capabilities and automated container monitoring.
Queue Service:
We use RabbitMQ as our asynchronous message queue solution. RabbitMQ's high performance and reliability enable us to achieve efficient message delivery, improving the responsiveness and reliability of our system.
Container Monitoring:
We provide the Compdfkit Server service, which will automatically monitor the memory usage of the containers after startup. If the memory usage exceeds the maximum value you defined in docker-compose.yml
, it will automatically free up the container's memory if the container has no files being processed.
Example
1. Open the Docker daemon port
To obtain container-related data from Docker to manipulate the container for container monitoring, you need to open the Docker daemon port.
Open the Docker configuration file
Server: ubuntu18.0
sudo vim /lib/systemd/system/docker.service
Server: ubuntu20.0
sudo vim /usr/lib/systemd/system/docker.service
Change setting
Please find the following code in the configuration file and add
-H tcp://0.0.0.0:12375
. The specific code is as follows:ExecStart=/usr/bin/dockerd -H unix:///var/run/docker.sock
After successful addition, the example is as follows:
ExecStart=/usr/bin/dockerd -H unix:///var/run/docker.sock -H tcp://0.0.0.0:12375
Restart Docker
systemctl daemon-reload && systemctl restart docker
2. Start
ComPDFKit Server and ComPDFKit Async use a MySQL database to store data, so you need to configure an available MySQL database and import "compdfkit.sql" into your database.
First, please register Docker Hub and use compdfkit/compdfkit-xxx:tag
to reference the ComPDFKit image. To pull the latest ComPDFKit image, run the following command:
docker pull compdfkit/compdfkit-server:1.3.0
docker pull compdfkit/compdfkit-async:
Secondly, you need to create a docker-compose.yml
file in a location where you want and allow to edit it using vim docker-compose.yml
. After successful creation, copy the following content to the docker-compose.yml
file.
Note: Please make sure that docker-compose
is installed in your system first. If docker-compose
has not been installed yet, please go to Overview of installing Docker Compose | Docker Docs to learn how to install docker-compose
on the system you are using.
version: '3.3'
services:
compdfkit_async1:
restart: always
image: compdfkit/compdfkit-async:
container_name: compdfkit_async1
ports:
- 17001:7000
environment:
LICENSE: your LICENSE_KEY
DB_URL: dbmysql:3306/compdfkit
DB_USERNAME: root
DB_PASSWORD: mypassword
# Temporary storage space for file processing.
TMP_PATH: /tmp/compdfkit
# Error prompt language setting supports Chinese and English (default English).
LANGUAGE: zh_cn
MQ_HOST: rabbitmq
MQ_PORT: 5672
MQ_USERNAME: admin
MQ_PASSWORD: admin
# compdfkit_server container access address
COMPDFKIT_SERVER_ADDRESS: http://127.0.0.1:17000
volumes:
- /tmp/compdfkit:/tmp/compdfkit
depends_on:
- dbmysql
compdfkit_async2:
restart: always
image: compdfkit/compdfkit-async:
container_name: compdfkit_async2
ports:
- 17002:7000
environment:
LICENSE: your LICENSE_KEY
DB_URL: dbmysql:3306/compdfkit
DB_USERNAME: root
DB_PASSWORD: mypassword
# Temporary storage space for file processing.
TMP_PATH: /tmp/compdfkit
# Error prompt language setting supports Chinese and English (default English).
LANGUAGE: zh_cn
MQ_HOST: rabbitmq
MQ_PORT: 5672
MQ_USERNAME: admin
MQ_PASSWORD: admin
# compdfkit_server container access address
COMPDFKIT_SERVER_ADDRESS: http://127.0.0.1:17000
volumes:
- /tmp/compdfkit:/tmp/compdfkit
depends_on:
- dbmysql
compdfkit_server:
restart: always
image: compdfkit/compdfkit-server:1.3.0
container_name: compdfkit_server
ports:
- 17000:17000
environment:
SERVER_PORT: 17000
DOCKER_HOST: tcp://127.0.0.1:12375
DOCKER_VERSION: 1.41
CONTAINER_LIST: compdfkit_async1,compdfkit_async2
MEMORY_USAGE_RATE_LIMIT: 30
MEMORY_USAGE_LIMIT: 2048
DB_URL: 127.0.0.1:3307/compdfkit
DB_USERNAME: root
DB_PASSWORD: mypassword
MQ_HOST: 127.0.0.1
MQ_PORT: 5672
MQ_USERNAME: admin
MQ_PASSWORD: admin
TMP_PATH: /tmp/compdfkit
LANGUAGE: en
volumes:
- /tmp/compdfkit:/tmp/compdfkit
network_mode: "host"
depends_on:
- dbmysql
dbmysql:
image: mysql:8.0.27
restart: always
command: --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci --skip-character-set-client-handshake --skip-name-resolve --max_allowed_packet=500M --max_connections=1000 --default-authentication-plugin=mysql_native_password
container_name: dbmysql
ports:
- 3307:3306
environment:
MYSQL_ROOT_PASSWORD: mypassword
MYSQL_DATABASE: compdfkit
volumes:
- ./compdfkit.sql:/docker-entrypoint-initdb.d/compdfkit.sql
rabbitmq:
image: rabbitmq:3.10.0-rc.3-management
container_name: convert-rabbitmq
restart: always
ports:
- '5672:5672'
- '15672:15672'
environment:
- RABBITMQ_DEFAULT_USER=admin
- RABBITMQ_DEFAULT_PASS=admin
volumes:
- ./rabbitmq_data:/var/lib/rabbitmq
Finally, you need to copy the "compdfkit.sql" file to the same level directory as docker-compose.yml
. After completion, please execute the following command:
docker-compose up -d
When you see the following content, it means the project has been successfully launched:
Creating dbmysql...done
Creating convert-rabbitmq ... done
Creating compdfkit_async1 ... done
Creating compdfkit_async2 ... done
Creating compdfkit_server ... done
Must be configured:
LICENSE
- This is the license key used to activate ComPDFKit Async. If not specified or incorrect, ComPDFKit Async will fail to start.DB_URL
- Database link address, consisting of<host>
:<port>
/<database name>
. If not specified or incorrect, ComPDFKit Server or Async will fail to start.DB_USERNAME
- Database link user name. If not specified or incorrect, ComPDFKit Server or Async will fail to start.DB_PASSWORD
- Database link user password. If not specified or incorrect, ComPDFKit Server or Async will fail to start.DOCKER_VERSION
- Docker daemon API version, which can be viewed through the commanddocker version --format '{ {.Server.APIVersion} }'
. If the Docker daemon port is not opened or is incorrect, ComPDFKit Monitor will not be able to monitor the corresponding container in Docker.DOCKER_HOST
- Docker daemon API address. If the Docker daemon port is not opened or is incorrect, ComPDFKit Server will not be able to monitor the corresponding container in Docker.CONTAINER_LIST
- A list of ComPDFKit Conversion container names that you need to monitor. If not specified or the container name is incorrect, ComPDFKit Server will not automatically monitor the container.MEMORY_USAGE_RATE_LIMIT
- The maximum memory usage you can accept for ComPDFKit Conversion. If it exceeds, ComPDFKit Server will release its memory.MEMORY_USAGE_LIMIT
- The maximum memory usage, in MB, that you can accept for ComPDFKit Conversion. If it exceeds, ComPDFKit Server will release its memory.MQ_HOST
- The address of the queue, default is 127.0.0.1. If not set, ComPDFKit Server or Async will fail to start.MQ_PORT
- The port of the queue, default is 127.0.0.1. If not set, ComPDFKit Server or Async will fail to start.MQ_USERNAME
- Queue username, default is admin. If not set, ComPDFKit Server or Async will fail to start.MQ_PASSWORD
- Password for the queue, default is admin. If not set, ComPDFKit Server or Async will fail to start.COMPDFKIT_SERVER_ADDRESS
- The request address of the ComPDFKit Server service. If not set or set incorrectly, ComPDFKit Conversion will not start properly.
Optional configuration:
SERVER_PORT
- ComPDFKit On-premises listening port, default7000
.TMP_PATH
- Temporary storage space for loading temporary files, default/tmp/compdfkit
.LANGUAGE
- Interface error description languagezh_cn
||en
, defaulten
.CONVERT_TIMEOUT
- File processing timeout (in minutes), default15
.
Install curl
The interaction with the processor is done through its HTTP API: sending requests with files and commands, and receiving result files. Before that, you need to have curl installed in order to make API calls. Most desktop Linux distributions come bundled with curl. You can check if it is already installed by running the command curl --version in the terminal. If you get an error, you can install it using the package manager of your distribution: curl
.
Ubuntu/Debian :
apt-get update && apt-get install -y curl
Example Feature: PDF Conversion
Everything is now all set up, and you can begin using the ComPDFKit On-premises for PDF document processing, using the PDF to Word conversion function as an example. For more API calls for other functions, please refer to the “ComPDFKit_Processor_API_Reference_v1.2.0.pdf” .
Select the file you wish to perform the operation on and move it to your desired location (e.g., document.pdf file).
Run the following command:
curl -f -X POST http://localhost:7000/file/handle \
-H "Content-Type: multipart/form-data" \
-F file=@"document.pdf" \
-F executeType="pdf/docx" \
-F password="file open password" \
> result.docx
Open the file result.docx
in Word Viewer - you will see a Word document.