Software and hardware requirements
This page describes system requirements for installing Document.One system. The system requirements depend on a deployment use case:
- Setup 1: D1 system components installed on a single node
- Setup 2: D1 and Conversion Service on separate nodes
- Setup 3: D1 and Conversion Service on separate clustered nodes
The deployment use cases are described in details in D1 deployment use cases.
Hardware requirements
Node server requirements
The node requirements are meant to be per a node. In case of the cluster usage, the requirements are multiplied. A node itself can be either a virtual machine or bare metal server. Take in consideration that also the underlying operational system needs a certain amount of memory.
You must have write
access for the installation server.
- D1 server size and OS requirements
- Conversion Service server size and OS requirements
Deployment use case | CPU | Memory | Network | Operating system |
---|---|---|---|---|
Setup 1* | 4 CPU | 12 GB | GBit or better | Linux - Kernel 3.10 or newer |
Setup 2 | 4 CPU | 12 GB | GBit or better | Linux - Kernel 3.10 or newer |
Setup 3 | 8 CPU | 32 GB | GBit or better | Linux - Kernel 3.10 or newer |
*D1 and Conversion Service are installed on a single node.
Deployment use case | CPU | Memory | Network | Operating system |
---|---|---|---|---|
Setup 1* | --- | --- | --- | --- |
Setup 2 | 4 CPU | 12 GB | GBit or better | Linux - Kernel 3.10 or newer |
Setup 3 | 8 CPU | 32 GB | GBit or better | Linux - Kernel 3.10 or newer |
*D1 and Conversion Service are installed on a single node. See the requirements in the first table.
Load balancer requirements
There are no special requirements for the load balancer. Neither the algorithm, nor the session handling (sticky or not) has any special requirements. Since e.g. a conversion health check is synchronous, for obvious reasons it is recommended to have a session timeout ≥ 10min.
Mount points requirements
Since sizing of documents cannot be predicted, it is recommended to have a file system with dynamic grow capabilities (e.g. XFS or ZFS). The following table describes mount points requirements for D1 and Conversion Service:
- D1 mount points
- Conversion Service mount points
Mount point | Deployment parameter | Size | Description | Dynamic sizing | Backup | Type |
---|---|---|---|---|---|---|
D1 installation directory | installationPath | 5GB | Stores D1 installation files | No | Priority 4 | Local |
Log files | logFilesDir | GB size depends on the log configuration | Stores log files | Yes | Priority 3 | Local |
Temp files | tempDir | GB size depends on the load | Stores temporary files | Yes | Priority 4 | Local |
Content files per repository | Configurable via UI during creation of the repository | GB size depends on number of documents | Stores content files | Yes | Priority 1 | Shared between D1 nodes |
Fulltext index | ELASTIC_SERVICE_DATA_PATH | GB size depends on number of documents | Stores elastic search fulltext index | Yes | Priority 1 | Shared between D1 nodes |
JVM non-heap memory must be at least 4 GB
Mount point | Deployment parameter | Size | Description | Dynamic sizing | Backup | Type |
---|---|---|---|---|---|---|
Conversion Service installation directory | installationPath | 5GB | Stores Conversion Service installation files | No | Priority 4 | Local |
Log files | logFilesDir | GB size depends on the log configuration | Stores log files | Yes | Priority 3 | Local |
Temp files | tempDir | GB size depends on the load | Stores temporary files | Yes | Priority 4 | Local |
Conversion job files | CONV_STORAGE_FOLDER | Depends on the number of conversion requests | Stores content files shared between Conversion Service nodes. Used only if CONV_STORAGE_TYPE=fs | Yes | Priority 4 | Shared between D1 nodes |
Software requirements
Operating system settings
File descriptor
The following table decsribes requirements for the number of concurrently open file descriptors:
Deployment | Value |
---|---|
D1 | 500000 |
Conversion Service | 100000 |
Messaging | 10000 |
Random generator
Since servers have low entropy, haveged service is necessary for enriching it.
Deployment | Service version |
---|---|
D1 | 1.9.2+ |
Conversion Service | 1.9.2+ |
Messaging | 1.9.2+ |
Command line tools
The following table describes the tools that need to be installed on corersponding server nodes:
Deployment | Link | Version | Description |
---|---|---|---|
D1 | curl | 7.29+ | Is required by the health check script. |
Conversion Service | curl | 7.29+ | Is required by the health check script. |
Conversion Service | tesseract | 4.0.0+ | Is required for extracting text from images. If you are sure that you will not need Optical Character Recognition, you can skip installing Tesseract. |
Conversion Service | wkhtmltopdf | 0.12.5+ | Is required for converting HTML files to PDF. |
Messaging | curl | 7.29+ | Is required by the health check script. |
Java requirements
You must install JVM on all nodes that are parts of your D1 system deployment.
The version of Jave must be OpenJDK or Oracle JDK 17.
Java in general supports more than 32GB heap size for one JVM. However, keep in mind that above 32GB Java uses 64bit references, which uses more memory by itself. If you decide to exceed the 32GB boundary, memory must be increased dramatically to have a similar heap available. In practice, this means that if you increase the memory above 32GB, it is necessary to go over 40GB. See Java performance for more details.
To avoid heap resizing during uptime of the servers which leads to performance issues, -Xmx
and -Xms
should be equal.
Based on the available memory settings specified in the setup configuration above make sure that the operating system does not needs to cache because of memory limitations. The rest should then be assigned to D1, Conversion Service and Messaging.
The following tables describe JVM memory requirements for D1 and Conversion Service which are the same for both system components:
Setting | Deployment parameter | Value | Description |
---|---|---|---|
-Xmx | maxHeapSize | 4 GB - 31 GB | Maximum heap size. |
-Xms | initialHeapSize | 4 GB - 31 GB | Minimum heap size. It should be equal to the maximum heap memory. |
JVM non-heap memory must be at least 4 GB.
Firewall settings
On each D1 and Conversion Service node, the configured ports need to be opened:
- For external communication (HTTPS - for security reasons) which provides API to users, developers, and other systems:
- Load balancers are needed
- Nodes are needed if there is no load balancer.
- For internal communication:
- In a case of a clustered setup, ports need to be opened for messaging.
- In a case of automatic service discovery (which is optional), multicast must be allowed.
Tomcat runtime requirements
The runtime of D1 and Conversion Service is Apache Tomcat.
The following table decsribes runtime requirements for D1 and Conversion Service which are the same:
Setting | Deployment parameter | Value | Default | Description |
---|---|---|---|---|
HTTP port | httpPort | Any available port | 8080 | Expose to the outside. This communication is not encrypted. |
HTTPS port | httpsPort | Any available port | 8443 | Expose to the outside |
AJP port | ajpPort | Any available port | 8009 | Only internal usage - do not expose to the outside |
Server port | serverPort | Any available port | 8005 | Only internal usage - do not expose to the outside |
Maximum connections | maxConnections | A positive number or no limit (recommended) | -1 (no limit) | Maximum number of connections |
Maximum Threads | maxThreads | ≥1000 | 4000 | Maximum number of request worker |
Memory and CPU scaling recommendations
In principle, we recommend to double the memory in GB compared to the number of CPUs, keeping the minimal values in mind. Heap memory size should be equal to non-heap memory size.
Vertical scaling
Memory (GB) = 2 * number of CPUs or 8 GB (minimum requirements) - whichever is greater.
- 2 CPUs (minimum) = 8 GB of memory (4 GB heap, 4 GB non-heap as per minimal requirements)
- 4 CPUs = 8 GB of memory (4 GB heap, 4 GB non-heap)
- 8 CPUs = 16 GB of memory (8 GB heap, 8 GB non-heap)
- 16 CPUs = 32 GB of memory (16 GB heap, 16 GB non-heap)
Horizontal scaling
You must increase the number of Conversion Service nodes based on your expected/observed load increase.
The following table shows resource consumption levels per job type. If you know the distribution of job types in your organization, it should help you to estimate the load.
Job type | Expected resource consumption |
---|---|
Convert to image | Heavy |
Prepare for WebReader | Heavy |
Convert to PDF | Average |
Merge documents | Average |
Merge templates | Average |
Apply watermark | Light |
Apply template | Light |
Extract text | Light without OCR, otherwise heavy |
Review the D1 database requirements before you proceed with D1 system installation. See D1 database requirements.