Skip to content

HugeGraph Distributed (pd‐store) Version Deployment Guide‐EN

YangJiaqi edited this page Mar 10, 2025 · 5 revisions

Note: You can refer to the [distributed cluster] configuration in the Docker-Compose repository for similar setup.

Prerequisites

  • Linux, macOS (Windows environment has not been tested yet)
  • Java version ≥ 11
  • Maven version ≥ 3.5.0

Build and Start from Source

  1. Clone the master branch of the source code
git clone https://github.com/apache/hugegraph.git
  1. Build in the project root directory
cd hugegraph
mvn clean install -DskipTests=true

If the build is successful, the build artifacts for PD, Store, and Server modules will be stored in:

  • PD: hugegraph-pd/apache-hugegraph-pd-incubating-1.5.0
  • Store: hugegraph-store/apache-hugegraph-store-incubating-1.5.0
  • Server: hugegraph-server/apache-hugegraph-server-incubating-1.5.0
.
├── hugegraph-pd
│  ├── apache-hugegraph-pd-incubating-1.5.0
│  └── apache-hugegraph-pd-incubating-1.5.0.tar.gz
├── hugegraph-server
│  ├── apache-hugegraph-server-incubating-1.5.0
│  └── apache-hugegraph-server-incubating-1.5.0.tar.gz
└─── hugegraph-store
   ├── apache-hugegraph-store-incubating-1.5.0
   └── apache-hugegraph-store-incubating-1.5.0.tar.gz

These paths will be used as the working directories when running PD, Store, and Server.

  1. Start PD
cd hugegraph-pd/apache-hugegraph-pd-incubating-1.5.0
./bin/start-hugegraph-pd.sh

If the startup is successful, you can find the following log in hugegraph-pd/apache-hugegraph-pd-incubating-1.5.0/logs/hugegraph-pd-stdout.log:

2024-04-08 15:15:45 [main] [INFO] o.a.h.p.b.HugePDServer - Started HugePDServer in 3.879 seconds (JVM running for 5.149)
  1. Start Store
cd hugegraph-store/apache-hugegraph-store-incubating-1.5.0
./bin/start-hugegraph-store.sh

If the startup is successful, you can find the following log in hugegraph-store/apache-hugegraph-store-incubating-1.5.0/logs/hugegraph-store-server.log:

2024-04-08 15:16:29 [main] [INFO] o.a.h.s.n.StoreNodeApplication - Started StoreNodeApplication in 4.794 seconds (JVM running for 6.21)
  1. Start Server
cd hugegraph-server/apache-hugegraph-server-incubating-1.5.0
./bin/start-hugegraph.sh -p true

Passing -p true imports the example graph, as described in:

https://hugegraph.apache.org/docs/quickstart/hugegraph-server/#517-create-an-example-graph-when-startup

Simple verification:

> curl http://localhost:8080/graphs
{"graphs":["hugegraph"]}
  1. Stop Server, Store, and PD

In order, execute the following commands in the respective directories:

./bin/stop-hugegraph.sh
./bin/stop-hugegraph-store.sh
./bin/stop-hugegraph-pd.sh

Start Using Downloaded tar Packages

Download the PD, Store, and Server tar packages from [this page] and extract them. After extraction, you will get the following folders:

  • apache-hugegraph-pd-incubating-1.5.0
  • apache-hugegraph-store-incubating-1.5.0
  • apache-hugegraph-server-incubating-1.5.0

The subsequent process is the same as described above.

Multi-Node Configuration Reference

  • 3 PD nodes
    • Raft ports: 8610, 8611, 8612
    • RPC ports: 8686, 8687, 8688
    • REST ports: 8620, 8621, 8622
  • 3 Store nodes
    • Raft ports: 8510, 8511, 8512
    • RPC ports: 8500, 8501, 8502
    • REST ports: 8520, 8521, 8522
  • 3 Server nodes (disable auth + use distributed scheduler)
    • REST ports: 8081, 8082, 8083
    • RPC ports: 8091, 8092, 8093
    • Gremlin ports: 8181, 8182, 8183

Server Configuration

hugegraph.properties

backend=hstore
serializer=binary
task.scheduler_type=distributed

# PD service addresses, multiple addresses separated by commas ⚠️ Use the PD RPC ports
pd.peers=127.0.0.1:8686,127.0.0.1:8687,127.0.0.1:8688

rest-server.properties

Each server has its own configuration:

restserver.url=http://127.0.0.1:8081
gremlinserver.url=http://127.0.0.1:8181

rpc.server_host=127.0.0.1
rpc.server_port=8091

server.id=server-1
server.role=master
restserver.url=http://127.0.0.1:8082
gremlinserver.url=http://127.0.0.1:8182

rpc.server_host=127.0.0.1
rpc.server_port=8092

server.id=server-2
server.role=worker
restserver.url=http://127.0.0.1:8083
gremlinserver.url=http://127.0.0.1:8183

rpc.server_host=127.0.0.1
rpc.server_port=8093

server.id=server-3
server.role=worker

gremlin-server.yaml

host: 127.0.0.1
port: 8181
host: 127.0.0.1
port: 8182
host: 127.0.0.1
port: 8183

[Download multi-server.zip for complete Server node configuration files.

PD Configuration

Modify application.yml for each PD node. Example:

spring:
  application:
    name: hugegraph-pd

management:
  metrics:
    export:
      prometheus:
        enabled: true
  endpoints:
    web:
      exposure:
        include: "*"

logging:
  config: 'file:./conf/log4j2.xml'
license:
  verify-path: ./conf/verify-license.json
  license-path: ./conf/hugegraph.license
grpc:
  # cluster mode
  port: 8686 # ⚠️Local testing requires configuring different ports.
  host: 127.0.0.1 

server:
  # rest server port
  port: 8620 # ⚠️Local testing requires configuring different ports.

pd:
  # data path
  data-path: ./pd_data # ⚠️Local testing requires configuring different path.
  # Auto-scaling check cycle: Periodically check the number of partitions in each store and automatically balance the partition count.
  patrol-interval: 1800
  # Initial store list: Stores in the list are automatically activated.
  initial-store-count: 1
  # grpc IP:grpc port 
  # store config info
  initial-store-list: 127.0.0.1:8500,127.0.0.1:8501,127.0.0.1:8502

raft:
  # cluster mode
  address: 127.0.0.1:8610 # ⚠️Local testing requires configuring different ports.
  peers-list: 127.0.0.1:8610,127.0.0.1:8611,127.0.0.1:8612

store:
  # The store offline timeout. If the store is unavailable for longer than this time, it is considered permanently unavailable, and replicas will be allocated to other machines. The unit is seconds
  max-down-time: 172800
  # Enable store monitoring data storage
  monitor_data_enabled: true
  # Monitoring data interval,minute (default), hour, second
  # default: 1 min * 1 day = 1440
  monitor_data_interval: 1 minute
  # Monitoring data retention period 1 day; day, month, year
  monitor_data_retention: 1 day
  initial-store-count: 1

partition:
  # Default number of replicas per partition
  default-shard-count: 1
  # Default maximum number of replicas per machine, initial number of partitions = store-max-shard-count * store-number / default-shard-count
  store-max-shard-count: 12

[Download multi-pd.zip](https://github.com/hugegraph/hugegraph/releases/download/pd-store-tmp/multi-pd.zip) for full PD node configuration files.

Store Configuration

Modify application.yml for each Store node. Example:

pdserver:
  # PD service address, multiple PD addresses separated by commas ⚠️ Configure the RPC port for PD.
  address: 127.0.0.1:8686,127.0.0.1:8687,127.0.0.1:8688

management:
  metrics:
    export:
      prometheus:
        enabled: true
  endpoints:
    web:
      exposure:
        include: "*"

grpc:
  host: 127.0.0.1
  port: 8500 # ⚠️Local testing requires configuring different ports.
  netty-server:
    max-inbound-message-size: 1000MB
raft:
  disruptorBufferSize: 1024
  address: 127.0.0.1:8510 # ⚠️Local testing requires configuring different ports.
  max-log-file-size: 600000000000
  snapshotInterval: 1800
server:
  port: 8520 # ⚠️Local testing requires configuring different ports.

app:
  # Storage path: Supports multiple paths, separated by commas.
  data-path: ./storage # ⚠️Local testing requires configuring different path.
  #raft-path: ./storage

spring:
  application:
    name: store-node-grpc-server
  profiles:
    active: default
    include: pd

logging:
  config: 'file:./conf/log4j2.xml'
  level:
    root: info

[Download multi-store.zip] for full Store node configuration files.