Scales with Your Team
Purple8 Graph is designed to grow with you — from a laptop to a production cluster — without any migration, re-architecting, or API changes. You write the same Cypher queries whether you're on a single machine or a 12-node fault-tolerant deployment across three data centres.
Start on a single machine
The default GraphEngine runs in-process. No Docker required, no separate server to manage. Install, import, query.
from purple8_graph import GraphEngine
engine = GraphEngine("./data")
engine.add_node("Document", {
"title": "Q4 Report",
"body": "Revenue grew 34% year-on-year...",
})
results = engine.query("""
CALL db.vector.search('Document', $vec, 5)
YIELD node, score
RETURN node.title, score
""", vec=my_embedding)This is production-ready. Many teams run Purple8 this way for months before they need to scale out. The embedded storage handles millions of nodes and billions of edges on a single NVMe drive.
Scale to a multi-shard cluster
When your dataset grows beyond a single machine — or you need higher write throughput — promote to ShardedGraphEngine. The API is identical.
from purple8_graph.distributed import ShardedGraphEngine, HashPartitioner
engine = ShardedGraphEngine(
shard_configs=[
{"path": "./data/shard-0"},
{"path": "./data/shard-1"},
{"path": "./data/shard-2"},
],
partitioner=HashPartitioner(),
)
# Same Cypher — the engine federates across shards automatically
results = engine.query("""
CALL db.vector.search('Document', $vec, 10)
YIELD node, score
MATCH (node)-[:AUTHORED_BY]->(author:Person)
RETURN node.title, author.name, score
ORDER BY score DESC LIMIT 5
""", vec=my_embedding)Queries are automatically federated: Purple8 issues them in parallel across all shards and merges the results. You don't rewrite a single line of application code.
Add fault tolerance with Raft
For high availability, wrap shards in Raft consensus groups. Each shard gets 3 or 5 replicas; the cluster survives the loss of any minority of nodes.
from purple8_graph.distributed import RaftShardedEngine, HashPartitioner
engine = RaftShardedEngine(
shard_groups=[
{
"replicas": [
{"host": "node-0a.internal", "port": 9010, "path": "/data/s0"},
{"host": "node-0b.internal", "port": 9010, "path": "/data/s0"},
{"host": "node-0c.internal", "port": 9010, "path": "/data/s0"},
]
},
{
"replicas": [
{"host": "node-1a.internal", "port": 9010, "path": "/data/s1"},
{"host": "node-1b.internal", "port": 9010, "path": "/data/s1"},
{"host": "node-1c.internal", "port": 9010, "path": "/data/s1"},
]
},
],
partitioner=HashPartitioner(),
election_timeout_ms=300,
)Leader election completes within election_timeout_ms of a failure. Writes require a quorum commit. Reads are served from the leader by default — use read_consistency="eventual" for replica reads.
Spread across data centres
Assign replicas to different availability zones or regions using the zone tag. The Raft leader-election algorithm is zone-aware: it avoids electing a leader in a zone that already holds a majority of replicas, so a full AZ outage doesn't take down your cluster.
engine = RaftShardedEngine(
shard_groups=[
{
"replicas": [
{"host": "us-east-1a.db.internal", "port": 9010, "path": "/data/s0", "zone": "us-east-1a"},
{"host": "us-east-1b.db.internal", "port": 9010, "path": "/data/s0", "zone": "us-east-1b"},
{"host": "eu-west-1a.db.internal", "port": 9010, "path": "/data/s0", "zone": "eu-west-1a"},
]
},
],
partitioner=HashPartitioner(),
election_timeout_ms=500, # slightly higher for cross-region latency
)Scaling path at a glance
| Stage | Setup | Nodes | Fault tolerant? |
|---|---|---|---|
| Development | GraphEngine | 1 | — |
| Production (single machine) | GraphEngine | 1 | WAL durability |
| Scale-out | ShardedGraphEngine | 2–N | No (single replica per shard) |
| High availability | RaftShardedEngine | 3+ per shard | Yes (quorum) |
| Multi-region | RaftShardedEngine + zone | 3+ per shard | Yes (AZ-aware election) |
What stays the same at every stage
- Cypher API — your queries don't change
- Python SDK interface —
engine.query(...),engine.add_node(...), etc. - REST & GraphQL endpoints — same paths, same auth
- Encryption — KMS config carries through to every shard and replica
- Journey Engine — AI workflows continue firing on graph changes across the cluster
No migration scripts. No re-indexing. No schema changes. Just add nodes to your cluster config and you're done.
Which tier includes clustering?
| Feature | Desktop Free | Desktop Pro | Pro Cloud | Cloud Plus | Self-Hosted Server | Enterprise |
|---|---|---|---|---|---|---|
| Single-node | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
ShardedGraphEngine | — | — | — | ✅ | ✅ | ✅ |
RaftShardedEngine (HA) | — | — | — | — | ✅ | ✅ |
| Multi-region / AZ-aware | — | — | — | — | — | ✅ |
Further reading
- Sharding & Clustering reference guide — partitioning strategies, Docker Compose config, consistency model
- System Requirements — hardware sizing per tier