Scales with Your Team

Purple8 Graph is designed to grow with you — from a laptop to a production cluster — without any migration, re-architecting, or API changes. You write the same Cypher queries whether you're on a single machine or a 12-node fault-tolerant deployment across three data centres.

Start on a single machine

The default GraphEngine runs in-process. No Docker required, no separate server to manage. Install, import, query.

python

from purple8_graph import GraphEngine

engine = GraphEngine("./data")

engine.add_node("Document", {
    "title": "Q4 Report",
    "body": "Revenue grew 34% year-on-year...",
})

results = engine.query("""
    CALL db.vector.search('Document', $vec, 5)
    YIELD node, score
    RETURN node.title, score
""", vec=my_embedding)

This is production-ready. Many teams run Purple8 this way for months before they need to scale out. The embedded storage handles millions of nodes and billions of edges on a single NVMe drive.

Scale to a multi-shard cluster

When your dataset grows beyond a single machine — or you need higher write throughput — promote to ShardedGraphEngine. The API is identical.

python

from purple8_graph.distributed import ShardedGraphEngine, HashPartitioner

engine = ShardedGraphEngine(
    shard_configs=[
        {"path": "./data/shard-0"},
        {"path": "./data/shard-1"},
        {"path": "./data/shard-2"},
    ],
    partitioner=HashPartitioner(),
)

# Same Cypher — the engine federates across shards automatically
results = engine.query("""
    CALL db.vector.search('Document', $vec, 10)
    YIELD node, score
    MATCH (node)-[:AUTHORED_BY]->(author:Person)
    RETURN node.title, author.name, score
    ORDER BY score DESC LIMIT 5
""", vec=my_embedding)

Queries are automatically federated: Purple8 issues them in parallel across all shards and merges the results. You don't rewrite a single line of application code.

Add fault tolerance with Raft

For high availability, wrap shards in Raft consensus groups. Each shard gets 3 or 5 replicas; the cluster survives the loss of any minority of nodes.

python

from purple8_graph.distributed import RaftShardedEngine, HashPartitioner

engine = RaftShardedEngine(
    shard_groups=[
        {
            "replicas": [
                {"host": "node-0a.internal", "port": 9010, "path": "/data/s0"},
                {"host": "node-0b.internal", "port": 9010, "path": "/data/s0"},
                {"host": "node-0c.internal", "port": 9010, "path": "/data/s0"},
            ]
        },
        {
            "replicas": [
                {"host": "node-1a.internal", "port": 9010, "path": "/data/s1"},
                {"host": "node-1b.internal", "port": 9010, "path": "/data/s1"},
                {"host": "node-1c.internal", "port": 9010, "path": "/data/s1"},
            ]
        },
    ],
    partitioner=HashPartitioner(),
    election_timeout_ms=300,
)

Leader election completes within election_timeout_ms of a failure. Writes require a quorum commit. Reads are served from the leader by default — use read_consistency="eventual" for replica reads.

Spread across data centres

Assign replicas to different availability zones or regions using the zone tag. The Raft leader-election algorithm is zone-aware: it avoids electing a leader in a zone that already holds a majority of replicas, so a full AZ outage doesn't take down your cluster.

python

engine = RaftShardedEngine(
    shard_groups=[
        {
            "replicas": [
                {"host": "us-east-1a.db.internal", "port": 9010, "path": "/data/s0", "zone": "us-east-1a"},
                {"host": "us-east-1b.db.internal", "port": 9010, "path": "/data/s0", "zone": "us-east-1b"},
                {"host": "eu-west-1a.db.internal", "port": 9010, "path": "/data/s0", "zone": "eu-west-1a"},
            ]
        },
    ],
    partitioner=HashPartitioner(),
    election_timeout_ms=500,   # slightly higher for cross-region latency
)

Scaling path at a glance

Stage	Setup	Nodes	Fault tolerant?
Development	`GraphEngine`	1	—
Production (single machine)	`GraphEngine`	1	WAL durability
Scale-out	`ShardedGraphEngine`	2–N	No (single replica per shard)
High availability	`RaftShardedEngine`	3+ per shard	Yes (quorum)
Multi-region	`RaftShardedEngine` + `zone`	3+ per shard	Yes (AZ-aware election)

What stays the same at every stage

Cypher API — your queries don't change
Python SDK interface — engine.query(...), engine.add_node(...), etc.
REST & GraphQL endpoints — same paths, same auth
Encryption — KMS config carries through to every shard and replica
Journey Engine — AI workflows continue firing on graph changes across the cluster

No migration scripts. No re-indexing. No schema changes. Just add nodes to your cluster config and you're done.

Which tier includes clustering?

Feature	Desktop Free	Desktop Pro	Pro Cloud	Cloud Plus	Self-Hosted Server	Enterprise
Single-node	✅	✅	✅	✅	✅	✅
`ShardedGraphEngine`	—	—	—	✅	✅	✅
`RaftShardedEngine` (HA)	—	—	—	—	✅	✅
Multi-region / AZ-aware	—	—	—	—	—	✅

Scales with Your Team ​

Start on a single machine ​

Scale to a multi-shard cluster ​

Add fault tolerance with Raft ​

Spread across data centres ​

Scaling path at a glance ​

What stays the same at every stage ​

Which tier includes clustering? ​

Further reading ​

Scales with Your Team

Start on a single machine

Scale to a multi-shard cluster

Add fault tolerance with Raft

Spread across data centres

Scaling path at a glance

What stays the same at every stage

Which tier includes clustering?

Further reading