Skip to content

Scales with Your Team

Purple8 Graph is designed to grow with you — from a laptop to a production cluster — without any migration, re-architecting, or API changes. You write the same Cypher queries whether you're on a single machine or a 12-node fault-tolerant deployment across three data centres.

Start on a single machine

The default GraphEngine runs in-process. No Docker required, no separate server to manage. Install, import, query.

python
from purple8_graph import GraphEngine

engine = GraphEngine("./data")

engine.add_node("Document", {
    "title": "Q4 Report",
    "body": "Revenue grew 34% year-on-year...",
})

results = engine.query("""
    CALL db.vector.search('Document', $vec, 5)
    YIELD node, score
    RETURN node.title, score
""", vec=my_embedding)

This is production-ready. Many teams run Purple8 this way for months before they need to scale out. The embedded storage handles millions of nodes and billions of edges on a single NVMe drive.


Scale to a multi-shard cluster

When your dataset grows beyond a single machine — or you need higher write throughput — promote to ShardedGraphEngine. The API is identical.

python
from purple8_graph.distributed import ShardedGraphEngine, HashPartitioner

engine = ShardedGraphEngine(
    shard_configs=[
        {"path": "./data/shard-0"},
        {"path": "./data/shard-1"},
        {"path": "./data/shard-2"},
    ],
    partitioner=HashPartitioner(),
)

# Same Cypher — the engine federates across shards automatically
results = engine.query("""
    CALL db.vector.search('Document', $vec, 10)
    YIELD node, score
    MATCH (node)-[:AUTHORED_BY]->(author:Person)
    RETURN node.title, author.name, score
    ORDER BY score DESC LIMIT 5
""", vec=my_embedding)

Queries are automatically federated: Purple8 issues them in parallel across all shards and merges the results. You don't rewrite a single line of application code.


Add fault tolerance with Raft

For high availability, wrap shards in Raft consensus groups. Each shard gets 3 or 5 replicas; the cluster survives the loss of any minority of nodes.

python
from purple8_graph.distributed import RaftShardedEngine, HashPartitioner

engine = RaftShardedEngine(
    shard_groups=[
        {
            "replicas": [
                {"host": "node-0a.internal", "port": 9010, "path": "/data/s0"},
                {"host": "node-0b.internal", "port": 9010, "path": "/data/s0"},
                {"host": "node-0c.internal", "port": 9010, "path": "/data/s0"},
            ]
        },
        {
            "replicas": [
                {"host": "node-1a.internal", "port": 9010, "path": "/data/s1"},
                {"host": "node-1b.internal", "port": 9010, "path": "/data/s1"},
                {"host": "node-1c.internal", "port": 9010, "path": "/data/s1"},
            ]
        },
    ],
    partitioner=HashPartitioner(),
    election_timeout_ms=300,
)

Leader election completes within election_timeout_ms of a failure. Writes require a quorum commit. Reads are served from the leader by default — use read_consistency="eventual" for replica reads.


Spread across data centres

Assign replicas to different availability zones or regions using the zone tag. The Raft leader-election algorithm is zone-aware: it avoids electing a leader in a zone that already holds a majority of replicas, so a full AZ outage doesn't take down your cluster.

python
engine = RaftShardedEngine(
    shard_groups=[
        {
            "replicas": [
                {"host": "us-east-1a.db.internal", "port": 9010, "path": "/data/s0", "zone": "us-east-1a"},
                {"host": "us-east-1b.db.internal", "port": 9010, "path": "/data/s0", "zone": "us-east-1b"},
                {"host": "eu-west-1a.db.internal", "port": 9010, "path": "/data/s0", "zone": "eu-west-1a"},
            ]
        },
    ],
    partitioner=HashPartitioner(),
    election_timeout_ms=500,   # slightly higher for cross-region latency
)

Scaling path at a glance

StageSetupNodesFault tolerant?
DevelopmentGraphEngine1
Production (single machine)GraphEngine1WAL durability
Scale-outShardedGraphEngine2–NNo (single replica per shard)
High availabilityRaftShardedEngine3+ per shardYes (quorum)
Multi-regionRaftShardedEngine + zone3+ per shardYes (AZ-aware election)

What stays the same at every stage

  • Cypher API — your queries don't change
  • Python SDK interfaceengine.query(...), engine.add_node(...), etc.
  • REST & GraphQL endpoints — same paths, same auth
  • Encryption — KMS config carries through to every shard and replica
  • Journey Engine — AI workflows continue firing on graph changes across the cluster

No migration scripts. No re-indexing. No schema changes. Just add nodes to your cluster config and you're done.


Which tier includes clustering?

FeatureDesktop FreeDesktop ProPro CloudCloud PlusSelf-Hosted ServerEnterprise
Single-node
ShardedGraphEngine
RaftShardedEngine (HA)
Multi-region / AZ-aware

Further reading

Purple8 Graph is proprietary software. All rights reserved.