Real-time Augmented AI with Structured Data

Most AI retrieval systems are built for unstructured content — PDFs, notes, chat history. But the most valuable data in any organisation is structured: CRM records, ERP transactions, contracts, tickets, financial accounts, HR systems.

Purple8 Graph bridges that gap. It lets you model structured data as a connected graph and query it with hybrid search — so your AI gets relational, up-to-date context with a single round-trip, not a pipeline of JOINs and vector calls.

The core idea

Traditional RAG:

User question → embed → vector search → top-k chunks → LLM

Graph-augmented AI with structured data:

User question → embed → Cypher (vector + relationships + filters) → structured context row → LLM

The difference: the LLM sees not just what a record says, but who owns it, what it's connected to, and what state it's in right now.

Step 1 — Model your structured data as a graph

Any structured record becomes a node. Foreign keys become edges.

python

from purple8_graph import GraphEngine

engine = GraphEngine("./data")

# CRM records
engine.add_node("Account",     {"id": "acct-001", "name": "Acme Corp",   "arr": 240000, "tier": "enterprise", "industry": "manufacturing"})
engine.add_node("Contact",     {"id": "ctt-001",  "name": "Jane Smith",  "role": "CFO", "email": "jane@acme.com"})
engine.add_node("Opportunity", {"id": "opp-001",  "name": "Renewal Q3", "value": 120000, "stage": "negotiation", "close_date": "2026-09-30"})
engine.add_node("SupportTicket", {"id": "tkt-001", "subject": "API timeout errors", "priority": "high", "status": "open"})
engine.add_node("Contract",    {"id": "ctr-001",  "start": "2025-10-01", "renewal_date": "2026-09-30", "auto_renew": False})

# Relationships encode the structure that SQL can't traverse in one query
engine.add_edge("opp-001", "acct-001", "BELONGS_TO")
engine.add_edge("acct-001", "ctt-001", "PRIMARY_CONTACT")
engine.add_edge("acct-001", "tkt-001", "HAS_TICKET")
engine.add_edge("acct-001", "ctr-001", "HAS_CONTRACT")

No schema required

You don't need to define a schema before ingesting. Purple8 infers it from the data. Add new record types — a new ERP object, a new CRM field — without any DDL or migration.

Step 2 — Retrieve structured context with one Cypher query

At query time, hybrid Cypher combines vector similarity with relationship traversal and property filters — in a single round-trip.

python

import openai

user_question = "What's the renewal risk on Acme Corp?"
query_vec = openai.embeddings.create(input=user_question, model="text-embedding-3-small").data[0].embedding

results = engine.query("""
    CALL db.vector.search('Opportunity', $vec, 5) YIELD node, score
    WHERE score > 0.70
    MATCH (node)-[:BELONGS_TO]->(acct:Account)
    MATCH (acct)<-[:PRIMARY_CONTACT]-(contact:Contact)
    OPTIONAL MATCH (acct)-[:HAS_TICKET]->(t:SupportTicket {status: 'open'})
    OPTIONAL MATCH (acct)-[:HAS_CONTRACT]->(c:Contract)
    RETURN
        node.name          AS opportunity,
        node.stage         AS stage,
        node.value         AS value,
        node.close_date    AS close_date,
        acct.name          AS account,
        acct.arr           AS arr,
        acct.tier          AS tier,
        contact.name       AS owner,
        contact.email      AS owner_email,
        count(t)           AS open_tickets,
        c.renewal_date     AS renewal_date,
        c.auto_renew       AS auto_renew,
        score
    ORDER BY score DESC
    LIMIT 1
""", vec=query_vec)

One query. Zero application-side joins.

Step 3 — Inject structured context into the LLM

python

row = results[0]

system_prompt = "You are a revenue intelligence assistant. Answer based only on the structured context provided."

user_prompt = f"""
Structured context:
- Opportunity: {row['opportunity']} (stage: {row['stage']}, value: ${row['value']:,}, closes: {row['close_date']})
- Account: {row['account']} ({row['tier']} tier, ARR: ${row['arr']:,})
- Primary contact: {row['owner']} <{row['owner_email']}>
- Open support tickets: {row['open_tickets']}
- Contract renewal: {row['renewal_date']} (auto-renew: {'yes' if row['auto_renew'] else 'no'})

Question: {user_question}
"""

response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user",   "content": user_prompt},
    ]
)
print(response.choices[0].message.content)

Step 4 — Keep context fresh with real-time graph updates

The graph is your source of truth. Update records as they change in your source system — the next AI query automatically gets the latest state.

python

# A deal just moved stage — update the graph
engine.update_node("opp-001", {"stage": "closed-won", "close_date": "2026-07-15"})

# A new ticket opened
engine.add_node("SupportTicket", {"id": "tkt-002", "subject": "Data export failure", "priority": "critical", "status": "open"})
engine.add_edge("acct-001", "tkt-002", "HAS_TICKET")

No cache invalidation. No pipeline re-run. The next vector search returns the updated context.

Step 5 — Trigger AI workflows automatically on graph changes

Use the Journey Engine to fire AI analysis whenever a structured record reaches a certain state — without polling.

python

from purple8_graph import JourneyEngine, JourneyAIAdvisor

je = JourneyEngine(engine)

je.define_journey("renewal-risk-check", stages=[
    {
        "name":      "negotiation",
        "condition": "node.stage = 'negotiation' AND node.close_date < date() + duration('P30D')",
        "advisor":   JourneyAIAdvisor(
            prompt_template="""
            The opportunity '{name}' is in negotiation and closes in under 30 days.
            Open tickets: {open_tickets}. Auto-renew: {auto_renew}.
            Recommend 3 specific actions to reduce churn risk.
            """,
            output_field="risk_actions",
        ),
        "sla_hours": 1,
    },
    {
        "name":      "human-review",
        "condition": "node.risk_score > 0.7",
        "human_in_the_loop": True,
        "assignee_query": "MATCH (u:User {role: 'csm'}) RETURN u LIMIT 1",
    },
])

When any Opportunity node transitions to negotiation and its close date is within 30 days, Purple8 automatically runs the AI advisor, writes the risk_actions back to the node, and (if the risk score is high) routes it to a human CSM for review — all with a full audit trail.

→ Journey Engine guide

Why this beats SQL + vector search

	SQL JOIN + vector	Purple8 Graph
Round-trips per query	2–4	1
Arbitrary relationship depth	❌	✅
Schema migrations for new objects	Required	Not required
Real-time updates in context	Cache invalidation	Direct node update
AI workflow triggers on data change	External job/event	Built-in Journey Engine
Hybrid keyword + semantic + graph	❌	✅

More use cases

See the Use Cases page for working Cypher examples across:

CRM deal intelligence — renewal risk, churn signals
Financial risk traversal — counterparty exposure chains
HR org-chart AI — reporting structure, headcount queries
Supply chain disruption alerts — real-time supplier risk
Regulatory compliance Q&A — policy-to-data-asset mapping

Real-time Augmented AI with Structured Data ​

The core idea ​

Step 1 — Model your structured data as a graph ​

Step 2 — Retrieve structured context with one Cypher query ​

Step 3 — Inject structured context into the LLM ​

Step 4 — Keep context fresh with real-time graph updates ​

Step 5 — Trigger AI workflows automatically on graph changes ​

Why this beats SQL + vector search ​

More use cases ​

Further reading ​

Real-time Augmented AI with Structured Data

The core idea

Step 1 — Model your structured data as a graph

Step 2 — Retrieve structured context with one Cypher query

Step 3 — Inject structured context into the LLM

Step 4 — Keep context fresh with real-time graph updates

Step 5 — Trigger AI workflows automatically on graph changes

Why this beats SQL + vector search

More use cases

Further reading