Real-time Augmented AI with Structured Data
Most AI retrieval systems are built for unstructured content — PDFs, notes, chat history. But the most valuable data in any organisation is structured: CRM records, ERP transactions, contracts, tickets, financial accounts, HR systems.
Purple8 Graph bridges that gap. It lets you model structured data as a connected graph and query it with hybrid search — so your AI gets relational, up-to-date context with a single round-trip, not a pipeline of JOINs and vector calls.
The core idea
Traditional RAG:
User question → embed → vector search → top-k chunks → LLMGraph-augmented AI with structured data:
User question → embed → Cypher (vector + relationships + filters) → structured context row → LLMThe difference: the LLM sees not just what a record says, but who owns it, what it's connected to, and what state it's in right now.
Step 1 — Model your structured data as a graph
Any structured record becomes a node. Foreign keys become edges.
from purple8_graph import GraphEngine
engine = GraphEngine("./data")
# CRM records
engine.add_node("Account", {"id": "acct-001", "name": "Acme Corp", "arr": 240000, "tier": "enterprise", "industry": "manufacturing"})
engine.add_node("Contact", {"id": "ctt-001", "name": "Jane Smith", "role": "CFO", "email": "jane@acme.com"})
engine.add_node("Opportunity", {"id": "opp-001", "name": "Renewal Q3", "value": 120000, "stage": "negotiation", "close_date": "2026-09-30"})
engine.add_node("SupportTicket", {"id": "tkt-001", "subject": "API timeout errors", "priority": "high", "status": "open"})
engine.add_node("Contract", {"id": "ctr-001", "start": "2025-10-01", "renewal_date": "2026-09-30", "auto_renew": False})
# Relationships encode the structure that SQL can't traverse in one query
engine.add_edge("opp-001", "acct-001", "BELONGS_TO")
engine.add_edge("acct-001", "ctt-001", "PRIMARY_CONTACT")
engine.add_edge("acct-001", "tkt-001", "HAS_TICKET")
engine.add_edge("acct-001", "ctr-001", "HAS_CONTRACT")No schema required
You don't need to define a schema before ingesting. Purple8 infers it from the data. Add new record types — a new ERP object, a new CRM field — without any DDL or migration.
Step 2 — Retrieve structured context with one Cypher query
At query time, hybrid Cypher combines vector similarity with relationship traversal and property filters — in a single round-trip.
import openai
user_question = "What's the renewal risk on Acme Corp?"
query_vec = openai.embeddings.create(input=user_question, model="text-embedding-3-small").data[0].embedding
results = engine.query("""
CALL db.vector.search('Opportunity', $vec, 5) YIELD node, score
WHERE score > 0.70
MATCH (node)-[:BELONGS_TO]->(acct:Account)
MATCH (acct)<-[:PRIMARY_CONTACT]-(contact:Contact)
OPTIONAL MATCH (acct)-[:HAS_TICKET]->(t:SupportTicket {status: 'open'})
OPTIONAL MATCH (acct)-[:HAS_CONTRACT]->(c:Contract)
RETURN
node.name AS opportunity,
node.stage AS stage,
node.value AS value,
node.close_date AS close_date,
acct.name AS account,
acct.arr AS arr,
acct.tier AS tier,
contact.name AS owner,
contact.email AS owner_email,
count(t) AS open_tickets,
c.renewal_date AS renewal_date,
c.auto_renew AS auto_renew,
score
ORDER BY score DESC
LIMIT 1
""", vec=query_vec)One query. Zero application-side joins.
Step 3 — Inject structured context into the LLM
row = results[0]
system_prompt = "You are a revenue intelligence assistant. Answer based only on the structured context provided."
user_prompt = f"""
Structured context:
- Opportunity: {row['opportunity']} (stage: {row['stage']}, value: ${row['value']:,}, closes: {row['close_date']})
- Account: {row['account']} ({row['tier']} tier, ARR: ${row['arr']:,})
- Primary contact: {row['owner']} <{row['owner_email']}>
- Open support tickets: {row['open_tickets']}
- Contract renewal: {row['renewal_date']} (auto-renew: {'yes' if row['auto_renew'] else 'no'})
Question: {user_question}
"""
response = openai.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt},
]
)
print(response.choices[0].message.content)Step 4 — Keep context fresh with real-time graph updates
The graph is your source of truth. Update records as they change in your source system — the next AI query automatically gets the latest state.
# A deal just moved stage — update the graph
engine.update_node("opp-001", {"stage": "closed-won", "close_date": "2026-07-15"})
# A new ticket opened
engine.add_node("SupportTicket", {"id": "tkt-002", "subject": "Data export failure", "priority": "critical", "status": "open"})
engine.add_edge("acct-001", "tkt-002", "HAS_TICKET")No cache invalidation. No pipeline re-run. The next vector search returns the updated context.
Step 5 — Trigger AI workflows automatically on graph changes
Use the Journey Engine to fire AI analysis whenever a structured record reaches a certain state — without polling.
from purple8_graph import JourneyEngine, JourneyAIAdvisor
je = JourneyEngine(engine)
je.define_journey("renewal-risk-check", stages=[
{
"name": "negotiation",
"condition": "node.stage = 'negotiation' AND node.close_date < date() + duration('P30D')",
"advisor": JourneyAIAdvisor(
prompt_template="""
The opportunity '{name}' is in negotiation and closes in under 30 days.
Open tickets: {open_tickets}. Auto-renew: {auto_renew}.
Recommend 3 specific actions to reduce churn risk.
""",
output_field="risk_actions",
),
"sla_hours": 1,
},
{
"name": "human-review",
"condition": "node.risk_score > 0.7",
"human_in_the_loop": True,
"assignee_query": "MATCH (u:User {role: 'csm'}) RETURN u LIMIT 1",
},
])When any Opportunity node transitions to negotiation and its close date is within 30 days, Purple8 automatically runs the AI advisor, writes the risk_actions back to the node, and (if the risk score is high) routes it to a human CSM for review — all with a full audit trail.
Why this beats SQL + vector search
| SQL JOIN + vector | Purple8 Graph | |
|---|---|---|
| Round-trips per query | 2–4 | 1 |
| Arbitrary relationship depth | ❌ | ✅ |
| Schema migrations for new objects | Required | Not required |
| Real-time updates in context | Cache invalidation | Direct node update |
| AI workflow triggers on data change | External job/event | Built-in Journey Engine |
| Hybrid keyword + semantic + graph | ❌ | ✅ |
More use cases
See the Use Cases page for working Cypher examples across:
- CRM deal intelligence — renewal risk, churn signals
- Financial risk traversal — counterparty exposure chains
- HR org-chart AI — reporting structure, headcount queries
- Supply chain disruption alerts — real-time supplier risk
- Regulatory compliance Q&A — policy-to-data-asset mapping
Further reading
- Hybrid Search guide — BM25 + vector + graph in one query
- Journey Engine guide — trigger AI workflows on graph changes
- Schema & Data Model — ingest structured data without upfront DDL
- LLM / RAG Grounding use case — unstructured variant