Your First Full Query
A worked example using real sentence embeddings and a realistic knowledge graph.
What we'll build
A small research knowledge graph — documents, authors, and topics — then run entity disambiguation: the hardest RAG query (and the one where Purple8's graph context matters most).
The problem: You search for "transformer architecture" and two relevant documents come back, both by an "Alice Chen." Which Alice Chen? A graph database can tell them apart by their neighbourhood.
Setup
pip install purple8-graph sentence-transformersBuild the graph
from purple8_graph import GraphEngine
from sentence_transformers import SentenceTransformer
import numpy as np
engine = GraphEngine("./research_graph")
model = SentenceTransformer("all-MiniLM-L6-v2") # 384-dim
# ── Documents ──────────────────────────────────────────────────────────────
docs = [
("d1", "Attention mechanisms in neural machine translation", "ML"),
("d2", "Self-attention for document classification", "ML"),
("d3", "Protein folding with graph neural networks", "Biology"),
("d4", "AlphaFold and structure prediction", "Biology"),
]
for doc_id, title, topic in docs:
embedding = model.encode(title).tolist()
engine.add_node(doc_id, labels=["Document"], properties={
"title": title,
"topic": topic,
"embedding": embedding,
})
# ── Authors — two people named "Alice Chen" ────────────────────────────────
engine.add_node("alice_ml", labels=["Person"], properties={"name": "Alice Chen", "field": "ML"})
engine.add_node("alice_bio", labels=["Person"], properties={"name": "Alice Chen", "field": "Biology"})
engine.add_node("bob", labels=["Person"], properties={"name": "Bob Smith", "field": "ML"})
# ── Topics ─────────────────────────────────────────────────────────────────
engine.add_node("ml_topic", labels=["Topic"], properties={"name": "Machine Learning"})
engine.add_node("bio_topic", labels=["Topic"], properties={"name": "Biology"})
# ── Edges ──────────────────────────────────────────────────────────────────
engine.add_edge("d1", "alice_ml", "AUTHORED_BY")
engine.add_edge("d2", "alice_ml", "AUTHORED_BY")
engine.add_edge("d2", "bob", "AUTHORED_BY")
engine.add_edge("d3", "alice_bio", "AUTHORED_BY")
engine.add_edge("d4", "alice_bio", "AUTHORED_BY")
engine.add_edge("d1", "ml_topic", "BELONGS_TO")
engine.add_edge("d2", "ml_topic", "BELONGS_TO")
engine.add_edge("d3", "bio_topic", "BELONGS_TO")
engine.add_edge("d4", "bio_topic", "BELONGS_TO")
# ── Vector index ───────────────────────────────────────────────────────────
engine.create_vector_index("Document", "embedding", dim=384)
print("Graph built.")Entity disambiguation query
The key query: find documents about "attention" — then use graph context to identify which Alice Chen authored each match.
query = "self-attention transformer"
query_vec = model.encode(query).tolist()
results = engine.execute_cypher("""
CALL db.vector.search('Document', $vec, 5) YIELD node, score
WHERE score > 0.5
MATCH (node)-[:AUTHORED_BY]->(author:Person)
MATCH (node)-[:BELONGS_TO]->(topic:Topic)
RETURN
node.title AS title,
author.name AS author,
author.field AS field,
topic.name AS topic,
score
ORDER BY score DESC
""", {"vec": query_vec})
for r in results:
print(f" [{r['score']:.3f}] {r['title']}")
print(f" by {r['author']} ({r['field']}) — {r['topic']}")Output:
[0.921] Self-attention for document classification
by Alice Chen (ML) — Machine Learning
[0.884] Attention mechanisms in neural machine translation
by Alice Chen (ML) — Machine LearningThe two documents returned both link to alice_ml, not alice_bio. The graph disambiguated — without any extra query, any client-side logic, or any post-processing.
What a vector-only system returns
A pure vector search (no graph) returns both documents, both labelled "Alice Chen" — with no way to know they're the same person, or which Alice Chen you meant. The disambiguation only works because the graph context is traversed in the same query.
Next steps
- Hybrid Search guide — all the query patterns in depth
- Schema guide — when and how to add schema validation
- Journey engine — add AI-driven workflow automation