Skip to content

Conversation

@Leomrlin
Copy link
Contributor

We're excited to introduce initial support for context-aware memory operations in Apache GeaFlow (incubating) through the integration of two key retrieval operators: Lucene-powered keyword search and embedding-based semantic search. This enhancement lays the foundational layer for building dynamic, AI-driven graph memory systems — enabling real-time, hybrid querying over structured graph data and unstructured semantic intent.

✅ Key Features Implemented

  • KeywordVector + Lucene Indexing: Enables fast, full-text retrieval of entities using BM25-style keyword matching. Ideal for surfacing exact or near-exact matches from entity attributes (e.g., names, emails, titles).
  • EmbeddingVector + Vector Index Store: Supports semantic search via high-dimensional embeddings. Queries are encoded using a configured embedding model and matched against pre-indexed node representations.
  • Hybrid VectorSearch Interface: Combines multiple vector types (keyword, embedding, traversal hints) into a single search context, paving the way for multimodal retrieval.
  • End-to-End Query Pipeline: From query ingestion → hybrid indexing → graph retrieval → context verbalization, demonstrated with LDBC-scale data.

🧪 Validated Use Cases

Our GraphMemoryTest suite demonstrates:

  • Resolving ambiguous queries like "Chaim Azriel" into multiple candidate persons using keyword + embedding fusion.
  • Traversing relationships (e.g., Comment_hasCreator_Person) in follow-up rounds via contextual refinement.
  • Iterative context expansion across multiple search cycles — mimicking agent memory evolution.

🔮 Why This Matters

This work represents the first step toward Graphiti-inspired, relationship-aware AI memory within GeaFlow:

Instead of treating context as static text, we model it as a dynamic, evolving subgraph, enriched by both semantic similarity and topological structure.

By leveraging GeaFlow’s native streaming graph engine, we aim to go beyond batch RAG — supporting incremental updates, temporal reasoning, and multi-hop inference at low latency.


Next Steps:
We propose incubating this as the GeaFlow Memory Engine, with upcoming support for:

  • Graph traversal-guided re-ranking
  • Agent session management with episodic memory
  • Integration with LLM agents for autonomous reasoning

This PR sets the stage: from graph analytics to graph-native AI memory.

Let’s build the future of contextual intelligence — on streaming graphs. 🚀

@yaozhongq yaozhongq requested a review from cbqiao December 29, 2025 07:52
Copy link

@Appointat Appointat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test

Copy link

@Appointat Appointat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR. Left some comments.

@Leomrlin Leomrlin changed the title feat(dsl): Adding Lucene & Embedding-Based Search Operators to Apache GeaFlow (incubating) for Lightweight Context Memory feat(ai): Adding Lucene & Embedding-Based Search Operators to Apache GeaFlow (incubating) for Lightweight Context Memory Jan 6, 2026
Copy link
Contributor

@cbqiao cbqiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@DukeWangYu
Copy link
Contributor

LGTM

Copy link
Contributor

@kitalkuyo-gita kitalkuyo-gita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@Appointat Appointat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@Appointat Appointat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@Appointat Appointat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@Appointat Appointat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cbqiao cbqiao merged commit ee4a0c5 into apache:master Jan 21, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants