-
Notifications
You must be signed in to change notification settings - Fork 155
feat(ai): Adding Lucene & Embedding-Based Search Operators to Apache GeaFlow (incubating) for Lightweight Context Memory #716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Appointat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test
Appointat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your PR. Left some comments.
geaflow-ai/src/main/java/org/apache/geaflow/ai/common/model/EmbeddingService.java
Show resolved
Hide resolved
geaflow-ai/src/main/java/org/apache/geaflow/ai/common/model/EmbeddingService.java
Show resolved
Hide resolved
geaflow-ai/src/main/java/org/apache/geaflow/ai/common/model/EmbeddingService.java
Show resolved
Hide resolved
geaflow-ai/src/main/java/org/apache/geaflow/ai/common/model/ModelInfo.java
Outdated
Show resolved
Hide resolved
geaflow-ai/src/main/java/org/apache/geaflow/ai/common/model/Response.java
Show resolved
Hide resolved
geaflow-ai/src/main/java/org/apache/geaflow/ai/index/vector/TraversalVector.java
Outdated
Show resolved
Hide resolved
geaflow-ai/src/main/java/org/apache/geaflow/ai/index/EmbeddingIndexStore.java
Outdated
Show resolved
Hide resolved
geaflow-ai/src/main/java/org/apache/geaflow/ai/graph/io/CsvFileReader.java
Show resolved
Hide resolved
geaflow-ai/src/main/java/org/apache/geaflow/ai/operator/SessionOperator.java
Show resolved
Hide resolved
geaflow-ai/src/main/java/org/apache/geaflow/ai/operator/EmbeddingOperator.java
Outdated
Show resolved
Hide resolved
geaflow-ai/src/main/java/org/apache/geaflow/ai/session/SessionManagement.java
Show resolved
Hide resolved
geaflow-ai/src/main/java/org/apache/geaflow/ai/operator/SearchUtils.java
Outdated
Show resolved
Hide resolved
geaflow-ai/src/main/java/org/apache/geaflow/ai/operator/SessionOperator.java
Show resolved
Hide resolved
cbqiao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
LGTM |
kitalkuyo-gita
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Appointat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Appointat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Appointat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Appointat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
We're excited to introduce initial support for context-aware memory operations in Apache GeaFlow (incubating) through the integration of two key retrieval operators: Lucene-powered keyword search and embedding-based semantic search. This enhancement lays the foundational layer for building dynamic, AI-driven graph memory systems — enabling real-time, hybrid querying over structured graph data and unstructured semantic intent.
✅ Key Features Implemented
KeywordVector+ Lucene Indexing: Enables fast, full-text retrieval of entities using BM25-style keyword matching. Ideal for surfacing exact or near-exact matches from entity attributes (e.g., names, emails, titles).EmbeddingVector+ Vector Index Store: Supports semantic search via high-dimensional embeddings. Queries are encoded using a configured embedding model and matched against pre-indexed node representations.VectorSearchInterface: Combines multiple vector types (keyword, embedding, traversal hints) into a single search context, paving the way for multimodal retrieval.🧪 Validated Use Cases
Our
GraphMemoryTestsuite demonstrates:Comment_hasCreator_Person) in follow-up rounds via contextual refinement.🔮 Why This Matters
This work represents the first step toward Graphiti-inspired, relationship-aware AI memory within GeaFlow:
By leveraging GeaFlow’s native streaming graph engine, we aim to go beyond batch RAG — supporting incremental updates, temporal reasoning, and multi-hop inference at low latency.
Next Steps:
We propose incubating this as the GeaFlow Memory Engine, with upcoming support for:
This PR sets the stage: from graph analytics to graph-native AI memory.
Let’s build the future of contextual intelligence — on streaming graphs. 🚀