Intugle
diff --git a/‎README.md‎
Lines changed: 3 additions & 1 deletion b/‎README.md‎
Lines changed: 3 additions & 1 deletion
diff --git a/‎docsite/docs/core-concepts/data-product/conceptual-search.md‎
Lines changed: 119 additions & 0 deletions b/‎docsite/docs/core-concepts/data-product/conceptual-search.md‎
Lines changed: 119 additions & 0 deletions
diff --git a/‎docsite/docs/core-concepts/data-product/index.md‎
Lines changed: 1 addition & 0 deletions b/‎docsite/docs/core-concepts/data-product/index.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docsite/docs/core-concepts/semantic-intelligence/semantic-search.md‎
Lines changed: 1 addition & 1 deletion b/‎docsite/docs/core-concepts/semantic-intelligence/semantic-search.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docsite/docs/examples.md‎
Lines changed: 1 addition & 0 deletions b/‎docsite/docs/examples.md‎
Lines changed: 1 addition & 0 deletions
@@ -42,6 +42,8 @@ Intugle’s GenAI-powered open-source Python library builds a semantic data mode
 *   **Semantic Data Model -** Transform raw, fragmented datasets into an intelligent semantic graph that captures entities, relationships, and context — the foundation for connected intelligence.
 *   **Business Glossary & Semantic Search:** Auto-generate a business glossary and enable search that understands meaning, not just keywords — making data more accessible across technical and business users.
 *   **Data Products -** Instantly generate SQL and reusable data products enriched with context, eliminating manual pipelines and accelerating data-to-insight.
+*   **Conceptual Search -** Generate data product plans from natural language queries, bridging the gap between business questions and executable data product definitions. Learn more in the [documentation](https://intugle.github.io/data-tools/docs/core-concepts/data-product/conceptual-search).
+
 
 ## Getting Started
 
@@ -107,7 +109,7 @@ For a detailed, hands-on introduction to the project, please see our quickstart
 | **Native Snowflake with Cortex Analyst [ Tech Manufacturing ]** | [`quickstart_native_snowflake.ipynb`](notebooks/quickstart_native_snowflake.ipynb) | [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Intugle/data-tools/blob/main/notebooks/quickstart_native_snowflake.ipynb) |
 | **Native Databricks with AI/BI Genie [ Tech Manufacturing ]** | [`quickstart_native_databricks.ipynb`](notebooks/quickstart_native_databricks.ipynb) | [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Intugle/data-tools/blob/main/notebooks/quickstart_native_databricks.ipynb) |
 | **Streamlit App**       | [`quickstart_streamlit.ipynb`](notebooks/quickstart_streamlit.ipynb) | [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Intugle/data-tools/blob/main/notebooks/quickstart_streamlit.ipynb) |
-
+| **Conceptual Search**   | [`quickstart_conceptual_search.ipynb`](notebooks/quickstart_conceptual_search.ipynb) | [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Intugle/data-tools/blob/main/notebooks/quickstart_conceptual_search.ipynb) |
 These datasets will take you through the following steps:
 
 *   **Generate Semantic Model** → The unified layer that transforms fragmented datasets, creating the foundation for connected intelligence.
 
@@ -0,0 +1,119 @@
+---
+sidebar_position: 8
+title: Conceptual Search
+---
+
+# Conceptual Search
+
+:::info Experimental Feature
+Conceptual Search is an experimental feature. The API and functionality may change in future releases.
+:::
+
+Conceptual Search is an AI-powered feature that allows you to generate a data product plan from a natural language query. It bridges the gap between a high-level business question and a concrete, executable data product definition.
+
+## Overview
+
+At its core, Conceptual Search uses a sophisticated, two-stage process orchestrated by AI agents and knowledge graphs:
+
+1.  **Knowledge Graphs**: The system builds knowledge graphs for both database tables and columns. Nodes represent tables/columns, and edges connect conceptually related items based on semantic similarity and shared concepts extracted by an LLM.
+2.  **Graph-Based Retrievers**: When you search, the system uses a hybrid approach of vector search and graph traversal to find relevant tables and columns, even if they are not direct keyword matches.
+3.  **AI Agents**: The process is managed by two LangChain/LangGraph agents: a `DataProductPlannerAgent` and a `DataProductBuilderAgent`.
+
+## The Two-Stage Workflow
+
+### Stage 1: Planning
+
+The goal of this stage is to convert a vague user request (e.g., "customer churn metrics") into a structured `DataProductPlan`, which is a well-defined list of dimensions and measures.
+
+1.  **Input**: A natural language query.
+2.  **Agent's Task**: The `DataProductPlannerAgent` uses its tools to find relevant database tables and existing data products.
+3.  **Output**: The agent produces a `DataProductPlan` object, which can be reviewed and modified by the user.
+
+:::tip User Validation is Key
+The `DataProductPlan` generated by the AI is a starting point. It is crucial to review and validate this plan to ensure it aligns with your business requirements before proceeding to the building stage.
+:::
+
+### Stage 2: Building
+
+This stage takes the abstract `DataProductPlan` and maps each attribute to a specific, physical database column, defining its logic (e.g., aggregation for measures).
+
+1.  **Input**: The `DataProductPlan` from Stage 1.
+2.  **Agent's Task**: The `DataProductBuilderAgent` iterates through each attribute in the plan, using the graph-based column retriever to find the most relevant physical column.
+3.  **Output**: The collected mappings are assembled into a final `ETLModel`, which is a complete, machine-readable definition of the data product, ready to be used to generate a SQL query.
+
+## Usage Example
+
+```python
+from intugle import DataProduct
+
+dp = DataProduct()
+
+# 1. Generate a plan from a natural language query
+plan = await dp.plan(query="top 10 customers by their total purchase amount")
+
+# 2. Review and modify the plan
+print("Original Plan:")
+plan.display()
+
+plan.rename_attribute("total purchase amount", "Total Spend")
+plan.disable_attribute("customer address") # Assuming this was in the plan
+
+print("\nModified Plan:")
+plan.display()
+
+
+# 3. Create the ETL model from the modified plan
+etl_model = await dp.create_etl_model_from_plan(plan)
+
+# 4. Build the data product
+result_dataset = dp.build(etl=etl_model)
+
+# 5. Access the results
+print(result_dataset.to_df())
+```
+
+
+## Modifying the Data Product Plan
+
+The `DataProductPlan` object is not just a static output; it's an interactive object that you can modify to refine the AI's suggestions. This allows you to correct any misunderstandings or add your own domain knowledge to the plan.
+
+Here are the available methods to modify the plan:
+
+| Method                             | Description                                                               | Example                                                                                             |
+| ---------------------------------- | ------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------- |
+| `rename_attribute(old, new)`       | Renames an existing attribute.                                            | `plan.rename_attribute('Customer ID', 'Client Identifier')`                                         |
+| `set_attribute_description(name, desc)` | Updates the description of an attribute.                               | `plan.set_attribute_description('Client Identifier', 'The unique ID for each client')`              |
+| `set_attribute_classification(name, class)` | Changes the classification to 'Dimension' or 'Measure'.          | `plan.set_attribute_classification('Total Sales', 'Measure')`                                       |
+| `disable_attribute(name)`          | Deactivates an attribute so it won't be included in the final data product. | `plan.disable_attribute('Customer Address')`                                                        |
+| `enable_attribute(name)`           | Reactivates a previously disabled attribute.                              | `plan.enable_attribute('Customer Address')`                                                         |
+| `to_df()`                          | Returns the final plan as a pandas DataFrame with only active attributes. | `final_plan_df = plan.to_df()`                                                                      |
+
+## Qdrant Server Requirement
+
+Conceptual Search utilizes [Qdrant](https://qdrant.tech/) as its vector database for efficient retrieval of relevant tables and columns. Therefore, a running Qdrant instance is required.
+
+You can easily set up a Qdrant server using Docker:
+
+```bash
+docker run -d -p 6333:6333 -p 6334:6334 \
+    -v qdrant_storage:/qdrant/storage:z \
+    --name qdrant qdrant/qdrant
+```
+
+After starting the Qdrant server, you need to configure its URL and API key (if authorization is used) in your environment variables:
+
+```bash
+export QDRANT_URL="http://localhost:6333"
+export QDRANT_API_KEY="your-qdrant-api-key" # if authorization is used
+```
+
+## Enhancing Performance with Tavily Web Search
+
+For better performance and more contextually aware data product plans, it is recommended to use the Tavily web search tool. This allows the planning agent to research industry-best practices and common metrics related to your query.
+
+To enable this feature, you need to get a API key from [Tavily](https://tavily.com/) and set it as an environment variable:
+
+```bash
+export TAVILY_API_KEY="your-tavily-api-key"
+```
+
@@ -93,3 +93,4 @@ The `DataProduct` class provides a powerful way to query your connected data wit
 *   **[Aggregations](./aggregations.md)**: Understand how to perform grouping and aggregation functions like `COUNT` and `SUM`.
 *   **[Joins](./joins.md)**: Discover how the builder automatically handles joins between tables.
 *   **[Advanced Examples](./advanced-examples.md)**: See how to combine these concepts to build complex data products.
+*   **[Conceptual Search](./conceptual-search.md)**: Learn how to generate data products from natural language queries.
@@ -131,7 +131,7 @@ from intugle.semantic_search import SemanticSearch
 
 # This assumes your project's .yml files are in the default location.
 # You can also specify the path to your models directory:
-# search_client = SemanticSearch(project_base="/path/to/your/models")
+# search_client = SemanticSearch(models_dir_path="/path/to/your/models")
 search_client = SemanticSearch()
 
 # 1. Initialize the search index.
 
@@ -13,6 +13,7 @@ For a detailed, hands-on introduction to the project, please see our quickstart
 | **Tech Manufacturing**  | [`quickstart_tech_manufacturing.ipynb`](https://github.com/Intugle/data-tools/blob/main/notebooks/quickstart_tech_manufacturing.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Intugle/data-tools/blob/main/notebooks/quickstart_tech_manufacturing.ipynb) |
 | **FMCG**                | [`quickstart_fmcg.ipynb`](https://github.com/Intugle/data-tools/blob/main/notebooks/quickstart_fmcg.ipynb)             | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Intugle/data-tools/blob/main/notebooks/quickstart_fmcg.ipynb)             |
 | **Sports Media**        | [`quickstart_sports_media.ipynb`](https://github.com/Intugle/data-tools/blob/main/notebooks/quickstart_sports_media.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Intugle/data-tools/blob/main/notebooks/quickstart_sports_media.ipynb) |
+| **Conceptual Search**   | [`quickstart_conceptual_search.ipynb`](https://github.com/Intugle/data-tools/blob/main/notebooks/quickstart_conceptual_search.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Intugle/data-tools/blob/main/notebooks/quickstart_conceptual_search.ipynb) |
 | **Databricks Unity Catalog [Health Care]** | [`quickstart_healthcare_databricks.ipynb`](https://github.com/Intugle/data-tools/blob/main/notebooks/quickstart_healthcare_databricks.ipynb) | Databricks Notebook Only |
 | **Snowflake Horizon Catalog [ FMCG ]** | [`quickstart_fmcg_snowflake.ipynb`](https://github.com/Intugle/data-tools/blob/main/notebooks/quickstart_fmcg_snowflake.ipynb) | Snowflake Notebook Only |
 | **Native Snowflake with Cortex Analyst [ Tech Manufacturing ]** | [`quickstart_native_snowflake.ipynb`](https://github.com/Intugle/data-tools/blob/main/notebooks/quickstart_native_snowflake.ipynb) | [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Intugle/data-tools/blob/main/notebooks/quickstart_native_snowflake.ipynb) |