Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
## Changelog

### v1.5.1 (February 23, 2026)
### v1.5.1 (March 30, 2026)
- Fix challenge learner
- Update requirements.
- Updated documentations website.
- Add RAG var to LearnerPipeline and its documentation with examples.
- Minor bug fixing in LLM-Augmenter.

### v1.5.0 (February 5, 2026)
- Fix challenge learners
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,9 @@ print(metrics)

Other available learners:
- [LLM-Based Learner](https://ontolearner.readthedocs.io/learners/llm.html)
- [Retriever-Based Learner](https://ontolearner.readthedocs.io/learners/retrieval.html)
- [RAG-Based Learner](https://ontolearner.readthedocs.io/learners/rag.html)
- [LLMs4OL Challenge Learners](https://ontolearner.readthedocs.io/learners/llms4ol.html)

---

Expand Down
6 changes: 2 additions & 4 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@


.. raw:: html

<div align="center">
Expand Down Expand Up @@ -109,8 +107,8 @@ Working with OntoLearner is straightforward:
random_state=42
)

# Initialize a multi-component learning pipeline (retriever + LLM)
# This configuration enables a Retrieval-Augmented Generation (RAG) setup
# RAG can be configured either by passing both IDs (shown here),
# or by passing a prebuilt `rag=` learner object.
pipeline = LearnerPipeline(
retriever_id='sentence-transformers/all-MiniLM-L6-v2',
llm_id='Qwen/Qwen2.5-0.5B-Instruct',
Expand Down
4 changes: 2 additions & 2 deletions docs/source/learners/llm.rst
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ You will see a evaluations results.

Pipeline Usage
-----------------------
The OntoLearner package also offers a streamlined ``LearnerPipeline`` class that simplifies the entire process of initializing, training, predicting, and evaluating a RAG setup into a single call. This is particularly useful for rapid experimentation and deployment.
The OntoLearner package also offers a streamlined ``LearnerPipeline`` class that simplifies initialization, training, prediction, and evaluation into a single call. In this section, we run the pipeline in **LLM-only** mode by setting ``llm_id`` only.

.. code-block:: python

Expand All @@ -113,7 +113,7 @@ The OntoLearner package also offers a streamlined ``LearnerPipeline`` class that

# Set up the learner pipeline using a lightweight instruction-tuned LLM
pipeline = LearnerPipeline(
llm_id='Qwen/Qwen2.5-0.5B-Instruct', # Small-scale LLM for reasoning over term-type assignments
llm_id='Qwen/Qwen2.5-0.5B-Instruct', # LLM-only mode
hf_token='...', # Hugging Face access token for loading gated models
batch_size=32 # Batch size for parallel inference (if applicable)
)
Expand Down
22 changes: 15 additions & 7 deletions docs/source/learners/rag.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,8 @@ We start by importing necessary components from the ontolearner package, loading
AgrO, # Example agricultural ontology
train_test_split, # Helper function for data splitting
LabelMapper, # Maps ontology labels to/from textual representations
StandardizedPrompting # Standard prompting strategy across tasks
evaluation_report
StandardizedPrompting, # Standard prompting strategy across tasks
evaluation_report,
)

# Load the AgrO ontology (an agricultural domain ontology)
Expand Down Expand Up @@ -99,16 +99,24 @@ To build a RAG model, you first initialize its constituent parts: an LLM learner

Pipeline Usage
---------------------
Similar to LLM and Retrieval learner, RAG Learner is also callable via streamlined ``LearnerPipeline`` class that simplifies the entire learning process.
Similar to LLM and Retrieval learners, RAG is callable via ``LearnerPipeline``, you can run RAG in two equivalent ways:

You initialize the ``LearnerPipeline`` by directly providing the ``retriever_id``, ``llm_id``, and other parameters like ``hf_token``, ``batch_size``, and ``top_k`` (number of top retrievals to include in RAG prompting). Then, you simply call the ``pipeline`` instance with your ``train_data``, ``test_data``, specify ``evaluate=True`` to compute metrics, and define the ``task`` (e.g., `'term-typing'`).
1. Provide both ``retriever_id`` and ``llm_id`` (pipeline auto-composes an ``AutoRAGLearner``).
2. Provide a prebuilt ``rag`` learner object for custom configurations.

.. code-block:: python

# Import core modules from the OntoLearner library
from ontolearner import LearnerPipeline, AgrO, train_test_split
from ontolearner import (
LearnerPipeline,
AutoLLMLearner,
AutoRetrieverLearner,
AutoRAGLearner,
LabelMapper,
StandardizedPrompting,
AgrO,
train_test_split,
)

# Load the AgrO ontology, which contains concepts related to wines, their properties, and categories
ontology = AgrO()
ontology.load() # Load entities, types, and structured term annotations from the ontology
ontological_data = ontology.extract()
Expand Down
11 changes: 9 additions & 2 deletions docs/source/learners/retrieval.rst
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ When working with large contexts, the retriever model may encounter memory issue
Pipeline Usage
-----------------------

Similar to LLM learner, Retrieval Learner is also callable via streamlined ``LearnerPipeline`` class that simplifies the entire process learning.
Similar to the LLM learner, Retrieval learner is also callable via the streamlined ``LearnerPipeline`` class. In this section we use **retriever-only** mode by providing ``retriever_id`` only.

.. code-block:: python

Expand All @@ -100,7 +100,7 @@ Similar to LLM learner, Retrieval Learner is also callable via streamlined ``Lea
)

# Initialize the learning pipeline using a dense retriever
# This configuration uses sentence embeddings to match similar relational contexts
# This is retriever-only mode (no LLM component)
pipeline = LearnerPipeline(
retriever_id='sentence-transformers/all-MiniLM-L6-v2', # Hugging Face model ID for retrieval
batch_size=10, # Number of samples to process per batch (if batching is enabled internally)
Expand All @@ -125,6 +125,10 @@ Similar to LLM learner, Retrieval Learner is also callable via streamlined ``Lea
# Print the full output dictionary (includes predictions)
print(outputs)

.. note::

For RAG with ``LearnerPipeline`` see: `https://ontolearner.readthedocs.io/learners/rag.html <https://ontolearner.readthedocs.io/learners/rag.html>`_.

.. hint::
See `Learning Tasks <https://ontolearner.readthedocs.io/learning_tasks/llms4ol.html>`_ for possible tasks within Learners.

Expand Down Expand Up @@ -372,6 +376,9 @@ Here the ``LLMAugmentedRetrieverLearner`` is the high-level wrapper that orchest
augments = {"config": llm_augmenter_generator.get_config()}
augments[task] = llm_augmenter_generator.augment(ontological_data, task=task)

base_retriever = LLMAugmentedRetriever()
learner = LLMAugmentedRetrieverLearner(base_retriever=base_retriever)

learner.set_augmenter(augments)
learner.load(model_id="Qwen/Qwen3-Embedding-8B")

Expand Down
6 changes: 6 additions & 0 deletions docs/source/package_reference/pipeline.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,12 @@
Learner Pipeline
====================

``LearnerPipeline`` supports:

- retriever-only mode (set ``retriever_id``)
- llm-only mode (set ``llm_id``)
- rag mode (set both ``retriever_id`` and ``llm_id``), or provide a prebuilt ``rag`` learner

LearnerPipeline
---------------------
.. autoclass:: ontolearner._learner.LearnerPipeline
Expand Down
34 changes: 33 additions & 1 deletion docs/source/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,11 @@ To alighn with machine learning follow, once the ontology is loaded, and ontolog
)


Once the data is split into training and testing sets, you can apply learning models to the ontology learning tasks. OntoLearner supports multiple modeling approaches, including retrieval-based methods, Large Language Model (LLM)-based techniques, and Retrieval-Augmented Generation (RAG) strategies. The ``LearnerPipeline`` within OntoLearner is designed for ease of use, abstracting away the complexities of loading models and preparing datasets or data loaders. You can configure the pipeline with your choice of LLMs, retrievers, or RAG components.
Once the data is split into training and testing sets, you can apply learning models to the ontology learning tasks. OntoLearner supports multiple modeling approaches, including retrieval-based methods, Large Language Model (LLM)-based techniques, and Retrieval-Augmented Generation (RAG) strategies. The ``LearnerPipeline`` supports all three modes:

- Retriever-only: set ``retriever_id``
- LLM-only: set ``llm_id``
- RAG: set both ``retriever_id`` + ``llm_id`` for AutoRAGLearner. For prebuild RAG pass ``rag`` learner.

In the example below, we configure a RAG-based learner by specifying the Qwen LLM (`Qwen/Qwen2.5-0.5B-Instruct <https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct>`_) and a retriever based on a sentence-transformer model (`all-MiniLM-L6-v2 <https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2>`_):

Expand Down Expand Up @@ -165,6 +169,34 @@ In the example below, we configure a RAG-based learner by specifying the Qwen LL
- ``llm_id``: The instruction-following language model used to generate candidate outputs.
- ``top_k``: Number of retrieved examples passed to the LLM (used in RAG setup).
- ``hf_token``: Required for loading gated models from Hugging Face.
- ``rag``: Optional prebuilt ``AutoRAGLearner`` (or compatible) object for custom RAG setups.

If you already created a RAG learner object, you can pass it directly:

.. code-block:: python

from ontolearner import (
LearnerPipeline,
AutoLLMLearner,
AutoRetrieverLearner,
AutoRAGLearner,
LabelMapper,
StandardizedPrompting,
)

retriever = AutoRetrieverLearner(top_k=3)
llm = AutoLLMLearner(
prompting=StandardizedPrompting,
label_mapper=LabelMapper(),
token='<YOUR_HF_TOKEN>'
)
rag = AutoRAGLearner(retriever=retriever, llm=llm)

pipeline = LearnerPipeline(
rag=rag,
retriever_id='sentence-transformers/all-MiniLM-L6-v2',
llm_id='Qwen/Qwen2.5-0.5B-Instruct'
)

Once configured, the pipeline is executed on the training and test data:

Expand Down
6 changes: 3 additions & 3 deletions examples/llm_learner_alexbek_rag_term_typing.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,11 @@
output_dir="./results/",
)

# Build the pipeline and pass raw structured objects end-to-end.
# We place the RAG learner in the llm slot and set llm_id accordingly.
# Build the pipeline and pass the dedicated RAG learner explicitly.
pipe = LearnerPipeline(
llm=rag_learner,
rag=rag_learner,
llm_id="Qwen/Qwen2.5-0.5B-Instruct",
retriever_id="sentence-transformers/all-MiniLM-L6-v2",
ontologizer_data=True,
)

Expand Down
Loading
Loading