fix(rlsapi): add error handling, metrics, and system context #1006

major · 2026-01-15T16:41:16Z

Description

Follow-up improvements to the rlsapi v1 /infer endpoint:

Add RateLimitError (429) and APIStatusError (500) handling
Add llm_calls_failures_total metrics tracking
Document skip_rag field as reserved for future use
Include system info (OS, version, arch) in LLM instructions
Standardize MOCK_AUTH test constant to use project-standard pattern

Type of change

Bug fix
Optimization
Documentation Update
Unit tests improvement

Tools used to create PR

Assisted-by: Claude
Generated by: N/A

Related Tickets & Documents

Related Issue # N/A
Depends on fix(rlsapi): use Responses API instead of Agent abstraction #969

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Unit tests added for _build_instructions. All existing tests pass.

Summary by CodeRabbit

New Features
- Inference requests can include system context and explicit system instructions.
Bug Fixes
- Enhanced handling and HTTP responses for rate-limit and server-error cases.
- Improved metrics for LLM call failures and related error paths.
Chores
- Updated field documentation to mark a RAG-related option as reserved for future use.
Tests
- Expanded tests and fixtures to cover system-instruction building and error scenarios.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

openshift-ci · 2026-01-15T16:41:28Z

Hi @major. Thanks for your PR.

I'm waiting for a lightspeed-core member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

coderabbitai · 2026-01-15T16:41:30Z

Walkthrough

Adds system-context instruction construction to the RLSAPI v1 inference flow, updates retrieve_simple_response to accept instructions, propagates those instructions into LLM calls, and extends error handling and metrics for rate limits and API status errors (429 and 500 responses).

Changes

Cohort / File(s)	Summary
Endpoint Logic & Error Handling `src/app/endpoints/rlsapi_v1.py`	Added `_build_instructions(systeminfo)` to compose system-aware instructions; changed `retrieve_simple_response(question: str)` → `retrieve_simple_response(question: str, instructions: str)` and pass instructions through infer flow; added imports for `APIStatusError`, `RateLimitError`, `InternalServerErrorResponse`, `QuotaExceededResponse`, and `RlsapiV1SystemInfo`; added handling and metrics increments for rate-limits (429) and API status errors (500).
Request Model Docs `src/models/rlsapi/requests.py`	Updated `RlsapiV1InferRequest.skip_rag` docstring to mark RAG as not yet implemented (documentation only).
Tests & Mocks `tests/unit/app/endpoints/test_rlsapi_v1.py`	Exported `_build_instructions` for testing; added `mock_auth_resolvers` fixture and centralized auth mocks; added parametric tests for `_build_instructions` (full/partial/empty system info); updated tests to pass explicit system prompt and adapt to signature changes; updated global mock auth values.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant InferEndpoint
    participant BuildInstructions
    participant RetrieveSimpleResponse
    participant LlamaStackClient
    participant Metrics

    Client->>InferEndpoint: POST /infer (question, systeminfo)
    InferEndpoint->>BuildInstructions: _build_instructions(systeminfo)
    alt systeminfo has context
        BuildInstructions->>BuildInstructions: combine DEFAULT_SYSTEM_PROMPT + OS/Version/Arch
    else empty/partial
        BuildInstructions->>BuildInstructions: return DEFAULT_SYSTEM_PROMPT (or partial)
    end
    BuildInstructions-->>InferEndpoint: instructions
    InferEndpoint->>RetrieveSimpleResponse: retrieve_simple_response(question, instructions)
    RetrieveSimpleResponse->>LlamaStackClient: call LLM with instructions
    alt Success
        LlamaStackClient-->>RetrieveSimpleResponse: response
        RetrieveSimpleResponse-->>InferEndpoint: result
        InferEndpoint->>Metrics: increment llm_calls_total
        InferEndpoint-->>Client: 200 OK (response)
    else RateLimitError (429)
        LlamaStackClient-->>RetrieveSimpleResponse: RateLimitError
        RetrieveSimpleResponse-->>InferEndpoint: error
        InferEndpoint->>Metrics: increment llm_calls_failures_total
        InferEndpoint-->>Client: 429 QuotaExceededResponse
    else APIStatusError (500)
        LlamaStackClient-->>RetrieveSimpleResponse: APIStatusError
        RetrieveSimpleResponse-->>InferEndpoint: error
        InferEndpoint->>Metrics: increment llm_calls_failures_total
        InferEndpoint-->>Client: 500 InternalServerErrorResponse.generic()
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

feat(rlsapi): implement LLM integration for v1 /infer endpoint #916: Modifies the same retrieve_simple_response / infer flow and instruction propagation; strongly related to the signature and LLM wiring changes.

Suggested labels

ok-to-test

Suggested reviewers

tisnik

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main changes: adding error handling (RateLimitError, APIStatusError), metrics (llm_calls_failures_total), and system context integration into the rlsapi v1 endpoint.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0e32d7c and 48a7c8f.

📒 Files selected for processing (3)

src/app/endpoints/rlsapi_v1.py
src/models/rlsapi/requests.py
tests/unit/app/endpoints/test_rlsapi_v1.py

🧰 Additional context used

📓 Path-based instructions (5)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Use FastAPI dependencies: from fastapi import APIRouter, HTTPException, Request, status, Depends
Use Llama Stack imports: from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Use logger = logging.getLogger(__name__) pattern for module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Complete type annotations for function parameters and return types, using typing_extensions.Self for model validators
Use union types with modern syntax: str | int instead of Union[str, int]
Use Optional[Type] for optional parameters
Use snake_case with descriptive, action-oriented function names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead
Use async def for I/O operations and external API calls
Handle APIConnectionError from Llama Stack in error handling
Use logger.debug() for detailed diagnostic information
Use logger.info() for general information about program execution
Use logger.warning() for unexpected events or potential problems
Use logger.error() for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes: Configuration, Error/Exception, Resolver, Interface
Use ABC for abstract base classes with @abstractmethod decorators
Complete type annotations for all class attributes
Follow Google Python docstring conventions (https://google.github.io/styleguide/pyguide.html) with sections: Args, Returns, Raises, Attributes
Run uv run make format to auto-format code with black and ruff before completion
Run uv run make verify to run all linters (black, pyl...

Files:

src/models/rlsapi/requests.py
tests/unit/app/endpoints/test_rlsapi_v1.py
src/app/endpoints/rlsapi_v1.py

src/models/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/models/**/*.py: Use @field_validator and @model_validator for custom validation in Pydantic models
Extend BaseModel for data Pydantic models
Use @model_validator and @field_validator for Pydantic model validation

Files:

src/models/rlsapi/requests.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest for all unit and integration tests, not unittest
Use pytest-mock for AsyncMock objects in tests
Use MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") pattern for authentication mocks in tests

Files:

tests/unit/app/endpoints/test_rlsapi_v1.py

tests/unit/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/unit/**/*.py: Unit tests require 60% code coverage
Write unit tests covering new functionality before completion

Files:

tests/unit/app/endpoints/test_rlsapi_v1.py

src/app/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use FastAPI HTTPException with appropriate status codes for API endpoint error handling

Files:

src/app/endpoints/rlsapi_v1.py

🧠 Learnings (6)

📚 Learning: 2026-01-12T10:58:40.230Z

Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.

Applied to files:

src/models/rlsapi/requests.py

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/**/*.py : Use `MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token")` pattern for authentication mocks in tests

Applied to files:

tests/unit/app/endpoints/test_rlsapi_v1.py

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/**/*.py : Use `pytest-mock` for AsyncMock objects in tests

Applied to files:

tests/unit/app/endpoints/test_rlsapi_v1.py

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to **/*.py : Handle `APIConnectionError` from Llama Stack in error handling

Applied to files:

src/app/endpoints/rlsapi_v1.py

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to **/*.py : Use FastAPI dependencies: `from fastapi import APIRouter, HTTPException, Request, status, Depends`

Applied to files:

src/app/endpoints/rlsapi_v1.py

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to src/app/**/*.py : Use FastAPI `HTTPException` with appropriate status codes for API endpoint error handling

Applied to files:

src/app/endpoints/rlsapi_v1.py

🧬 Code graph analysis (1)

src/app/endpoints/rlsapi_v1.py (2)

src/models/responses.py (6)

InternalServerErrorResponse (1787-1945)

QuotaExceededResponse (1675-1784)

ServiceUnavailableResponse (1948-1979)

UnauthorizedResponse (1302-1383)

UnprocessableEntityResponse (1627-1672)

generic (1842-1853)

src/models/rlsapi/requests.py (2)

RlsapiV1InferRequest (123-200)

RlsapiV1SystemInfo (42-66)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: build-pr
GitHub Check: E2E: server mode / ci
GitHub Check: E2E: library mode / ci

🔇 Additional comments (12)

src/models/rlsapi/requests.py (1)

129-129: Documentation consistency addressed.

The skip_rag attribute docstring (line 129) and Field description (line 155) are now aligned, both stating "Reserved for future use. RAG retrieval is not yet implemented." This resolves the previous inconsistency.

Also applies to: 153-157

src/app/endpoints/rlsapi_v1.py (6)

12-12: LGTM!

The new imports for APIStatusError and RateLimitError from llama_stack_client are correctly added to support the new exception handling.

46-48: LGTM!

The 429 and 500 response codes are correctly added to the OpenAPI schema, aligning with the new exception handlers for RateLimitError and APIStatusError.

52-78: LGTM!

The _build_instructions function is well-implemented:

Correctly filters empty system info fields

Returns the base prompt when no context is available

Docstring follows Google conventions with Args/Returns sections

Type annotations are complete

114-142: LGTM!

The signature change to accept instructions is correctly implemented. The docstring is updated appropriately, and the parameter is passed through to the LLM client.

182-189: LGTM!

The instruction building and passing flow is correctly integrated into the endpoint. System info is extracted from the request context and transformed into LLM instructions.

200-211: Past review concern addressed, exception ordering verified.

The RateLimitError handler no longer attempts to access the non-existent model attribute—it now uses a generic QuotaExceededResponse directly. The exception ordering is correct: RateLimitError (subclass) is caught before APIStatusError (parent class).

One consideration: logger.exception at line 209 will log the full stack trace, which is appropriate for unexpected API errors where debugging context is valuable.

tests/unit/app/endpoints/test_rlsapi_v1.py (5)

34-34: LGTM!

MOCK_AUTH now follows the coding guidelines pattern exactly: ("mock_user_id", "mock_username", False, "mock_token"). Based on learnings and coding guidelines.

91-95: LGTM!

Good refactoring to centralize authorization mocking into a reusable fixture. This reduces duplication and aligns with pytest best practices.

109-144: LGTM!

Comprehensive parametrized tests for _build_instructions covering full, partial, and empty system info scenarios. The test structure is clean with clear positive and negative assertions.

202-205: LGTM!

Tests correctly updated to pass the instructions parameter to retrieve_simple_response, matching the new signature.

231-246: LGTM!

Endpoint tests are correctly refactored to use the mock_auth_resolvers fixture, maintaining test functionality while reducing boilerplate.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@src/app/endpoints/rlsapi_v1.py`:
- Around line 200-210: Remove the ineffective getattr call on RateLimitError and
always return the generic QuotaExceededResponse: stop trying to read a
non-existent model attribute from the exception (remove used_model = getattr(e,
"model", "") and the if used_model: branch) and construct response =
QuotaExceededResponse(response="The quota has been exceeded", cause=str(e))
before raising HTTPException(**response.model_dump()) from e; keep the
metrics.llm_calls_failures_total.inc() and logger.error("Rate limit exceeded for
request %s: %s", request_id, e) as-is.

In `@src/models/rlsapi/requests.py`:
- Around line 153-157: Update the docstring for the skip_rag attribute to match
the Field description: replace the existing "Whether to skip RAG retrieval
(default False)" docstring with "Reserved for future use. RAG retrieval is not
yet implemented." so the docstring and the Field(... description=...) are
consistent for the skip_rag field in the requests model.

🧹 Nitpick comments (2)

tests/unit/app/endpoints/test_rlsapi_v1.py (1)
73-97: Consider adding fixtures for RateLimitError and APIStatusError.

The endpoint now handles RateLimitError (429) and APIStatusError (500) in addition to APIConnectionError (503), but there are no unit test fixtures or tests covering these error paths. Consider adding test coverage for completeness.
Example fixtures for missing error scenarios
`@pytest.fixture`(name="mock_rate_limit_error")
def mock_rate_limit_error_fixture(mocker: MockerFixture) -> None:
    """Mock responses.create() to raise RateLimitError."""
    from llama_stack_client import RateLimitError
    mock_error = RateLimitError("Rate limit exceeded")
    mock_error.model = "test-model"
    _setup_responses_mock(
        mocker,
        mocker.AsyncMock(side_effect=mock_error),
    )


`@pytest.fixture`(name="mock_api_status_error")
def mock_api_status_error_fixture(mocker: MockerFixture) -> None:
    """Mock responses.create() to raise APIStatusError."""
    from llama_stack_client import APIStatusError
    _setup_responses_mock(
        mocker,
        mocker.AsyncMock(side_effect=APIStatusError("API error", response=mocker.Mock(), body=None)),
    )
tests/integration/endpoints/test_rlsapi_v1_integration.py (1)

56-84: Consider extracting shared test helpers.

The _create_mock_response_output function is identical to the one in tests/unit/app/endpoints/test_rlsapi_v1.py, and _setup_responses_mock is similar. Consider extracting these to a shared test utility module (e.g., tests/utils/mock_helpers.py) to reduce duplication.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1a12376 and 42bf733.

📒 Files selected for processing (4)

src/app/endpoints/rlsapi_v1.py
src/models/rlsapi/requests.py
tests/integration/endpoints/test_rlsapi_v1_integration.py
tests/unit/app/endpoints/test_rlsapi_v1.py

🧰 Additional context used

📓 Path-based instructions (6)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Use FastAPI dependencies: from fastapi import APIRouter, HTTPException, Request, status, Depends
Use Llama Stack imports: from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Use logger = logging.getLogger(__name__) pattern for module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Complete type annotations for function parameters and return types, using typing_extensions.Self for model validators
Use union types with modern syntax: str | int instead of Union[str, int]
Use Optional[Type] for optional parameters
Use snake_case with descriptive, action-oriented function names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead
Use async def for I/O operations and external API calls
Handle APIConnectionError from Llama Stack in error handling
Use logger.debug() for detailed diagnostic information
Use logger.info() for general information about program execution
Use logger.warning() for unexpected events or potential problems
Use logger.error() for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes: Configuration, Error/Exception, Resolver, Interface
Use ABC for abstract base classes with @abstractmethod decorators
Complete type annotations for all class attributes
Follow Google Python docstring conventions (https://google.github.io/styleguide/pyguide.html) with sections: Args, Returns, Raises, Attributes
Run uv run make format to auto-format code with black and ruff before completion
Run uv run make verify to run all linters (black, pyl...

Files:

src/models/rlsapi/requests.py
src/app/endpoints/rlsapi_v1.py
tests/integration/endpoints/test_rlsapi_v1_integration.py
tests/unit/app/endpoints/test_rlsapi_v1.py

src/models/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/models/**/*.py: Use @field_validator and @model_validator for custom validation in Pydantic models
Extend BaseModel for data Pydantic models
Use @model_validator and @field_validator for Pydantic model validation

Files:

src/models/rlsapi/requests.py

src/app/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use FastAPI HTTPException with appropriate status codes for API endpoint error handling

Files:

src/app/endpoints/rlsapi_v1.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest for all unit and integration tests, not unittest
Use pytest-mock for AsyncMock objects in tests
Use MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") pattern for authentication mocks in tests

Files:

tests/integration/endpoints/test_rlsapi_v1_integration.py
tests/unit/app/endpoints/test_rlsapi_v1.py

tests/integration/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Integration tests require 10% code coverage

Files:

tests/integration/endpoints/test_rlsapi_v1_integration.py

tests/unit/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/unit/**/*.py: Unit tests require 60% code coverage
Write unit tests covering new functionality before completion

Files:

tests/unit/app/endpoints/test_rlsapi_v1.py

🧠 Learnings (5)

📚 Learning: 2026-01-12T10:58:40.230Z

Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.

Applied to files:

src/models/rlsapi/requests.py

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to **/*.py : Use FastAPI dependencies: `from fastapi import APIRouter, HTTPException, Request, status, Depends`

Applied to files:

src/app/endpoints/rlsapi_v1.py

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to **/*.py : Handle `APIConnectionError` from Llama Stack in error handling

Applied to files:

src/app/endpoints/rlsapi_v1.py
tests/integration/endpoints/test_rlsapi_v1_integration.py
tests/unit/app/endpoints/test_rlsapi_v1.py

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to **/*.py : Use Llama Stack imports: `from llama_stack_client import AsyncLlamaStackClient`

Applied to files:

src/app/endpoints/rlsapi_v1.py
tests/unit/app/endpoints/test_rlsapi_v1.py

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to src/app/**/*.py : Use FastAPI `HTTPException` with appropriate status codes for API endpoint error handling

Applied to files:

tests/unit/app/endpoints/test_rlsapi_v1.py

🧬 Code graph analysis (2)

src/app/endpoints/rlsapi_v1.py (4)

src/models/responses.py (6)

InternalServerErrorResponse (1787-1945)

QuotaExceededResponse (1675-1784)

ServiceUnavailableResponse (1948-1979)

UnauthorizedResponse (1302-1383)

model (1736-1749)

generic (1842-1853)

src/models/rlsapi/requests.py (2)

RlsapiV1InferRequest (123-200)

RlsapiV1SystemInfo (42-66)

src/models/rlsapi/responses.py (2)

RlsapiV1InferData (10-27)

RlsapiV1InferResponse (30-54)

src/utils/responses.py (1)

extract_text_from_response_output_item (6-56)

tests/integration/endpoints/test_rlsapi_v1_integration.py (2)

tests/unit/app/endpoints/test_rlsapi_v1.py (2)

_create_mock_response_output (64-70)

_setup_responses_mock (37-50)

src/client.py (1)

get_client (101-115)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: E2E: library mode / ci
GitHub Check: build-pr
GitHub Check: E2E: server mode / ci

🔇 Additional comments (10)

tests/unit/app/endpoints/test_rlsapi_v1.py (4)

37-51: LGTM!

The _setup_responses_mock helper cleanly encapsulates the mock setup for the Responses API, reducing duplication across fixtures. The mock chain correctly mirrors the AsyncLlamaStackClientHolder().get_client().responses.create() call pattern.

103-127: LGTM!

The _build_instructions tests provide good coverage of the three main scenarios: full system info, partial system info, and empty system info. The assertions correctly verify both the presence of expected content and the absence of fields that should be excluded.

179-207: LGTM!

The retrieve_simple_response tests are properly updated to pass the instructions parameter and provide good coverage of success, empty output, and API connection error scenarios.

213-295: LGTM!

The infer_endpoint tests provide solid coverage of the main flows including minimal requests, full context, unique request ID generation, connection error handling, and fallback behavior. The request validation tests are comprehensive.

src/app/endpoints/rlsapi_v1.py (3)

52-78: LGTM!

The _build_instructions function is well-implemented with clear logic. It correctly handles the three cases: full system info, partial system info, and empty system info. The docstring follows Google conventions as per coding guidelines.

114-148: LGTM!

The retrieve_simple_response function is properly updated to use the Responses API. The text extraction pattern using extract_text_from_response_output_item is consistent with the utility function implementation, and the store=False parameter is appropriate for stateless queries.

211-215: LGTM!

Using logger.exception for APIStatusError is the right choice as it captures the full stack trace for unexpected API errors, while logger.error for RateLimitError is appropriate since it's an expected operational condition.

tests/integration/endpoints/test_rlsapi_v1_integration.py (3)

87-91: LGTM!

The fixture properly sets up the mock and returns the mock client, enabling tests to inspect call arguments when needed.

262-302: LGTM!

The test_rlsapi_v1_infer_input_source_combination test properly verifies that all input sources (question, stdin, attachments, terminal) are combined and passed to the LLM. The use of call_args.kwargs["input"] to inspect the actual input content is a good approach.

191-222: LGTM!

The connection error test properly verifies the 503 response and error message. The manual mock setup is appropriate for this specific error scenario.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

src/app/endpoints/rlsapi_v1.py

src/models/rlsapi/requests.py

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@tests/unit/app/endpoints/test_rlsapi_v1.py`:
- Line 34: Update the MOCK_AUTH test constant to follow the project's standard
pattern: replace the tuple ("test_user_id", "test_user", True, "test_token")
with ("mock_user_id", "mock_username", False, "mock_token"); modify the symbol
MOCK_AUTH in the test_rlsapi_v1.py file so tests use the standardized mock
values and non-admin flag.

🧹 Nitpick comments (2)

tests/unit/app/endpoints/test_rlsapi_v1.py (2)

64-70: Consider aligning mock structure with actual Responses API output.

The mock sets content = text as a string, but per the relevant code snippet for extract_text_from_response_output_item, content can be a string or list of fragments. The current mock works because the helper handles strings directly, but adding a comment clarifying this mock represents the simplified string-content case would help maintainability.

287-300: Missing test coverage for RateLimitError and APIStatusError handlers.

The endpoint now handles RateLimitError (returning 429) and APIStatusError (returning 500), but there are no unit tests for these code paths. Consider adding tests similar to test_infer_api_connection_error_returns_503:

📝 Suggested test additions

`@pytest.fixture`(name="mock_rate_limit_error")
def mock_rate_limit_error_fixture(mocker: MockerFixture) -> None:
    """Mock responses.create() to raise RateLimitError."""
    from llama_stack_client import RateLimitError
    mock_response = mocker.Mock()
    mock_response.status_code = 429
    _setup_responses_mock(
        mocker,
        mocker.AsyncMock(side_effect=RateLimitError(
            message="Rate limit exceeded",
            response=mock_response,
            body=None,
        )),
    )


`@pytest.fixture`(name="mock_api_status_error")
def mock_api_status_error_fixture(mocker: MockerFixture) -> None:
    """Mock responses.create() to raise APIStatusError."""
    from llama_stack_client import APIStatusError
    mock_response = mocker.Mock()
    mock_response.status_code = 500
    _setup_responses_mock(
        mocker,
        mocker.AsyncMock(side_effect=APIStatusError(
            message="Internal server error",
            response=mock_response,
            body=None,
        )),
    )


`@pytest.mark.asyncio`
async def test_infer_rate_limit_error_returns_429(
    mock_configuration: AppConfig,
    mock_rate_limit_error: None,
    mock_auth_resolvers: None,
) -> None:
    """Test /infer endpoint returns 429 when rate limit is exceeded."""
    request = RlsapiV1InferRequest(question="Test question")

    with pytest.raises(HTTPException) as exc_info:
        await infer_endpoint(infer_request=request, auth=MOCK_AUTH)

    assert exc_info.value.status_code == status.HTTP_429_TOO_MANY_REQUESTS


`@pytest.mark.asyncio`
async def test_infer_api_status_error_returns_500(
    mock_configuration: AppConfig,
    mock_api_status_error: None,
    mock_auth_resolvers: None,
) -> None:
    """Test /infer endpoint returns 500 on API status errors."""
    request = RlsapiV1InferRequest(question="Test question")

    with pytest.raises(HTTPException) as exc_info:
        await infer_endpoint(infer_request=request, auth=MOCK_AUTH)

    assert exc_info.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 233ccfa and e7c9d1a.

📒 Files selected for processing (3)

src/app/endpoints/rlsapi_v1.py
src/models/rlsapi/requests.py
tests/unit/app/endpoints/test_rlsapi_v1.py

🚧 Files skipped from review as they are similar to previous changes (1)

src/models/rlsapi/requests.py

🧰 Additional context used

📓 Path-based instructions (4)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Use FastAPI dependencies: from fastapi import APIRouter, HTTPException, Request, status, Depends
Use Llama Stack imports: from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Use logger = logging.getLogger(__name__) pattern for module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Complete type annotations for function parameters and return types, using typing_extensions.Self for model validators
Use union types with modern syntax: str | int instead of Union[str, int]
Use Optional[Type] for optional parameters
Use snake_case with descriptive, action-oriented function names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead
Use async def for I/O operations and external API calls
Handle APIConnectionError from Llama Stack in error handling
Use logger.debug() for detailed diagnostic information
Use logger.info() for general information about program execution
Use logger.warning() for unexpected events or potential problems
Use logger.error() for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes: Configuration, Error/Exception, Resolver, Interface
Use ABC for abstract base classes with @abstractmethod decorators
Complete type annotations for all class attributes
Follow Google Python docstring conventions (https://google.github.io/styleguide/pyguide.html) with sections: Args, Returns, Raises, Attributes
Run uv run make format to auto-format code with black and ruff before completion
Run uv run make verify to run all linters (black, pyl...

Files:

src/app/endpoints/rlsapi_v1.py
tests/unit/app/endpoints/test_rlsapi_v1.py

src/app/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use FastAPI HTTPException with appropriate status codes for API endpoint error handling

Files:

src/app/endpoints/rlsapi_v1.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest for all unit and integration tests, not unittest
Use pytest-mock for AsyncMock objects in tests
Use MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") pattern for authentication mocks in tests

Files:

tests/unit/app/endpoints/test_rlsapi_v1.py

tests/unit/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/unit/**/*.py: Unit tests require 60% code coverage
Write unit tests covering new functionality before completion

Files:

tests/unit/app/endpoints/test_rlsapi_v1.py

🧠 Learnings (5)

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to **/*.py : Handle `APIConnectionError` from Llama Stack in error handling

Applied to files:

src/app/endpoints/rlsapi_v1.py
tests/unit/app/endpoints/test_rlsapi_v1.py

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to **/*.py : Use FastAPI dependencies: `from fastapi import APIRouter, HTTPException, Request, status, Depends`

Applied to files:

src/app/endpoints/rlsapi_v1.py

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to **/*.py : Use Llama Stack imports: `from llama_stack_client import AsyncLlamaStackClient`

Applied to files:

src/app/endpoints/rlsapi_v1.py
tests/unit/app/endpoints/test_rlsapi_v1.py

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to src/app/**/*.py : Use FastAPI `HTTPException` with appropriate status codes for API endpoint error handling

Applied to files:

tests/unit/app/endpoints/test_rlsapi_v1.py

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/**/*.py : Use `MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token")` pattern for authentication mocks in tests

Applied to files:

tests/unit/app/endpoints/test_rlsapi_v1.py

🧬 Code graph analysis (2)

src/app/endpoints/rlsapi_v1.py (2)

src/models/rlsapi/requests.py (2)

RlsapiV1InferRequest (123-200)

RlsapiV1SystemInfo (42-66)

src/utils/responses.py (1)

extract_text_from_response_output_item (6-56)

tests/unit/app/endpoints/test_rlsapi_v1.py (4)

src/app/endpoints/rlsapi_v1.py (1)

_build_instructions (52-78)

tests/unit/utils/auth_helpers.py (1)

mock_authorization_resolvers (8-26)

src/models/rlsapi/requests.py (2)

RlsapiV1SystemInfo (42-66)

RlsapiV1InferRequest (123-200)

src/utils/suid.py (1)

check_suid (19-54)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: build-pr
GitHub Check: E2E: server mode / ci
GitHub Check: E2E: library mode / ci

🔇 Additional comments (9)

src/app/endpoints/rlsapi_v1.py (4)

10-33: LGTM! Imports are well-organized and follow coding guidelines.

The imports correctly use FastAPI dependencies, Llama Stack client types including APIConnectionError, APIStatusError, and RateLimitError, and use absolute imports for internal modules. Based on learnings and coding guidelines.

52-78: LGTM! Clean helper function with proper docstring and type annotations.

The _build_instructions function correctly composes the system prompt with optional context. The implementation:

Has a proper Google-style docstring

Uses type annotations correctly

Returns early when no context parts exist

Joins parts with readable formatting

114-148: LGTM! Responses API integration looks correct.

The migration to the Responses API is clean:

Properly casts to OpenAIResponseObject

Uses extract_text_from_response_output_item helper for text extraction

Sets store=False for stateless operation

188-211: Error handling is well-structured; previous review comment about RateLimitError.model has been addressed.

The error handling correctly:

Increments llm_calls_failures_total metric for each failure type

Uses appropriate log levels (error for rate limit, exception for API errors)

Maps to correct HTTP status codes (429, 500, 503)

Chains exceptions properly with from e

The previous review comment about the ineffective getattr(e, "model", "") call has been resolved — the code now directly uses QuotaExceededResponse without attempting to access a non-existent model attribute.

tests/unit/app/endpoints/test_rlsapi_v1.py (5)

37-51: LGTM! Well-structured mock setup helper.

The _setup_responses_mock helper centralizes Responses API mocking, reducing duplication across tests. The mock chain (client holder → client → responses → create) correctly mirrors the production code path.

91-94: LGTM! Good fixture encapsulation.

Wrapping mock_authorization_resolvers in a pytest fixture improves test readability and follows the project's fixture-based approach for dependency injection.

109-144: LGTM! Comprehensive parametrized tests for _build_instructions.

The test covers:

Full system info (all fields populated)

Partial system info (only OS)

Empty system info (fallback to base prompt)

The assertions check both expected inclusions and exclusions, providing good coverage.

197-226: LGTM! Tests updated to match new retrieve_simple_response signature.

Tests correctly pass instructions as the second argument, using constants.DEFAULT_SYSTEM_PROMPT to match production behavior.

231-246: LGTM! Endpoint test properly uses the new mock_auth_resolvers fixture.

The test correctly validates the response structure including request_id generation and validation via check_suid.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

tests/unit/app/endpoints/test_rlsapi_v1.py

tisnik

LGTM

Add exception handlers for RateLimitError (429) and APIStatusError (500) to match the error handling patterns used by other endpoints like query.py and streaming_query.py. Previously, these exceptions would result in unhandled errors instead of proper HTTP error responses. Signed-off-by: Major Hayden <major@redhat.com>

Track LLM call failures in Prometheus metrics by incrementing llm_calls_failures_total counter in each exception handler, matching the pattern used by other endpoints. Signed-off-by: Major Hayden <major@redhat.com>

The skip_rag field in RlsapiV1InferRequest is accepted but not yet implemented. Update the field description to clarify this for API consumers. Signed-off-by: Major Hayden <major@redhat.com>

Enhance the LLM instructions with the user's RHEL system information (OS, version, architecture) when available. This gives the LLM better context about the environment the user is asking questions about, enabling more relevant and accurate responses. Signed-off-by: Major Hayden <major@redhat.com>

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/unit/app/endpoints/test_rlsapi_v1.py (1)

37-66: Remove duplicate function definition causing pipeline failure.

The _setup_responses_mock function is defined twice (lines 37-50 and 53-66), causing the Pylint E0102 error. This appears to be a copy-paste mistake. Remove the duplicate definition to fix the pipeline.

🐛 Proposed fix

 def _setup_responses_mock(mocker: MockerFixture, create_behavior: Any) -> None:
     """Set up responses.create mock with custom behavior."""
     mock_responses = mocker.Mock()
     mock_responses.create = create_behavior

     mock_client = mocker.Mock()
     mock_client.responses = mock_responses

     mock_client_holder = mocker.Mock()
     mock_client_holder.get_client.return_value = mock_client
     mocker.patch(
         "app.endpoints.rlsapi_v1.AsyncLlamaStackClientHolder",
         return_value=mock_client_holder,
     )
-
-
-def _setup_responses_mock(mocker: MockerFixture, create_behavior: Any) -> None:
-    """Set up responses.create mock with custom behavior."""
-    mock_responses = mocker.Mock()
-    mock_responses.create = create_behavior
-
-    mock_client = mocker.Mock()
-    mock_client.responses = mock_responses
-
-    mock_client_holder = mocker.Mock()
-    mock_client_holder.get_client.return_value = mock_client
-    mocker.patch(
-        "app.endpoints.rlsapi_v1.AsyncLlamaStackClientHolder",
-        return_value=mock_client_holder,
-    )

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b452672 and 0e32d7c.

📒 Files selected for processing (3)

src/app/endpoints/rlsapi_v1.py
src/models/rlsapi/requests.py
tests/unit/app/endpoints/test_rlsapi_v1.py

🧰 Additional context used

📓 Path-based instructions (5)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Use FastAPI dependencies: from fastapi import APIRouter, HTTPException, Request, status, Depends
Use Llama Stack imports: from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Use logger = logging.getLogger(__name__) pattern for module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Complete type annotations for function parameters and return types, using typing_extensions.Self for model validators
Use union types with modern syntax: str | int instead of Union[str, int]
Use Optional[Type] for optional parameters
Use snake_case with descriptive, action-oriented function names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead
Use async def for I/O operations and external API calls
Handle APIConnectionError from Llama Stack in error handling
Use logger.debug() for detailed diagnostic information
Use logger.info() for general information about program execution
Use logger.warning() for unexpected events or potential problems
Use logger.error() for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes: Configuration, Error/Exception, Resolver, Interface
Use ABC for abstract base classes with @abstractmethod decorators
Complete type annotations for all class attributes
Follow Google Python docstring conventions (https://google.github.io/styleguide/pyguide.html) with sections: Args, Returns, Raises, Attributes
Run uv run make format to auto-format code with black and ruff before completion
Run uv run make verify to run all linters (black, pyl...

Files:

tests/unit/app/endpoints/test_rlsapi_v1.py
src/models/rlsapi/requests.py
src/app/endpoints/rlsapi_v1.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest for all unit and integration tests, not unittest
Use pytest-mock for AsyncMock objects in tests
Use MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") pattern for authentication mocks in tests

Files:

tests/unit/app/endpoints/test_rlsapi_v1.py

tests/unit/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/unit/**/*.py: Unit tests require 60% code coverage
Write unit tests covering new functionality before completion

Files:

tests/unit/app/endpoints/test_rlsapi_v1.py

src/models/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/models/**/*.py: Use @field_validator and @model_validator for custom validation in Pydantic models
Extend BaseModel for data Pydantic models
Use @model_validator and @field_validator for Pydantic model validation

Files:

src/models/rlsapi/requests.py

src/app/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use FastAPI HTTPException with appropriate status codes for API endpoint error handling

Files:

src/app/endpoints/rlsapi_v1.py

🧠 Learnings (6)

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/**/*.py : Use `MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token")` pattern for authentication mocks in tests

Applied to files:

tests/unit/app/endpoints/test_rlsapi_v1.py

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/**/*.py : Use `pytest-mock` for AsyncMock objects in tests

Applied to files:

tests/unit/app/endpoints/test_rlsapi_v1.py

📚 Learning: 2026-01-12T10:58:40.230Z

Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.

Applied to files:

src/models/rlsapi/requests.py

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to **/*.py : Handle `APIConnectionError` from Llama Stack in error handling

Applied to files:

src/app/endpoints/rlsapi_v1.py

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to **/*.py : Use FastAPI dependencies: `from fastapi import APIRouter, HTTPException, Request, status, Depends`

Applied to files:

src/app/endpoints/rlsapi_v1.py

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to src/app/**/*.py : Use FastAPI `HTTPException` with appropriate status codes for API endpoint error handling

Applied to files:

src/app/endpoints/rlsapi_v1.py

🧬 Code graph analysis (2)

tests/unit/app/endpoints/test_rlsapi_v1.py (3)

src/app/endpoints/rlsapi_v1.py (1)

_build_instructions (52-78)

src/client.py (1)

get_client (101-115)

src/models/rlsapi/requests.py (1)

RlsapiV1SystemInfo (42-66)

src/app/endpoints/rlsapi_v1.py (2)

src/models/responses.py (6)

InternalServerErrorResponse (1787-1945)

QuotaExceededResponse (1675-1784)

ServiceUnavailableResponse (1948-1979)

UnauthorizedResponse (1302-1383)

UnprocessableEntityResponse (1627-1672)

generic (1842-1853)

src/models/rlsapi/requests.py (2)

RlsapiV1InferRequest (123-200)

RlsapiV1SystemInfo (42-66)

🪛 GitHub Actions: Python linter

tests/unit/app/endpoints/test_rlsapi_v1.py

[error] 53-53: Pylint: E0102 function already defined (function-redefined) at line 53; original definition at line 37. The step 'uv run pylint src tests' exited with code 2.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Pyright
GitHub Check: build-pr
GitHub Check: E2E: server mode / ci
GitHub Check: E2E: library mode / ci

🔇 Additional comments (11)

src/models/rlsapi/requests.py (1)

129-157: Documentation updates are consistent and complete.

The skip_rag field now has aligned documentation in both the class docstring (line 129) and the Field description (line 155), both stating "Reserved for future use. RAG retrieval is not yet implemented." This properly addresses the previous review feedback about inconsistency.

src/app/endpoints/rlsapi_v1.py (5)

12-30: Imports are well-organized and necessary.

The new imports properly support the error handling and system context features. Using absolute imports as per coding guidelines.

52-78: Well-implemented helper function for building system context instructions.

The _build_instructions function properly handles partial system information by only including non-empty fields. The logic correctly falls back to the base prompt when no context is available. Good use of docstrings following Google style.

200-211: Error handling properly implemented.

The RateLimitError handler now correctly uses a generic QuotaExceededResponse without the ineffective getattr(e, "model", "") call, addressing the previous review feedback. The APIStatusError handler appropriately returns a generic 500 response. Both handlers correctly increment the failure metrics.

114-148: Function signature properly updated to accept instructions.

The retrieve_simple_response function correctly accepts and passes the instructions parameter to the LLM API. Docstring is updated appropriately.

183-189: Integration of system context into inference flow.

The endpoint correctly builds instructions from the request's system info and passes them to retrieve_simple_response. The flow is clean and follows the pattern established by the new helper function.

tests/unit/app/endpoints/test_rlsapi_v1.py (5)

34-34: MOCK_AUTH now follows the coding guidelines pattern.

The authentication mock tuple correctly uses the standard values: ("mock_user_id", "mock_username", False, "mock_token") as specified in the coding guidelines.

122-160: Good test coverage for _build_instructions.

The parametrized tests cover the key scenarios: full system info, partial system info (OS only), and empty system info. The test structure using expected_contains and expected_not_contains lists is clean and maintainable.

107-110: Clean fixture for authorization mocking.

The mock_auth_resolvers fixture properly wraps the mock_authorization_resolvers helper, making test code cleaner through fixture injection.

213-241: Tests properly updated for new function signature.

The retrieve_simple_response tests correctly pass constants.DEFAULT_SYSTEM_PROMPT as the instructions parameter, maintaining test coverage after the signature change.

247-329: Endpoint tests properly updated with auth resolver fixture.

The endpoint tests correctly use the mock_auth_resolvers fixture and provide good coverage for various scenarios including minimal requests, full context, unique request ID generation, error handling, and empty response fallback.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

Use project-standard mock auth tuple pattern with non-admin flag. Signed-off-by: Major Hayden <major@redhat.com>

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/unit/app/endpoints/test_rlsapi_v1.py (1)

287-300: Missing test coverage for new error handlers.

The PR adds handlers for RateLimitError (429) and APIStatusError (500) in rlsapi_v1.py (lines 200-211), but there are no corresponding tests. Per coding guidelines, new functionality must have unit test coverage.

Add tests for both error handlers. Use the existing _setup_responses_mock helper:

Suggested test fixtures and functions

`@pytest.fixture`(name="mock_rate_limit_error")
def mock_rate_limit_error_fixture(mocker: MockerFixture) -> None:
    """Mock responses.create() to raise RateLimitError."""
    mock_response = httpx.Response(429, request=httpx.Request("POST", "http://test"))
    _setup_responses_mock(
        mocker,
        mocker.AsyncMock(side_effect=RateLimitError(
            "Rate limit exceeded", response=mock_response, body=None
        )),
    )

`@pytest.fixture`(name="mock_api_status_error")
def mock_api_status_error_fixture(mocker: MockerFixture) -> None:
    """Mock responses.create() to raise APIStatusError."""
    mock_response = httpx.Response(500, request=httpx.Request("POST", "http://test"))
    _setup_responses_mock(
        mocker,
        mocker.AsyncMock(side_effect=APIStatusError("API error", response=mock_response, body=None)),
    )

`@pytest.mark.asyncio`
async def test_infer_rate_limit_error_returns_429(
    mock_configuration: AppConfig,
    mock_rate_limit_error: None,
    mock_auth_resolvers: None,
) -> None:
    """Test /infer endpoint returns 429 when rate limited."""
    request = RlsapiV1InferRequest(question="Test question")
    with pytest.raises(HTTPException) as exc_info:
        await infer_endpoint(infer_request=request, auth=MOCK_AUTH)
    assert exc_info.value.status_code == status.HTTP_429_TOO_MANY_REQUESTS

`@pytest.mark.asyncio`
async def test_infer_api_status_error_returns_500(
    mock_configuration: AppConfig,
    mock_api_status_error: None,
    mock_auth_resolvers: None,
) -> None:
    """Test /infer endpoint returns 500 on API errors."""
    request = RlsapiV1InferRequest(question="Test question")
    with pytest.raises(HTTPException) as exc_info:
        await infer_endpoint(infer_request=request, auth=MOCK_AUTH)
    assert exc_info.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR

Also add imports: import httpx and update the llama_stack_client import to include RateLimitError, APIStatusError.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0e32d7c and 48a7c8f.

📒 Files selected for processing (3)

src/app/endpoints/rlsapi_v1.py
src/models/rlsapi/requests.py
tests/unit/app/endpoints/test_rlsapi_v1.py

🧰 Additional context used

📓 Path-based instructions (5)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Use FastAPI dependencies: from fastapi import APIRouter, HTTPException, Request, status, Depends
Use Llama Stack imports: from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Use logger = logging.getLogger(__name__) pattern for module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Complete type annotations for function parameters and return types, using typing_extensions.Self for model validators
Use union types with modern syntax: str | int instead of Union[str, int]
Use Optional[Type] for optional parameters
Use snake_case with descriptive, action-oriented function names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead
Use async def for I/O operations and external API calls
Handle APIConnectionError from Llama Stack in error handling
Use logger.debug() for detailed diagnostic information
Use logger.info() for general information about program execution
Use logger.warning() for unexpected events or potential problems
Use logger.error() for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes: Configuration, Error/Exception, Resolver, Interface
Use ABC for abstract base classes with @abstractmethod decorators
Complete type annotations for all class attributes
Follow Google Python docstring conventions (https://google.github.io/styleguide/pyguide.html) with sections: Args, Returns, Raises, Attributes
Run uv run make format to auto-format code with black and ruff before completion
Run uv run make verify to run all linters (black, pyl...

Files:

src/models/rlsapi/requests.py
tests/unit/app/endpoints/test_rlsapi_v1.py
src/app/endpoints/rlsapi_v1.py

src/models/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/models/**/*.py: Use @field_validator and @model_validator for custom validation in Pydantic models
Extend BaseModel for data Pydantic models
Use @model_validator and @field_validator for Pydantic model validation

Files:

src/models/rlsapi/requests.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest for all unit and integration tests, not unittest
Use pytest-mock for AsyncMock objects in tests
Use MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") pattern for authentication mocks in tests

Files:

tests/unit/app/endpoints/test_rlsapi_v1.py

tests/unit/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/unit/**/*.py: Unit tests require 60% code coverage
Write unit tests covering new functionality before completion

Files:

tests/unit/app/endpoints/test_rlsapi_v1.py

src/app/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use FastAPI HTTPException with appropriate status codes for API endpoint error handling

Files:

src/app/endpoints/rlsapi_v1.py

🧠 Learnings (6)

📚 Learning: 2026-01-12T10:58:40.230Z

Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.

Applied to files:

src/models/rlsapi/requests.py

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/**/*.py : Use `MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token")` pattern for authentication mocks in tests

Applied to files:

tests/unit/app/endpoints/test_rlsapi_v1.py

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/**/*.py : Use `pytest-mock` for AsyncMock objects in tests

Applied to files:

tests/unit/app/endpoints/test_rlsapi_v1.py

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to **/*.py : Handle `APIConnectionError` from Llama Stack in error handling

Applied to files:

src/app/endpoints/rlsapi_v1.py

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to **/*.py : Use FastAPI dependencies: `from fastapi import APIRouter, HTTPException, Request, status, Depends`

Applied to files:

src/app/endpoints/rlsapi_v1.py

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to src/app/**/*.py : Use FastAPI `HTTPException` with appropriate status codes for API endpoint error handling

Applied to files:

src/app/endpoints/rlsapi_v1.py

🧬 Code graph analysis (1)

src/app/endpoints/rlsapi_v1.py (2)

src/models/responses.py (6)

InternalServerErrorResponse (1787-1945)

QuotaExceededResponse (1675-1784)

ServiceUnavailableResponse (1948-1979)

UnauthorizedResponse (1302-1383)

UnprocessableEntityResponse (1627-1672)

generic (1842-1853)

src/models/rlsapi/requests.py (2)

RlsapiV1InferRequest (123-200)

RlsapiV1SystemInfo (42-66)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: build-pr
GitHub Check: E2E: server mode / ci
GitHub Check: E2E: library mode / ci

🔇 Additional comments (12)

src/models/rlsapi/requests.py (1)

129-129: Documentation consistency addressed.

The skip_rag attribute docstring (line 129) and Field description (line 155) are now aligned, both stating "Reserved for future use. RAG retrieval is not yet implemented." This resolves the previous inconsistency.

Also applies to: 153-157

src/app/endpoints/rlsapi_v1.py (6)

12-12: LGTM!

The new imports for APIStatusError and RateLimitError from llama_stack_client are correctly added to support the new exception handling.

46-48: LGTM!

The 429 and 500 response codes are correctly added to the OpenAPI schema, aligning with the new exception handlers for RateLimitError and APIStatusError.

52-78: LGTM!

The _build_instructions function is well-implemented:

Correctly filters empty system info fields

Returns the base prompt when no context is available

Docstring follows Google conventions with Args/Returns sections

Type annotations are complete

114-142: LGTM!

The signature change to accept instructions is correctly implemented. The docstring is updated appropriately, and the parameter is passed through to the LLM client.

182-189: LGTM!

The instruction building and passing flow is correctly integrated into the endpoint. System info is extracted from the request context and transformed into LLM instructions.

200-211: Past review concern addressed, exception ordering verified.

The RateLimitError handler no longer attempts to access the non-existent model attribute—it now uses a generic QuotaExceededResponse directly. The exception ordering is correct: RateLimitError (subclass) is caught before APIStatusError (parent class).

One consideration: logger.exception at line 209 will log the full stack trace, which is appropriate for unexpected API errors where debugging context is valuable.

tests/unit/app/endpoints/test_rlsapi_v1.py (5)

34-34: LGTM!

MOCK_AUTH now follows the coding guidelines pattern exactly: ("mock_user_id", "mock_username", False, "mock_token"). Based on learnings and coding guidelines.

91-95: LGTM!

Good refactoring to centralize authorization mocking into a reusable fixture. This reduces duplication and aligns with pytest best practices.

109-144: LGTM!

Comprehensive parametrized tests for _build_instructions covering full, partial, and empty system info scenarios. The test structure is clean with clear positive and negative assertions.

202-205: LGTM!

Tests correctly updated to pass the instructions parameter to retrieve_simple_response, matching the new signature.

231-246: LGTM!

Endpoint tests are correctly refactored to use the mock_auth_resolvers fixture, maintaining test functionality while reducing boilerplate.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

openshift-ci bot added the needs-ok-to-test label Jan 15, 2026

major force-pushed the fix/rlsapi-cleanup branch from 42bf733 to 233ccfa Compare January 15, 2026 16:44

coderabbitai bot reviewed Jan 15, 2026

View reviewed changes

src/app/endpoints/rlsapi_v1.py Show resolved Hide resolved

src/models/rlsapi/requests.py Show resolved Hide resolved

major force-pushed the fix/rlsapi-cleanup branch from 233ccfa to e7c9d1a Compare January 15, 2026 17:14

coderabbitai bot reviewed Jan 15, 2026

View reviewed changes

tests/unit/app/endpoints/test_rlsapi_v1.py Outdated Show resolved Hide resolved

major force-pushed the fix/rlsapi-cleanup branch from e7c9d1a to b452672 Compare January 15, 2026 17:33

tisnik approved these changes Jan 15, 2026

View reviewed changes

major force-pushed the fix/rlsapi-cleanup branch from b452672 to 0e32d7c Compare January 15, 2026 18:26

major added 4 commits January 15, 2026 12:29

fix(rlsapi): add metrics increment on LLM call failures

83f7c4c

Track LLM call failures in Prometheus metrics by incrementing llm_calls_failures_total counter in each exception handler, matching the pattern used by other endpoints. Signed-off-by: Major Hayden <major@redhat.com>

docs(rlsapi): document skip_rag field as reserved for future use

3fc7bd0

The skip_rag field in RlsapiV1InferRequest is accepted but not yet implemented. Update the field description to clarify this for API consumers. Signed-off-by: Major Hayden <major@redhat.com>

coderabbitai bot reviewed Jan 15, 2026

View reviewed changes

test(rlsapi): standardize MOCK_AUTH constant in rlsapi tests

48a7c8f

Use project-standard mock auth tuple pattern with non-admin flag. Signed-off-by: Major Hayden <major@redhat.com>

major force-pushed the fix/rlsapi-cleanup branch from 0e32d7c to 48a7c8f Compare January 15, 2026 18:30

coderabbitai bot reviewed Jan 15, 2026

View reviewed changes

tisnik merged commit 1ee30aa into lightspeed-core:main Jan 15, 2026
18 of 21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(rlsapi): add error handling, metrics, and system context #1006

fix(rlsapi): add error handling, metrics, and system context #1006

Uh oh!

major commented Jan 15, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

openshift-ci bot commented Jan 15, 2026

Uh oh!

coderabbitai bot commented Jan 15, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

tisnik left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix(rlsapi): add error handling, metrics, and system context #1006

fix(rlsapi): add error handling, metrics, and system context #1006

Uh oh!

Conversation

major commented Jan 15, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

openshift-ci bot commented Jan 15, 2026

Uh oh!

coderabbitai bot commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

major commented Jan 15, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 15, 2026 •

edited

Loading