-
Notifications
You must be signed in to change notification settings - Fork 65
fix(rlsapi): add error handling, metrics, and system context #1006
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @major. Thanks for your PR. I'm waiting for a lightspeed-core member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
WalkthroughAdds system-context instruction construction to the RLSAPI v1 inference flow, updates retrieve_simple_response to accept instructions, propagates those instructions into LLM calls, and extends error handling and metrics for rate limits and API status errors (429 and 500 responses). Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant InferEndpoint
participant BuildInstructions
participant RetrieveSimpleResponse
participant LlamaStackClient
participant Metrics
Client->>InferEndpoint: POST /infer (question, systeminfo)
InferEndpoint->>BuildInstructions: _build_instructions(systeminfo)
alt systeminfo has context
BuildInstructions->>BuildInstructions: combine DEFAULT_SYSTEM_PROMPT + OS/Version/Arch
else empty/partial
BuildInstructions->>BuildInstructions: return DEFAULT_SYSTEM_PROMPT (or partial)
end
BuildInstructions-->>InferEndpoint: instructions
InferEndpoint->>RetrieveSimpleResponse: retrieve_simple_response(question, instructions)
RetrieveSimpleResponse->>LlamaStackClient: call LLM with instructions
alt Success
LlamaStackClient-->>RetrieveSimpleResponse: response
RetrieveSimpleResponse-->>InferEndpoint: result
InferEndpoint->>Metrics: increment llm_calls_total
InferEndpoint-->>Client: 200 OK (response)
else RateLimitError (429)
LlamaStackClient-->>RetrieveSimpleResponse: RateLimitError
RetrieveSimpleResponse-->>InferEndpoint: error
InferEndpoint->>Metrics: increment llm_calls_failures_total
InferEndpoint-->>Client: 429 QuotaExceededResponse
else APIStatusError (500)
LlamaStackClient-->>RetrieveSimpleResponse: APIStatusError
RetrieveSimpleResponse-->>InferEndpoint: error
InferEndpoint->>Metrics: increment llm_calls_failures_total
InferEndpoint-->>Client: 500 InternalServerErrorResponse.generic()
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (3)
🧰 Additional context used📓 Path-based instructions (5)**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
src/models/**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
tests/**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
tests/unit/**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
src/app/**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
🧠 Learnings (6)📚 Learning: 2026-01-12T10:58:40.230ZApplied to files:
📚 Learning: 2026-01-11T16:30:41.784ZApplied to files:
📚 Learning: 2026-01-11T16:30:41.784ZApplied to files:
📚 Learning: 2026-01-11T16:30:41.784ZApplied to files:
📚 Learning: 2026-01-11T16:30:41.784ZApplied to files:
📚 Learning: 2026-01-11T16:30:41.784ZApplied to files:
🧬 Code graph analysis (1)src/app/endpoints/rlsapi_v1.py (2)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
🔇 Additional comments (12)
✏️ Tip: You can disable this entire section by setting Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
42bf733 to
233ccfa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/app/endpoints/rlsapi_v1.py`:
- Around line 200-210: Remove the ineffective getattr call on RateLimitError and
always return the generic QuotaExceededResponse: stop trying to read a
non-existent model attribute from the exception (remove used_model = getattr(e,
"model", "") and the if used_model: branch) and construct response =
QuotaExceededResponse(response="The quota has been exceeded", cause=str(e))
before raising HTTPException(**response.model_dump()) from e; keep the
metrics.llm_calls_failures_total.inc() and logger.error("Rate limit exceeded for
request %s: %s", request_id, e) as-is.
In `@src/models/rlsapi/requests.py`:
- Around line 153-157: Update the docstring for the skip_rag attribute to match
the Field description: replace the existing "Whether to skip RAG retrieval
(default False)" docstring with "Reserved for future use. RAG retrieval is not
yet implemented." so the docstring and the Field(... description=...) are
consistent for the skip_rag field in the requests model.
🧹 Nitpick comments (2)
tests/unit/app/endpoints/test_rlsapi_v1.py (1)
73-97: Consider adding fixtures for RateLimitError and APIStatusError.The endpoint now handles
RateLimitError(429) andAPIStatusError(500) in addition toAPIConnectionError(503), but there are no unit test fixtures or tests covering these error paths. Consider adding test coverage for completeness.Example fixtures for missing error scenarios
`@pytest.fixture`(name="mock_rate_limit_error") def mock_rate_limit_error_fixture(mocker: MockerFixture) -> None: """Mock responses.create() to raise RateLimitError.""" from llama_stack_client import RateLimitError mock_error = RateLimitError("Rate limit exceeded") mock_error.model = "test-model" _setup_responses_mock( mocker, mocker.AsyncMock(side_effect=mock_error), ) `@pytest.fixture`(name="mock_api_status_error") def mock_api_status_error_fixture(mocker: MockerFixture) -> None: """Mock responses.create() to raise APIStatusError.""" from llama_stack_client import APIStatusError _setup_responses_mock( mocker, mocker.AsyncMock(side_effect=APIStatusError("API error", response=mocker.Mock(), body=None)), )tests/integration/endpoints/test_rlsapi_v1_integration.py (1)
56-84: Consider extracting shared test helpers.The
_create_mock_response_outputfunction is identical to the one intests/unit/app/endpoints/test_rlsapi_v1.py, and_setup_responses_mockis similar. Consider extracting these to a shared test utility module (e.g.,tests/utils/mock_helpers.py) to reduce duplication.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/app/endpoints/rlsapi_v1.pysrc/models/rlsapi/requests.pytests/integration/endpoints/test_rlsapi_v1_integration.pytests/unit/app/endpoints/test_rlsapi_v1.py
🧰 Additional context used
📓 Path-based instructions (6)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Use FastAPI dependencies:from fastapi import APIRouter, HTTPException, Request, status, Depends
Use Llama Stack imports:from llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Uselogger = logging.getLogger(__name__)pattern for module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Complete type annotations for function parameters and return types, usingtyping_extensions.Selffor model validators
Use union types with modern syntax:str | intinstead ofUnion[str, int]
UseOptional[Type]for optional parameters
Use snake_case with descriptive, action-oriented function names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead
Useasync deffor I/O operations and external API calls
HandleAPIConnectionErrorfrom Llama Stack in error handling
Uselogger.debug()for detailed diagnostic information
Uselogger.info()for general information about program execution
Uselogger.warning()for unexpected events or potential problems
Uselogger.error()for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes:Configuration,Error/Exception,Resolver,Interface
Use ABC for abstract base classes with@abstractmethoddecorators
Complete type annotations for all class attributes
Follow Google Python docstring conventions (https://google.github.io/styleguide/pyguide.html) with sections: Args, Returns, Raises, Attributes
Runuv run make formatto auto-format code with black and ruff before completion
Runuv run make verifyto run all linters (black, pyl...
Files:
src/models/rlsapi/requests.pysrc/app/endpoints/rlsapi_v1.pytests/integration/endpoints/test_rlsapi_v1_integration.pytests/unit/app/endpoints/test_rlsapi_v1.py
src/models/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/models/**/*.py: Use@field_validatorand@model_validatorfor custom validation in Pydantic models
ExtendBaseModelfor data Pydantic models
Use@model_validatorand@field_validatorfor Pydantic model validation
Files:
src/models/rlsapi/requests.py
src/app/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use FastAPI
HTTPExceptionwith appropriate status codes for API endpoint error handling
Files:
src/app/endpoints/rlsapi_v1.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use pytest for all unit and integration tests, not unittest
Usepytest-mockfor AsyncMock objects in tests
UseMOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token")pattern for authentication mocks in tests
Files:
tests/integration/endpoints/test_rlsapi_v1_integration.pytests/unit/app/endpoints/test_rlsapi_v1.py
tests/integration/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Integration tests require 10% code coverage
Files:
tests/integration/endpoints/test_rlsapi_v1_integration.py
tests/unit/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/unit/**/*.py: Unit tests require 60% code coverage
Write unit tests covering new functionality before completion
Files:
tests/unit/app/endpoints/test_rlsapi_v1.py
🧠 Learnings (5)
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.
Applied to files:
src/models/rlsapi/requests.py
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to **/*.py : Use FastAPI dependencies: `from fastapi import APIRouter, HTTPException, Request, status, Depends`
Applied to files:
src/app/endpoints/rlsapi_v1.py
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to **/*.py : Handle `APIConnectionError` from Llama Stack in error handling
Applied to files:
src/app/endpoints/rlsapi_v1.pytests/integration/endpoints/test_rlsapi_v1_integration.pytests/unit/app/endpoints/test_rlsapi_v1.py
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to **/*.py : Use Llama Stack imports: `from llama_stack_client import AsyncLlamaStackClient`
Applied to files:
src/app/endpoints/rlsapi_v1.pytests/unit/app/endpoints/test_rlsapi_v1.py
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to src/app/**/*.py : Use FastAPI `HTTPException` with appropriate status codes for API endpoint error handling
Applied to files:
tests/unit/app/endpoints/test_rlsapi_v1.py
🧬 Code graph analysis (2)
src/app/endpoints/rlsapi_v1.py (4)
src/models/responses.py (6)
InternalServerErrorResponse(1787-1945)QuotaExceededResponse(1675-1784)ServiceUnavailableResponse(1948-1979)UnauthorizedResponse(1302-1383)model(1736-1749)generic(1842-1853)src/models/rlsapi/requests.py (2)
RlsapiV1InferRequest(123-200)RlsapiV1SystemInfo(42-66)src/models/rlsapi/responses.py (2)
RlsapiV1InferData(10-27)RlsapiV1InferResponse(30-54)src/utils/responses.py (1)
extract_text_from_response_output_item(6-56)
tests/integration/endpoints/test_rlsapi_v1_integration.py (2)
tests/unit/app/endpoints/test_rlsapi_v1.py (2)
_create_mock_response_output(64-70)_setup_responses_mock(37-50)src/client.py (1)
get_client(101-115)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: E2E: library mode / ci
- GitHub Check: build-pr
- GitHub Check: E2E: server mode / ci
🔇 Additional comments (10)
tests/unit/app/endpoints/test_rlsapi_v1.py (4)
37-51: LGTM!The
_setup_responses_mockhelper cleanly encapsulates the mock setup for the Responses API, reducing duplication across fixtures. The mock chain correctly mirrors theAsyncLlamaStackClientHolder().get_client().responses.create()call pattern.
103-127: LGTM!The
_build_instructionstests provide good coverage of the three main scenarios: full system info, partial system info, and empty system info. The assertions correctly verify both the presence of expected content and the absence of fields that should be excluded.
179-207: LGTM!The
retrieve_simple_responsetests are properly updated to pass theinstructionsparameter and provide good coverage of success, empty output, and API connection error scenarios.
213-295: LGTM!The
infer_endpointtests provide solid coverage of the main flows including minimal requests, full context, unique request ID generation, connection error handling, and fallback behavior. The request validation tests are comprehensive.src/app/endpoints/rlsapi_v1.py (3)
52-78: LGTM!The
_build_instructionsfunction is well-implemented with clear logic. It correctly handles the three cases: full system info, partial system info, and empty system info. The docstring follows Google conventions as per coding guidelines.
114-148: LGTM!The
retrieve_simple_responsefunction is properly updated to use the Responses API. The text extraction pattern usingextract_text_from_response_output_itemis consistent with the utility function implementation, and thestore=Falseparameter is appropriate for stateless queries.
211-215: LGTM!Using
logger.exceptionforAPIStatusErroris the right choice as it captures the full stack trace for unexpected API errors, whilelogger.errorforRateLimitErroris appropriate since it's an expected operational condition.tests/integration/endpoints/test_rlsapi_v1_integration.py (3)
87-91: LGTM!The fixture properly sets up the mock and returns the mock client, enabling tests to inspect call arguments when needed.
262-302: LGTM!The
test_rlsapi_v1_infer_input_source_combinationtest properly verifies that all input sources (question, stdin, attachments, terminal) are combined and passed to the LLM. The use ofcall_args.kwargs["input"]to inspect the actual input content is a good approach.
191-222: LGTM!The connection error test properly verifies the 503 response and error message. The manual mock setup is appropriate for this specific error scenario.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
233ccfa to
e7c9d1a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@tests/unit/app/endpoints/test_rlsapi_v1.py`:
- Line 34: Update the MOCK_AUTH test constant to follow the project's standard
pattern: replace the tuple ("test_user_id", "test_user", True, "test_token")
with ("mock_user_id", "mock_username", False, "mock_token"); modify the symbol
MOCK_AUTH in the test_rlsapi_v1.py file so tests use the standardized mock
values and non-admin flag.
🧹 Nitpick comments (2)
tests/unit/app/endpoints/test_rlsapi_v1.py (2)
64-70: Consider aligning mock structure with actual Responses API output.The mock sets
content = textas a string, but per the relevant code snippet forextract_text_from_response_output_item, content can be a string or list of fragments. The current mock works because the helper handles strings directly, but adding a comment clarifying this mock represents the simplified string-content case would help maintainability.
287-300: Missing test coverage forRateLimitErrorandAPIStatusErrorhandlers.The endpoint now handles
RateLimitError(returning 429) andAPIStatusError(returning 500), but there are no unit tests for these code paths. Consider adding tests similar totest_infer_api_connection_error_returns_503:📝 Suggested test additions
`@pytest.fixture`(name="mock_rate_limit_error") def mock_rate_limit_error_fixture(mocker: MockerFixture) -> None: """Mock responses.create() to raise RateLimitError.""" from llama_stack_client import RateLimitError mock_response = mocker.Mock() mock_response.status_code = 429 _setup_responses_mock( mocker, mocker.AsyncMock(side_effect=RateLimitError( message="Rate limit exceeded", response=mock_response, body=None, )), ) `@pytest.fixture`(name="mock_api_status_error") def mock_api_status_error_fixture(mocker: MockerFixture) -> None: """Mock responses.create() to raise APIStatusError.""" from llama_stack_client import APIStatusError mock_response = mocker.Mock() mock_response.status_code = 500 _setup_responses_mock( mocker, mocker.AsyncMock(side_effect=APIStatusError( message="Internal server error", response=mock_response, body=None, )), ) `@pytest.mark.asyncio` async def test_infer_rate_limit_error_returns_429( mock_configuration: AppConfig, mock_rate_limit_error: None, mock_auth_resolvers: None, ) -> None: """Test /infer endpoint returns 429 when rate limit is exceeded.""" request = RlsapiV1InferRequest(question="Test question") with pytest.raises(HTTPException) as exc_info: await infer_endpoint(infer_request=request, auth=MOCK_AUTH) assert exc_info.value.status_code == status.HTTP_429_TOO_MANY_REQUESTS `@pytest.mark.asyncio` async def test_infer_api_status_error_returns_500( mock_configuration: AppConfig, mock_api_status_error: None, mock_auth_resolvers: None, ) -> None: """Test /infer endpoint returns 500 on API status errors.""" request = RlsapiV1InferRequest(question="Test question") with pytest.raises(HTTPException) as exc_info: await infer_endpoint(infer_request=request, auth=MOCK_AUTH) assert exc_info.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/app/endpoints/rlsapi_v1.pysrc/models/rlsapi/requests.pytests/unit/app/endpoints/test_rlsapi_v1.py
🚧 Files skipped from review as they are similar to previous changes (1)
- src/models/rlsapi/requests.py
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Use FastAPI dependencies:from fastapi import APIRouter, HTTPException, Request, status, Depends
Use Llama Stack imports:from llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Uselogger = logging.getLogger(__name__)pattern for module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Complete type annotations for function parameters and return types, usingtyping_extensions.Selffor model validators
Use union types with modern syntax:str | intinstead ofUnion[str, int]
UseOptional[Type]for optional parameters
Use snake_case with descriptive, action-oriented function names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead
Useasync deffor I/O operations and external API calls
HandleAPIConnectionErrorfrom Llama Stack in error handling
Uselogger.debug()for detailed diagnostic information
Uselogger.info()for general information about program execution
Uselogger.warning()for unexpected events or potential problems
Uselogger.error()for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes:Configuration,Error/Exception,Resolver,Interface
Use ABC for abstract base classes with@abstractmethoddecorators
Complete type annotations for all class attributes
Follow Google Python docstring conventions (https://google.github.io/styleguide/pyguide.html) with sections: Args, Returns, Raises, Attributes
Runuv run make formatto auto-format code with black and ruff before completion
Runuv run make verifyto run all linters (black, pyl...
Files:
src/app/endpoints/rlsapi_v1.pytests/unit/app/endpoints/test_rlsapi_v1.py
src/app/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use FastAPI
HTTPExceptionwith appropriate status codes for API endpoint error handling
Files:
src/app/endpoints/rlsapi_v1.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use pytest for all unit and integration tests, not unittest
Usepytest-mockfor AsyncMock objects in tests
UseMOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token")pattern for authentication mocks in tests
Files:
tests/unit/app/endpoints/test_rlsapi_v1.py
tests/unit/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/unit/**/*.py: Unit tests require 60% code coverage
Write unit tests covering new functionality before completion
Files:
tests/unit/app/endpoints/test_rlsapi_v1.py
🧠 Learnings (5)
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to **/*.py : Handle `APIConnectionError` from Llama Stack in error handling
Applied to files:
src/app/endpoints/rlsapi_v1.pytests/unit/app/endpoints/test_rlsapi_v1.py
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to **/*.py : Use FastAPI dependencies: `from fastapi import APIRouter, HTTPException, Request, status, Depends`
Applied to files:
src/app/endpoints/rlsapi_v1.py
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to **/*.py : Use Llama Stack imports: `from llama_stack_client import AsyncLlamaStackClient`
Applied to files:
src/app/endpoints/rlsapi_v1.pytests/unit/app/endpoints/test_rlsapi_v1.py
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to src/app/**/*.py : Use FastAPI `HTTPException` with appropriate status codes for API endpoint error handling
Applied to files:
tests/unit/app/endpoints/test_rlsapi_v1.py
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/**/*.py : Use `MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token")` pattern for authentication mocks in tests
Applied to files:
tests/unit/app/endpoints/test_rlsapi_v1.py
🧬 Code graph analysis (2)
src/app/endpoints/rlsapi_v1.py (2)
src/models/rlsapi/requests.py (2)
RlsapiV1InferRequest(123-200)RlsapiV1SystemInfo(42-66)src/utils/responses.py (1)
extract_text_from_response_output_item(6-56)
tests/unit/app/endpoints/test_rlsapi_v1.py (4)
src/app/endpoints/rlsapi_v1.py (1)
_build_instructions(52-78)tests/unit/utils/auth_helpers.py (1)
mock_authorization_resolvers(8-26)src/models/rlsapi/requests.py (2)
RlsapiV1SystemInfo(42-66)RlsapiV1InferRequest(123-200)src/utils/suid.py (1)
check_suid(19-54)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-pr
- GitHub Check: E2E: server mode / ci
- GitHub Check: E2E: library mode / ci
🔇 Additional comments (9)
src/app/endpoints/rlsapi_v1.py (4)
10-33: LGTM! Imports are well-organized and follow coding guidelines.The imports correctly use FastAPI dependencies, Llama Stack client types including
APIConnectionError,APIStatusError, andRateLimitError, and use absolute imports for internal modules. Based on learnings and coding guidelines.
52-78: LGTM! Clean helper function with proper docstring and type annotations.The
_build_instructionsfunction correctly composes the system prompt with optional context. The implementation:
- Has a proper Google-style docstring
- Uses type annotations correctly
- Returns early when no context parts exist
- Joins parts with readable formatting
114-148: LGTM! Responses API integration looks correct.The migration to the Responses API is clean:
- Properly casts to
OpenAIResponseObject- Uses
extract_text_from_response_output_itemhelper for text extraction- Sets
store=Falsefor stateless operation
188-211: Error handling is well-structured; previous review comment aboutRateLimitError.modelhas been addressed.The error handling correctly:
- Increments
llm_calls_failures_totalmetric for each failure type- Uses appropriate log levels (
errorfor rate limit,exceptionfor API errors)- Maps to correct HTTP status codes (429, 500, 503)
- Chains exceptions properly with
from eThe previous review comment about the ineffective
getattr(e, "model", "")call has been resolved — the code now directly usesQuotaExceededResponsewithout attempting to access a non-existentmodelattribute.tests/unit/app/endpoints/test_rlsapi_v1.py (5)
37-51: LGTM! Well-structured mock setup helper.The
_setup_responses_mockhelper centralizes Responses API mocking, reducing duplication across tests. The mock chain (client holder → client → responses → create) correctly mirrors the production code path.
91-94: LGTM! Good fixture encapsulation.Wrapping
mock_authorization_resolversin a pytest fixture improves test readability and follows the project's fixture-based approach for dependency injection.
109-144: LGTM! Comprehensive parametrized tests for_build_instructions.The test covers:
- Full system info (all fields populated)
- Partial system info (only OS)
- Empty system info (fallback to base prompt)
The assertions check both expected inclusions and exclusions, providing good coverage.
197-226: LGTM! Tests updated to match newretrieve_simple_responsesignature.Tests correctly pass
instructionsas the second argument, usingconstants.DEFAULT_SYSTEM_PROMPTto match production behavior.
231-246: LGTM! Endpoint test properly uses the newmock_auth_resolversfixture.The test correctly validates the response structure including
request_idgeneration and validation viacheck_suid.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
e7c9d1a to
b452672
Compare
tisnik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
b452672 to
0e32d7c
Compare
Add exception handlers for RateLimitError (429) and APIStatusError (500) to match the error handling patterns used by other endpoints like query.py and streaming_query.py. Previously, these exceptions would result in unhandled errors instead of proper HTTP error responses. Signed-off-by: Major Hayden <major@redhat.com>
Track LLM call failures in Prometheus metrics by incrementing llm_calls_failures_total counter in each exception handler, matching the pattern used by other endpoints. Signed-off-by: Major Hayden <major@redhat.com>
The skip_rag field in RlsapiV1InferRequest is accepted but not yet implemented. Update the field description to clarify this for API consumers. Signed-off-by: Major Hayden <major@redhat.com>
Enhance the LLM instructions with the user's RHEL system information (OS, version, architecture) when available. This gives the LLM better context about the environment the user is asking questions about, enabling more relevant and accurate responses. Signed-off-by: Major Hayden <major@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/unit/app/endpoints/test_rlsapi_v1.py (1)
37-66: Remove duplicate function definition causing pipeline failure.The
_setup_responses_mockfunction is defined twice (lines 37-50 and 53-66), causing the Pylint E0102 error. This appears to be a copy-paste mistake. Remove the duplicate definition to fix the pipeline.🐛 Proposed fix
def _setup_responses_mock(mocker: MockerFixture, create_behavior: Any) -> None: """Set up responses.create mock with custom behavior.""" mock_responses = mocker.Mock() mock_responses.create = create_behavior mock_client = mocker.Mock() mock_client.responses = mock_responses mock_client_holder = mocker.Mock() mock_client_holder.get_client.return_value = mock_client mocker.patch( "app.endpoints.rlsapi_v1.AsyncLlamaStackClientHolder", return_value=mock_client_holder, ) - - -def _setup_responses_mock(mocker: MockerFixture, create_behavior: Any) -> None: - """Set up responses.create mock with custom behavior.""" - mock_responses = mocker.Mock() - mock_responses.create = create_behavior - - mock_client = mocker.Mock() - mock_client.responses = mock_responses - - mock_client_holder = mocker.Mock() - mock_client_holder.get_client.return_value = mock_client - mocker.patch( - "app.endpoints.rlsapi_v1.AsyncLlamaStackClientHolder", - return_value=mock_client_holder, - )
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/app/endpoints/rlsapi_v1.pysrc/models/rlsapi/requests.pytests/unit/app/endpoints/test_rlsapi_v1.py
🧰 Additional context used
📓 Path-based instructions (5)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Use FastAPI dependencies:from fastapi import APIRouter, HTTPException, Request, status, Depends
Use Llama Stack imports:from llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Uselogger = logging.getLogger(__name__)pattern for module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Complete type annotations for function parameters and return types, usingtyping_extensions.Selffor model validators
Use union types with modern syntax:str | intinstead ofUnion[str, int]
UseOptional[Type]for optional parameters
Use snake_case with descriptive, action-oriented function names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead
Useasync deffor I/O operations and external API calls
HandleAPIConnectionErrorfrom Llama Stack in error handling
Uselogger.debug()for detailed diagnostic information
Uselogger.info()for general information about program execution
Uselogger.warning()for unexpected events or potential problems
Uselogger.error()for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes:Configuration,Error/Exception,Resolver,Interface
Use ABC for abstract base classes with@abstractmethoddecorators
Complete type annotations for all class attributes
Follow Google Python docstring conventions (https://google.github.io/styleguide/pyguide.html) with sections: Args, Returns, Raises, Attributes
Runuv run make formatto auto-format code with black and ruff before completion
Runuv run make verifyto run all linters (black, pyl...
Files:
tests/unit/app/endpoints/test_rlsapi_v1.pysrc/models/rlsapi/requests.pysrc/app/endpoints/rlsapi_v1.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use pytest for all unit and integration tests, not unittest
Usepytest-mockfor AsyncMock objects in tests
UseMOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token")pattern for authentication mocks in tests
Files:
tests/unit/app/endpoints/test_rlsapi_v1.py
tests/unit/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/unit/**/*.py: Unit tests require 60% code coverage
Write unit tests covering new functionality before completion
Files:
tests/unit/app/endpoints/test_rlsapi_v1.py
src/models/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/models/**/*.py: Use@field_validatorand@model_validatorfor custom validation in Pydantic models
ExtendBaseModelfor data Pydantic models
Use@model_validatorand@field_validatorfor Pydantic model validation
Files:
src/models/rlsapi/requests.py
src/app/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use FastAPI
HTTPExceptionwith appropriate status codes for API endpoint error handling
Files:
src/app/endpoints/rlsapi_v1.py
🧠 Learnings (6)
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/**/*.py : Use `MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token")` pattern for authentication mocks in tests
Applied to files:
tests/unit/app/endpoints/test_rlsapi_v1.py
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/**/*.py : Use `pytest-mock` for AsyncMock objects in tests
Applied to files:
tests/unit/app/endpoints/test_rlsapi_v1.py
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.
Applied to files:
src/models/rlsapi/requests.py
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to **/*.py : Handle `APIConnectionError` from Llama Stack in error handling
Applied to files:
src/app/endpoints/rlsapi_v1.py
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to **/*.py : Use FastAPI dependencies: `from fastapi import APIRouter, HTTPException, Request, status, Depends`
Applied to files:
src/app/endpoints/rlsapi_v1.py
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to src/app/**/*.py : Use FastAPI `HTTPException` with appropriate status codes for API endpoint error handling
Applied to files:
src/app/endpoints/rlsapi_v1.py
🧬 Code graph analysis (2)
tests/unit/app/endpoints/test_rlsapi_v1.py (3)
src/app/endpoints/rlsapi_v1.py (1)
_build_instructions(52-78)src/client.py (1)
get_client(101-115)src/models/rlsapi/requests.py (1)
RlsapiV1SystemInfo(42-66)
src/app/endpoints/rlsapi_v1.py (2)
src/models/responses.py (6)
InternalServerErrorResponse(1787-1945)QuotaExceededResponse(1675-1784)ServiceUnavailableResponse(1948-1979)UnauthorizedResponse(1302-1383)UnprocessableEntityResponse(1627-1672)generic(1842-1853)src/models/rlsapi/requests.py (2)
RlsapiV1InferRequest(123-200)RlsapiV1SystemInfo(42-66)
🪛 GitHub Actions: Python linter
tests/unit/app/endpoints/test_rlsapi_v1.py
[error] 53-53: Pylint: E0102 function already defined (function-redefined) at line 53; original definition at line 37. The step 'uv run pylint src tests' exited with code 2.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Pyright
- GitHub Check: build-pr
- GitHub Check: E2E: server mode / ci
- GitHub Check: E2E: library mode / ci
🔇 Additional comments (11)
src/models/rlsapi/requests.py (1)
129-157: Documentation updates are consistent and complete.The
skip_ragfield now has aligned documentation in both the class docstring (line 129) and the Field description (line 155), both stating "Reserved for future use. RAG retrieval is not yet implemented." This properly addresses the previous review feedback about inconsistency.src/app/endpoints/rlsapi_v1.py (5)
12-30: Imports are well-organized and necessary.The new imports properly support the error handling and system context features. Using absolute imports as per coding guidelines.
52-78: Well-implemented helper function for building system context instructions.The
_build_instructionsfunction properly handles partial system information by only including non-empty fields. The logic correctly falls back to the base prompt when no context is available. Good use of docstrings following Google style.
200-211: Error handling properly implemented.The
RateLimitErrorhandler now correctly uses a genericQuotaExceededResponsewithout the ineffectivegetattr(e, "model", "")call, addressing the previous review feedback. TheAPIStatusErrorhandler appropriately returns a generic 500 response. Both handlers correctly increment the failure metrics.
114-148: Function signature properly updated to accept instructions.The
retrieve_simple_responsefunction correctly accepts and passes theinstructionsparameter to the LLM API. Docstring is updated appropriately.
183-189: Integration of system context into inference flow.The endpoint correctly builds instructions from the request's system info and passes them to
retrieve_simple_response. The flow is clean and follows the pattern established by the new helper function.tests/unit/app/endpoints/test_rlsapi_v1.py (5)
34-34: MOCK_AUTH now follows the coding guidelines pattern.The authentication mock tuple correctly uses the standard values:
("mock_user_id", "mock_username", False, "mock_token")as specified in the coding guidelines.
122-160: Good test coverage for_build_instructions.The parametrized tests cover the key scenarios: full system info, partial system info (OS only), and empty system info. The test structure using
expected_containsandexpected_not_containslists is clean and maintainable.
107-110: Clean fixture for authorization mocking.The
mock_auth_resolversfixture properly wraps themock_authorization_resolvershelper, making test code cleaner through fixture injection.
213-241: Tests properly updated for new function signature.The
retrieve_simple_responsetests correctly passconstants.DEFAULT_SYSTEM_PROMPTas the instructions parameter, maintaining test coverage after the signature change.
247-329: Endpoint tests properly updated with auth resolver fixture.The endpoint tests correctly use the
mock_auth_resolversfixture and provide good coverage for various scenarios including minimal requests, full context, unique request ID generation, error handling, and empty response fallback.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
Use project-standard mock auth tuple pattern with non-admin flag. Signed-off-by: Major Hayden <major@redhat.com>
0e32d7c to
48a7c8f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/unit/app/endpoints/test_rlsapi_v1.py (1)
287-300: Missing test coverage for new error handlers.The PR adds handlers for
RateLimitError(429) andAPIStatusError(500) inrlsapi_v1.py(lines 200-211), but there are no corresponding tests. Per coding guidelines, new functionality must have unit test coverage.Add tests for both error handlers. Use the existing
_setup_responses_mockhelper:Suggested test fixtures and functions
`@pytest.fixture`(name="mock_rate_limit_error") def mock_rate_limit_error_fixture(mocker: MockerFixture) -> None: """Mock responses.create() to raise RateLimitError.""" mock_response = httpx.Response(429, request=httpx.Request("POST", "http://test")) _setup_responses_mock( mocker, mocker.AsyncMock(side_effect=RateLimitError( "Rate limit exceeded", response=mock_response, body=None )), ) `@pytest.fixture`(name="mock_api_status_error") def mock_api_status_error_fixture(mocker: MockerFixture) -> None: """Mock responses.create() to raise APIStatusError.""" mock_response = httpx.Response(500, request=httpx.Request("POST", "http://test")) _setup_responses_mock( mocker, mocker.AsyncMock(side_effect=APIStatusError("API error", response=mock_response, body=None)), ) `@pytest.mark.asyncio` async def test_infer_rate_limit_error_returns_429( mock_configuration: AppConfig, mock_rate_limit_error: None, mock_auth_resolvers: None, ) -> None: """Test /infer endpoint returns 429 when rate limited.""" request = RlsapiV1InferRequest(question="Test question") with pytest.raises(HTTPException) as exc_info: await infer_endpoint(infer_request=request, auth=MOCK_AUTH) assert exc_info.value.status_code == status.HTTP_429_TOO_MANY_REQUESTS `@pytest.mark.asyncio` async def test_infer_api_status_error_returns_500( mock_configuration: AppConfig, mock_api_status_error: None, mock_auth_resolvers: None, ) -> None: """Test /infer endpoint returns 500 on API errors.""" request = RlsapiV1InferRequest(question="Test question") with pytest.raises(HTTPException) as exc_info: await infer_endpoint(infer_request=request, auth=MOCK_AUTH) assert exc_info.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERRORAlso add imports:
import httpxand update the llama_stack_client import to includeRateLimitError, APIStatusError.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/app/endpoints/rlsapi_v1.pysrc/models/rlsapi/requests.pytests/unit/app/endpoints/test_rlsapi_v1.py
🧰 Additional context used
📓 Path-based instructions (5)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Use FastAPI dependencies:from fastapi import APIRouter, HTTPException, Request, status, Depends
Use Llama Stack imports:from llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Uselogger = logging.getLogger(__name__)pattern for module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Complete type annotations for function parameters and return types, usingtyping_extensions.Selffor model validators
Use union types with modern syntax:str | intinstead ofUnion[str, int]
UseOptional[Type]for optional parameters
Use snake_case with descriptive, action-oriented function names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead
Useasync deffor I/O operations and external API calls
HandleAPIConnectionErrorfrom Llama Stack in error handling
Uselogger.debug()for detailed diagnostic information
Uselogger.info()for general information about program execution
Uselogger.warning()for unexpected events or potential problems
Uselogger.error()for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes:Configuration,Error/Exception,Resolver,Interface
Use ABC for abstract base classes with@abstractmethoddecorators
Complete type annotations for all class attributes
Follow Google Python docstring conventions (https://google.github.io/styleguide/pyguide.html) with sections: Args, Returns, Raises, Attributes
Runuv run make formatto auto-format code with black and ruff before completion
Runuv run make verifyto run all linters (black, pyl...
Files:
src/models/rlsapi/requests.pytests/unit/app/endpoints/test_rlsapi_v1.pysrc/app/endpoints/rlsapi_v1.py
src/models/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/models/**/*.py: Use@field_validatorand@model_validatorfor custom validation in Pydantic models
ExtendBaseModelfor data Pydantic models
Use@model_validatorand@field_validatorfor Pydantic model validation
Files:
src/models/rlsapi/requests.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use pytest for all unit and integration tests, not unittest
Usepytest-mockfor AsyncMock objects in tests
UseMOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token")pattern for authentication mocks in tests
Files:
tests/unit/app/endpoints/test_rlsapi_v1.py
tests/unit/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/unit/**/*.py: Unit tests require 60% code coverage
Write unit tests covering new functionality before completion
Files:
tests/unit/app/endpoints/test_rlsapi_v1.py
src/app/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use FastAPI
HTTPExceptionwith appropriate status codes for API endpoint error handling
Files:
src/app/endpoints/rlsapi_v1.py
🧠 Learnings (6)
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.
Applied to files:
src/models/rlsapi/requests.py
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/**/*.py : Use `MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token")` pattern for authentication mocks in tests
Applied to files:
tests/unit/app/endpoints/test_rlsapi_v1.py
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/**/*.py : Use `pytest-mock` for AsyncMock objects in tests
Applied to files:
tests/unit/app/endpoints/test_rlsapi_v1.py
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to **/*.py : Handle `APIConnectionError` from Llama Stack in error handling
Applied to files:
src/app/endpoints/rlsapi_v1.py
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to **/*.py : Use FastAPI dependencies: `from fastapi import APIRouter, HTTPException, Request, status, Depends`
Applied to files:
src/app/endpoints/rlsapi_v1.py
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to src/app/**/*.py : Use FastAPI `HTTPException` with appropriate status codes for API endpoint error handling
Applied to files:
src/app/endpoints/rlsapi_v1.py
🧬 Code graph analysis (1)
src/app/endpoints/rlsapi_v1.py (2)
src/models/responses.py (6)
InternalServerErrorResponse(1787-1945)QuotaExceededResponse(1675-1784)ServiceUnavailableResponse(1948-1979)UnauthorizedResponse(1302-1383)UnprocessableEntityResponse(1627-1672)generic(1842-1853)src/models/rlsapi/requests.py (2)
RlsapiV1InferRequest(123-200)RlsapiV1SystemInfo(42-66)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-pr
- GitHub Check: E2E: server mode / ci
- GitHub Check: E2E: library mode / ci
🔇 Additional comments (12)
src/models/rlsapi/requests.py (1)
129-129: Documentation consistency addressed.The
skip_ragattribute docstring (line 129) and Field description (line 155) are now aligned, both stating "Reserved for future use. RAG retrieval is not yet implemented." This resolves the previous inconsistency.Also applies to: 153-157
src/app/endpoints/rlsapi_v1.py (6)
12-12: LGTM!The new imports for
APIStatusErrorandRateLimitErrorfromllama_stack_clientare correctly added to support the new exception handling.
46-48: LGTM!The 429 and 500 response codes are correctly added to the OpenAPI schema, aligning with the new exception handlers for
RateLimitErrorandAPIStatusError.
52-78: LGTM!The
_build_instructionsfunction is well-implemented:
- Correctly filters empty system info fields
- Returns the base prompt when no context is available
- Docstring follows Google conventions with Args/Returns sections
- Type annotations are complete
114-142: LGTM!The signature change to accept
instructionsis correctly implemented. The docstring is updated appropriately, and the parameter is passed through to the LLM client.
182-189: LGTM!The instruction building and passing flow is correctly integrated into the endpoint. System info is extracted from the request context and transformed into LLM instructions.
200-211: Past review concern addressed, exception ordering verified.The
RateLimitErrorhandler no longer attempts to access the non-existentmodelattribute—it now uses a genericQuotaExceededResponsedirectly. The exception ordering is correct:RateLimitError(subclass) is caught beforeAPIStatusError(parent class).One consideration:
logger.exceptionat line 209 will log the full stack trace, which is appropriate for unexpected API errors where debugging context is valuable.tests/unit/app/endpoints/test_rlsapi_v1.py (5)
34-34: LGTM!
MOCK_AUTHnow follows the coding guidelines pattern exactly:("mock_user_id", "mock_username", False, "mock_token"). Based on learnings and coding guidelines.
91-95: LGTM!Good refactoring to centralize authorization mocking into a reusable fixture. This reduces duplication and aligns with pytest best practices.
109-144: LGTM!Comprehensive parametrized tests for
_build_instructionscovering full, partial, and empty system info scenarios. The test structure is clean with clear positive and negative assertions.
202-205: LGTM!Tests correctly updated to pass the
instructionsparameter toretrieve_simple_response, matching the new signature.
231-246: LGTM!Endpoint tests are correctly refactored to use the
mock_auth_resolversfixture, maintaining test functionality while reducing boilerplate.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
Description
Follow-up improvements to the rlsapi v1
/inferendpoint:RateLimitError(429) andAPIStatusError(500) handlingllm_calls_failures_totalmetrics trackingskip_ragfield as reserved for future useMOCK_AUTHtest constant to use project-standard patternType of change
Tools used to create PR
Related Tickets & Documents
Checklist before requesting a review
Testing
Unit tests added for
_build_instructions. All existing tests pass.Summary by CodeRabbit
New Features
Bug Fixes
Chores
Tests
✏️ Tip: You can customize this high-level summary in your review settings.