Summary
The proxy fails with a 500 error when generating embeddings via gemini_cli (e.g., gemini-embedding-001). Requests are incorrectly routed to the chat completion endpoint instead of an embedding endpoint.
Technical Details
- Routing Bug: In
rotator_library/client.py, the _execute_with_retry method hardcodes provider_plugin.acompletion for custom providers, ignoring whether the original call was for embeddings.
- Missing Implementation:
GeminiCliProvider in gemini_cli_provider.py does not implement aembedding.
- Endpoint Mismatch: Requests are sent to
:streamGenerateContent, which returns 404/400 for embedding models, resulting in a 500 error for the client.
Steps to Reproduce
curl http://localhost:8000/v1/embeddings \
-H "Authorization: Bearer <token>" \
-d '{"input": "test", "model": "gemini_cli/gemini-embedding-001"}'
Suggested Fix
- Update client.py to check the api_call type before delegation.
- Implement aembedding in GeminiCliProvider using the Google :embedContent endpoint.
Summary
The proxy fails with a 500 error when generating embeddings via
gemini_cli(e.g.,gemini-embedding-001). Requests are incorrectly routed to the chat completion endpoint instead of an embedding endpoint.Technical Details
rotator_library/client.py, the_execute_with_retrymethod hardcodesprovider_plugin.acompletionfor custom providers, ignoring whether the original call was for embeddings.GeminiCliProvideringemini_cli_provider.pydoes not implementaembedding.:streamGenerateContent, which returns 404/400 for embedding models, resulting in a 500 error for the client.Steps to Reproduce