Skip to content

Refine gateway runtime and simplify tokenizer search#1

Merged
KSemenenko merged 13 commits intomainfrom
codex/tokenizer-fallback-defaults
Mar 8, 2026
Merged

Refine gateway runtime and simplify tokenizer search#1
KSemenenko merged 13 commits intomainfrom
codex/tokenizer-fallback-defaults

Conversation

@KSemenenko
Copy link
Member

Summary

  • refactor the gateway runtime and registry internals into smaller, clearer collaborators
  • add optional gateway initialization via service-provider warmup and hosted background warmup
  • simplify tokenizer search to a single built-in ChatGptO200kBase path and remove the unused public tokenizer selection API
  • update tokenizer evaluation tests and README guidance to match the current runtime behavior

Verification

  • dotnet restore ManagedCode.MCPGateway.slnx
  • dotnet build ManagedCode.MCPGateway.slnx -c Release --no-restore
  • dotnet test --solution ManagedCode.MCPGateway.slnx -c Release --no-build
  • dotnet build ManagedCode.MCPGateway.slnx -c Release --no-restore -p:RunAnalyzers=true
  • roslynator analyze src/ManagedCode.MCPGateway/ManagedCode.MCPGateway.csproj -p Configuration=Release --severity-level warning
  • roslynator analyze tests/ManagedCode.MCPGateway.Tests/ManagedCode.MCPGateway.Tests.csproj -p Configuration=Release --severity-level warning
  • cloc --include-lang=C# src tests

Copilot AI review requested due to automatic review settings March 7, 2026 22:10
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors ManagedCode.MCPGateway runtime/catalog internals into smaller components, adds optional gateway initialization (service-provider extension + hosted warmup), and simplifies tokenizer search to a single built-in ChatGPT O200kBase path while updating docs/tests accordingly.

Changes:

  • Introduces McpGatewayRuntime orchestration split across indexing/search/invocation/embeddings helpers, plus a separate McpGatewayRegistry.
  • Adds optional eager initialization via InitializeManagedCodeMcpGatewayAsync() and AddManagedCodeMcpGatewayIndexWarmup().
  • Updates search behavior/tests/docs for tokenizer-based ranking (and optional English query normalization via a keyed IChatClient).

Reviewed changes

Copilot reviewed 64 out of 77 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/ManagedCode.MCPGateway.Tests/TestSupport/TestToolEmbeddingStore.cs Switches test embedding store to shared internal index helper.
tests/ManagedCode.MCPGateway.Tests/TestSupport/TestFunctionFactory.cs Adds shared AIFunction factory helper for tests.
tests/ManagedCode.MCPGateway.Tests/TestSupport/TestChatClient.cs Adds a test IChatClient for query-normalization scenarios.
tests/ManagedCode.MCPGateway.Tests/TestSupport/GatewayTestServiceProviderFactory.cs Adds optional keyed search-query chat client wiring for tests.
tests/ManagedCode.MCPGateway.Tests/Search/SearchTestEnums.cs Adds tokenizer-search test support + shared enums.
tests/ManagedCode.MCPGateway.Tests/Search/McpGatewayTokenizerSearchTests.cs Adds tokenizer strategy coverage and expectations.
tests/ManagedCode.MCPGateway.Tests/Search/McpGatewayTokenizerSearchEvaluationTests.cs Adds evaluation threshold test for tokenizer search.
tests/ManagedCode.MCPGateway.Tests/Search/McpGatewayTokenizerSearchEvaluationSupport.cs Adds evaluation metric helpers for tokenizer search.
tests/ManagedCode.MCPGateway.Tests/Search/McpGatewayTokenizerSearchEvaluationQueries.cs Adds evaluation query buckets.
tests/ManagedCode.MCPGateway.Tests/Search/McpGatewayTokenizerSearchEvaluationCatalog.cs Adds 50-tool evaluation catalog for tokenizer search.
tests/ManagedCode.MCPGateway.Tests/Search/McpGatewaySearchVectorTests.cs Splits out vector search tests from prior combined file.
tests/ManagedCode.MCPGateway.Tests/Search/McpGatewaySearchTests.cs Converts to partial and keeps shared tool config helpers.
tests/ManagedCode.MCPGateway.Tests/Search/McpGatewaySearchLexicalTests.cs Adds tokenizer/lexical fallback tests (incl. query normalization).
tests/ManagedCode.MCPGateway.Tests/Search/McpGatewaySearchEmbeddingStoreTests.cs Adds embedding-store reuse/regeneration tests.
tests/ManagedCode.MCPGateway.Tests/Search/McpGatewaySearchBuildTests.cs Adds registry/index build behavior + concurrency tests.
tests/ManagedCode.MCPGateway.Tests/Search/McpGatewayInitializationTests.cs Adds tests for service-provider init + hosted warmup.
tests/ManagedCode.MCPGateway.Tests/Search/McpGatewayInMemoryToolEmbeddingStoreTests.cs Refactors in-memory embedding store tests via helpers.
tests/ManagedCode.MCPGateway.Tests/MetaTools/McpGatewayMetaToolTests.cs Updates tests to use shared TestFunctionFactory.
tests/ManagedCode.MCPGateway.Tests/Invocation/McpGatewayInvocationTests.cs Converts to partial and splits invocation tests.
tests/ManagedCode.MCPGateway.Tests/Invocation/McpGatewayInvocationMcpTests.cs Adds MCP invocation tests in a dedicated file.
tests/ManagedCode.MCPGateway.Tests/Invocation/McpGatewayInvocationLocalTests.cs Adds local invocation tests in a dedicated file.
src/ManagedCode.MCPGateway/Registration/McpGatewayServiceProviderExtensions.cs Adds InitializeManagedCodeMcpGatewayAsync() extension.
src/ManagedCode.MCPGateway/Registration/McpGatewayServiceCollectionExtensions.cs Updates DI wiring and adds hosted warmup registration.
src/ManagedCode.MCPGateway/Properties/AssemblyInfo.cs Exposes internals to the test assembly.
src/ManagedCode.MCPGateway/Models/Search/McpGatewaySearchStrategy.cs Adds explicit search strategy enum.
src/ManagedCode.MCPGateway/Models/Search/McpGatewaySearchResult.cs Adds public search result record.
src/ManagedCode.MCPGateway/Models/Search/McpGatewaySearchRequest.cs Adds public search request record.
src/ManagedCode.MCPGateway/Models/Search/McpGatewaySearchQueryNormalization.cs Adds query-normalization configuration enum.
src/ManagedCode.MCPGateway/Models/Search/McpGatewaySearchMatch.cs Adds public search match record.
src/ManagedCode.MCPGateway/Models/Invocation/McpGatewayInvokeResult.cs Adds public invoke result record.
src/ManagedCode.MCPGateway/Models/Invocation/McpGatewayInvokeRequest.cs Adds public invoke request record.
src/ManagedCode.MCPGateway/Models/Embeddings/McpGatewayToolEmbeddingLookup.cs Adds embedding lookup record.
src/ManagedCode.MCPGateway/Models/Embeddings/McpGatewayToolEmbedding.cs Adds embedding record.
src/ManagedCode.MCPGateway/Models/Catalog/McpGatewayToolDescriptor.cs Adds tool descriptor record.
src/ManagedCode.MCPGateway/Models/Catalog/McpGatewaySourceKind.cs Adds source-kind enum.
src/ManagedCode.MCPGateway/Models/Catalog/McpGatewayIndexBuildResult.cs Adds index build result record.
src/ManagedCode.MCPGateway/Models/Catalog/McpGatewayDiagnostic.cs Adds diagnostic record.
src/ManagedCode.MCPGateway/McpGatewayToolSet.cs Extracts tool descriptions into constants.
src/ManagedCode.MCPGateway/ManagedCode.MCPGateway.csproj Adds hosting/tokenizer dependencies.
src/ManagedCode.MCPGateway/Internal/Warmup/McpGatewayIndexWarmupService.cs Adds hosted background index warmup service.
src/ManagedCode.MCPGateway/Internal/Serialization/McpGatewayJsonSerializer.cs Adds centralized JSON normalization helpers.
src/ManagedCode.MCPGateway/Internal/Runtime/Search/McpGatewaySearchTokenizerFactory.cs Adds single built-in tokenizer factory.
src/ManagedCode.MCPGateway/Internal/Runtime/Search/McpGatewayRuntime.Tokenization.cs Implements token/lexical term extraction and profiles.
src/ManagedCode.MCPGateway/Internal/Runtime/Search/McpGatewayRuntime.TokenSearchSegments.cs Builds token-search segments from descriptors/queries/context.
src/ManagedCode.MCPGateway/Internal/Runtime/Search/McpGatewayRuntime.Search.cs Implements search flow (vector + tokenizer/lexical ranking).
src/ManagedCode.MCPGateway/Internal/Runtime/Search/McpGatewayRuntime.QueryNormalization.cs Adds optional English query normalization via keyed IChatClient.
src/ManagedCode.MCPGateway/Internal/Runtime/Search/McpGatewayRuntime.Context.cs Adds context flattening for ranking input.
src/ManagedCode.MCPGateway/Internal/Runtime/Invocation/McpGatewayRuntime.InvocationResults.cs Adds MCP output extraction + JSON scalar normalization.
src/ManagedCode.MCPGateway/Internal/Runtime/Invocation/McpGatewayRuntime.Invocation.cs Implements invocation resolution/mapping and MCP/meta attachment.
src/ManagedCode.MCPGateway/Internal/Runtime/Embeddings/McpGatewayRuntime.Embeddings.cs Adds embedding generator/store resolution + fingerprinting.
src/ManagedCode.MCPGateway/Internal/Runtime/Core/McpGatewayRuntime.cs Introduces core runtime constants/state and DI-resolved options.
src/ManagedCode.MCPGateway/Internal/Runtime/Core/McpGatewayRuntime.Types.cs Adds internal runtime records/leasing helpers.
src/ManagedCode.MCPGateway/Internal/Runtime/Core/McpGatewayRuntime.Snapshot.cs Adds snapshot refresh logic based on registry versioning.
src/ManagedCode.MCPGateway/Internal/Runtime/Catalog/McpGatewayRuntime.Indexing.cs Implements single-flight index builds + embeddings persistence.
src/ManagedCode.MCPGateway/Internal/Runtime/Catalog/McpGatewayRuntime.Descriptors.cs Builds descriptors/documents from tools and schemas.
src/ManagedCode.MCPGateway/Internal/Embeddings/McpGatewayToolEmbeddingStoreIndex.cs Extracts embedding store index helper (clone, lookup, remove).
src/ManagedCode.MCPGateway/Internal/Catalog/Sources/McpGatewayToolSourceRegistrations.cs Refactors source registrations and MCP client caching/concurrency.
src/ManagedCode.MCPGateway/Internal/Catalog/Sources/McpGatewayRegistrationCollection.cs Adds thread-safe registration collection.
src/ManagedCode.MCPGateway/Internal/Catalog/Sources/McpGatewayClientFactory.cs Uses assembly version for MCP client info (no hardcoding).
src/ManagedCode.MCPGateway/Internal/Catalog/McpGatewayRegistry.cs Adds separate registry service implementing catalog source + disposal gate.
src/ManagedCode.MCPGateway/Internal/Catalog/McpGatewayOperationGate.cs Adds lifecycle gate for registry operations vs dispose.
src/ManagedCode.MCPGateway/Internal/Catalog/McpGatewayDefaults.cs Introduces default source id constant.
src/ManagedCode.MCPGateway/Internal/Catalog/McpGatewayCatalogSourceSnapshot.cs Adds snapshot record (version + registrations).
src/ManagedCode.MCPGateway/Internal/Catalog/IMcpGatewayCatalogSource.cs Adds internal catalog source abstraction.
src/ManagedCode.MCPGateway/Embeddings/McpGatewayInMemoryToolEmbeddingStore.cs Adds public in-memory embedding store using internal index helper.
src/ManagedCode.MCPGateway/Configuration/McpGatewayServiceKeys.cs Adds keyed service key for search-query chat client.
src/ManagedCode.MCPGateway/Configuration/McpGatewayOptions.cs Adds new options surface for strategy/normalization/defaults and registration helpers.
src/ManagedCode.MCPGateway/Abstractions/Embeddings/IMcpGatewayToolEmbeddingStore.cs Introduces public embedding store abstraction.
src/ManagedCode.MCPGateway/Abstractions/Catalog/IMcpGatewayRegistry.cs Introduces public registry abstraction for post-build mutations.
docs/Features/SearchQueryNormalizationAndRanking.md Documents normalization + tokenizer ranking design and verification.
docs/Architecture/Overview.md Documents new module boundaries and dependency rules.
README.md Updates usage docs for strategies, warmup, normalization, and defaults.
Directory.Packages.props Adds hosting/tokenizer package versions.
Directory.Build.props Bumps package version to 0.2.0.
AGENTS.md Updates repo workflow rules and documentation/testing guidance.
Comments suppressed due to low confidence (1)

src/ManagedCode.MCPGateway/Internal/Catalog/Sources/McpGatewayToolSourceRegistrations.cs:196

  • In GetClientAsync, createdTask is constructed before the CompareExchange succeeds. Under concurrent callers this can start multiple CreateClientAsync(...).AsTask() executions; tasks that lose the CAS still run and can create extra McpClient instances (and potentially leak/dispose incorrectly). Consider using a single-flight pattern that only starts CreateClientAsync after winning the CAS (e.g., store a TaskCompletionSource<McpClient>/lazy initializer, or create the task inside the successful CompareExchange branch).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8abca740bd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@KSemenenko
Copy link
Member Author

Addressed all current review comments in .

Changes included:

  • propagated cancellation into source loading, embedding generation, and embedding-store I/O while preserving single-flight behavior
  • made JSON context/schema serialization resilient to unsupported and cyclic payloads so search/invocation stay best-effort instead of failing hard
  • fixed MCP client single-flight creation so concurrent callers do not eagerly start duplicate client factories before winning the compare-exchange
  • added regression coverage for canceled index builds, cyclic context payloads, and retry-after-cancel behavior

Verification:

  • Determining projects to restore...
    All projects are up-to-date for restore.
  • ManagedCode.MCPGateway -> /Users/ksemenenko/Developer/MCPGateway/src/ManagedCode.MCPGateway/bin/Release/net10.0/ManagedCode.MCPGateway.dll
    ManagedCode.MCPGateway.Tests -> /Users/ksemenenko/Developer/MCPGateway/tests/ManagedCode.MCPGateway.Tests/bin/Release/net10.0/ManagedCode.MCPGateway.Tests.dll

Build succeeded.
0 Warning(s)
0 Error(s)

Time Elapsed 00:00:00.44

  • Running tests from /Users/ksemenenko/Developer/MCPGateway/tests/ManagedCode.MCPGateway.Tests/bin/Release/net10.0/ManagedCode.MCPGateway.Tests.dll (net10.0|arm64)
    /Users/ksemenenko/Developer/MCPGateway/tests/ManagedCode.MCPGateway.Tests/bin/Release/net10.0/ManagedCode.MCPGateway.Tests.dll (net10.0|arm64) passed (973ms)

Test run summary: Passed!
total: 65
failed: 0
succeeded: 65
skipped: 0
duration: 1s 111ms

  • ManagedCode.MCPGateway -> /Users/ksemenenko/Developer/MCPGateway/src/ManagedCode.MCPGateway/bin/Release/net10.0/ManagedCode.MCPGateway.dll
    ManagedCode.MCPGateway.Tests -> /Users/ksemenenko/Developer/MCPGateway/tests/ManagedCode.MCPGateway.Tests/bin/Release/net10.0/ManagedCode.MCPGateway.Tests.dll

Build succeeded.
0 Warning(s)
0 Error(s)

Time Elapsed 00:00:00.40

  • Loading solution '/Users/ksemenenko/Developer/MCPGateway/ManagedCode.MCPGateway.slnx'... for src/tests
  • github.com/AlDanial/cloc v 2.08 T=0.07 s (1037.8 files/s, 95611.2 lines/s)

Language files blank comment code

C# 76 1023 6 5973

SUM: 76 1023 6 5973

Result: 65/65 tests passed, 0 warnings, 0 diagnostics.

@KSemenenko
Copy link
Member Author

Addressed all current review comments in b868f13.

Changes included:

  • propagated BuildIndexAsync cancellation into source loading, embedding generation, and embedding-store I/O while preserving single-flight behavior
  • made JSON context/schema serialization resilient to unsupported and cyclic payloads so search/invocation stay best-effort instead of failing hard
  • fixed MCP client single-flight creation so concurrent callers do not eagerly start duplicate client factories before winning the compare-exchange
  • added regression coverage for canceled index builds, cyclic context payloads, and retry-after-cancel behavior

Verification:

  • dotnet restore ManagedCode.MCPGateway.slnx
  • dotnet build ManagedCode.MCPGateway.slnx -c Release --no-restore
  • dotnet test --solution ManagedCode.MCPGateway.slnx -c Release --no-build
  • dotnet build ManagedCode.MCPGateway.slnx -c Release --no-restore -p:RunAnalyzers=true
  • roslynator analyze for src/tests
  • cloc --include-lang=C# src tests

Result: 65/65 tests passed, 0 warnings, 0 diagnostics.

@KSemenenko KSemenenko merged commit 7d4b30a into main Mar 8, 2026
3 checks passed
@KSemenenko KSemenenko deleted the codex/tokenizer-fallback-defaults branch March 8, 2026 07:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants