Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions .markdownlint.jsonc
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.
// See the LICENSE file in the project root for more information.

// Configuration for the markdownlint VS Code extension.
// See https://github.com/DavidAnson/markdownlint for rule documentation.
{
// MD013 - Line length: enforce a maximum line length of 100 characters
"MD013": {
"line_length": 100,
"tables": false
},
// MD024 - No duplicate headings: only flag duplicates among sibling headings
// (allows the same heading text under different parents)
"MD024": {
"siblings_only": true
}
}
12 changes: 11 additions & 1 deletion .vscode/mcp.json
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,16 @@
"type": "http",
"url": "https://mcp.bluebird-ai.net/"
},
"bluebird-mcp-dsmaindev": {
"headers": {
"x-mcp-ec-branch": "master",
"x-mcp-ec-organization": "msdata",
"x-mcp-ec-project": "Database Systems",
"x-mcp-ec-repository": "DsMainDev"
},
"type": "http",
"url": "https://mcp.bluebird-ai.net/"
},
"github": {
"type": "http",
"url": "https://api.githubcopilot.com/mcp/"
Expand All @@ -72,4 +82,4 @@
"url": "https://learn.microsoft.com/api/mcp"
}
}
}
}
68 changes: 68 additions & 0 deletions plans/database_context/00-overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# Database Context Preservation Across Internal Reconnections

## Issue

[dotnet/SqlClient#4108](https://github.com/dotnet/SqlClient/issues/4108) — `SqlConnection` doesn't
restore database in the new session if connection is lost.

When a user changes the active database via `USE [db]` through `SqlCommand.ExecuteNonQuery()`, and
the physical connection subsequently breaks and is transparently reconnected, the recovered session
may land on the **initial catalog** from the connection string instead of the database the user
switched to.

## Status

**Fix implemented and validated.** The root cause (Issue G) has been identified and fixed. See
[03-issues.md](03-issues.md) for the full issue list and [04-recommendations.md](04-recommendations.md)
for the fix details. Server-side analysis of the SQL Server engine's session recovery code confirms
the fix is correct — see [06-server-side-analysis.md](06-server-side-analysis.md).

### Root Cause

`CompleteLogin()` in `SqlConnectionInternal.cs` trusted the server's `ENV_CHANGE` response
unconditionally after session recovery. If the server did not properly restore the database context,
the client silently ended up on the wrong database.

### Server-Side Confirmation

Analysis of the SQL Server source code (`featureext.cpp`, `login.cpp`, `session.cpp`) confirms that
the server correctly implements session recovery for database context — the recovery database from
the ToBe chunk is treated as mandatory (`Source #0`) with no silent fallback. The root cause is
entirely **client-side**: `CompleteLogin()` did not verify the server's response matched the recovery
target.

### Fix

After session recovery completes in `CompleteLogin()`, the fix compares `CurrentDatabase` (set by
the server's `ENV_CHANGE`) against the database from `_recoverySessionData`. If they differ, a
`USE [database]` command is sent to the server to force alignment. This guarantees both client and
server are on the correct database after recovery, regardless of server behavior.

The fix is gated behind the `Switch.Microsoft.Data.SqlClient.VerifyRecoveredDatabaseContext`
AppContext switch (default: `true`). Manual tests set it to `false` to confirm the server-only path
fails without the fix.
Comment on lines +39 to +43
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doc says VerifyRecoveredDatabaseContext defaults to true, but LocalAppContextSwitches.VerifyRecoveredDatabaseContext currently defaults to false. Please update this document (or the switch default) so the documented behavior matches the implementation.

Copilot uses AI. Check for mistakes.
Comment on lines +42 to +43
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This document states the VerifyRecoveredDatabaseContext switch default is true, but LocalAppContextSwitches.VerifyRecoveredDatabaseContext currently uses defaultValue: false. Please update the doc (or the implementation) so the documented default matches shipped behavior.

Suggested change
AppContext switch (default: `true`). Manual tests set it to `false` to confirm the server-only path
fails without the fix.
AppContext switch (default: `false`). Manual tests use `false` to confirm the server-only path
fails without the client-side fix.

Copilot uses AI. Check for mistakes.

### Key Properties of the Fix

- **Correct**: Both client and server are guaranteed to be on the same database after recovery
- **Safe**: Only executes during reconnection (`_recoverySessionData != null`), never on first login
- **Efficient**: No overhead when the server properly restores the database (the common case)
- **Defensive**: Handles both wrong-database and missing-ENV_CHANGE server behaviors

## Scope

This analysis covers every code path where an internal reconnection can occur and evaluates whether
the current database context (`CurrentDatabase`) is correctly maintained. The assumption is:

> Any internal reconnection within `SqlConnection` must maintain the current database context.

## Documents

| File | Contents |
| ---- | -------- |
| [01-architecture.md](01-architecture.md) | How database context is tracked and how session recovery works |
| [02-flows.md](02-flows.md) | Every reconnection flow, annotated with database context behaviour |
| [03-issues.md](03-issues.md) | Identified bugs and gaps, ranked by severity |
| [04-recommendations.md](04-recommendations.md) | Proposed fixes |
| [05-reconnection-and-retry-mechanisms.md](05-reconnection-and-retry-mechanisms.md) | All retry/reconnection mechanisms with public doc cross-references |
| [06-server-side-analysis.md](06-server-side-analysis.md) | SQL Server engine session recovery internals, including `ParseFeatureData`, `ParseSessionDataChunk`, `FDetermineSessionDb`, test coverage gaps |
140 changes: 140 additions & 0 deletions plans/database_context/01-architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
# Architecture: Database Context Tracking and Session Recovery

## Key Data Structures

### `SqlConnectionInternal` fields

| Field | Type | Set during | Purpose |
| ----- | ---- | ---------- | ------- |
| `CurrentDatabase` | `string` | Login, ENV_CHANGE | The database the server considers active right now |
| `_originalDatabase` | `string` | Login, constructor | Reset target for pool recycling (`ResetConnection()` restores to this value) |
| `_currentSessionData` | `SessionData` | Constructor | Live session state, snapshotted before reconnection |
| `_recoverySessionData` | `SessionData` | Constructor (param) | Saved session from the broken connection, used to build the recovery login packet |
| `_fConnectionOpen` | `bool` | `CompleteLogin()` | Guards whether ENV_CHANGE updates `_originalDatabase` |
| `_sessionRecoveryAcknowledged` | `bool` | `OnFeatureExtAck()` | Whether the server supports session recovery |

### `SessionData` fields

| Field | Mutated by | Cleared by `Reset()` | Purpose |
| ----- | ---------- | -------------------- | ------- |
| `_initialDatabase` | `CompleteLogin()` (first login only) | No | Immutable baseline from the login the server confirmed |
| `_database` | `CurrentSessionData` getter | Yes (set to `null`) | Current database, written just-in-time before snapshot |
| `_initialLanguage` | `CompleteLogin()` | No | Immutable baseline language |
| `_language` | `CurrentSessionData` getter | Yes | Current language |
| `_initialCollation` | `CompleteLogin()` | No | Immutable baseline collation |
| `_collation` | `OnEnvChange()` | Yes | Current collation |
| `_delta[]` | `OnFeatureExtAck()`, `SQLSESSIONSTATE` token handler | Yes | Per-stateID session variable changes |
| `_initialState[]` | `OnFeatureExtAck()` (first login only) | No | Per-stateID session variable baselines |
| `_unrecoverableStatesCount` | `SQLSESSIONSTATE` token handler | Yes | Count of non-recoverable session states |

`Reset()` is called when `ENV_SPRESETCONNECTIONACK` arrives (server acknowledged
`sp_reset_connection`). It clears delta/current state but preserves the immutable baselines.

## How `CurrentDatabase` is set

### During login

```text
Login() → CurrentDatabase = server.ResolvedDatabaseName
(= ConnectionOptions.InitialCatalog)
Server login response → ENV_CHANGE(ENV_DATABASE) → CurrentDatabase = newValue
CompleteLogin() → _currentSessionData._initialDatabase = CurrentDatabase
(only when _recoverySessionData == null, i.e. first login)
```

`SqlConnectionInternal.cs` line 2976—sets `CurrentDatabase` to `InitialCatalog` immediately. The
server then confirms (or overrides) via ENV_CHANGE before `CompleteLogin()` captures it.

### During normal operation

```text
USE [MyDb] via SqlCommand → server response → ENV_CHANGE(ENV_DATABASE)
OnEnvChange() → CurrentDatabase = "MyDb"
_originalDatabase NOT updated (guarded by _fConnectionOpen)
```

`SqlConnectionInternal.cs` lines 1155–1164. After the connection is open, `_originalDatabase` is frozen.

### During pool reset

```text
Deactivate() → ResetConnection()
→ _parser.PrepareResetConnection() (sets TDS header flag for sp_reset_connection)
→ CurrentDatabase = _originalDatabase (resets to initial catalog immediately)
```

`SqlConnectionInternal.cs` lines 3895–3907.

### `CurrentSessionData` getter (just-in-time snapshot)

```csharp
internal SessionData CurrentSessionData
{
get
{
if (_currentSessionData != null)
{
_currentSessionData._database = CurrentDatabase;
_currentSessionData._language = _currentLanguage;
}
return _currentSessionData;
}
}
```

`SqlConnectionInternal.cs` lines 530–537. This is called by `ValidateAndReconnect()` right before
saving recovery data for reconnection.

## Session Recovery Protocol

When `ConnectRetryCount > 0` (default: **1**), the driver negotiates `FEATUREEXT_SRECOVERY` with the
server during login. On reconnection, `WriteSessionRecoveryFeatureRequest()` encodes:

1. **Initial state**: `_initialDatabase`, `_initialCollation`, `_initialLanguage`, `_initialState[]`
2. **Current deltas**: `_database` (if different from `_initialDatabase`), `_language`,
`_collation`, `_delta[]`

The server uses the initial state + deltas to rebuild the session. If `_database !=
_initialDatabase`, the server switches to `_database` after login.

### `WriteSessionRecoveryFeatureRequest` — relevant excerpt

```text
TdsParser.cs line 8963:
initialLength += ... _initialDatabase ...
TdsParser.cs line 8966:
currentLength += ... (_initialDatabase == _database ? 0 : _database) ...
TdsParser.cs line 9017:
WriteIdentifier(_database != _initialDatabase ? _database : null, ...)
```

When `_database` equals `_initialDatabase`, a zero-length identifier is written (meaning "no
change"). When they differ, the current database name is written and the server applies it.

## Flow: How `ValidateAndReconnect` triggers recovery

```text
SqlCommand.RunExecuteNonQueryTds()
→ SqlConnection.ValidateAndReconnect()
check _connectRetryCount > 0
check _sessionRecoveryAcknowledged
check !stateObj.ValidateSNIConnection() ← physical connection broken?
SessionData cData = tdsConn.CurrentSessionData ← snapshot (writes _database = CurrentDatabase)
_recoverySessionData = cData ← save for new connection
tdsConn.DoomThisConnection()
Task.Run → ReconnectAsync()
→ ForceNewConnection = true
→ OpenAsync()
→ TryReplaceConnection()
→ SqlConnectionFactory.CreateConnection()
→ new SqlConnectionInternal(..., recoverySessionData)
constructor: _originalDatabase = recoverySessionData._initialDatabase
Login()
→ CurrentDatabase = InitialCatalog
→ login.database = CurrentDatabase
→ TdsLogin(..., _recoverySessionData, ...)
→ WriteSessionRecoveryFeatureRequest(recoverySessionData, ...)
Server processes login + recovery → ENV_CHANGE(ENV_DATABASE) → CurrentDatabase updated
CompleteLogin()
→ _recoverySessionData = null
```
Loading
Loading