KeyError crash when running ~1000 queries on a single connection after any DDL/ROLLBACK (connection becomes unusable)

## Summary

When a connection executes a DDL statement (`CREATE`, `DROP`, `ALTER`) or triggers a `ROLLBACK`, the prepared-statement LRU cache enters a split-brain state. Once the cache subsequently fills back up to `max_prepared_statements` (default: 1000), the next query crashes with a `KeyError` inside the eviction logic. The crash also corrupts the connection so that the following `rollback()` call fails too, leaving the connection permanently broken.

This manifests in some dbt projects because users can run an arbitrary number of queries after dbt executes at least one DDL all of which execute on the same connection.

## Environment

- `redshift-connector` version: **2.1.12**
- Python: 3.x
- Cluster: Redshift (provisioned)

## Steps to reproduce

Minimal inline repro:

\`\`\`python
import redshift_connector

conn = redshift_connector.connect(host=..., database=..., user=..., password=...)
conn.autocommit = True
cur = conn.cursor()

# Step 1: DDL poisons the LRU tracker
cur.execute("CREATE TABLE IF NOT EXISTS repro_test (id INT)")

# Step 2: fill the prepared-statement cache to the limit (default 1000)
for i in range(1000):
    cur.execute(f"SELECT {i} AS n")
    cur.fetchall()

# Step 3: one more new query triggers eviction → KeyError
cur.execute("SELECT 'boom' AS x")   # ← crashes here
\`\`\`

**Expected:** query succeeds.
**Actual:**
\`\`\`
KeyError: ('CREATE TABLE IF NOT EXISTS repro_test (id INT)', ())
  File "redshift_connector/core.py", line 1849, in execute
    statements_to_close.append(cache["ps"][oldest_key]["statement_name_bin"])
\`\`\`

## Root cause

There are two parallel cache structures kept in sync for LRU eviction:

\`\`\`
cache["ps"]             – dict        : key → prepared-statement metadata
cache["statement_dict"] – OrderedDict : key → None   (LRU order)
\`\`\`

**Bug 1** — `handle_COMMAND_COMPLETE` (`core.py:2029–2034`) clears `ps` on DDL/ROLLBACK but leaves `statement_dict` intact:

\`\`\`python
if command in (b"ALTER", b"CREATE", b"DROP", b"ROLLBACK"):
    for ps in pcache["ps"].values():
        self.close_prepared_statement(ps["statement_name_bin"])
    pcache["ps"].clear()          # ← ps emptied
    # statement_dict is NOT cleared ← stale keys remain
\`\`\`

**Bug 2** — The consistency-repair check (`core.py:1842–1844`) only adds entries to `statement_dict`; it never removes stale ones:

\`\`\`python
if len(cache["ps"]) != len(cache["statement_dict"]):
    for existing_key in cache["ps"]:
        cache["statement_dict"][existing_key] = None  # adds, never removes
\`\`\`

**Bug 3 (crash site)** — Eviction blindly assumes the oldest `statement_dict` key exists in `ps` (`core.py:1847–1850`):

\`\`\`python
if len(cache["ps"]) >= self.max_prepared_statements:
    oldest_key, _ = cache["statement_dict"].popitem(last=False)
    statements_to_close.append(cache["ps"][oldest_key]["statement_name_bin"])
    #                                    ↑ KeyError if oldest_key is stale
\`\`\`

### Failure sequence

1. Connection runs N unique queries → `ps` and `statement_dict` each have N entries
2. A DDL or ROLLBACK executes → `ps` cleared to 0, `statement_dict` still has N stale keys
3. N more unique queries run → `ps` refills to N; `statement_dict` now has 2N entries
4. One more new query: eviction fires (`len(ps) >= 1000`), pops the oldest `statement_dict` entry (a stale pre-DDL key), crashes with `KeyError`
5. The crashed query left a `PARSE` message partially sent; the subsequent `rollback()` fails with `ProgrammingError: prepared statement "..." already exists`, making the connection permanently unusable

### Observed diagnostic output

\`\`\`
ps size after DDL:    0   statement_dict size: 502   ← 502 stale entries remain
query 1000: ps=1000,  sd=1502                        ← divergence grows with each query
Step 4: KeyError: ('SELECT 1 AS query_num', ())      ← crash at eviction
\`\`\`

Note: in a realistic dbt workflow the failure occurs at query ~999 rather than 1001, because DDL setup queries (`DROP TABLE`, `CREATE TABLE`) run before the main loop and pre-poison `statement_dict` with stale keys, reducing the effective headroom.

## Proposed fix

The minimal fix is to also clear `statement_dict` wherever `ps` is cleared:

\`\`\`python
# core.py ~line 2034
pcache["ps"].clear()
pcache["statement_dict"].clear()   # ← add this line
\`\`\`

A more defensive fix also guards the eviction against stale keys:

\`\`\`python
# core.py ~line 1847
if len(cache["ps"]) >= self.max_prepared_statements:
    while cache["statement_dict"]:
        oldest_key, _ = cache["statement_dict"].popitem(last=False)
        if oldest_key in cache["ps"]:
            statements_to_close.append(cache["ps"][oldest_key]["statement_name_bin"])
            del cache["ps"][oldest_key]
            break
        # else: stale key, discard and keep looking
\`\`\`

Both fixes together fully resolve the issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KeyError crash when running ~1000 queries on a single connection after any DDL/ROLLBACK (connection becomes unusable) #273

Summary

Environment

Steps to reproduce

Step 1: DDL poisons the LRU tracker

Step 2: fill the prepared-statement cache to the limit (default 1000)

Step 3: one more new query triggers eviction → KeyError

Root cause

Failure sequence

Observed diagnostic output

Proposed fix

core.py ~line 2034

core.py ~line 1847

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

KeyError crash when running ~1000 queries on a single connection after any DDL/ROLLBACK (connection becomes unusable) #273

Description

Summary

Environment

Steps to reproduce

Step 1: DDL poisons the LRU tracker

Step 2: fill the prepared-statement cache to the limit (default 1000)

Step 3: one more new query triggers eviction → KeyError

Root cause

Failure sequence

Observed diagnostic output

Proposed fix

core.py ~line 2034

core.py ~line 1847

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions