Skip to content

Commit cb19cfb

Browse files
janiszclaudemtodor
authored
add E2E testing framework (#26)
Signed-off-by: Tomasz Janiszewski <tomek@redhat.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> Co-authored-by: Mladen Todorovic <mtodor@gmail.com>
1 parent f74e98e commit cb19cfb

24 files changed

+1041
-3
lines changed

.github/dependabot.yml

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
version: 2
2+
updates:
3+
# Monitor root Go module
4+
- package-ecosystem: "gomod"
5+
directory: "/"
6+
schedule:
7+
interval: "daily"
8+
commit-message:
9+
prefix: "chore"
10+
prefix-development: "chore"
11+
include: "scope"
12+
13+
# Monitor e2e-tests tools Go module
14+
- package-ecosystem: "gomod"
15+
directory: "/e2e-tests/tools"
16+
schedule:
17+
interval: "daily"
18+
commit-message:
19+
prefix: "chore"
20+
prefix-development: "chore"
21+
include: "scope"

.github/workflows/test.yml

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,11 +25,26 @@ jobs:
2525
uses: actions/setup-go@v5
2626

2727
- name: Download dependencies
28-
run: go mod download
28+
run: find . -name go.mod -execdir go mod download \;
29+
30+
- name: Verify go.mod and go.sum are up to date
31+
run: |
32+
find . -name go.mod -execdir go mod tidy \;
33+
if [ -n "$(git status --porcelain)" ]; then
34+
echo "Error: go.mod or go.sum files are not up to date"
35+
echo "Modified files:"
36+
git status --porcelain
37+
echo ""
38+
echo "Please run 'go mod tidy' in all directories containing go.mod and commit the changes"
39+
exit 1
40+
fi
2941
3042
- name: Run tests with coverage
3143
run: make test-coverage-and-junit
3244

45+
- name: Run E2E smoke test
46+
run: make e2e-smoke-test
47+
3348
- name: Upload test results to Codecov
3449
uses: codecov/test-results-action@v1
3550
with:

.gitignore

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,3 +16,9 @@
1616

1717
# Lint output
1818
/report.xml
19+
20+
# E2E tests
21+
/e2e-tests/.env
22+
/e2e-tests/mcp-reports/
23+
/e2e-tests/bin/
24+
/e2e-tests/**/*-out.json

Makefile

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,14 @@ helm-lint: ## Run helm lint for Helm chart
5757
test: ## Run unit tests
5858
$(GOTEST) -v ./...
5959

60+
.PHONY: e2e-smoke-test
61+
e2e-smoke-test: ## Run E2E smoke test (build and verify mcpchecker)
62+
@cd e2e-tests && ./scripts/smoke-test.sh
63+
64+
.PHONY: e2e-test
65+
e2e-test: ## Run E2E tests
66+
@cd e2e-tests && ./scripts/run-tests.sh
67+
6068
.PHONY: test-coverage-and-junit
6169
test-coverage-and-junit: ## Run unit tests with coverage and junit output
6270
go install github.com/jstemmer/go-junit-report/v2@v2.1.0

e2e-tests/README.md

Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
# StackRox MCP E2E Testing
2+
3+
End-to-end tests for the StackRox MCP server using [mcpchecker](https://github.com/mcpchecker/mcpchecker).
4+
5+
## Quick Start
6+
7+
### Smoke Test (No Agent Required)
8+
9+
Validate configuration and build without running actual agents:
10+
11+
```bash
12+
cd e2e-tests
13+
./scripts/smoke-test.sh
14+
```
15+
16+
This is useful for CI and quickly checking that everything compiles.
17+
18+
## Prerequisites
19+
20+
- Go 1.25+
21+
- Google Cloud Project with Vertex AI enabled (for Claude agent)
22+
- OpenAI API Key (for LLM judge)
23+
- StackRox API Token
24+
25+
## Setup
26+
27+
### 1. Build mcpchecker
28+
29+
```bash
30+
cd e2e-tests
31+
./scripts/build-mcpchecker.sh
32+
```
33+
34+
### 2. Configure Environment
35+
36+
Create `.env` file:
37+
38+
```bash
39+
# Required: GCP Project for Vertex AI (Claude agent)
40+
ANTHROPIC_VERTEX_PROJECT_ID=<GCP Project ID>
41+
42+
# Required: StackRox Central API Token
43+
STACKROX_MCP__CENTRAL__API_TOKEN=<StackRox API Token>
44+
45+
# Required: OpenAI API Key (for LLM judge)
46+
OPENAI_API_KEY=<OpenAI API Key>
47+
48+
# Optional: Vertex AI region (defaults to us-east5)
49+
CLOUD_ML_REGION=us-east5
50+
51+
# Optional: Judge configuration (defaults to OpenAI)
52+
JUDGE_MODEL_NAME=gpt-5-nano
53+
```
54+
55+
## Running Tests
56+
57+
```bash
58+
./scripts/run-tests.sh
59+
```
60+
61+
Results are saved to `mcpchecker/mcpchecker-stackrox-mcp-e2e-out.json`.
62+
63+
### View Results
64+
65+
```bash
66+
# Summary
67+
jq '.[] | {taskName, taskPassed}' mcpchecker/mcpchecker-stackrox-mcp-e2e-out.json
68+
69+
# Tool calls
70+
jq '[.[] | .callHistory.ToolCalls[]? | {name: .request.Params.name, arguments: .request.Params.arguments}]' mcpchecker/mcpchecker-stackrox-mcp-e2e-out.json
71+
```
72+
73+
## Test Cases
74+
75+
| Test | Description | Tool |
76+
|------|-------------|------|
77+
| `list-clusters` | List all clusters | `list_clusters` |
78+
| `cve-detected-workloads` | CVE detected in deployments | `get_deployments_for_cve` |
79+
| `cve-detected-clusters` | CVE detected in clusters | `get_clusters_with_orchestrator_cve` |
80+
| `cve-nonexistent` | Handle non-existent CVE | `get_clusters_with_orchestrator_cve` |
81+
| `cve-cluster-does-exist` | CVE with cluster filter | `get_clusters_with_orchestrator_cve` |
82+
| `cve-cluster-does-not-exist` | CVE with cluster filter | `get_clusters_with_orchestrator_cve` |
83+
| `cve-clusters-general` | General CVE query | `get_clusters_with_orchestrator_cve` |
84+
| `cve-cluster-list` | CVE across clusters | `get_clusters_with_orchestrator_cve` |
85+
86+
## Configuration
87+
88+
- **`mcpchecker/eval.yaml`**: Main test configuration, agent settings, assertions
89+
- **`mcpchecker/mcp-config.yaml`**: MCP server configuration
90+
- **`mcpchecker/tasks/*.yaml`**: Individual test task definitions
91+
92+
## How It Works
93+
94+
mcpchecker uses a proxy architecture to intercept MCP tool calls:
95+
96+
1. AI agent receives task prompt
97+
2. Agent calls MCP tool
98+
3. mcpchecker proxy intercepts and records the call
99+
4. Call forwarded to StackRox MCP server
100+
5. Server executes and returns result
101+
6. mcpchecker validates assertions and response quality
102+
103+
## Troubleshooting
104+
105+
**Tests fail - no tools called**
106+
- Verify StackRox Central is accessible
107+
- Check API token permissions
108+
109+
**Build errors**
110+
```bash
111+
go mod tidy
112+
./scripts/build-mcpchecker.sh
113+
```
114+
115+
## Further Reading
116+
117+
- [mcpchecker Documentation](https://github.com/mcpchecker/mcpchecker)
118+
- [StackRox MCP Server](../README.md)

e2e-tests/mcpchecker/eval.yaml

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
kind: Eval
2+
metadata:
3+
name: "stackrox-mcp-e2e"
4+
config:
5+
agent:
6+
type: "builtin.claude-code"
7+
model: "claude-sonnet-4-5"
8+
llmJudge:
9+
env:
10+
baseUrlKey: JUDGE_BASE_URL
11+
apiKeyKey: JUDGE_API_KEY
12+
modelNameKey: JUDGE_MODEL_NAME
13+
mcpConfigFile: mcp-config.yaml
14+
taskSets:
15+
# Assertion Fields Explained:
16+
# - toolsUsed: List of tools that MUST be called at least once
17+
# - minToolCalls: Minimum TOTAL number of tool calls across ALL tools (not per-tool)
18+
# - maxToolCalls: Maximum TOTAL number of tool calls across ALL tools (prevents runaway tool usage)
19+
# Example: If maxToolCalls=3, the agent can make up to 3 tool calls total in the test,
20+
# regardless of which tools are called.
21+
22+
# Test 1: List clusters
23+
- path: tasks/list-clusters.yaml
24+
assertions:
25+
toolsUsed:
26+
- server: stackrox-mcp
27+
toolPattern: "list_clusters"
28+
minToolCalls: 1
29+
maxToolCalls: 1
30+
31+
# Test 2: CVE detected in workloads
32+
# Claude does comprehensive CVE checking (orchestrator, deployments, nodes)
33+
- path: tasks/cve-detected-workloads.yaml
34+
assertions:
35+
toolsUsed:
36+
- server: stackrox-mcp
37+
toolPattern: "get_deployments_for_cve"
38+
argumentsMatch:
39+
cveName: "CVE-2021-31805"
40+
minToolCalls: 1
41+
maxToolCalls: 3
42+
43+
# Test 3: CVE detected in clusters - basic
44+
- path: tasks/cve-detected-clusters.yaml
45+
assertions:
46+
toolsUsed:
47+
- server: stackrox-mcp
48+
toolPattern: "get_clusters_with_orchestrator_cve"
49+
argumentsMatch:
50+
cveName: "CVE-2016-1000031"
51+
minToolCalls: 1
52+
maxToolCalls: 3
53+
54+
# Test 4: Non-existent CVE
55+
# Expects 3 calls because "Is CVE detected in my clusters?" triggers comprehensive check
56+
# (orchestrator, deployments, nodes). The LLM cannot know beforehand if CVE exists.
57+
- path: tasks/cve-nonexistent.yaml
58+
assertions:
59+
toolsUsed:
60+
- server: stackrox-mcp
61+
toolPattern: "get_clusters_with_orchestrator_cve"
62+
argumentsMatch:
63+
cveName: "CVE-2099-00001"
64+
minToolCalls: 1
65+
maxToolCalls: 3
66+
67+
# Test 5: CVE with specific cluster filter (does exist)
68+
# Claude does comprehensive checking even for single cluster (orchestrator, deployments, nodes)
69+
- path: tasks/cve-cluster-does-exist.yaml
70+
assertions:
71+
toolsUsed:
72+
- server: stackrox-mcp
73+
toolPattern: "list_clusters"
74+
- server: stackrox-mcp
75+
toolPattern: "get_clusters_with_orchestrator_cve"
76+
argumentsMatch:
77+
cveName: "CVE-2016-1000031"
78+
minToolCalls: 2
79+
maxToolCalls: 4
80+
81+
# Test 6: CVE with specific cluster filter (does not exist)
82+
- path: tasks/cve-cluster-does-not-exist.yaml
83+
assertions:
84+
toolsUsed:
85+
- server: stackrox-mcp
86+
toolPattern: "list_clusters"
87+
minToolCalls: 1
88+
maxToolCalls: 2
89+
90+
# Test 7: CVE detected in clusters - general
91+
- path: tasks/cve-clusters-general.yaml
92+
assertions:
93+
toolsUsed:
94+
- server: stackrox-mcp
95+
toolPattern: "get_clusters_with_orchestrator_cve"
96+
argumentsMatch:
97+
cveName: "CVE-2021-31805"
98+
minToolCalls: 1
99+
maxToolCalls: 5
100+
101+
# Test 8: CVE check with cluster list reference
102+
- path: tasks/cve-cluster-list.yaml
103+
assertions:
104+
toolsUsed:
105+
- server: stackrox-mcp
106+
toolPattern: "get_clusters_with_orchestrator_cve"
107+
argumentsMatch:
108+
cveName: "CVE-2024-52577"
109+
minToolCalls: 1
110+
maxToolCalls: 5
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
mcpServers:
2+
stackrox-mcp:
3+
command: go
4+
args:
5+
- run
6+
- ../../cmd/stackrox-mcp/...
7+
- --config
8+
- ../stackrox-mcp-e2e-config.yaml
9+
# API token loaded from parent shell environment (.env file)
10+
# No env section = full environment inheritance
11+
# Auto-approve all tools
12+
enableAllTools: true
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
kind: Task
2+
metadata:
3+
name: "cve-cluster-does-exist"
4+
difficulty: easy
5+
steps:
6+
prompt:
7+
inline: "Is this CVE-2016-1000031 detected in cluster name staging-central-cluster?"
8+
verify:
9+
contains: "A response indicating whether CVE-2016-1000031 is detected in staging-central-cluster"
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
kind: Task
2+
metadata:
3+
name: "cve-cluster-does-not-exist"
4+
difficulty: easy
5+
steps:
6+
prompt:
7+
inline: "Is this CVE-2024-52577 detected in cluster name prod-hal-9000?"
8+
verify:
9+
contains: "A response indicating that CVE-2024-52577 is not detected in prod-hal-9000 cluster, or that the cluster does not exist"
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
kind: Task
2+
metadata:
3+
name: "cve-cluster-list"
4+
difficulty: easy
5+
steps:
6+
prompt:
7+
inline: "Is this CVE-2024-52577 detected in any of my clusters?"
8+
verify:
9+
contains: "A response indicating whether CVE-2024-52577 is detected in clusters"

0 commit comments

Comments
 (0)