Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions docs/ci-integration.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,12 @@ url: https://docs.example.com
# options:
# maxLinksToTest: 50
# samplingStrategy: deterministic

# Optional: test specific pages (implies samplingStrategy: curated)
# pages:
# - https://docs.example.com/quickstart
# - url: https://docs.example.com/api/auth
# tag: api-reference
```

### Config resolution
Expand Down
30 changes: 30 additions & 0 deletions docs/quick-start.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,36 @@ You can combine this with `--checks` to run a single check against a single page
npx afdocs check https://docs.example.com/api/auth --sampling none --checks rendering-strategy
```

## Check a specific set of pages

If you want to check a handful of pages without running full discovery, pass them directly with `--urls`:

```bash
npx afdocs check https://docs.example.com --urls https://docs.example.com/quickstart,https://docs.example.com/api/auth
```

This skips page discovery and runs all checks against exactly those URLs. You can tag pages for grouped scoring by defining them in a config file:

```yaml
# agent-docs.config.yml
url: https://docs.example.com
pages:
- url: https://docs.example.com/quickstart
tag: getting-started
- url: https://docs.example.com/tutorials/first-app
tag: getting-started
- url: https://docs.example.com/api/auth
tag: api-reference
- url: https://docs.example.com/api/webhooks
tag: api-reference
```

```bash
npx afdocs check --format scorecard
```

The scorecard will include a Tag Scores section showing how each group of pages scores, with a per-check breakdown of what's passing and failing within each tag. The JSON output (`--format json --score`) includes full per-page detail for each tag. See [Config File Reference](/reference/config-file) for the full `pages` schema.

## Get consistent results between runs

By default, AFDocs randomly samples pages, so results can vary between runs. For reproducible results (useful when verifying a fix), use deterministic sampling:
Expand Down
13 changes: 9 additions & 4 deletions docs/reference/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,15 +77,17 @@ Some checks depend on others. If you include a check without its dependency, the

### Sampling

| Flag | Default | Description |
| ----------------------- | -------- | ----------------------------------------------------------- |
| `--sampling <strategy>` | `random` | URL sampling strategy: `random`, `deterministic`, or `none` |
| `--max-links <n>` | `50` | Maximum number of pages to sample |
| Flag | Default | Description |
| ----------------------- | -------- | ---------------------------------------------------------------------------- |
| `--sampling <strategy>` | `random` | URL sampling strategy: `random`, `deterministic`, `curated`, or `none` |
| `--max-links <n>` | `50` | Maximum number of pages to sample |
| `--urls <urls>` | | Comma-separated page URLs for curated scoring (implies `--sampling curated`) |

**Sampling strategies:**

- **`random`**: Shuffle discovered URLs and take the first N. Fast and broad, but results vary between runs. Useful for spot-checking pages across a large corpus.
- **`deterministic`**: Sort discovered URLs alphabetically and pick an even spread. Produces the same sample on repeated runs as long as the URL set is stable. Useful for CI or when verifying a fix.
- **`curated`**: Test a specific set of pages listed in the config file's `pages` field or passed via `--urls`. Skips discovery entirely. Useful for ongoing monitoring of representative pages or focused evaluation of specific sections.
- **`none`**: Skip discovery entirely. Only check the URL you pass on the command line.

```bash
Expand All @@ -95,6 +97,9 @@ afdocs check https://docs.example.com --sampling deterministic
# Check a single page
afdocs check https://docs.example.com/api/auth --sampling none

# Test specific pages without a config file
afdocs check https://docs.example.com --urls https://docs.example.com/quickstart,https://docs.example.com/api/auth

# Sample fewer pages for a quicker run
afdocs check https://docs.example.com --max-links 10

Expand Down
35 changes: 34 additions & 1 deletion docs/reference/config-file.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,12 @@ options:
thresholds:
pass: 50000
fail: 100000

# Optional: test specific pages instead of discovering via llms.txt/sitemap
# pages:
# - https://docs.example.com/quickstart
# - url: https://docs.example.com/api/auth
# tag: api-reference
```

## Fields
Expand All @@ -49,13 +55,40 @@ Override default runner options. All fields are optional:
| Field | Default | Description |
| ------------------ | -------- | ---------------------------------------------------- |
| `maxLinksToTest` | `50` | Maximum number of pages to sample |
| `samplingStrategy` | `random` | `random`, `deterministic`, or `none` |
| `samplingStrategy` | `random` | `random`, `deterministic`, `curated`, or `none` |
| `maxConcurrency` | `3` | Maximum concurrent HTTP requests |
| `requestDelay` | `200` | Delay between requests in milliseconds |
| `requestTimeout` | `30000` | Timeout for individual HTTP requests in milliseconds |
| `thresholds.pass` | `50000` | Page size pass threshold in characters |
| `thresholds.fail` | `100000` | Page size fail threshold in characters |

### `pages` (optional)

A list of specific page URLs to test. When `pages` is present and no `samplingStrategy` is explicitly set, the strategy defaults to `curated`, which skips discovery and tests exactly the listed pages.

Each entry can be a plain URL string or an object with `url` and an optional `tag` for grouped scoring:

```yaml
url: https://docs.example.com

pages:
# Plain URL strings
- https://docs.example.com/quickstart
- https://docs.example.com/install

# Objects with tags for grouped scoring
- url: https://docs.example.com/api/auth
tag: api-reference
- url: https://docs.example.com/api/users
tag: api-reference
```

When pages have tags, the scorecard and JSON output include per-tag aggregate scores, making it easy to compare agent-friendliness across sections of your documentation.

Tags are optional and can be mixed with plain URL strings. Pages without tags are included in the overall score but don't appear in any tag group.

Note that `maxLinksToTest` does not apply to curated pages; all listed pages are tested.

## Config resolution

The config loader searches for `agent-docs.config.yml` (or `.yaml`) starting from the current working directory and walking up the directory tree, similar to how ESLint and Prettier find their config files. This means the config works whether you're running the CLI from your project root or running a test file from a subdirectory.
Expand Down
11 changes: 11 additions & 0 deletions docs/reference/programmatic-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,15 @@ const report = await runChecks('https://docs.example.com', {
fail: 100000,
},
});

// Or test specific pages with curated sampling:
const curatedReport = await runChecks('https://docs.example.com', {
samplingStrategy: 'curated',
curatedPages: [
'https://docs.example.com/quickstart',
{ url: 'https://docs.example.com/api/auth', tag: 'api-reference' },
],
});
```

All options are optional. The defaults match the CLI defaults.
Expand Down Expand Up @@ -90,6 +99,8 @@ import type {
RunnerOptions,
CheckOptions,
AgentDocsConfig,
CuratedPageEntry,
PageConfigEntry,
} from 'afdocs';
```

Expand Down
39 changes: 39 additions & 0 deletions docs/reference/scoring-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,43 @@ This is the same function; the subpath is provided for consumers who want a narr
| `diagnostics` | `Diagnostic[]` | Interaction diagnostics that fired |
| `caps` | `ScoreCap[]` | Score caps that were applied |
| `resolutions` | `Record<string, string>` | Fix suggestions keyed by check ID |
| `tagScores` | `Record<string, TagScore>` | Per-tag aggregate scores (present when curated pages have tags) |

## TagScore

When curated pages have tags, each `TagScore` contains the aggregate score plus a per-check breakdown showing exactly which checks contributed and how each page fared:

| Field | Type | Description |
| ----------- | --------------------- | -------------------------------------------------------------- |
| `score` | `number` | Aggregate score for this tag (0-100) |
| `grade` | `Grade` | Letter grade |
| `pageCount` | `number` | Number of pages tagged with this tag |
| `checks` | `TagCheckBreakdown[]` | Per-check breakdown with weight, proportion, and page statuses |

Each `TagCheckBreakdown` contains:

| Field | Type | Description |
| ------------ | ---------------------------------------- | -------------------------------------------------- |
| `checkId` | `string` | The check ID |
| `category` | `string` | The check's category |
| `weight` | `number` | The check's effective weight in the scoring system |
| `proportion` | `number` | 0-1 proportion earned for this tag's pages |
| `pages` | `Array<{ url: string; status: string }>` | Per-page status within this check |

```ts
const score = computeScore(report);
if (score.tagScores) {
for (const [tag, tagScore] of Object.entries(score.tagScores)) {
console.log(`${tag}: ${tagScore.score}/100 (${tagScore.grade})`);
for (const check of tagScore.checks) {
if (check.proportion < 1) {
const failing = check.pages.filter((p) => p.status === 'fail');
console.log(` ${check.checkId}: ${failing.length} failing pages`);
}
}
}
}
```

## Grade conversion

Expand All @@ -62,6 +99,8 @@ import type {
ScoreResult,
CheckScore,
CategoryScore,
TagScore,
TagCheckBreakdown,
ScoreCap,
Diagnostic,
DiagnosticSeverity, // 'info' | 'warning' | 'critical'
Expand Down
10 changes: 10 additions & 0 deletions examples/agent-docs.config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,13 @@ url: https://docs.example.com
# options:
# maxLinksToTest: 50
# samplingStrategy: deterministic

# Optional: test specific pages instead of discovering via llms.txt/sitemap.
# When pages is present and no samplingStrategy is set, defaults to 'curated'.
# pages:
# - https://docs.example.com/quickstart
# - https://docs.example.com/install
# - url: https://docs.example.com/api/auth
# tag: api-reference
# - url: https://docs.example.com/api/users
# tag: api-reference
71 changes: 63 additions & 8 deletions src/cli/commands/check.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,13 @@ import { normalizeUrl, runChecks } from '../../runner.js';
import { formatText } from '../formatters/text.js';
import { formatJson } from '../formatters/json.js';
import { formatScorecard } from '../formatters/scorecard.js';
import type { SamplingStrategy } from '../../types.js';
import { findConfig } from '../../helpers/config.js';
import type { PageConfigEntry, SamplingStrategy } from '../../types.js';
import { findConfig, validatePages } from '../../helpers/config.js';

// Ensure all checks are registered
import '../../checks/index.js';

const SAMPLING_STRATEGIES = ['random', 'deterministic', 'none'] as const;
const SAMPLING_STRATEGIES = ['random', 'deterministic', 'curated', 'none'] as const;
const FORMAT_OPTIONS = ['text', 'json', 'scorecard'] as const;

export function registerCheckCommand(program: Command): void {
Expand All @@ -22,7 +22,14 @@ export function registerCheckCommand(program: Command): void {
.option('--max-concurrency <n>', 'Maximum concurrent requests')
.option('--request-delay <ms>', 'Delay between requests in ms')
.option('--max-links <n>', 'Maximum links to test')
.option('--sampling <strategy>', 'URL sampling strategy: random, deterministic, or none')
.option(
'--sampling <strategy>',
'URL sampling strategy: random, deterministic, curated, or none',
)
.option(
'--urls <urls>',
'Comma-separated page URLs for curated scoring (implies --sampling curated)',
)
.option('--pass-threshold <n>', 'Pass threshold in characters')
.option('--fail-threshold <n>', 'Fail threshold in characters')
.option('-v, --verbose', 'Show per-page details for checks with issues')
Expand All @@ -39,8 +46,49 @@ export function registerCheckCommand(program: Command): void {
return;
}

// Resolve URL: CLI arg > config url > error
const resolvedUrl = rawUrl ?? config?.url;
// Determine curated pages and sampling strategy (before URL resolution,
// since curated pages can provide a fallback base URL)
let curatedPages: PageConfigEntry[] | undefined;
let samplingRaw: string;

if (opts.urls) {
// --urls flag: parse comma-separated URLs, force curated strategy
const rawUrls = (opts.urls as string)
.split(',')
.map((s) => s.trim())
.filter(Boolean);
if (rawUrls.length === 0) {
process.stderr.write('Error: --urls requires at least one URL.\n');
process.exitCode = 1;
return;
}
try {
validatePages(rawUrls, '--urls');
} catch (err) {
process.stderr.write(`Error: ${(err as Error).message}\n`);
process.exitCode = 1;
return;
}
curatedPages = rawUrls;
samplingRaw = 'curated';
} else if (config?.pages && config.pages.length > 0) {
// Config has pages: use them, default to curated unless explicitly overridden
curatedPages = config.pages;
samplingRaw =
(opts.sampling as string | undefined) ?? config?.options?.samplingStrategy ?? 'curated';
} else {
// No curated pages: standard behavior
samplingRaw =
(opts.sampling as string | undefined) ?? config?.options?.samplingStrategy ?? 'random';
}

// Resolve URL: CLI arg > config url > first curated page origin > error
let resolvedUrl = rawUrl ?? config?.url;
if (!resolvedUrl && curatedPages && curatedPages.length > 0) {
const firstEntry = curatedPages[0];
const firstUrl = typeof firstEntry === 'string' ? firstEntry : firstEntry.url;
resolvedUrl = new URL(firstUrl).origin;
}
if (!resolvedUrl) {
process.stderr.write(
'Error: No URL provided. Pass a URL as an argument or set "url" in agent-docs.config.yml\n',
Expand All @@ -64,8 +112,6 @@ export function registerCheckCommand(program: Command): void {
return;
}

const samplingRaw =
(opts.sampling as string | undefined) ?? config?.options?.samplingStrategy ?? 'random';
const sampling = samplingRaw as SamplingStrategy;
if (!SAMPLING_STRATEGIES.includes(sampling)) {
process.stderr.write(
Expand All @@ -75,6 +121,14 @@ export function registerCheckCommand(program: Command): void {
return;
}

if (sampling === 'curated' && (!curatedPages || curatedPages.length === 0)) {
process.stderr.write(
'Error: Curated sampling requires pages. Use --urls or define "pages" in your config file.\n',
);
process.exitCode = 1;
return;
}

const maxConcurrency = parseInt(
String((opts.maxConcurrency as string | undefined) ?? config?.options?.maxConcurrency ?? 3),
10,
Expand Down Expand Up @@ -115,6 +169,7 @@ export function registerCheckCommand(program: Command): void {
requestDelay,
maxLinksToTest,
samplingStrategy: sampling,
curatedPages,
thresholds: {
pass: passThreshold,
fail: failThreshold,
Expand Down
30 changes: 30 additions & 0 deletions src/cli/formatters/scorecard.ts
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,36 @@ export function formatScorecard(report: ReportResult, scoreResult?: ScoreResult)
}
lines.push('');

// Tag scores (when curated pages have tags)
if (score.tagScores) {
lines.push(` ${chalk.bold('Tag Scores:')}`);
const sortedTags = Object.entries(score.tagScores).sort(([a], [b]) => a.localeCompare(b));
for (const [tag, tagScore] of sortedTags) {
lines.push(
formatCategoryLine(tag, tagScore.score, tagScore.grade) +
chalk.dim(` · ${tagScore.pageCount} page${tagScore.pageCount !== 1 ? 's' : ''}`),
);

// Show checks that aren't fully passing
const issues = tagScore.checks.filter((c) => c.proportion < 1);
for (const check of issues) {
const counts = { pass: 0, warn: 0, fail: 0 };
for (const p of check.pages) {
if (p.status in counts) counts[p.status as keyof typeof counts]++;
}
const parts: string[] = [];
if (counts.fail > 0) parts.push(chalk.red(`${counts.fail} fail`));
if (counts.warn > 0) parts.push(chalk.yellow(`${counts.warn} warn`));
if (counts.pass > 0) parts.push(chalk.green(`${counts.pass} pass`));
const worstStatus = counts.fail > 0 ? 'fail' : 'warn';
const label = STATUS_LABELS[worstStatus];
const color = STATUS_COLORS[worstStatus];
lines.push(` ${color(label)} ${check.checkId.padEnd(30)} ${parts.join(', ')}`);
}
}
lines.push('');
}

// Interaction diagnostics
if (score.diagnostics.length > 0) {
lines.push(` ${chalk.bold('Interaction Diagnostics:')}`);
Expand Down
Loading