Cipherscope is a static analysis scanner designed to build a cryptographic inventory. It parses source files using Tree-sitter, matches library anchors and algorithm symbols, and emits JSONL findings that can be aggregated into an inventory.
flowchart TD
A[Discovery] --> B[Parsing]
B --> C[Library Anchoring]
C --> D[Algorithm Detection]
D --> E[JSONL Output]
A --> A1[File walk + filters]
B --> B1[Tree-sitter AST]
C --> C1[Import/include anchors]
D --> D1[Symbol match + params]
D1 --> D2[Local constant resolution]
E --> E1[Library + algorithm assets]
- Library hit: name, file path, evidence location.
- Algorithm hit: name, file path, evidence location, metadata (e.g., key size, primitive).
- Output format is designed for tooling pipelines and inventory aggregation.
{
"assetType": "library|algorithm",
"identifier": "string",
"path": "string",
"evidence": {
"line": 1,
"column": 1
},
"metadata": {
"primitive": "string",
"keySize": 256
}
}To reduce overcounting on a single callsite, Cipherscope applies a simple same-line dedupe rule after matching:
- If two algorithms share the same
primitiveand line, drop the generic identifier when a more specific variant is present. - A more specific identifier is one that either:
- starts with the generic identifier plus a
-(e.g.,AES-GCMoverAES), or - shares the same non-numeric tokens but adds numeric detail (e.g.,
ECDSA-P256overECDSA).
- starts with the generic identifier plus a
- Different primitives on the same line are kept.
Patterns live in patterns.toml:
- Libraries define anchors and API regexes.
- Algorithms define symbol patterns and parameter extraction rules.
Adding a new library or algorithm usually only requires editing
patterns.toml.
- Inventory-first: it focuses on discovering crypto usage and relevant metadata.
- Local constant resolution only; cross-file or full data-flow analysis is out of scope for now.