This repository contains two NLP classification models built with the Hugging Face Transformers library. The project focuses on two distinct domains:
- Adversarial Prompt Security (binary classification)
- Scientific Text Classification (multiclass classification)
Both projects are unified by a common pipeline of Transformer-based classification and data augmentation.
Table of Contents
- Overview
- Quick Start Guide
- Local
- Demo
- Tests
- Documentation
- Project Structure
- Conventions
- Project Description
- Project Extension and Future Work
This section is intentionally ordered so a first-time contributor can move from machine setup to secure configuration and then to deterministic testing.
To start locally, first ensure you have just and uv installed. If you
don't, run the following OS-specific commands:
MacOS:
brew install just uvLinux (Debian/Ubuntu):
sudo apt-get update
sudo apt-get install -y just
curl -LsSf https://astral.sh/uv/install.sh | sh
# then restart your shell so uv is on PATHWindows:
# uv (official installer)
irm https://astral.sh/uv/install.ps1 | iex
# just — pick one package manager you support in your project:
# winget (preferred if available)
winget install casey.just -e # if this ID doesn't resolve on some systems, use one of the following lines
# scoop
scoop install just
# chocolatey
choco install just
Then, install the dependencies and activate the virtual environment by running:
just install
source .venv/bin/activateThe project expects two runtime keys:
OPENAI_API_KEYWANDB_API_KEY
Use the committed template and keep real keys in a local, ignored file:
cp .env.example .envThen edit .env and replace placeholder values with your own keys.
Notes:
.envis ignored by Git and must never be committed.- Shell environment variables override
.envvalues. - Missing required keys raise a configuration error early.
- Runtime configuration is modelled with
pydantic-settingsviaRuntimeSettings. load_settings()uses process env and falls back to the default.env.load_settings(env_file=None)disables env-file loading entirely.
Example usage in code:
from TransformerClassifiers import load_api_keys, load_settings, load_prompt_injection_dataset
api_keys = load_api_keys()
# api_keys.openai_api_key
# api_keys.wandb_api_key
settings = load_settings()
# settings.dataset_cache_dir
# settings.request_timeout_seconds
# settings.results_database_url
dataset_bundle = load_prompt_injection_dataset(settings=settings)
# dataset_bundle.dataset["train"]
# dataset_bundle.metadata.cache_key
# dataset_bundle.to_pandas("train")Additional settings (datasets, network, database, artefact paths) use the
TRANSFORMER_CLASSIFIERS_ env var prefix, for example:
TRANSFORMER_CLASSIFIERS_DATASET_CACHE_DIRTRANSFORMER_CLASSIFIERS_REQUEST_TIMEOUT_SECONDSTRANSFORMER_CLASSIFIERS_RESULTS_DATABASE_URLTRANSFORMER_CLASSIFIERS_MODEL_ARTIFACTS_DIRTRANSFORMER_CLASSIFIERS_PROMPT_INJECTION_DATASET_IDTRANSFORMER_CLASSIFIERS_PROMPT_INJECTION_DATASET_REVISIONTRANSFORMER_CLASSIFIERS_PROMPT_INJECTION_VALIDATION_FRACTIONTRANSFORMER_CLASSIFIERS_PROMPT_INJECTION_SPLIT_SEED
Why this matters:
- It keeps credentials and environment-specific settings out of code.
- It makes local, CI, and deployment behaviour consistent.
- It gives us one typed configuration contract as the project grows.
The first formal pipeline stage is the prompt-injection data loader. It:
- loads the source dataset from Hugging Face
- validates the raw dataset contract
- derives a deterministic validation split from the source training split
- caches the prepared
DatasetDictlocally for reuse
Example:
from TransformerClassifiers import load_prompt_injection_dataset
bundle = load_prompt_injection_dataset()
train_dataset = bundle.dataset["train"]
train_frame = bundle.to_pandas("train")
metadata = bundle.metadataPrepared datasets are reused from the local cache by default. Use
force_refresh=True when you explicitly want to rebuild the prepared cache.
TBA
To run the default offline test suite, make sure you have the virtual environment activated and run:
python -m pytestTo check coverage, run:
python -m pytest -m "not external_api" --cov=src --cov-fail-under=90 --cov-report=term-missingExternal live API tests (OpenAI/W&B) are opt-in and excluded from PR CI by default. Run them only when credentials and provider SDKs are available:
export RUN_EXTERNAL_API_TESTS=true
export OPENAI_API_KEY=...
export WANDB_API_KEY=...
uv pip install openai wandb
python -m pytest -m external_api -qIn GitHub Actions, the dedicated live-test workflow
.github/workflows/external-api-tests.yml is manual-only via workflow_dispatch.
The live-test gate accepts either the unprefixed credential variables above or their supported prefixed aliases:
TRANSFORMER_CLASSIFIERS_OPENAI_API_KEYTRANSFORMER_CLASSIFIERS_WANDB_API_KEY
Why two test tiers:
- Offline tests provide fast, stable feedback for every contributor and PR.
- Live tests validate real provider wiring without making normal PR CI flaky or costly.
CI is configured in .github/workflows/ci.yml and is intentionally PR-focused.
It runs for open, reopened, synchronised, and ready-for-review pull requests.
Draft pull requests are ignored until they are marked as ready.
Dependency installation in CI uses uv sync --group dev --frozen to enforce lockfile reproducibility.
CI pipeline stages (in execution order):
Check PR Commit Policy- Fails if the PR has anything other than exactly one commit.
- Fails if commit messages start with
fixup!orsquash!. - Keeps PR history clean before merge.
Secret Scanning- Runs the
detect-secretspre-commit hook against all files. - Fails the PR on newly introduced secrets or private key material.
- Runs the
Pre-commit Checks- Runs all hooks from
.pre-commit-config.yaml. - Enforces formatting, linting, and lightweight safety checks.
- Runs all hooks from
Type Check (Pyright)- Runs static type checks with
pyright. - Catches interface/typing issues before runtime tests.
- Runs static type checks with
Smoke Tests- Runs the
smokemarker subset (pytest -m smoke). - Provides a fast runtime sanity check before full tests.
- Runs the
Pytest (Python 3.11)- Runs all non-live tests (
-m "not external_api") and enforces a minimum coverage of90%. - Uploads
coverage.xmlas a workflow artifact for inspection.
- Runs all non-live tests (
Docs Build- Runs
mkdocs build --strict. - Fails the PR if documentation pages, links, or API autodoc references are invalid.
- Runs
Dependency Vulnerability Audit (Non-blocking)- Runs
pip-auditagainst installed dependencies. - Reports known vulnerabilities in CI logs.
- Is intentionally non-blocking while security posture is being established.
- Runs with
if: always()so findings are still emitted when test stages fail.
- Runs
Security and dependency maintenance is configured with Dependabot in
.github/dependabot.yml:
- Weekly Python dependency update PRs (from
pyproject.toml). - Weekly GitHub Actions version update PRs.
The workflow also uses concurrency cancellation:
- When new commits are pushed to the same PR, in-progress older runs are cancelled.
- This avoids stale CI feedback and reduces consumed GitHub Actions minutes.
Branch protection/ruleset alignment:
- Require a pull request before merging.
- Required approvals:
0(solo workflow), while keeping code-owner and conversation rules. - Require review from Code Owners.
- Require conversation resolution before merging.
- Require status checks to pass (must be enabled), with required checks:
Check PR Commit PolicySecret ScanningPre-commit ChecksType Check (Pyright)Smoke TestsPytest (Python 3.11)Docs Build- Keep
Dependency Vulnerability Audit (Non-blocking)as informational for now, rather than as a required blocking status check. - Block force pushes.
- Require linear history.
- Allow squash merging (and optional rebase merging), with merge commits disabled.
Operational rules we enforce in day-to-day development:
- Never commit plaintext secrets, keys, or credential files.
- Run
pre-commitbefore pushing; secret scanning is a required CI gate. - Keep secrets in local
.envfiles or GitHub Secrets, never in notebooks/docs. - Rotate keys immediately after any suspected exposure.
- Use the smallest required permissions/scope for every issued key.
- For proven scanner false positives in test fixtures, use line-level
# pragma: allowlist secretonly; do not weaken global scanner rules.
Full policy and acceptance criteria: Security and Developer Experience Contract.
Project documentation is built with MkDocs Material and published to GitHub Pages.
The site combines hand-written guides from docs/ and API reference pages generated
from in-code Google-style docstrings.
Local docs commands:
just docs-build
just docs-serveDocumentation workflows:
- PRs run a strict build (
uv run mkdocs build --strict) as a blocking CI gate. - Pushes to
maintrigger.github/workflows/docs-publish.ymlto deploy to GitHub Pages.
GitHub repository settings needed once:
- Settings -> Pages -> Build and deployment -> Source:
GitHub Actions. - Branch protection/ruleset -> required status checks: include
Docs Build.
The project structure can be seen below, with files having the following roles:
| Folder | File | Description |
|---|---|---|
| ... | ... | ... |
TBA
TBA
Designed to detect prompt injections and jailbreak attempts (e.g., "ignore previous instructions", "DAN", roleplay).
- Backbone: ...
- Focus: ...
- Techniques: ...
Classifies text into scientific/technical categories versus general content.
- Backbone: ...
- Focus: ...
- Techniques: ...
TBA