Skip to content

AndreiRoibu/Transformer-Classifiers

Repository files navigation

Transformer Classifier

1. Overview

This repository contains two NLP classification models built with the Hugging Face Transformers library. The project focuses on two distinct domains:

  • Adversarial Prompt Security (binary classification)
  • Scientific Text Classification (multiclass classification)

Both projects are unified by a common pipeline of Transformer-based classification and data augmentation.

Table of Contents

  1. Overview
  2. Quick Start Guide
    1. Local
    2. Demo
    3. Tests
    4. Documentation
    5. Project Structure
    6. Conventions
  3. Project Description
  4. Project Extension and Future Work

2. Quick Start Guide

This section is intentionally ordered so a first-time contributor can move from machine setup to secure configuration and then to deterministic testing.

2.1. Local

To start locally, first ensure you have just and uv installed. If you don't, run the following OS-specific commands:

MacOS:

brew install just uv

Linux (Debian/Ubuntu):

sudo apt-get update
sudo apt-get install -y just
curl -LsSf https://astral.sh/uv/install.sh | sh
# then restart your shell so uv is on PATH

Windows:

# uv (official installer)
irm https://astral.sh/uv/install.ps1 | iex

# just — pick one package manager you support in your project:
# winget (preferred if available)
winget install casey.just -e  # if this ID doesn't resolve on some systems, use one of the following lines
# scoop
scoop install just
# chocolatey
choco install just

Then, install the dependencies and activate the virtual environment by running:

just install
source .venv/bin/activate

2.1.1. Configure API Keys Securely

The project expects two runtime keys:

  • OPENAI_API_KEY
  • WANDB_API_KEY

Use the committed template and keep real keys in a local, ignored file:

cp .env.example .env

Then edit .env and replace placeholder values with your own keys.

Notes:

  • .env is ignored by Git and must never be committed.
  • Shell environment variables override .env values.
  • Missing required keys raise a configuration error early.
  • Runtime configuration is modelled with pydantic-settings via RuntimeSettings.
  • load_settings() uses process env and falls back to the default .env.
  • load_settings(env_file=None) disables env-file loading entirely.

Example usage in code:

from TransformerClassifiers import load_api_keys, load_settings, load_prompt_injection_dataset

api_keys = load_api_keys()
# api_keys.openai_api_key
# api_keys.wandb_api_key

settings = load_settings()
# settings.dataset_cache_dir
# settings.request_timeout_seconds
# settings.results_database_url

dataset_bundle = load_prompt_injection_dataset(settings=settings)
# dataset_bundle.dataset["train"]
# dataset_bundle.metadata.cache_key
# dataset_bundle.to_pandas("train")

Additional settings (datasets, network, database, artefact paths) use the TRANSFORMER_CLASSIFIERS_ env var prefix, for example:

  • TRANSFORMER_CLASSIFIERS_DATASET_CACHE_DIR
  • TRANSFORMER_CLASSIFIERS_REQUEST_TIMEOUT_SECONDS
  • TRANSFORMER_CLASSIFIERS_RESULTS_DATABASE_URL
  • TRANSFORMER_CLASSIFIERS_MODEL_ARTIFACTS_DIR
  • TRANSFORMER_CLASSIFIERS_PROMPT_INJECTION_DATASET_ID
  • TRANSFORMER_CLASSIFIERS_PROMPT_INJECTION_DATASET_REVISION
  • TRANSFORMER_CLASSIFIERS_PROMPT_INJECTION_VALIDATION_FRACTION
  • TRANSFORMER_CLASSIFIERS_PROMPT_INJECTION_SPLIT_SEED

Why this matters:

  • It keeps credentials and environment-specific settings out of code.
  • It makes local, CI, and deployment behaviour consistent.
  • It gives us one typed configuration contract as the project grows.

2.1.2. Load the Prompt-Injection Dataset

The first formal pipeline stage is the prompt-injection data loader. It:

  • loads the source dataset from Hugging Face
  • validates the raw dataset contract
  • derives a deterministic validation split from the source training split
  • caches the prepared DatasetDict locally for reuse

Example:

from TransformerClassifiers import load_prompt_injection_dataset

bundle = load_prompt_injection_dataset()

train_dataset = bundle.dataset["train"]
train_frame = bundle.to_pandas("train")
metadata = bundle.metadata

Prepared datasets are reused from the local cache by default. Use force_refresh=True when you explicitly want to rebuild the prepared cache.

2.2. Demo

TBA

2.3. Tests and CI

To run the default offline test suite, make sure you have the virtual environment activated and run:

python -m pytest

To check coverage, run:

python -m pytest -m "not external_api" --cov=src --cov-fail-under=90 --cov-report=term-missing

External live API tests (OpenAI/W&B) are opt-in and excluded from PR CI by default. Run them only when credentials and provider SDKs are available:

export RUN_EXTERNAL_API_TESTS=true
export OPENAI_API_KEY=...
export WANDB_API_KEY=...
uv pip install openai wandb
python -m pytest -m external_api -q

In GitHub Actions, the dedicated live-test workflow .github/workflows/external-api-tests.yml is manual-only via workflow_dispatch.

The live-test gate accepts either the unprefixed credential variables above or their supported prefixed aliases:

  • TRANSFORMER_CLASSIFIERS_OPENAI_API_KEY
  • TRANSFORMER_CLASSIFIERS_WANDB_API_KEY

Why two test tiers:

  • Offline tests provide fast, stable feedback for every contributor and PR.
  • Live tests validate real provider wiring without making normal PR CI flaky or costly.

CI is configured in .github/workflows/ci.yml and is intentionally PR-focused. It runs for open, reopened, synchronised, and ready-for-review pull requests. Draft pull requests are ignored until they are marked as ready. Dependency installation in CI uses uv sync --group dev --frozen to enforce lockfile reproducibility.

CI pipeline stages (in execution order):

  1. Check PR Commit Policy
    • Fails if the PR has anything other than exactly one commit.
    • Fails if commit messages start with fixup! or squash!.
    • Keeps PR history clean before merge.
  2. Secret Scanning
    • Runs the detect-secrets pre-commit hook against all files.
    • Fails the PR on newly introduced secrets or private key material.
  3. Pre-commit Checks
    • Runs all hooks from .pre-commit-config.yaml.
    • Enforces formatting, linting, and lightweight safety checks.
  4. Type Check (Pyright)
    • Runs static type checks with pyright.
    • Catches interface/typing issues before runtime tests.
  5. Smoke Tests
    • Runs the smoke marker subset (pytest -m smoke).
    • Provides a fast runtime sanity check before full tests.
  6. Pytest (Python 3.11)
    • Runs all non-live tests (-m "not external_api") and enforces a minimum coverage of 90%.
    • Uploads coverage.xml as a workflow artifact for inspection.
  7. Docs Build
    • Runs mkdocs build --strict.
    • Fails the PR if documentation pages, links, or API autodoc references are invalid.
  8. Dependency Vulnerability Audit (Non-blocking)
    • Runs pip-audit against installed dependencies.
    • Reports known vulnerabilities in CI logs.
    • Is intentionally non-blocking while security posture is being established.
    • Runs with if: always() so findings are still emitted when test stages fail.

Security and dependency maintenance is configured with Dependabot in .github/dependabot.yml:

  • Weekly Python dependency update PRs (from pyproject.toml).
  • Weekly GitHub Actions version update PRs.

The workflow also uses concurrency cancellation:

  • When new commits are pushed to the same PR, in-progress older runs are cancelled.
  • This avoids stale CI feedback and reduces consumed GitHub Actions minutes.

Branch protection/ruleset alignment:

  • Require a pull request before merging.
  • Required approvals: 0 (solo workflow), while keeping code-owner and conversation rules.
  • Require review from Code Owners.
  • Require conversation resolution before merging.
  • Require status checks to pass (must be enabled), with required checks:
  • Check PR Commit Policy
  • Secret Scanning
  • Pre-commit Checks
  • Type Check (Pyright)
  • Smoke Tests
  • Pytest (Python 3.11)
  • Docs Build
  • Keep Dependency Vulnerability Audit (Non-blocking) as informational for now, rather than as a required blocking status check.
  • Block force pushes.
  • Require linear history.
  • Allow squash merging (and optional rebase merging), with merge commits disabled.

2.3.1. Security Hardening Rules

Operational rules we enforce in day-to-day development:

  1. Never commit plaintext secrets, keys, or credential files.
  2. Run pre-commit before pushing; secret scanning is a required CI gate.
  3. Keep secrets in local .env files or GitHub Secrets, never in notebooks/docs.
  4. Rotate keys immediately after any suspected exposure.
  5. Use the smallest required permissions/scope for every issued key.
  6. For proven scanner false positives in test fixtures, use line-level # pragma: allowlist secret only; do not weaken global scanner rules.

Full policy and acceptance criteria: Security and Developer Experience Contract.

2.4. Documentation

Project documentation is built with MkDocs Material and published to GitHub Pages. The site combines hand-written guides from docs/ and API reference pages generated from in-code Google-style docstrings.

Local docs commands:

just docs-build
just docs-serve

Documentation workflows:

  • PRs run a strict build (uv run mkdocs build --strict) as a blocking CI gate.
  • Pushes to main trigger .github/workflows/docs-publish.yml to deploy to GitHub Pages.

GitHub repository settings needed once:

  • Settings -> Pages -> Build and deployment -> Source: GitHub Actions.
  • Branch protection/ruleset -> required status checks: include Docs Build.

2.5. Project Structure

The project structure can be seen below, with files having the following roles:

Folder File Description
... ... ...

2.6. Conventions

TBA

3. Project Description

TBA

3.1. Models

3.1.1. Adversarial Prompt Classifier

Designed to detect prompt injections and jailbreak attempts (e.g., "ignore previous instructions", "DAN", roleplay).

  • Backbone: ...
  • Focus: ...
  • Techniques: ...

3.1.2. Scientific Text Classifier

Classifies text into scientific/technical categories versus general content.

  • Backbone: ...
  • Focus: ...
  • Techniques: ...

4. Project Extension and Future Work

TBA

About

This repository contains NLP classification models built with the Hugging Face Transformers library. The project focuses on two distinct domains: Adversarial Prompt Security and Scientific Text Classification, unified by a common pipeline of Transformer-based classification and data augmentation.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors