A modular PDF toolkit (currently implementing a PDF merger and rotating pages functionality).
Designed to evolve into a toolkit for merging, splitting, rotating, and compressing PDF files.
Built as a ongoing hands-on practice project to deepen my Python skills.
Inspired by https://dailypythonprojects.substack.com/p/build-a-pdf-toolkit-with-python-day
The project follows this structure:
Layers: CLI → operations → core, with utils as shared helpers.
See docs/architecture.md for details.
pdf_tools/
│
├── main.py
│ # Application entry point
│ # CLI dispatch, logging configuration, exit codes
│
├── cli/
│ └── parser.py
│ # Defines CLI subcommands and argument parsing using argparse
│
├── core/
│ ├── merger.py
│ │ # Pure PDF logic
│ │ # Implements `merge_pdfs()`
│ └── transformer.py
│ # Pure PDF logic
│ # Implements `rotate_pages()`
│
├── operations/
│ ├── merge.py
│ │ # Operation orchestration layer for merge
│ │ # Performs validation, file discovery, and calls core logic
│ │ # Uses structured logging
│ │ # Raises `OperationError` on failure
│ └── rotate.py
│ # Operation orchestration layer for rotate
│ # Validates input file, parses page specs, and calls core logic
│ # Raises `OperationError` on failure
│
└── utils/
├── file_utils.py
│ # File system helpers
│ # e.g. `find_pdf_files()`
│
└── validation.py
# Generic validation helpers
# `validate_folder()`, `ensure_pdf_extension()`, `is_writable_directory()`
- separation of responsibilities (CLI adapter → operations → core)
- core: pure functions, no logging/CLI/HTTP
- operations: orchestration, validation, logging, user-friendly error messages
- make PDF operations reusable as importable functions
- maintain a clean and readable project structure
- structuring small Python projects into modular components
- applying separation of concerns (CLI layer vs operations vs core logic)
- designing operation orchestration layers (config, result dataclasses,
OperationError) - designing CLI tools with
argparse(subcommands, dest, required) - instantiating classes vs calling functions; modules vs classes
- controlled exception handling: raising custom errors, exception chaining
- safe file handling using context managers
- working with
pypdfto process structured documents - validation and normalisation: input, directories, write permissions, output paths
- type hints
- dataclasses
- managing virtual environments and project dependencies with
uvandgit
- Runtime:
Python 3.12 - Third-party:
pypdf - Standard library:
argparse,dataclasses,logging,os,sys - Tooling:
uv,git
Install uv (if you don’t have it yet):
Follow the official instructions for your platform: https://docs.astral.sh/uv/getting-started/installation/.
Clone and set up:
# clone the repository:
git clone https://github.com/jmozzi/pdf-tools
cd pdf-tools
# create and activate a virtual environment with uv:
uv venv
# install dependencies with uv (using `pyproject.toml`)
uv sync
Once inside the virtual environment:
Run without arguments: shows the help menu (subcommand required)
uv run main.py
Merge command:
uv run main.py merge -p path/to/folder -o merged_file.pdf
Rotate command:
uv run main.py rotate -i "path/to/input.pdf" -pg "1,3,5-7" -o rotated_output.pdf
Options for merge:
| Option | Short | Default | Description |
|---|---|---|---|
--path |
-p |
current directory | Folder containing PDF files |
--output |
-o |
merged_output.pdf |
Output filename |
Options for rotate:
| Option | Short | Default | Description |
|---|---|---|---|
--input |
-i |
(required) | Input PDF file to rotate |
--output |
-o |
rotated_output.pdf |
Output PDF filename |
--pages |
-pg |
(required) | Pages to rotate (1-based), can include comma-separated pages and ranges, e.g. 1,3,5-7 |
Help:
uv run main.py -h
uv run main.py merge -h
uv run main.py rotate -h
The docs/ folder holds detailed explanations:
| File | Content |
|---|---|
docs/architecture.md |
layers, responsibilities, error handling |
docs/flow-merge-cli-to-core.md |
how one merge command flows through the system |
docs/cli-parser.md |
how the CLI parser and subcommands work |
docs/operations-merge.md |
how MergeOperation works (config, result, validation, exceptions) |
docs/operations-merge-design-choices.md |
why I designed operations/merge.py this way |
docs/operations-rotate.md |
how RotateOperation works (config, result, validation, exceptions) |
- add unit tests for merger
- add split, compress commands and core functions
- add web layer (Flask or FastAPI), reusing operations
