Transformer from Scratch (PyTorch)

A faithful “from-scratch” implementation of the Transformer architecture:

Token/positional embeddings
Multi-Head Self-Attention
Encoder/Decoder stacks with residual connections + LayerNorm
Position-wise feed-forward networks
Teacher forcing / masking for decoder inputs (if seq2seq)
Clear training pipeline with early stopping & LR scheduling

🧠 Architecture

Embedding(d_model) + sinusoidal positional encoding
N encoder layers:
- Multi-Head Self-Attention (Q/K/V projections, scaled dot-product, softmax)
- Residual + LayerNorm
- FFN: Linear → ReLU → Linear
N decoder layers (if seq2seq):
- Masked self-attention
- Cross-attention with encoder outputs
- FFN block with residuals/LayerNorm
Classifier head (for classification) or linear projection to vocab (for generation)

Optimizer: AdamW
Scheduler: Cosine or warmup + decay
Loss: CrossEntropy (label smoothing optional)

🧾 requirements.txt

torch==2.4.1
numpy==2.1.3
pandas==2.2.3
matplotlib==3.9.3
seaborn==0.13.2
scikit-learn==1.5.2
tqdm==4.66.5

📌 Insights

Proper masking (pad & causal) is crucial for stable training
Warmup + cosine schedule helps initial convergence
LayerNorm placement (Pre-Norm) can improve gradient flow on deeper stacks

📁 Dataset

The dataset is in the repo.

📊 Results

All the results including, test, train, validation, corellations are in the notebook.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
README.md		README.md
transformer_from_scratch_seq.ipynb		transformer_from_scratch_seq.ipynb
yelp_review_polarity_csv.tar.gz		yelp_review_polarity_csv.tar.gz

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transformer from Scratch (PyTorch)

🧠 Architecture

🧾 requirements.txt

📌 Insights

📁 Dataset

📊 Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Transformer from Scratch (PyTorch)

🧠 Architecture

🧾 requirements.txt

📌 Insights

📁 Dataset

📊 Results

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages