End_To_End_Text_Summarizer

Workflows

update config.yaml
update params.yaml
update entity
update the configuration manager in src config
update the components
update the pipeline
update the main.py
update the app.py

Project Overview

This project is a comprehensive End-to-End Text Summarizer built using Python and the Hugging Face Transformers library. It leverages the Pegasus model to generate concise summaries of dialogue-based text (trained on the SAMSum dataset). The project is structured with a modular pipeline design, ensuring scalability and ease of maintenance.

Key Features

Modular Pipeline: Distinct stages for Data Ingestion, Validation, Transformation, Model Training, and Evaluation.
State-of-the-Art Model: Utilizes Google's Pegasus model for high-quality abstractive summarization.
Configuration Management: Centralized configuration via config.yaml and params.yaml.
Logging: Robust logging system for tracking pipeline execution.

Tech Stack

Language: Python
Libraries: Hugging Face Transformers, PyTorch, Datasets, Pandas, NLTK
Tools: Docker, FastAPI (planned)

How to Run

1. Clone the Repository

To get started, clone the repository to your local machine:

git clone https://github.com/krishnab0841/End_To_End_Text_Summarizer.git
cd End_To_End_Text_Summarizer

2. Create a Virtual Environment

It is recommended to use a virtual environment to manage dependencies:

conda create -n summary python=3.8 -y
conda activate summary

3. Install Dependencies

Install the required Python packages:

pip install -r requirements.txt

4. Run the Pipeline

Execute the main script to run the entire training and evaluation pipeline:

python main.py

Project Structure

config/: Configuration files.
src/: Source code for components and pipelines.
research/: Jupyter notebooks for experimentation.
artifacts/: Generated artifacts (datasets, models, metrics).

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github/workflows		.github/workflows
config		config
research		research
src/TextSummarizer		src/TextSummarizer
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
main.py		main.py
params.yaml		params.yaml
requirements.txt		requirements.txt
setup.py		setup.py
template.py		template.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

End_To_End_Text_Summarizer

Workflows

Project Overview

Key Features

Tech Stack

How to Run

1. Clone the Repository

2. Create a Virtual Environment

3. Install Dependencies

4. Run the Pipeline

Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

End_To_End_Text_Summarizer

Workflows

Project Overview

Key Features

Tech Stack

How to Run

1. Clone the Repository

2. Create a Virtual Environment

3. Install Dependencies

4. Run the Pipeline

Project Structure

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages