This project reads nucleotide sequences from a FASTA file, finds all ATG start codons, and translates each open reading frame (ORF) into an amino acid sequence until a stop codon or the end of the sequence is reached.
- Handles multiple sequences in one FASTA file
- Finds all ATG start codons
- Stops translation at TAA, TAG, or TGA codons
- Outputs translated sequences in FASTA format
- Python 3.x
- Place your nucleotide sequences in a FASTA file, e.g.,
input.fasta. - Run the Python script:
python genetic_translation.py -i input.fasta -o output.fasta- The translated protein sequences will be saved in output.fasta.
>seq1
CGCGATATGCATGTACTAATATAAGATGATAATCA
>seq1-6
MHVLI
>seq1-10
MY
>seq1-25
MII
- Each sequence in the output FASTA file is named with the input sequence name, followed by the zero-based start position of the ATG start codon.
- Incomplete codons at the end of a sequence are ignored.
Incomplete codons at the end of a sequence are ignored.
If you want to try this project immediately:
- Clone the repository:
git clone https://github.com/marti-dotcom/DNA-Translation.git
- Navigate to the folder:
cd DNA-Translation
- Run the script as shown above.
Thank you for checking out my DNA to Protein Translation project!
I hope you find it useful and easy to run. Feel free to STAR the repo if you like it!
Made with <3 by Martina