Another Natural Language Processing (NLP) exploration in Bioinformatic literatures in 2017 autumn semester. The previous exploration in 2016 autumn could refer here.
This is a Neural Information Processing Systems (NIPS) 2017 Competition Track named "Classifying Clinically Actionable Genetic Mutations", which was hosted by Kaggle.
To predict the effect of Genetic Variants to enable Personalized Medicine, we need to determine oncogenicity (4 classes) and mutation effect (9 classes) of the genes based on abstracts of biomedical literatures. It can be defined as a knowledge discovery task, i.e. gene-variation-effect relationship. I tried DeepDive, which has been stopped updating from 2017.
The detailed procedures could refer ppt file.
In this work, other skills were also needed:
- A special program language DDlog presented by author
- Other programming skills: bash, Python, SQL, HTML
- A database tool PostgreSQL