PALAVA: a neural-network model for inferring genetic pathways from single-cell data

Build using the scvi-tools

Performs nonlinear factor analysis and incorporates gene set information to pre-annotate factors with gene sets.

Overview

PALAVA is a nonlinear factor analysis method that incorporates prior knowledge through gene sets. The method provides interpretable dimension reduction to analyse biological signals in the data. The methods models use annotated latent variable uses a gene set as prior knowledge. Thus, we associates the the biological meaning of the gene set to the corresponding latent variable. This provides us with a more meaningful latent space, as the latent variables are pre-annotated with biological meaning. We also assume we are unaware of the which gene sets (or biological processes) are relevant to the data. Thus reasonably excessive gene sets can be provided. The method provides factor importance scores that ranks the factors based on importance. The modelling is flexible enough to infer nonlinear relationships between genes to capture more complicated biological processes in the data. The design of the annotated decoder also accounts for errors in the gene set. Consequently, through interpretability techniques the gene sets can be refined based on information from the data. Thus the method can introduce relevant genes into the gene set or not use gene set genes if the data does not have such a signal in the counts data.

Installation

Create conda environment with python 3.10 conda create --name palava-env python=3.10
Activate conda environment, conda activate palava-env
Then run pip install git+ssh://git@github.com/shimlab/PALAVA.git

If you want an editable installation or from cloned repo, then

Clone this directory
After cloning, navigate to the repo (in the same path as the setup.py), run pip install -e . or remove the -e for normal installation.

Test run

This notebook test runs the method and visualises the output of the method. It requires the palava_on_sim_data_a_test.h5ad data file in the directory. Highly recommended to use a gpu. Training will take less than 10 mins with gpu (on cpu approx 1 hour).

Required input of PALAVA

The raw counts of single RNA seq data to analyse (should not be log transformed)
Set of gene sets you think could be relevant to the data at had (example: 50 Hallmark gene sets)
example on simulated data can be found in example_notebooks.

Output generated by PALAVA

Factor importance scores : Ranks the factors based on most importance
Factor activations: Provides the representation of the data in terms of factors (factor to cell relationship, analogous to PCA representation of the data)
Factor scores: Provides the factor loadings of the data (factor to gene relationship , analogous to factor loadings). This can be used to refine the gene sets.

TODO

resolve warnings

Notes

If using mac use accelerator='cpu', the package is not compatible with mps.

Name		Name	Last commit message	Last commit date
Latest commit History 3,766 Commits
.github/workflows		.github/workflows
example_notebooks		example_notebooks
palava		palava
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
image.png		image.png
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PALAVA: a neural-network model for inferring genetic pathways from single-cell data

Overview

Installation

Test run

Required input of PALAVA

Output generated by PALAVA

TODO

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PALAVA: a neural-network model for inferring genetic pathways from single-cell data

Overview

Installation

Test run

Required input of PALAVA

Output generated by PALAVA

TODO

Notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages