Skip to content

Commit 3fa1641

Browse files
committed
Updates to the beta-release-cleanup branch
One of the largest updates include moving several components under a new component called external tools. We also updated the documentation to reflect this information. gnn_embedding code is now working as intended. Updates all the tests to reflect the changes. We will no longer be suporting a install script. we will just let using isntall pytorch and R on their own while providing documentation on where to install it from. After this commit I will remove all the sentive data from any previous commit Any .csv or .RData file will get removed from the repo I am also adding contingencies to prevent future upload of sensitive files. Other updates include: - updated the .gitignore file - updated the .pre-commit-config.yml file - updated the ArunTest.py file: almost working - updated the README.md file - updated the bioneuralnet/__init__.py file
1 parent c4e9c40 commit 3fa1641

111 files changed

Lines changed: 2944 additions & 19552 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/docs.yml

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@ name: Build and Deploy Documentation
22

33
on:
44
push:
5-
branches: [ main ]
5+
branches: [main]
66
pull_request:
7-
branches: [ main ]
7+
branches: [main]
88

99
jobs:
1010
build-deploy-docs:
@@ -19,22 +19,24 @@ jobs:
1919
- name: Set up Python 3.10
2020
uses: actions/setup-python@v4
2121
with:
22-
python-version: '3.10'
22+
python-version: "3.10"
2323
check-latest: true
2424

25-
- name: Install Dependencies and Set Up Environment
25+
- name: Install Dependencies
2626
run: |
2727
python -m pip install --upgrade pip
28-
python fast-install.py --cuda --cuda-version 12.1 --dev
28+
pip install -r requirements.txt
29+
pip install -r requirements-dev.txt
30+
pip install torch
31+
pip install torch_geometric
2932
shell: bash
3033

3134
- name: Build Documentation
3235
run: |
33-
source ./bioneuralnet-env/bin/activate
3436
mkdir -p docs/build/html
3537
sphinx-build -b html docs/source/ docs/build/html/
3638
shell: bash
37-
39+
3840
- name: Deploy to GitHub Pages
3941
uses: peaceiris/actions-gh-pages@v3
4042
with:

.github/workflows/python-app.yml

Lines changed: 46 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -1,84 +1,67 @@
1-
name: BioNeuralNet
1+
name: BioNeuralNet CI
22

33
on:
44
push:
5-
branches: [ main ]
5+
branches: [main]
66
pull_request:
7-
branches: [ main ]
7+
branches: [main]
88

99
jobs:
1010
build:
1111
strategy:
1212
matrix:
1313
os: [ubuntu-latest, macos-latest, windows-latest]
14-
python-version: ['3.10', '3.11']
14+
python-version: ["3.10", "3.11"]
1515

1616
runs-on: ${{ matrix.os }}
1717

1818
steps:
19-
- name: Checkout repository
20-
uses: actions/checkout@v3
19+
- name: Checkout repository
20+
uses: actions/checkout@v3
2121

22-
- name: Set up Python ${{ matrix.python-version }}
23-
uses: actions/setup-python@v4
24-
with:
25-
python-version: ${{ matrix.python-version }}
26-
check-latest: true
22+
- name: Set up Python ${{ matrix.python-version }}
23+
uses: actions/setup-python@v4
24+
with:
25+
python-version: ${{ matrix.python-version }}
26+
check-latest: true
2727

28-
- name: Cache pip dependencies
29-
uses: actions/cache@v3
30-
with:
31-
path: |
32-
~/.cache/pip
33-
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt', '**/scripts/requirements-dev.txt', 'fast-install.py') }}
34-
restore-keys: |
35-
${{ runner.os }}-pip-
28+
- name: Cache pip dependencies
29+
uses: actions/cache@v3
30+
with:
31+
path: ~/.cache/pip
32+
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt', '**/requirements-dev.txt') }}
33+
restore-keys: ${{ runner.os }}-pip-
3634

37-
- name: Install dependencies using fast-install.py (Unix)
38-
if: matrix.os != 'windows-latest'
39-
run: |
40-
chmod +x fast-install.py
41-
./fast-install.py --cuda --cuda-version 12.1 --dev
42-
shell: bash
35+
- name: Install Python dependencies
36+
run: |
37+
python -m pip install --upgrade pip
38+
pip install -r requirements.txt
39+
pip install -r requirements-dev.txt
40+
pip install torch
41+
pip install torch_geometric
42+
shell: bash
4343

44-
- name: Install dependencies using fast-install.py (Windows)
45-
if: matrix.os == 'windows-latest'
46-
run: |
47-
python fast-install.py --cuda --cuda-version 12.1 --dev
48-
shell: powershell
44+
- name: Install R
45+
uses: r-lib/actions/setup-r@v2
46+
with:
47+
r-version: "latest"
4948

50-
- name: Verify installed Python packages (Unix)
51-
if: matrix.os != 'windows-latest'
52-
run: |
53-
source ./bioneuralnet-env/bin/activate
54-
pip list
55-
shell: bash
49+
- name: Install R packages
50+
run: |
51+
Rscript -e "if (!requireNamespace('BiocManager', quietly = TRUE)) install.packages('BiocManager')"
52+
Rscript -e "BiocManager::install(update = TRUE, ask = FALSE)"
53+
Rscript -e "install.packages(c('SmCCNet', 'jsonlite', 'dplyr'))"
54+
Rscript -e "BiocManager::install(c('WGCNA', 'impute', 'GO.db', 'dynamicTreeCut', 'fastcluster'))"
55+
shell: bash
5656

57-
- name: Verify installed Python packages (Windows)
58-
if: matrix.os == 'windows-latest'
59-
run: |
60-
.\bioneuralnet-env\Scripts\Activate.ps1
61-
pip list
62-
shell: powershell
57+
- name: Run tests with pytest
58+
run: |
59+
pytest --cov=bioneuralnet --cov-report=xml tests/
6360
64-
- name: Run tests with pytest (Unix)
65-
if: matrix.os != 'windows-latest'
66-
run: |
67-
source ./bioneuralnet-env/bin/activate
68-
pytest --cov=bioneuralnet --cov-report=xml tests/
69-
shell: bash
70-
71-
- name: Run tests with pytest (Windows)
72-
if: matrix.os == 'windows-latest'
73-
run: |
74-
.\bioneuralnet-env\Scripts\Activate.ps1
75-
pytest --cov=bioneuralnet --cov-report=xml tests/
76-
shell: powershell
77-
78-
- name: Upload coverage to Codecov
79-
uses: codecov/codecov-action@v3
80-
with:
81-
token: ${{ secrets.CODECOV_TOKEN }}
82-
files: ./coverage.xml
83-
flags: unittests
84-
name: codecov-umbrella
61+
- name: Upload coverage to Codecov
62+
uses: codecov/codecov-action@v3
63+
with:
64+
token: ${{ secrets.CODECOV_TOKEN }}
65+
files: ./coverage.xml
66+
flags: unittests
67+
name: codecov-umbrella

.github/workflows/readthedocs.yml

Lines changed: 0 additions & 18 deletions
This file was deleted.

.gitignore

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,15 @@ venv.bak/
1111
.pytest_cache
1212
release.md
1313
bioneuralnet.egg-info
14+
/dist/
15+
/build/
16+
/docker_files/
17+
18+
19+
# Block sensitive file types globally
20+
*.csv
21+
*.RData
22+
1423

1524
# Sphinx documentation build
1625
docs/build/

.pre-commit-config.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,10 @@ repos:
66
- id: end-of-file-fixer
77
- id: check-yaml
88
- id: check-added-large-files
9+
- id: forbidden-files
10+
args:
11+
- '*.csv'
12+
- '*.RData'
913

1014
- repo: https://github.com/psf/black
1115
rev: 23.3.0

ArunTest.py

Lines changed: 57 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -12,13 +12,13 @@
1212
"""
1313

1414
import pandas as pd
15-
from bioneuralnet.graph_generation import SmCCNet
15+
from bioneuralnet.external_tools import SmCCNet
1616
from bioneuralnet.downstream_task import DPMON
1717

18-
def run_smccnet_dpmon_workflow(omics_genes: pd.DataFrame,
19-
20-
phenotype: pd.Series,
21-
clinical_data: pd.DataFrame) -> pd.DataFrame:
18+
19+
def run_smccnet_dpmon_workflow(
20+
omics_genes: pd.DataFrame, phenotype: pd.Series, clinical_data: pd.DataFrame
21+
) -> pd.DataFrame:
2222
"""
2323
Executes the hybrid workflow combining SmCCNet for network generation and DPMON for disease prediction.
2424
@@ -39,10 +39,10 @@ def run_smccnet_dpmon_workflow(omics_genes: pd.DataFrame,
3939
try:
4040
smccnet_instance = SmCCNet(
4141
phenotype_df=phenotype,
42-
omics_dfs=[gene_names],
43-
data_types=['genes'],
42+
omics_dfs=[omics_genes],
43+
data_types=["genes"],
4444
kfold=5,
45-
summarization='PCA',
45+
summarization="PCA",
4646
seed=732,
4747
)
4848
adjacency_matrix = smccnet_instance.run()
@@ -53,63 +53,75 @@ def run_smccnet_dpmon_workflow(omics_genes: pd.DataFrame,
5353
omics_list=[omics_genes],
5454
phenotype_data=phenotype,
5555
features_data=clinical_data,
56-
model='GCN',
57-
tune=False,
58-
gpu=False
56+
model="GCN",
57+
tune=False,
58+
gpu=False,
5959
)
6060

6161
predictions_df = dpmon_instance.run()
6262
if not predictions_df.empty:
6363
print("DPMON workflow completed successfully. Predictions generated.")
6464
else:
65-
print("DPMON hyperparameter tuning completed. No predictions were generated.")
65+
print(
66+
"DPMON hyperparameter tuning completed. No predictions were generated."
67+
)
6668

6769
return predictions_df
6870

6971
except Exception as e:
7072
print(f"An error occurred during the SmCCNet + DPMON workflow: {e}")
7173
raise e
7274

75+
7376
if __name__ == "__main__":
7477
try:
7578
print("Starting SmCCNet + DPMON Hybrid Workflow...")
7679

77-
# omics_proteins = pd.DataFrame({
78-
# 'protein_feature1': [0.1, 0.2],
79-
# 'protein_feature2': [0.3, 0.4]
80-
# }, index=['Sample1', 'Sample2'])
81-
82-
# omics_metabolites = pd.DataFrame({
83-
# 'metabolite_feature1': [0.5, 0.6],
84-
# 'metabolite_feature2': [0.7, 0.8]
85-
# }, index=['Sample1', 'Sample2'])
86-
87-
# phenotype_data = pd.Series([1, 0], index=['Sample1', 'Sample2'])
88-
89-
# clinical_data = pd.DataFrame({
90-
# 'clinical_feature1': [5, 3],
91-
# 'clinical_feature2': [7, 2]
92-
# }, index=['Sample1', 'Sample2'])
93-
94-
## COPDGeneCounts.csv: 1st column is geneID_split containing the gene names
95-
omics_genes = pd.read_csv("/Users/sarkara/Desktop/GitHub/BioNeuralNet/COPDGeneCounts.csv")
96-
97-
#gene_names = omics_genes["geneID_split"]
98-
values = omics_genes[omics_genes.columns[0:337]]
99-
gene_names = values.drop(["geneID"], axis=1)
100-
gene_names_transposed = gene_names.reset_index().transpose()
101-
print(gene_names_transposed)
102-
103-
md = pd.read_csv("/Users/sarkara/Desktop/GitHub/BioNeuralNet/COPDGeneMetadata.csv")
80+
omics_genes = pd.read_csv("example_data/COPDGeneCounts.csv")
81+
omics_genes = omics_genes.drop(
82+
[
83+
"geneID",
84+
"end",
85+
"strand",
86+
"gene_id",
87+
"gene_name",
88+
"gene_type",
89+
"chr",
90+
"start",
91+
],
92+
axis=1,
93+
)
94+
omics_genes_t = omics_genes.T
95+
omics_genes_t = omics_genes_t.reset_index().rename(columns={"index": "sid"})
96+
97+
new_header = omics_genes_t.iloc[0].copy()
98+
new_header.iloc[0] = "sid"
99+
omics_genes_t = omics_genes_t[1:]
100+
omics_genes_t.columns = new_header
101+
omics_genes_t = omics_genes_t.reset_index(drop=True)
102+
103+
# gene_names = omics_genes["geneID_split"]
104+
md = pd.read_csv("example_data/COPDGeneMetadata.csv")
104105
phenotype = md[["sid", "finalgold_visit"]].reset_index()
105-
clinical_data = md[["sid", "age_visit", "gender", "smoking_status"]].reset_index()
106-
print(clinical_data)
107-
print(phenotype)
108-
gene_names = gene_names.reset_index()
109-
predictions = run_smccnet_dpmon_workflow(gene_names, phenotype, clinical_data)
106+
clinical_data = md[
107+
["sid", "age_visit", "gender", "smoking_status"]
108+
].reset_index(drop=True)
109+
110+
phenotype_subset = phenotype[["sid", "finalgold_visit"]]
111+
phenotype_subset["finalgold_visit"] = pd.to_numeric(
112+
phenotype_subset["finalgold_visit"], errors="coerce"
113+
)
114+
115+
print(f"Gene name:\n {omics_genes_t}")
116+
print(f"Phenotype subset: \n{phenotype_subset}")
117+
print(f"Cclinical data: \n{clinical_data}")
118+
119+
predictions = run_smccnet_dpmon_workflow(
120+
omics_genes_t, phenotype_subset, clinical_data
121+
)
110122

111123
print("DPMON Predictions:")
112-
#print(predictions)
124+
print(predictions)
113125

114126
print("Hybrid Workflow completed successfully.\n")
115127
except Exception as e:

0 commit comments

Comments
 (0)