Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 55 additions & 0 deletions scripts/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,3 +49,58 @@ In this command:
- `tmp-fg` is the filegroup slug to which files should be copied.
- `12345` is the ID of the request for which the files belong.
- `runId=ABCD_123 runMode="XSeq"` are optional key-value pairs that can be added as metadata when copying files.

## parse_cohort_files.py
This script is typically used before submitting Tempo Jobs for alignment.
There are three commands in this script.
-`Parse` Creates the input file to run Tempo jobs by comparing the cohort directory with bams and parsing the differences.
-`Check` Creates a script to check if files exist.
-`Remove` Creates a scripts which deletes the files.

Usage:

Prerequisite:

Switch to tempobot for access to cohort directory:
source /usersoftware/core006/dodzdo.sh

Activate conda environment:
conda activate py37

Set enviornment variables:
- BEAGLE_USER
- BEAGLE_PW
- BEAGLE_ENDPOINT

Here is an example for parse_cohort_files.py *parse* command:
```

python3 parse_cohort_files.py parse <input_files> <directory_path> <parse_output> <diff_output> [<file_group_id>]
- <input_files> can be a single file, multiple files, or a wildcard (e.g., /path/to/files/*.txt)
- <directory_path> is the path containing existing directories to compare

python3 parse_cohort_files.py parse /data1/core006/ccs_pipelines/tempo/wes_repo/Results/v2.1.x/cohort_level/*.txt /data1/core006/ccs_pipelines/tempo/wes_repo/Results/v2.1.x/bams/ parse_output.txt diff_output.txt
```
Here is an example for parse_cohort_files *check* command:
```
python3 parse_cohort_files.py check <input> <output>

python3 parse_cohort_files.py check CCS_F00000.cohort.txt CCS_F00000.cohort.check.sh
```
Here is an example for parse_cohort_files *remove* command:
```
python3 parse_cohort_files.py remove <input> <output>

python3 parse_cohort_files.py remove CCS_F00000.cohort.txt CCS_F00000.cohort.remove_file.sh
```
## submit_tempo_jobs.py

The submit_tempo_jobs.py script is used to submit tempo jobs to voyager using the <diff_output> file created from the parse command.

Here is a usage example of submit_tempo_jobs:

```
python submit_tempo_jobs.py diff_output.txt CCS_F00000

python submit_tempo_jobs.py <diff_output> <cohort_id> [<job_group_id>]
```
20 changes: 13 additions & 7 deletions scripts/parse_cohort_files.py
Original file line number Diff line number Diff line change
Expand Up @@ -136,38 +136,44 @@ def parse_cohort_file(input_files, directory_path, output_file, diff_output_file
python3 parse_cohort_files.py parse <input_files> <directory_path> <parse_output> <diff_output> [<file_group_id>]
- <input_files> can be a single file, multiple files, or a wildcard (e.g., /path/to/files/*.txt)
- <directory_path> is the path containing existing directories to compare
python3 parse_cohort_files.py remove <input> [<output>]
python3 parse_cohort_files.py check <input> [<output>]
python3 parse_cohort_files.py list_dir <directory> [<output>]
python3 parse_cohort_files.py compare <file1.txt> <file2.txt> <report_file>
python3 parse_cohort_files.py remove <input> <output>
python3 parse_cohort_files.py check <input> <output>

"""

if __name__ == "__main__":
if len(sys.argv) < 6:
if len(sys.argv) < 2:
print(HELP)
exit(1)
command = sys.argv[1]
if command == "parse":
if len(sys.argv) < 6:
print(HELP)
exit(1)
input_files = sys.argv[2:-3]
directory_path = sys.argv[-3]
output_file = sys.argv[-2]
diff_output_file = sys.argv[-1]
parse_cohort_file(input_files, directory_path, output_file, diff_output_file)
elif command == "remove":
if len(sys.argv) != 4:
print(HELP)
exit(1)
input_file = sys.argv[2]
if len(sys.argv) > 2:
output_file = sys.argv[3]
create_remove_script(input_file, output_file)
else:
create_remove_script(input_file)
elif command == "check":
if len(sys.argv) != 4:
print(HELP)
exit(1)
input_file = sys.argv[2]
if len(sys.argv) > 2:
output_file = sys.argv[3]
create_check_script(input_file, output_file)
else:
create_check_script(input_file)
else:
print(HELP)

print(HELP)