Greetings! I am attempting to feed ms2rescore with outputs (.pin, .pepXML, .tsv`) from msfragger (part of the fragpipe workflow, but i did not use any validation tools nor any downstream software). I used top n = 5 so that the rescoring tool can have more candidates.
My regex pattern for psm_id_pattern and spectrum_id_pattern can be successfully used in parsing, both in the GUI and in the command line interface. Though the MSFragger output denotes the amino acid modification as M[147], unrecognizable by DeepLC, I can successfully fix the issue by modifying the json file.
However, I am encountering several problems in using the GUI (windows). First, the GUI cannot parse my .pepXML, so I can only use .tsv for processing. Secondly, the GUI claims that several functions in downstream processing (maybe somewhere after the DeepLC? Because I do not encounter the errors before I fix the DeepLC recognition problems.)
When I try to switch to the command line interace to manually process the rescoring pipeline, additional issues occur.
1. MS2rescore cannot recognize my .tsv file.
I can make the claim that, under the context of MSFragger outputs at least, the issue is unique to PSM with multiple candidates, since I used a smaller psm file with only 1 candidate per spectrum and did not encounter this problem.
specifically
input_psm = read_file("filename.tsv",
filetype='fragpipe')
File ~/miniforge3/lib/python3.12/site-packages/psm_utils/io/__init__.py:210, in read_file(filename, filetype, *args, **kwargs)
206 raise PSMUtilsIOException(
207 f"Filetype '{filetype}' unknown or not supported for reading."
208 ) from e
209 reader = reader_cls(filename, *args, **kwargs)
--> [210](https://vscode-remote+ssh-002dremote-002b172-002e28-002e40-002e2.vscode-resource.vscode-cdn.net/new_home/xinyuegu/playground/~/miniforge3/lib/python3.12/site-packages/psm_utils/io/__init__.py:210) psm_list = reader.read_file()
211 return psm_list
File ~/miniforge3/lib/python3.12/site-packages/psm_utils/io/_base_classes.py:46, in ReaderBase.read_file(self)
44 def read_file(self) -> PSMList:
45 """Read full PSM file into a PSMList object."""
---> [46](https://vscode-remote+ssh-002dremote-002b172-002e28-002e40-002e2.vscode-resource.vscode-cdn.net/new_home/xinyuegu/playground/~/miniforge3/lib/python3.12/site-packages/psm_utils/io/_base_classes.py:46) return PSMList(psm_list=[psm for psm in self.__iter__()])
File ~/miniforge3/lib/python3.12/site-packages/psm_utils/io/fragpipe.py:71, in FragPipeReader.__iter__(self)
69 reader = csv.DictReader(msms_in, delimiter="\t")
70 for row in reader:
---> [71](https://vscode-remote+ssh-002dremote-002b172-002e28-002e40-002e2.vscode-resource.vscode-cdn.net/new_home/xinyuegu/playground/~/miniforge3/lib/python3.12/site-packages/psm_utils/io/fragpipe.py:71) yield self._get_peptide_spectrum_match(row)
...
96 rescoring_features=rescoring_features,
97 metadata={},
98 )
KeyError: 'Modified Peptide
alternatively
input_psm = read_file("file_name.tsv",
filetype='tsv')
PSMUtilsIOException: Could not parse PSM from three consecutive rows. Verify that the file is formatted correctly as a psm_utils TSV file or that the correct file type reader is used.
2. when using .pepXML as psm file input and in command line interace, I am encountering issues such as (from log)
INFO ms2rescore.feature_generators.ms2pip // Running MS²PIP for PSMs from run (1/1) `None`...
...
...
...
INFO ms2rescore.feature_generators.deeplc // Running DeepLC for PSMs from run (1/1): `None`...
I am not sure whether this matters in downstream processing. After following line by line in ths ms2rescore tutorial, the psmList from readfile() function also have run = None in every entry. I found out that it can be due to the get_psm_dict() function used in initializing the psmList object, which depends on the functions parsing the input file from the psm_util package.
I have a separate issue in another ticket.
update: when trying to use the GUI, it reports errors such as [access denied]. thank you so much!
Greetings! I am attempting to feed ms2rescore with outputs (.pin,.pepXML,.tsv`) from msfragger (part of the fragpipe workflow, but i did not use any validation tools nor any downstream software). I used top n = 5 so that the rescoring tool can have more candidates.My regex pattern for psm_id_pattern and spectrum_id_pattern can be successfully used in parsing, both in the GUI and in the command line interface. Though the MSFragger output denotes the amino acid modification as M[147], unrecognizable by DeepLC, I can successfully fix the issue by modifying the json file.
However, I am encountering several problems in using the GUI (windows). First, the GUI cannot parse my
.pepXML, so I can only use.tsvfor processing. Secondly, the GUI claims that several functions in downstream processing (maybe somewhere after the DeepLC? Because I do not encounter the errors before I fix the DeepLC recognition problems.)When I try to switch to the command line interace to manually process the rescoring pipeline, additional issues occur.
1. MS2rescore cannot recognize my
.tsvfile.I can make the claim that, under the context of MSFragger outputs at least, the issue is unique to PSM with multiple candidates, since I used a smaller
psmfile with only 1 candidate per spectrum and did not encounter this problem.specifically
alternatively
2. when using
.pepXMLas psm file input and in command line interace, I am encountering issues such as (from log)I am not sure whether this matters in downstream processing. After following line by line in ths ms2rescore tutorial, the psmList from readfile() function also have
run = Nonein every entry. I found out that it can be due to the get_psm_dict() function used in initializing the psmList object, which depends on the functions parsing the input file from thepsm_utilpackage.I have a separate issue in another ticket.
update: when trying to use the GUI, it reports errors such as [access denied]. thank you so much!