-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Hello,
After reading the guidelines to interpret results (for pLink2, but most of it should still apply, https://github.com/pFindStudio/pLink2/wiki/CSV-result), I'm having a hard time recreating the _filtered.csv results from the large file containing all PSMs.
My goal is to have a unified report of all the crosslinks, looplinks, monolinks and linear peptides above the FDR threshold (which the _filtered files provide), but I need to include in this report all the target-decoy AND decoy-decoy that are also above this threshold (which the _filtered files exclude).
I'm trying to get this data from the large file since it should have everything. I'm filtering it to include all Peptide_Type (0, 1, 2 and 3), also all the values for Target_Decoy (0, 1 and 2), and my initial thought was to filter by Q-value, to include everything lower or equal than 0.05 (for a 5% FDR, for example). However, the dataset I end up with doesn't match what I have on the _filtered files. Surely it's larger because now decoys are also included, but if I search for specific crosslink peptides, they have vastly different scores in the large file versus the _filtered file. Additionally, there are a lot of entries present in the _filtered file that are not found in the large one, which doesn't make sense to me.
Is there a better way to introducing the decoy hits back into the _filtered results, or are there extra steps that I'm missing when filtering the large file that you could help me with?
Thank you!