Search predicted peptides from proteomes
0
0
Entering edit mode
2.8 years ago
tomas4482 ▴ 420

I created a fasta containing my target around 1000 predicted peptide sequences. It is concatenated with Uniprot reference fasta to create a target-decoy database. Then the search engine will search proteome data to look up for any hits in the reference database.

I have some questions.

  1. I would like to set 5% FDR for peptide identification filtering. Someone suggested that I should do this step exclusively for my predicted peptides. But I found few literatures mentioned details about how to perform this step. Some of them performed FDR filtering for the whole reference (Uniprot ref+predicted peptides). May I ask what is the preferred one?

  2. If I would like to do the FDR filtering only for my target sequences, how could I do this?

peptide proteome search • 729 views
ADD COMMENT
0
Entering edit mode

If you want to do FDR-control on your predicted sequences only, filter out ref mapped PSMs from your result (regardless of target or decoy). As a result, you have only target/decoy PSMs mapping to the predicted sequences. Apply FDR-control on the PSMs. This practice is called separated FDR in general.

However, you have to be careful if you have small amount of PSMs to apply FDR-control. This is because there are chances that the PSMs are too small to represent null-distribution.

Actually, there is no significant difference between separated FDR and global FDR (use of whole reference) in practice so it is a matter of choice in many cases.

If you are beginner in proteomics, I recommend you to draw score distributions of 1) use of whole reference and 2) predicted and ref separately. And then, make a decision which one shows better null-distribution (Of course, before FDR control) and identification rate (Of course, after FDR control).


P.S. I am concerned with filtering ref-mapped PSMs so I give an example for your convenience:

PEPTIDE1 maps to both target-REF-protein1 and target-PREDICTED-protein1 => PEPTIDE1 is assigned to target-REF-protein1. => This is because reference protein is prioritized than predicted protein.

PEPTIDE2 maps to both decoy-REF-protein1 and target-PREDICTED-protein1 => PEPTIDE2 is assigned to target-PREDICTED-protein1 => This is because target sequence is prioritized than decoy sequence.

PEPTIDE3 maps to both decoy-REF-protein1 and decoy-PREDICTED-protein1 => PEPTIDE3 is assigned to both of them. => This is because there is no priority between decoys.

Good luck to you!

ADD REPLY

Login before adding your answer.

Traffic: 975 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6