Search predicted peptides from proteomes
Entering edit mode
2.7 years ago
tomas4482 ▴ 400

I created a fasta containing my target around 1000 predicted peptide sequences. It is concatenated with Uniprot reference fasta to create a target-decoy database. Then the search engine will search proteome data to look up for any hits in the reference database.

I have some questions.

  1. I would like to set 5% FDR for peptide identification filtering. Someone suggested that I should do this step exclusively for my predicted peptides. But I found few literatures mentioned details about how to perform this step. Some of them performed FDR filtering for the whole reference (Uniprot ref+predicted peptides). May I ask what is the preferred one?

  2. If I would like to do the FDR filtering only for my target sequences, how could I do this?

peptide proteome search • 703 views
Entering edit mode

If you want to do FDR-control on your predicted sequences only, filter out ref mapped PSMs from your result (regardless of target or decoy). As a result, you have only target/decoy PSMs mapping to the predicted sequences. Apply FDR-control on the PSMs. This practice is called separated FDR in general.

However, you have to be careful if you have small amount of PSMs to apply FDR-control. This is because there are chances that the PSMs are too small to represent null-distribution.

Actually, there is no significant difference between separated FDR and global FDR (use of whole reference) in practice so it is a matter of choice in many cases.

If you are beginner in proteomics, I recommend you to draw score distributions of 1) use of whole reference and 2) predicted and ref separately. And then, make a decision which one shows better null-distribution (Of course, before FDR control) and identification rate (Of course, after FDR control).

P.S. I am concerned with filtering ref-mapped PSMs so I give an example for your convenience:

PEPTIDE1 maps to both target-REF-protein1 and target-PREDICTED-protein1 => PEPTIDE1 is assigned to target-REF-protein1. => This is because reference protein is prioritized than predicted protein.

PEPTIDE2 maps to both decoy-REF-protein1 and target-PREDICTED-protein1 => PEPTIDE2 is assigned to target-PREDICTED-protein1 => This is because target sequence is prioritized than decoy sequence.

PEPTIDE3 maps to both decoy-REF-protein1 and decoy-PREDICTED-protein1 => PEPTIDE3 is assigned to both of them. => This is because there is no priority between decoys.

Good luck to you!


Login before adding your answer.

Traffic: 1363 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6