Question: How to interpret log.final.out in star
0
gravatar for XBria
14 months ago by
XBria50
germany
XBria50 wrote:

Hi,

Can anyone please let me know which percentages are showing precision and recall of reads in the star log.final.out file ? Is it the annotated splice junctions/total number of splice junctions ?

how about mapped correctly and mapped incorrectly ?

Thanks

rna-seq • 708 views
ADD COMMENTlink modified 14 months ago by h.mon25k • written 14 months ago by XBria50
2
gravatar for h.mon
14 months ago by
h.mon25k
Brazil
h.mon25k wrote:

There is no such information at the .Log.final.out. To calculate precision and recall, STAR would have to know the true location of each read, map them, and check if it mapped it correctly or not. With this information it could calculate precision and recall.

STAR (or any other mapper) doesn't know anything about the true position of the reads. STAR is trying to estimate the true mapping location of each read, but doesn't know if got it right or not.

ADD COMMENTlink written 14 months ago by h.mon25k

So, how can I recognize if a mapping command with an optimized parameters is the best one among the rest of options by only the files of STAR?

ADD REPLYlink modified 14 months ago • written 14 months ago by XBria50
1

You could:

1) find or create a test set, with known true location of each read, and perform your parameter tuning on this set - you can then calculate precision, recall, and so forth. However, parameter optimizations will likely by data-dependent, at least to some extent, and there is no guarantee the optimized parameters from the test set will be the best to your real data.

2) Perform dowstream analysis on the several alignments and see the quality of the results, or try to find out which ones "makes more sense" or is "biologically more relevant". If you are not careful, this will result on an intricate form of p-hacking, though.

3) Explore the mappings visually with IGV or other genome browser. You can open several bam files simultaneously in IGV (better convert them to bigWig, however) and look for discordant regions, to perform visual assessment of the mapping.

4) Stop overthinking (Improving the mapping rate by aligner parameters, STAR outputs interpretation, Parameter optimization STAR) and use the default parameters. At these other threads, you have been told your mapping rate seems just fine and similar to the mapping rate of good datasets.

Probably (4) is the best suggestion, and it is not the first time you heard it.

ADD REPLYlink modified 14 months ago • written 14 months ago by h.mon25k

I have to write about the optimization process. That is why I need to perform optimization, however the default parameters values seem the best.

Thanks

ADD REPLYlink written 14 months ago by XBria50

If you had said this first, in your first question, you probably could have gotten better answers.

ADD REPLYlink written 14 months ago by swbarnes25.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1687 users visited in the last hour