Question

Multi-mapped reads in Ribo-Seq data, discard or keep?

0

Entering edit mode

8 months ago

Carmen • 0

Hi all,

I am for the first time doing a TE analysis using Ribo-seq and RNA-seq data, however I have a few question regarding the analysis.

I have used STAR to align the reads from both datasets to the Human genome. For the RNA-Seq datatsets (average read length 100 nt) I get a very good amount of uniquely mapped reads. For the Ribo-seq datasets I get only 13-15% of uniquely mapped reads, which I guess is understandable if we consider that the average read length is very short (30 nt).

I am now planning to get the gene counts (as TPM) for both datasets and use these to calculate the translation efficiency of each gene. Should I use all mapped reads independently on the fact that most of them show multi-mapping or should I just use the uniquely mapped reads? In the second case, won't this be underepresentative of the real counts?

Thanks a lot in advance for your help.

STAR ribo-seq multi-mapped rna-seq TE • 649 views

ADD COMMENT • link written 8 months ago by Carmen • 0

score 0 · Answer 1 · 2023-12-22

Hi, Carmen, lower mapping rate might indicate that you either didn't perform rRNA depletion with your Ribo-Seq samples, or the depletion you performed was not quite successful. Abundance of rRNA reads can account for higher % of multimapping reads (although I haven't worked with human Ribo-seq datasets, but I guess it might be the case). Multimapping reads are usually discarded from further analysis. You can also try another alignment tool, or you can try to change the commands, i.e. mismatch tolerance, and so on, and check what results you get. Running FastQC is also an easy way to check contaminating/ overrepresented sequences and quickly see if there is some issue with the sample. Good luck! Hope this helps.