Question

Small RNA sequence, how to calculate read counts? What's the criterion? Is there software to do this work?

2

Entering edit mode

9.3 years ago

xuling2015 ▴ 20

After alignment small RNA sequence to the reference genome using bowtie. I got SAM file, then how to calculate read counts? What's the criterion? Is there software to do this work?

sequence RNA-Seq • 8.8k views

ADD COMMENT • link updated 2.0 years ago by Ram 43k • written 9.3 years ago by xuling2015 ▴ 20

1

Entering edit mode

Are you interested in small RNAs in general or more specifically miRNAs? There are a lot of tools specifically meant to handle miRNAs, so if you're happy restricting yourself slightly to just those then just go with one of those tools (e.g., miRDeep2).

ADD REPLY • link 9.3 years ago by Devon Ryan 104k

0

Entering edit mode

If you use Bowtie2, assure that you set the '-a' parameter. Otherwise you will only get one random hit reported, even if there are multiple mapping loci. small RNAs are known to occur in multiple copies (miRNAs, tRNAs, etc.).

Some more information about the problem: Attention: Bowtie2 And Multiple Hits

ADD REPLY • link updated 2.0 years ago by Ram 43k • written 9.2 years ago by David Langenberger 11k

Ram · Answer 1 · 2015-01-31

To directly answer your question, HTseq count, featureCounts from the subread package, and GenomicRanges summarizeOverlaps(). For example usage of the latter, see:

http://bioconductor.org/packages/release/bioc/vignettes/GenomicRanges/inst/doc/GenomicRangesHOWTOs.pdf

In your case, if you have a set of regions defining the miRNA locations on the genome, you can create a GRanges or GRangesList (see the rtracklayer package for how to import a BED file to a GRanges object) and then use summarizeOverlaps() as described. Once you have a table of counts, the typical RNA-seq tools like DESeq2, edgeR, and limma voom can be applied.

Ram · Answer 2 · 2015-02-01

0

Entering edit mode

9.2 years ago

Whoknows ▴ 960

You can use HTSeq for this,

And you can also do this on Fastq files with miRanalyzer2, groupReads from this link: http://bioinfo5.ugr.es/miRanalyzer/standalone.html#toc-Section-1

ADD COMMENT • link updated 2.0 years ago by Ram 43k • written 9.2 years ago by Whoknows ▴ 960

0

Entering edit mode

Regarding HTSeq, one has to be quite careful with this when looking at small RNAs. Small RNAs can often exist with multiple copies in the genome and htseq (well, htseq-count) will default to ignoring these if they can't be uniquely aligned against.

ADD REPLY • link 9.2 years ago by Devon Ryan 104k

0

Entering edit mode

When converting read counts of small RNA into reads per million (RPM), how to define "Total_number_mapped_reads".

For example, in one of the mapped small RNA library

2601833 reads; of these: 2601833 (100.00%) were unpaired; of these:
352919 (13.56%) aligned 0 times

362282 (13.92%) aligned exactly 1 time

1886632 (72.51%) aligned >1 times
  
86.44% overall alignment rate

Total_number_mapped_reads= 362282 + 1886632 = 2248914

But the fact that some of the "1886632" reads are mapped multiple times, the actual Total_number_mapped_reads(location) will be higher.

To compute reads per million, which "Total_number_mapped_reads" should be used. The one which is got after multimapping or the actual number of mapped reads.

thanks !!

ADD REPLY • link 6.2 years ago by Chirag Nepal ★ 2.4k

0

Entering edit mode

Please ask a new question.

ADD REPLY • link 6.2 years ago by Sean Davis 26k

0

Entering edit mode

Thanks Sean !! I did it here Best/right way to quantify small RNA transcripts

ADD REPLY • link 6.2 years ago by Chirag Nepal ★ 2.4k