Question: Small RNA sequence, how to calculate read counts? What's the criterion? Is there software to do this work?
1
gravatar for xuling2015
3.8 years ago by
xuling201510
United States
xuling201510 wrote:

After alignment small RNA sequence to the reference genome using bowtie. I got SAM file, then how to calculate read counts? What's the criterion? Is there software to do this work? 

rna-seq sequence • 5.8k views
ADD COMMENTlink modified 3.7 years ago by Whoknows660 • written 3.8 years ago by xuling201510
1

Are you interested in small RNAs in general or more specifically miRNAs? There are a lot of tools specifically meant to handle miRNAs, so if you're happy restricting yourself slightly to just those then just go with one of those tools (e.g., miRDeep2).

ADD REPLYlink written 3.8 years ago by Devon Ryan85k

If you use Bowtie2, assure that you set the '-a' parameter. Otherwise you will only get one random hit reported, even if there are multiple mapping loci. small RNAs are known to occur in multiple copies (miRNAs, tRNAs, etc.).

Some more information about the problem: Attention: Bowtie2 And Multiple Hits

ADD REPLYlink written 3.7 years ago by David Langenberger8.2k
1
gravatar for Sean Davis
3.7 years ago by
Sean Davis25k
National Institutes of Health, Bethesda, MD
Sean Davis25k wrote:

To directly answer your question, HTseq count, featureCounts from the subread package, and GenomicRanges summarizeOverlaps().  For example usage of the latter, see:

http://bioconductor.org/packages/release/bioc/vignettes/GenomicRanges/inst/doc/GenomicRangesHOWTOs.pdf

In your case, if you have a set of regions defining the miRNA locations on the genome, you can create a GRanges or GRangesList (see the rtracklayer package for how to import a BED file to a GRanges object) and then use summarizeOverlaps() as described.  Once you have a table of counts, the typical RNA-seq tools like DESeq2, edgeR, and limma voom can be applied.

 

ADD COMMENTlink modified 3.7 years ago • written 3.7 years ago by Sean Davis25k
0
gravatar for Whoknows
3.7 years ago by
Whoknows660
Tehran,Iran
Whoknows660 wrote:

You can use HTSeq  for this,

And you can also do this on Fastq files with  miRanalyzer2  , groupReads from following link:

http://bioinfo5.ugr.es/miRanalyzer/standalone.html#toc-Section-1

 

ADD COMMENTlink modified 3.7 years ago • written 3.7 years ago by Whoknows660

Regarding HTSeq, one has to be quite careful with this when looking at small RNAs. Small RNAs can often exist with multiple copies in the genome and htseq (well, htseq-count) will default to ignoring these if they can't be uniquely aligned against.

ADD REPLYlink written 3.7 years ago by Devon Ryan85k

When converting read counts of small RNA into reads per million (RPM), how to define "Total_number_mapped_reads".

For example, in one of the mapped small RNA library

2601833 reads; of these: 2601833 (100.00%) were unpaired; of these:

352919 (13.56%) aligned 0 times

362282 (13.92%) aligned exactly 1 time

1886632 (72.51%) aligned >1 times
  

86.44% overall alignment rate

Total_number_mapped_reads= 362282 + 1886632 = 2248914

But the fact that some of the "1886632" reads are mapped multiple times, the actual Total_number_mapped_reads(location) will be higher.

To compute reads per million, which "Total_number_mapped_reads" should be used. The one which is got after multimapping or the actual number of mapped reads.

thanks !!

ADD REPLYlink modified 7 months ago • written 7 months ago by Chirag Nepal2.1k

Please ask a new question.

ADD REPLYlink written 7 months ago by Sean Davis25k

Thanks Sean !! I did it here Best/right way to quantify small RNA transcripts

ADD REPLYlink written 7 months ago by Chirag Nepal2.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1259 users visited in the last hour