Question: Distribution of mapped reads
0
gravatar for ilovesuperheroes1993
13 months ago by
ilovesuperheroes19930 wrote:

Hi, I have the bam files of small RNA sequencing data mapped to the human reference genome by STAR. I have to find out the percentage of reads that mapped to:

(1) miRNA (2) lncRNA (3) piRNA (4) other non-coding rna (5) introns (6) 3- and 5- utrs (7) promoters

I started by finding out the reads which mapped to known mature miRNA. The command I have used is bedtools intersect -abam bam_file -b mature_mirna_gff file -bed | wc -l

Then I am using the annotation file of lncRNA to find the number of reads, then piRNA and so on.

Is this methodology correct? Do I need to remove the reads which mapped to a specific class from the bam file after each step?

ADD COMMENTlink modified 13 months ago by Grinch90 • written 13 months ago by ilovesuperheroes19930

Just to be sure : is it small RNA-Seq (libraies were buid with a specific kit to catch small RNAs such as miRNAs) or RNA-Seq (polyA or rRNA depleted lib) ?

ADD REPLYlink written 13 months ago by Nicolas Rosewick9.0k

Yes it was done with the NEBNext small RNA kit, specifically for miRNA and piRNA

ADD REPLYlink modified 13 months ago • written 13 months ago by ilovesuperheroes19930

I don't understand why you want to annotate lncRNAs in smallRNA-seq data? You won't find any or rather you shouldn't if the library preparation was done properly. You also wrote in the comment that you enriched for miRNAs and piRNAs. Neither of these have introns.

ADD REPLYlink written 13 months ago by Grinch90
1
gravatar for Grinch
13 months ago by
Grinch90
Germany
Grinch90 wrote:

Firstly, for small RNAs you should use Bowtie2 or similar, for mapping reads, because small RNAs don't have introns, thus a spliced aligner is not necessary, in fact it performs worse in my experience for such data. After the alignment of reads to the reference genome, you should count number of reads per gene, for example with featureCount from the SubRead package. RNA central database has a very extensive annotation of small RNAs and you can download an annotation file in GTF of GFF format, with which you annotate your data.

ADD COMMENTlink modified 13 months ago • written 13 months ago by Grinch90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 850 users visited in the last hour