Hi, I have the bam files of small RNA sequencing data mapped to the human reference genome by STAR. I have to find out the percentage of reads that mapped to:
(1) miRNA (2) lncRNA (3) piRNA (4) other non-coding rna (5) introns (6) 3- and 5- utrs (7) promoters
I started by finding out the reads which mapped to known mature miRNA. The command I have used is bedtools intersect -abam bam_file -b mature_mirna_gff file -bed | wc -l
Then I am using the annotation file of lncRNA to find the number of reads, then piRNA and so on.
Is this methodology correct? Do I need to remove the reads which mapped to a specific class from the bam file after each step?