Hi,
I was trying to get the counts of miRNAs using htseq-count. The sam file is generated from adapter trimmed miRNA-seq alignment (bowtie) against miRBase mature miRNAs.
The sam file look like this:
HISEQ2000:383:C2452ACXX:1:2316:21334:99296 0 hsa-miR-30c-5p 1 255 23M * 0 0 TGTAAACATCCTACACTCTCAGC @@@DDDD>FFHFBH<C:FB9EE: XA:i:0 MD:Z:23 NM:i:0
HISEQ2000:383:C2452ACXX:1:2316:21266:99322 0 hsa-miR-486-5p 1 255 22M * 0 0 TCCTGTACTGAGCTGCCCCGAG @C@FFFDFFHAHHIIIIIIEED XA:i:0 MD:Z:22 NM:i:0
The reference fasta file used to create bowtie index:
>hsa-let-7a-5p
TGAGGTAGTAGGTTGTATAGTT
>hsa-let-7a-3p
CTATACAATCTACTGTCTTTC
Gtf file used :
hsa-mir-6859-1 . miRNA_primary_transcript 17369 17436 . - . ID "MI0022705"; Alias "MI0022705"; Name "hsa-mir-6859-1"
hsa-miR-6859-5p . miRNA 17409 17431 . - . ID "MIMAT0027618"; Alias "MIMAT0027618"; Name "hsa-miR-6859-5p"; Derives_from "MI0022705"
I am not getting any counts for miRNAs based on these files with htseq-count. At the same time, I can see from the sam file that there are mappings to miRNAs.
htseq command used:
htseq-count --mode=intersection-strict --stranded=no --type=miRNA --idattr=Name eg.sam hsa.gff3
Could anyone advise me whether there is anything wrong in this approach or in the files I have used to get the counts ?
Thanks
As sanders.muc noted, it looks like your annotation won't match the alignments. How long is the "hsa-miR-6859-5p" transcript in the fasta file? If it's less than ~17.4kb then this will never work. Why don't you just look at how many reads map to each contig in the resulting BAM (look at samtools idxstats)? Presuming you first filtered out multimappers, that would seem to work.