I'm new to analysing small RNa-seq and I have some questions. Hope some experts analysing small RNA-seq could give me some advices.
- I'm mapping my single-end smRNA-seq data to hg19, hg38 references. I used Cap-mirseq pipeline to do this so the aligner was bowtie. When I got bam files, I check mapped reads with samtools flagstat and was surprised. My bam mapped to hg19 reference got about 50,000,000 reads and bam mapped to hg38 reference got about half of hg19 mapped bam. I had 3 more other data and tried all of them. Here's what I got for the flagstat result.
58275643 + 0 mapped (90.82% : N/A) 25226589 + 0 mapped (87.94% : N/A) 36270257 + 0 mapped (86.49% : N/A) 27601897 + 0 mapped (91.43% : N/A)
23224974 + 0 mapped (53.08% : N/A) 18395834 + 0 mapped (74.52% : N/A) 20368027 + 0 mapped (62.12% : N/A) 17979959 + 0 mapped (73.06% : N/A)
Could this be possible??
- I'm going to analysis DEG with these data. I'm confused how to get raw count file with smRNA-seq data. This is different with just RNA-seq, right? Could someone give me some pointer how to do this? Should I use just normal gtf file or miRNA data base's gtf file?(such as hairpin.gft?)
(ex. using htseq-count with which gtf file or gff file, the feature type I should use, id attribute to use,etc)
Thank you very much for your helps!