I have small RNA-seq data and i have a workflow for analyzing it, any suggestion about it or its fine to work with.
Trim Galore > Collapse > RNA STAR > FeatureCount > Limma-voom
I think if you collapse, you are going to have problems, because more or less every read from a miRNA will have the same sequence, then collapasing will mean you only ever get one read for each isoMir.
Also, STAR isn't really neccessary, because miRNAs arn't spliced. You should be okay using BWA or bowtie2.
Becareful when doing featureCounts. Many miRNAs multi-map because they are present in more than one genome location, but featureCounts will automatically discard these reads unless told not to.
So why in every workflow i look they use collapse, i dont understand. and instead of feacurecount, you suggest using HT-seq ?
Most workflows collapse but still keep track of the number of identical sequences (so you don't end up with the problem of one read for each isoMir).
You can still use featureCounts and specify the -M flag to count multi-mapping reads. The issue of multi-mapping depends on what you're mapping against (e.g. are you aligning against the entire reference genome?).
yes, Human reference genome. and another thing is, in bowtie should i use mirbase gff file(hsa) instead of Ucsc gff file, which i download from UCSC website ?
because bowtie isn't a spliced aligner, it doesn't use a GFF file.