Hi biostars
"I am analyzing human RNA-seq samples. In the mapping step, I used STAR and obtained an alignment rate of more than 80%. However, when I used featureCounts, the percentage of assigned reads dropped to less than 50%, and a large proportion of the unassigned reads are reported as 'no feature'. I am confident about the strandness setting, and I used the same GTF file for both STAR and featureCounts. What could be the reason for this discrepancy?"
Depending on if this was a mRNAseq or totalRNA seq dataset, one explanation could be that the "no feature" reads may be rRNA (specially if your GTF file does not contain rRNA genes). With totalRNAseq this would be expected (a large % of reads being rRNA). In case this is mRNAseq, then it is possible that polyA capture (or rRNA depletion step) may not have worked as well as expected.
You can also examine the aligned data using a genome viewer like IGV and check where the reads are aligning. e.g. inside or outside gene models (exons, UTR). Reads aligning outside genes may represent DNA contamination (random alignments), intronic reads or previously unknown genes.
Since you mentioned galaxy in the title, any questions specifically about that should be posted on their help forum to get specific response from their team: https://help.galaxyproject.org/
āIām performing a meta-analysis across multiple human RNA-seq datasets. can I proceed with these data as they are, or should I make changes (e.g., filtering rRNA, changing counting method, or excluding samples)? What best practices would you recommend for meta-analysis in this situation?ā
āIām performing a meta-analysis across multiple human RNA-seq datasets. can I proceed with these data as they are, or should I make changes (e.g., filtering rRNA, changing counting method, or excluding samples)? What best practices would you recommend for meta-analysis in this situation?ā