I am running a test run on HT-seq counts on a data set previously used by a member of my lab, using the galaxy tool. I am getting an unusually high amount of No Feature results. In the raw data, there are approximately 26 million forward reads and 26 million reverse. The model is Zebrafish, reference genome GRCz10.
Mode = Union, Stranded = Yes (The experiment was strand specific) , Minimum Alignment quality = 10, Feature type = Exon, ID Attribute = gene_id, Additional Bam = No, Force Sorting of BAM by name = Yes
The results of HT-seq Counts are:
no_feature 22001714, ambiguous 13105, too_low_aQual 0, not_aligned 0, alignment_not_unique 19024306,
Can anyone explain why the majority of the reads are reading as No Feature? Is this normal and can be explained by the use of the zebrafish or is something going wrong?