Well hello there,
I am using
featureCounts from the subread package to count some third generation reads produced by Nanopore sequencing (MinION) and mapped to a reference genome. While we had overall high basecall quality for our reads and the mapping rates were also very nice (94%)
featureCount only produced assignment rates in the 50% to 60%.
The largest group there is "NoFeatures" which made me wonder where those reads mapped.
Assigned 1057725 Unassigned_Unmapped 62207 Unassigned_Read_Type 0 Unassigned_Singleton 0 Unassigned_MappingQuality 0 Unassigned_Chimera 0 Unassigned_FragmentLength 0 Unassigned_Duplicate 0 Unassigned_MultiMapping 0 Unassigned_Secondary 0 Unassigned_NonSplit 0 Unassigned_NoFeatures 457608 Unassigned_Overlapping_Length 0 Unassigned_Ambiguity 283748
I used a custom annotation gff (Gencode + Custom features) to count the mappings. I was wondering if somebody knew a tool or straight forward way (other then checking IGV visually), where those reads are.
Especially if we possibly have some kind of contamination by genomic DNA.
Any suggestions for QC / Tools / procedures are welcome. Thanks !