I ran a HT-seq command and I would like to cross check with you guys. wonder what output is ideal? Below is the command
htseq-count -f bam --idattr=gene -r pos /home/user/scratch60/STARresults/SRR7059136Aligned.sortedByCoord.out.bam /home/user/scratch60/NCBI_files/GCF_000001405.26_GRCh38_genomic.gff >/home/user/scratch60/HTseq_annotation/annotated_SRR7059136.txt
and the output is this
12600000 SAM alignment record pairs processed. Warning: Mate pairing was ambiguous for 22805 records; mate key for first such record: ('SRR7059136.1152992', 'first', 'NC_000001.11', 135867, 'NC_000001.11', 493007, 357290). 12621898 SAM alignment pairs processed.
My questions are:
- Should I be concerned about missing mate encountered warnings? Is there an ideal number one should be aiming for?
- Am I right to run
-r posbecause my STAR command included
--outSAMtype BAM SortedByCoordinate? I'm trying to understand the logic of this, if someone can explain it, it would be much appreciated!