Question: HT-seq - what is ideal?
gravatar for junsionglow
8 months ago by
junsionglow20 wrote:

Hi guys,

I ran a HT-seq command and I would like to cross check with you guys. wonder what output is ideal? Below is the command

htseq-count -f bam --idattr=gene -r pos /home/user/scratch60/STARresults/SRR7059136Aligned.sortedByCoord.out.bam /home/user/scratch60/NCBI_files/GCF_000001405.26_GRCh38_genomic.gff >/home/user/scratch60/HTseq_annotation/annotated_SRR7059136.txt

and the output is this

12600000 SAM alignment record pairs processed.
Warning: Mate pairing was ambiguous for 22805 records; mate key for first such record: ('SRR7059136.1152992', 'first', 'NC_000001.11', 135867, 'NC_000001.11', 493007, 357290).
12621898 SAM alignment pairs processed.

My questions are:

  1. Should I be concerned about missing mate encountered warnings? Is there an ideal number one should be aiming for?
  2. Am I right to run -r pos because my STAR command included --outSAMtype BAM SortedByCoordinate? I'm trying to understand the logic of this, if someone can explain it, it would be much appreciated!

Thanks guys!

rna-seq • 316 views
ADD COMMENTlink modified 8 months ago by Devon Ryan89k • written 8 months ago by junsionglow20
gravatar for Devon Ryan
8 months ago by
Devon Ryan89k
Freiburg, Germany
Devon Ryan89k wrote:
  1. In an ideal world it'd be 0, but losing <1% won't affect anything.
  2. Sure, though you can just have STAR quantify things for your and not have to wait as long.

In general HTSeq-count isn't much used these days because it's quite slow. Either have STAR do the counting for you or use featureCounts and you'll get the results quicker.

ADD COMMENTlink written 8 months ago by Devon Ryan89k

Thanks Devon! That makes sense.

ADD REPLYlink written 8 months ago by junsionglow20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1381 users visited in the last hour