Question: HT-seq - what is ideal?
gravatar for junsionglow
14 months ago by
junsionglow20 wrote:

Hi guys,

I ran a HT-seq command and I would like to cross check with you guys. wonder what output is ideal? Below is the command

htseq-count -f bam --idattr=gene -r pos /home/user/scratch60/STARresults/SRR7059136Aligned.sortedByCoord.out.bam /home/user/scratch60/NCBI_files/GCF_000001405.26_GRCh38_genomic.gff >/home/user/scratch60/HTseq_annotation/annotated_SRR7059136.txt

and the output is this

12600000 SAM alignment record pairs processed.
Warning: Mate pairing was ambiguous for 22805 records; mate key for first such record: ('SRR7059136.1152992', 'first', 'NC_000001.11', 135867, 'NC_000001.11', 493007, 357290).
12621898 SAM alignment pairs processed.

My questions are:

  1. Should I be concerned about missing mate encountered warnings? Is there an ideal number one should be aiming for?
  2. Am I right to run -r pos because my STAR command included --outSAMtype BAM SortedByCoordinate? I'm trying to understand the logic of this, if someone can explain it, it would be much appreciated!

Thanks guys!

rna-seq • 472 views
ADD COMMENTlink modified 14 months ago by Devon Ryan92k • written 14 months ago by junsionglow20
gravatar for Devon Ryan
14 months ago by
Devon Ryan92k
Freiburg, Germany
Devon Ryan92k wrote:
  1. In an ideal world it'd be 0, but losing <1% won't affect anything.
  2. Sure, though you can just have STAR quantify things for your and not have to wait as long.

In general HTSeq-count isn't much used these days because it's quite slow. Either have STAR do the counting for you or use featureCounts and you'll get the results quicker.

ADD COMMENTlink written 14 months ago by Devon Ryan92k

Thanks Devon! That makes sense.

ADD REPLYlink written 14 months ago by junsionglow20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1959 users visited in the last hour