Htseq-count output file having a high number of __not

Htseq-count output file having a high number of __not_aligned

0

Entering edit mode

19 months ago

Nemo • 0

I have aligned my RNA sequences against human genome GRCh38.p13. Then I am using htseq-count to count the reads per gene. The below is the command I am using:

htseq-count ./sorted-bams/f.bam ./gencode.v41.chr_patch_hapl_scaff.annotation.gff3.gz >  ./htseq/f.txt

In the output file, I got :

__no_feature    2280226
__ambiguous 244
__too_low_aQual 3761161
__not_aligned   34990259
__alignment_not_unique  0

Are these numbers reasonable?

read htseq-count human counts genome alignment • 722 views

ADD COMMENT • link updated 19 months ago by Shred ★ 1.4k • written 19 months ago by Nemo • 0

0

Entering edit mode

How are distributed your reads in term of length? And their quality? Which software did you used for aligning them? Which parameters where used for the alignment? With no details there's no margin to investigate for reasons.

ADD REPLY • link 19 months ago by Shred ★ 1.4k

0

Entering edit mode

Thanks Shred for your response. Regarding software, I am using GATK DRAGEN Map for alignment. The command I am using does not have any specific parameter as follows:

dragen-os -r /human -1 R1 -2 R2 > /samFiles/sample.sam

Im not sure how can I get the other information you asked for. samtools view maybe?

ADD REPLY • link 19 months ago by Nemo • 0

0

Entering edit mode

I'm not sure how Dragen handles the alignment file and if there's any incompatibilities with quantification softwares like htseq-count. Why not following the Illumina protocol also for the quantification?

Before doing any kind of analysis you need to check the quality of the sequencing file. Run FASTQC against them to see the read length distribution, sequencing qualities and other features.

ADD REPLY • link 19 months ago by Shred ★ 1.4k

Login before adding your answer.