HTSeq-count output files sizes are always 10mb
1
0
Entering edit mode
8.0 years ago

Hi.

As the post title says, when I run this command: python -m HTSeq.scripts.count -f bam -s no -i ID -t exon -r name ./name_sorted.bam ./Ppersica_298_v2.1.gene_exons.gff3 > ./Counts the output file size are always 10mb. I have done this at least 5 times in a row, with the same command on different Prunus Persica bam files (output from Tophat2, RNA-Seq data) which were name sorted using samtools.

It's weird that all the files has the same size, like HTSeq-count has a limit size for the output, or maybe my command is wrong?

RNA-Seq HTSeq-Count Python • 2.1k views
ADD COMMENT
3
Entering edit mode
8.0 years ago

If you're counting reads at sites, the output file will always be a list of the same length.

The input data would have to be an order of magnitude different in total reads to have any impact on the number of characters in the counts column.

ADD COMMENT
1
Entering edit mode

Exactly. Every feature in your GTF file will be present in the output of htseq count. But for a lot of the exons in there, the count will be 0.

ADD REPLY
0
Entering edit mode

So my file sizes are fine. Thanks a lot for the quick answer.

ADD REPLY

Login before adding your answer.

Traffic: 2015 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6