Question

High no feature counts in ht-seq counts

0

Entering edit mode

6.6 years ago

eozcan ▴ 10

Hi,

I know this question has been posted before but I could not find answers to increase my counts for RNA-seq reads. I have done RNA-seq in Illumina Nextseq. I prepared libraries with NEBNext Ultra II Directional library kit. I used bowtie2 to map my reads to the reference genome. It is a bacterial genome. I have more than 95 % alignment rates in my samples. I converted sam file to bam file and name sorted. Then, I used ht-seq count to get the RNA-seq counts. here is my command line:

-o eo8_FP1_2_counts.txt htseq-count -f bam -t gene -i ID --stranded=reverse eo8_FP1_2_trimmed_sorted.bam ../Lplantarum.gff3


__no_feature    3587262
__ambiguous 39161
__too_low_aQual 15356
__not_aligned   35528
__alignment_not_unique  0

This is for a sample which has 5128080 (100.00%) were paired and 99.24% overall alignment rate. It seems like I am losing half of my reads.

What should I do to improve my ht-seq counts?

Thanks.

RNA-Seq sequencing • 3.0k views

ADD COMMENT • link 6.6 years ago by eozcan ▴ 10

1

Entering edit mode

And the annotation file matches your fasta genome?

ADD REPLY • link 6.6 years ago by WouterDeCoster 48k

0

Entering edit mode

I dowloaded them both from ncbi from same genome.

ADD REPLY • link 6.6 years ago by eozcan ▴ 10

0

Entering edit mode

A bacterial genome should be gene dense (and you would not need to worry about splicing). Have you reviewed the alignment in a genome browser?

ADD REPLY • link 6.6 years ago by GenoMax 152k

0

Entering edit mode

I have not. Which genome browser would you recommend? and what should I check when I review it?

ADD REPLY • link 6.6 years ago by eozcan ▴ 10

0

Entering edit mode

Use Integrated Genome Viewer (From Broad Institute). You will need to create a custom genome for your bacterium for which you will need a fasta format genome sequence file and GFF file to go with it. Once you open the alignment make sure reads are mapping under the genes.

ADD REPLY • link 6.6 years ago by GenoMax 152k

0

Entering edit mode

You confirmed with the people who made the library that the prep was stranded?

ADD REPLY • link 6.6 years ago by swbarnes2 15k

0

Entering edit mode

Quote from original question:

I prepared libraries with NEBNext Ultra II Directional library kit.

ADD REPLY • link 6.6 years ago by GenoMax 152k

0

Entering edit mode

I made the libraries and it is stranded.

ADD REPLY • link 6.6 years ago by eozcan ▴ 10

0

Entering edit mode

Your commnad-line is broken (incomplete), but apparently you didn't specify the stranded option for htseq-count.

ADD REPLY • link 6.6 years ago by h.mon 35k

0

Entering edit mode

stranded option is reverse, I specified it. It is there. What else is incomplete I didnt understand?

ADD REPLY • link 6.6 years ago by eozcan ▴ 10

0

Entering edit mode

[Update] my sorted bam files were name sorted and ht-seq count somehow do not like that kind of sort in my samples. So, I redo htseq count with coordinate sorted bam files, however, that did not improve the no-feature counts. I had some addition to my command line as below

-o eo8_sortedcoordinate_counts.txt htseq-count -f bam -t gene -i ID --stranded=reverse --nonunique=all --mode=union -a=0 --order=name eo8_FP1_2_trimmed.sorted.bam ../Lplantarum.gff3

Are there any comments on how to improve high no feature count?

ADD REPLY • link updated 6.6 years ago by WouterDeCoster 48k • written 6.6 years ago by eozcan ▴ 10

0

Entering edit mode

Have you looked at the alignments in a genome browser as I had suggested above?

ADD REPLY • link 6.6 years ago by GenoMax 152k

0

Entering edit mode

enter image description here does this mean they are aligning to genome?

igvsnapshot

ADD REPLY • link updated 6.6 years ago by GenoMax 152k • written 6.6 years ago by eozcan ▴ 10

0

Entering edit mode

I am not sure if you can see the image. But it is aligning to the genes.

ADD REPLY • link 6.6 years ago by eozcan ▴ 10

0

Entering edit mode

Please use to post the images properly : How to add images to a Biostars post

Looking at the images it does look like you have reads aligning to all regions so not sure why they are not being counted. Can you use "unstranded" counting option to see if you are able to get read counts to go up?

ADD REPLY • link 6.6 years ago by GenoMax 152k

0

Entering edit mode

I tried that it does not change/improve my reads.

ADD REPLY • link 6.6 years ago by eozcan ▴ 10

0

Entering edit mode

Only thing I can think of is the annotation file you downloaded is either not in the correct format or has some other errors in it.

ADD REPLY • link 6.6 years ago by GenoMax 152k

0

Entering edit mode

I tried another annotation file and a fasta file from Ensemle, the previous one from ncbi. In order to see if there is a problem with the annotation file. I got lower counts and higher no feature.

ADD REPLY • link 6.6 years ago by eozcan ▴ 10

0

Entering edit mode

I think this is a point where access to your data/analysis may be needed to figure out what is going on. A forum is not the best place to do that.

Perhaps someone else may be along with suggestions on other things to try.

ADD REPLY • link 6.6 years ago by GenoMax 152k