Question: High no feature counts in ht-seq counts
0
gravatar for eozcan
5 weeks ago by
eozcan0
eozcan0 wrote:

Hi,

I know this question has been posted before but I could not find answers to increase my counts for RNA-seq reads. I have done RNA-seq in Illumina Nextseq. I prepared libraries with NEBNext Ultra II Directional library kit. I used bowtie2 to map my reads to the reference genome. It is a bacterial genome. I have more than 95 % alignment rates in my samples. I converted sam file to bam file and name sorted. Then, I used ht-seq count to get the RNA-seq counts. here is my command line:

-o eo8_FP1_2_counts.txt htseq-count -f bam -t gene -i ID --stranded=reverse eo8_FP1_2_trimmed_sorted.bam ../Lplantarum.gff3


__no_feature    3587262
__ambiguous 39161
__too_low_aQual 15356
__not_aligned   35528
__alignment_not_unique  0

This is for a sample which has 5128080 (100.00%) were paired and 99.24% overall alignment rate. It seems like I am losing half of my reads.

What should I do to improve my ht-seq counts?

Thanks.

sequencing rna-seq • 222 views
ADD COMMENTlink modified 5 weeks ago • written 5 weeks ago by eozcan0
1

And the annotation file matches your fasta genome?

ADD REPLYlink written 5 weeks ago by WouterDeCoster35k

I dowloaded them both from ncbi from same genome.

ADD REPLYlink written 5 weeks ago by eozcan0

A bacterial genome should be gene dense (and you would not need to worry about splicing). Have you reviewed the alignment in a genome browser?

ADD REPLYlink written 5 weeks ago by genomax60k

I have not. Which genome browser would you recommend? and what should I check when I review it?

ADD REPLYlink written 5 weeks ago by eozcan0

Use Integrated Genome Viewer (From Broad Institute). You will need to create a custom genome for your bacterium for which you will need a fasta format genome sequence file and GFF file to go with it. Once you open the alignment make sure reads are mapping under the genes.

ADD REPLYlink written 5 weeks ago by genomax60k

You confirmed with the people who made the library that the prep was stranded?

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by swbarnes24.7k

Quote from original question:

I prepared libraries with NEBNext Ultra II Directional library kit.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by genomax60k

I made the libraries and it is stranded.

ADD REPLYlink written 5 weeks ago by eozcan0

Your commnad-line is broken (incomplete), but apparently you didn't specify the stranded option for htseq-count.

ADD REPLYlink written 5 weeks ago by h.mon22k

stranded option is reverse, I specified it. It is there. What else is incomplete I didnt understand?

ADD REPLYlink written 5 weeks ago by eozcan0

[Update] my sorted bam files were name sorted and ht-seq count somehow do not like that kind of sort in my samples. So, I redo htseq count with coordinate sorted bam files, however, that did not improve the no-feature counts. I had some addition to my command line as below

-o eo8_sortedcoordinate_counts.txt htseq-count -f bam -t gene -i ID --stranded=reverse --nonunique=all --mode=union -a=0 --order=name eo8_FP1_2_trimmed.sorted.bam ../Lplantarum.gff3

Are there any comments on how to improve high no feature count?

ADD REPLYlink modified 5 weeks ago by WouterDeCoster35k • written 5 weeks ago by eozcan0

Have you looked at the alignments in a genome browser as I had suggested above?

ADD REPLYlink written 5 weeks ago by genomax60k

enter image description heredoes this mean they are aligning to genome?

igvsnapshot

ADD REPLYlink modified 5 weeks ago by genomax60k • written 5 weeks ago by eozcan0

I am not sure if you can see the image. But it is aligning to the genes.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by eozcan0

Please use to post the images properly : How to add images to a Biostars post

Looking at the images it does look like you have reads aligning to all regions so not sure why they are not being counted. Can you use "unstranded" counting option to see if you are able to get read counts to go up?

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by genomax60k

I tried that it does not change/improve my reads.

ADD REPLYlink written 5 weeks ago by eozcan0

Only thing I can think of is the annotation file you downloaded is either not in the correct format or has some other errors in it.

ADD REPLYlink written 5 weeks ago by genomax60k

I tried another annotation file and a fasta file from Ensemle, the previous one from ncbi. In order to see if there is a problem with the annotation file. I got lower counts and higher no feature.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by eozcan0

I think this is a point where access to your data/analysis may be needed to figure out what is going on. A forum is not the best place to do that.

Perhaps someone else may be along with suggestions on other things to try.

ADD REPLYlink written 5 weeks ago by genomax60k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1103 users visited in the last hour