Question: High no feature counts in ht-seq counts
0
gravatar for eozcan
4 months ago by
eozcan0
eozcan0 wrote:

Hi,

I know this question has been posted before but I could not find answers to increase my counts for RNA-seq reads. I have done RNA-seq in Illumina Nextseq. I prepared libraries with NEBNext Ultra II Directional library kit. I used bowtie2 to map my reads to the reference genome. It is a bacterial genome. I have more than 95 % alignment rates in my samples. I converted sam file to bam file and name sorted. Then, I used ht-seq count to get the RNA-seq counts. here is my command line:

-o eo8_FP1_2_counts.txt htseq-count -f bam -t gene -i ID --stranded=reverse eo8_FP1_2_trimmed_sorted.bam ../Lplantarum.gff3


__no_feature    3587262
__ambiguous 39161
__too_low_aQual 15356
__not_aligned   35528
__alignment_not_unique  0

This is for a sample which has 5128080 (100.00%) were paired and 99.24% overall alignment rate. It seems like I am losing half of my reads.

What should I do to improve my ht-seq counts?

Thanks.

sequencing rna-seq • 324 views
ADD COMMENTlink modified 4 months ago • written 4 months ago by eozcan0
1

And the annotation file matches your fasta genome?

ADD REPLYlink written 4 months ago by WouterDeCoster38k

I dowloaded them both from ncbi from same genome.

ADD REPLYlink written 4 months ago by eozcan0

A bacterial genome should be gene dense (and you would not need to worry about splicing). Have you reviewed the alignment in a genome browser?

ADD REPLYlink written 4 months ago by genomax65k

I have not. Which genome browser would you recommend? and what should I check when I review it?

ADD REPLYlink written 4 months ago by eozcan0

Use Integrated Genome Viewer (From Broad Institute). You will need to create a custom genome for your bacterium for which you will need a fasta format genome sequence file and GFF file to go with it. Once you open the alignment make sure reads are mapping under the genes.

ADD REPLYlink written 4 months ago by genomax65k

You confirmed with the people who made the library that the prep was stranded?

ADD REPLYlink modified 4 months ago • written 4 months ago by swbarnes25.2k

Quote from original question:

I prepared libraries with NEBNext Ultra II Directional library kit.

ADD REPLYlink modified 4 months ago • written 4 months ago by genomax65k

I made the libraries and it is stranded.

ADD REPLYlink written 4 months ago by eozcan0

Your commnad-line is broken (incomplete), but apparently you didn't specify the stranded option for htseq-count.

ADD REPLYlink written 4 months ago by h.mon24k

stranded option is reverse, I specified it. It is there. What else is incomplete I didnt understand?

ADD REPLYlink written 4 months ago by eozcan0

[Update] my sorted bam files were name sorted and ht-seq count somehow do not like that kind of sort in my samples. So, I redo htseq count with coordinate sorted bam files, however, that did not improve the no-feature counts. I had some addition to my command line as below

-o eo8_sortedcoordinate_counts.txt htseq-count -f bam -t gene -i ID --stranded=reverse --nonunique=all --mode=union -a=0 --order=name eo8_FP1_2_trimmed.sorted.bam ../Lplantarum.gff3

Are there any comments on how to improve high no feature count?

ADD REPLYlink modified 4 months ago by WouterDeCoster38k • written 4 months ago by eozcan0

Have you looked at the alignments in a genome browser as I had suggested above?

ADD REPLYlink written 4 months ago by genomax65k

enter image description heredoes this mean they are aligning to genome?

igvsnapshot

ADD REPLYlink modified 4 months ago by genomax65k • written 4 months ago by eozcan0

I am not sure if you can see the image. But it is aligning to the genes.

ADD REPLYlink modified 4 months ago • written 4 months ago by eozcan0

Please use to post the images properly : How to add images to a Biostars post

Looking at the images it does look like you have reads aligning to all regions so not sure why they are not being counted. Can you use "unstranded" counting option to see if you are able to get read counts to go up?

ADD REPLYlink modified 4 months ago • written 4 months ago by genomax65k

I tried that it does not change/improve my reads.

ADD REPLYlink written 4 months ago by eozcan0

Only thing I can think of is the annotation file you downloaded is either not in the correct format or has some other errors in it.

ADD REPLYlink written 4 months ago by genomax65k

I tried another annotation file and a fasta file from Ensemle, the previous one from ncbi. In order to see if there is a problem with the annotation file. I got lower counts and higher no feature.

ADD REPLYlink modified 4 months ago • written 4 months ago by eozcan0

I think this is a point where access to your data/analysis may be needed to figure out what is going on. A forum is not the best place to do that.

Perhaps someone else may be along with suggestions on other things to try.

ADD REPLYlink written 4 months ago by genomax65k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1213 users visited in the last hour