Question

RNA Seq Analysis:Feature Counts are zero

0

Entering edit mode

3.9 years ago

Genomics ▴ 20

I have feature counts with zero for most of genes in most of the samples. I am worrying as Alignment statistics shows 90% (12M reads aligned) so if not exon then where it is going and aligning.

Not all counts are zero, but for some specific genes i would expect counts to be higher in samples Following command i have used

featureCounts -p -t exon -g gene_id -a Refseq.gtf -o output.txt sample.bam -M -O

For Example this is feature count of CXCL2 gene in different samples

CXCL2   1   0   0   0   0   0   0   0

RNA-Seq feature counts • 2.3k views

ADD COMMENT • link updated 3.8 years ago by Biostar 20 • written 3.9 years ago by Genomics ▴ 20

1

Entering edit mode

Have you visually verified alignments using IGV? You may have uneven coverage and/or over-amplified libraries. You are even counting multi-mapping reads (which is odd) and still getting zeros?

ADD REPLY • link 3.9 years ago by GenoMax 141k

0

Entering edit mode

I have visualized in IGV , it shows the same as what i get in feature Count.

ADD REPLY • link 3.9 years ago by Genomics ▴ 20

2

Entering edit mode

As in there are no reads mapping to that gene? If that is the case why are you surprised by featureCounts result? Perhaps there is something wrong with the experiment itself?

ADD REPLY • link 3.9 years ago by GenoMax 141k

0

Entering edit mode

Agreed to this point. But why am I getting aligned reads (12-20 M) for all samples. I want to know where are they going and aligning.

ADD REPLY • link 3.9 years ago by Genomics ▴ 20

0

Entering edit mode

What strandedness was your library prep kit?

What strandedness flag did you use? (-s 0, 1, 2)

Do those match?

ADD REPLY • link 3.9 years ago by bruce.moran ▴ 960

0

Entering edit mode

Yes it is matching my library is F/R strand and I've used following in Hisat --fr -1

ADD REPLY • link 3.9 years ago by Genomics ▴ 20

0

Entering edit mode

featureCounts has isStrandSpecific flag also, default is 0 indicating unstranded.

See this helpful thread.

ADD REPLY • link 3.9 years ago by bruce.moran ▴ 960

0

Entering edit mode

I did this. Doesn't help

ADD REPLY • link 3.9 years ago by Genomics ▴ 20

0

Entering edit mode

What's the experimental design here? What material are samples from? Did you do the labwork, or can you get RNA integrity/RINs? What were duplicate rates like? What are levels of CXCL1, 3, 5 like?

ADD REPLY • link 3.9 years ago by bruce.moran ▴ 960

0

Entering edit mode

My samples are from swab, I've not done any lab work, just analyzing the data.But I can get RNA integrity. All CXCls have no counts in samples.

ADD REPLY • link 3.9 years ago by Genomics ▴ 20

0

Entering edit mode

I'd really want to know how long after swabs were collected was RNA extracted, what was the swab stored in (RNAlater, buffer from extraction kit)?

What library kit was used? Previously might have said to look into rRNA contamination if an rRNA-depletion method was used, but nearly always polyA-selection now.

My guess would be low complexity library based on sample degradation. There would be a lot of duplicates (50%+) if so.

If you want to know where reads align and were counted, sort your featureCounts output file on the sum of count columns

awk '{print $0, $1+$2+$3+$4+$5+$6+$7+$8}' output.txt | sort -k10rn

I would also test for exonic reads using samtools view with a relevant bed file of gene regions using -L and -M.

ADD REPLY • link 3.9 years ago by bruce.moran ▴ 960

0

Entering edit mode

I've removed rRNA by mapping them to rRNA databse Also how about doing a Denovo assembly and checking where the reads are aligning?

ADD REPLY • link 3.9 years ago by Genomics ▴ 20

0

Entering edit mode

Could do de novo, presume this is human if swabs (buccal?), if so I'd be more inclined to see what the issue was with the data.

What level of rRNA reads were in the samples? Was it polyA selection kit or rRNA depletion?

Still interested in the duplication rate, think ruling out low complexity library is useful. That does assume some genes had relatively 'normal' levels of expression(?)

ADD REPLY • link 3.9 years ago by bruce.moran ▴ 960