featureCounts outputs zero raw read counts!!
1
0
Entering edit mode
3.0 years ago
kuntalasb ▴ 10

Hello all, I have a set of lncRNAs identified from a plant species for which I need to perform differential gene expression through DESeq2 package. So when I run featureCounts of "Rsubread" package of R to generate raw read counts of the lncRNAs, the output counts.txt shows read counts "0" for most of the lncRNAs, which shouldn't be the case as I have inputted only those lncRNAs that have FPKM value of greater than 1 and hence are expressed in the samples. My featureCounts command line was :

    files=c("Con-1-sorted.sam","Con-2-sorted.sam","Con-3-sorted.sam","Treat-1-sorted.sam","Treat-2-sorted.sam","Treat-3-sorted.sam")

    fc=featureCounts(files,annot.ext="expressed-lnc.gtf",isGTFAnnotationFile=TRUE, GTF.attrType="transcript_id", allowMultiOverlap=TRUE, strandSpecific=1, isPairedEnd=TRUE)

write.table(x=data.frame(fc$annotation[,c("GeneID","Length")],fc$counts,stringsAsFactors=FALSE),file="counts.txt",quote=FALSE,sep="\t",row.names=FALSE)

The counts.txt output looks like this:

GeneID  Length  Con.1.sorted.sam    Con.2.sorted.sam    Con.3.sorted.sam    Treat.1.sorted.sam  Treat.2.sorted.sam  Treat.3.sorted.sam
TCONS_00010754  676 0   0   0   0   0   0
TCONS_00003631  1769    0   0   0   0   0   0
TCONS_00009207  659 0   0   0   0   0   0
TCONS_00000101  283 15  31  17  12  12  9
TCONS_00005923  1184    0   0   0   0   0   0
TCONS_00007573  1205    0   0   0   0   0   0
TCONS_00009271  761 0   0   0   0   0   0
TCONS_00000248  590 0   0   0   0   0   0
TCONS_00000249  864 0   0   0   0   0   0
TCONS_00003762  680 0   0   0   0   0   0
TCONS_00000286  848 0   0   0   0   0   0
TCONS_00000293  533 0   0   0   0   0   0

Am I missing something? Previously I didn't use allowMultiOverlap=TRUE but saw somewhere that this might give actual counts for the genes and it worked for some people. However it didn't work for me. I am new to this field so I don't have much knowledge regarding bioinformatic tools and commands. Any help would be appreciated. Thank you.

featureCounts DGE • 1.1k views
ADD COMMENT
1
Entering edit mode

You could simply be looking at an initial few genes that don't have any counts. You have this line

TCONS_00000101  283 15  31  17  12  12  9

so at least one gene in your example has counts. It is possible that based on pointed noted by @Istvan below it may be a matter of checking the cause of why you are mostly getting zero counts. Remember that featureCounts will not count multi-mapped reads by default.

ADD REPLY
1
Entering edit mode
3.0 years ago

Here it would be difficult to advise as no one else can see your data and getting zero counts can have have many reasons - from using the tools incorrectly or having mismatching data to legitimate biological reasons.

In general never "just run" a counter tool without fully understanding and visualizing your alignment data. So do that first:

Take your BAM files and your GTF files and visualize them in IGV.

With that you will establish that:

  1. the data was mapped correctly
  2. the annotations match the chromosomal naming/coordinates in the data
  3. the reads align in the expected directionality
  4. you can count by eye how many reads should you be getting
ADD COMMENT

Login before adding your answer.

Traffic: 2571 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6