featureCounts outputs zero raw read counts!!
1
0
Entering edit mode
6 months ago
kuntalasb ▴ 10

Hello all, I have a set of lncRNAs identified from a plant species for which I need to perform differential gene expression through DESeq2 package. So when I run featureCounts of "Rsubread" package of R to generate raw read counts of the lncRNAs, the output counts.txt shows read counts "0" for most of the lncRNAs, which shouldn't be the case as I have inputted only those lncRNAs that have FPKM value of greater than 1 and hence are expressed in the samples. My featureCounts command line was :

    files=c("Con-1-sorted.sam","Con-2-sorted.sam","Con-3-sorted.sam","Treat-1-sorted.sam","Treat-2-sorted.sam","Treat-3-sorted.sam")

fc=featureCounts(files,annot.ext="expressed-lnc.gtf",isGTFAnnotationFile=TRUE, GTF.attrType="transcript_id", allowMultiOverlap=TRUE, strandSpecific=1, isPairedEnd=TRUE)

write.table(x=data.frame(fc$annotation[,c("GeneID","Length")],fc$counts,stringsAsFactors=FALSE),file="counts.txt",quote=FALSE,sep="\t",row.names=FALSE)


The counts.txt output looks like this:

GeneID  Length  Con.1.sorted.sam    Con.2.sorted.sam    Con.3.sorted.sam    Treat.1.sorted.sam  Treat.2.sorted.sam  Treat.3.sorted.sam
TCONS_00010754  676 0   0   0   0   0   0
TCONS_00003631  1769    0   0   0   0   0   0
TCONS_00009207  659 0   0   0   0   0   0
TCONS_00000101  283 15  31  17  12  12  9
TCONS_00005923  1184    0   0   0   0   0   0
TCONS_00007573  1205    0   0   0   0   0   0
TCONS_00009271  761 0   0   0   0   0   0
TCONS_00000248  590 0   0   0   0   0   0
TCONS_00000249  864 0   0   0   0   0   0
TCONS_00003762  680 0   0   0   0   0   0
TCONS_00000286  848 0   0   0   0   0   0
TCONS_00000293  533 0   0   0   0   0   0


Am I missing something? Previously I didn't use allowMultiOverlap=TRUE but saw somewhere that this might give actual counts for the genes and it worked for some people. However it didn't work for me. I am new to this field so I don't have much knowledge regarding bioinformatic tools and commands. Any help would be appreciated. Thank you.

featureCounts DGE • 389 views
1
Entering edit mode

You could simply be looking at an initial few genes that don't have any counts. You have this line

TCONS_00000101  283 15  31  17  12  12  9


so at least one gene in your example has counts. It is possible that based on pointed noted by @Istvan below it may be a matter of checking the cause of why you are mostly getting zero counts. Remember that featureCounts will not count multi-mapped reads by default.

1
Entering edit mode
6 months ago

Here it would be difficult to advise as no one else can see your data and getting zero counts can have have many reasons - from using the tools incorrectly or having mismatching data to legitimate biological reasons.

In general never "just run" a counter tool without fully understanding and visualizing your alignment data. So do that first:

Take your BAM files and your GTF files and visualize them in IGV.

With that you will establish that:

1. the data was mapped correctly
2. the annotations match the chromosomal naming/coordinates in the data
3. the reads align in the expected directionality
4. you can count by eye how many reads should you be getting