Question

featureCounts micounts reads?

0

Entering edit mode

7 weeks ago

tessa2525 • 0

Hi,

I have an RNA-seq data set with nuclear and total (whole cell) RNA under multiple conditions. I first aligned them with STAR and now I'm trying to quantify the presence of different transcript regions (e.g. introns, CDS, 3UTR etc.) in my reads. For this, I ran featureCounts multiple times on different GTF.feature types (see example below).

featureCounts(files = bam.list,annot.ext = "Araport11_GTF_genes_transposons.Jan2023.gtf", isGTFAnnotationFile = TRUE,  GTF.featureType = "CDS", GTF.attrType = "gene_id",   countMultiMappingReads = TRUE, fraction = TRUE, allowMultiOverlap = FALSE, isPairedEnd = TRUE, strandSpecific = 2, useMetaFeatures = TRUE, nthreads = 20) #do the same for GTF.featureType "intron", "three_prime_utr", "five_prime_utr, "exon"

And also ran featureCounts on the full gene to calculate relative abundances of each transcript region:

featureCounts(files = bam.list,annot.ext = "Araport11_GTF_genes_transposons.Jan2023.gtf", isGTFAnnotationFile = TRUE,  GTF.featureType = "gene", GTF.attrType = "gene_id",   countMultiMappingReads = TRUE, fraction = TRUE, allowMultiOverlap = FALSE, isPairedEnd = TRUE, strandSpecific = 2, useMetaFeatures = TRUE, nthreads = 20)

However, when I looked at the relative abundance of each region (e.g. counts per CDS / counts per gene), I noticed that for quite some genes something strange is happening. The gene GTF feature should contain all counts from the other transcript regions, but for some genes the counts in e.g. the CDS are a lot higher than the counts in the gene. See image below:

enter image description here

I checked the count data for one of those genes, and indeed the counts in both the exon and gene were really low, but they were high for the 3UTR, 5UTR and CDS. enter image description here

I cannot find an error in my GTF annotation file: enter image description here

And on IGV, the reads seem to truly align to the gene: enter image description here

Example of a BAM file that is counted for the CDS, but not for the gene: enter image description here

Any idea what might be going on here??

RNA-seq GTF introns featureCounts • 170 views

ADD COMMENT • link 7 weeks ago by tessa2525 • 0