If in case you used GTF file as reference annotation,
1) You can just convert the annotation into table format.
Example: C: How do I get the gene annotation for the latest version (GRCh38)?
2) Import you GTF converted table (Geneid GeneSymbol Chromosome Start End Class Strand Length) and your matrix from featurecounts (Geneid sample1expr Sample2expr Sample3expr) into R and use 'merge' by 'Geneid' column.
x <- read.table("featurecounts.matrix", header=T, sep="\t")
annotation <- read.table("annotation.txt", header=T, sep="\t")
featurecounts_annotated <- merge( annotation, x, by='Geneid')
3) Then you can sum the counts in the sample column based on RNA class you are interested in.
Two-step:
### Two-step 1) sum the reads by column class
sample1_countSum <- aggregate(cbind(featurecounts_annotated$sample1expr) ~ Class, data = featurecounts_annotated, sum)
### Two-step 2) calculate percentage
sample1_countSum[,"percentage"] <- ( sample1_countSum$V1/sum( sample1_countSum$V1))*100
Single-step:
sample1_result <- aggregate((cbind(featurecounts_annotated$sample1expr)/sum(featurecounts_annotated$sample1expr))*100 ~ Class, data = featurecounts_annotated, sum)
Final output you will have Class of RNAs with corresponding percentage mapped reads from sample1.
Hi EagleEye, I've already generated a table in the terminal that looks like the following:
saved as a .txt file. Would I still have to carry out 2) or would I be able to go straight to 3).
Please use
ADD COMMENT/ADD REPLYwhen responding to existing posts to keep threads logically organized.This comment belongs under @EagleEye's answer.
You mean you got matrix like this,
Yes exactly, thats the matrix I've got !
Consider this matrix as 'featurecounts.matrix' in the below example. Follow other steps I mentioned.