I have a very trivial question for you all. I am running a bioinformatics pipeline (with Rsubread and EdgeR) on some RNA-seq data, and am trying to find the most upregulated genes In one library. To be more specific; I have sequencing data from three cell types: Cardiomyocyte, Endothelial, and Fibroblasts and I am well on my way into finding genes that are differentially expressed across the three groups, but how do I find the most highly expressed genes in ONLY the cardiomyocyte group? Here is the sample code that I've used so far to do the differential expression analysis across multiple groups:
all_bam_8 <- "//ALL OF THE BAM FILES THAT I NEED TO RUN FEATURECOUNTS ON" featureCounts_8 <- featureCounts(files= all_bam_8, annot.inbuilt = "hg38", allowMultiOverlap = TRUE, nthreads = 16) grouping <- factor(c("A","A","A","A","B","B","B","B","B","C","C","C","C","C","C","D","D","D","D","E","E","E","F","F","F","F","G","G","G","G")) // GROUP A are the cardiomyocte libraries, group D are the endothelial cell libraries, and group F are the fibroblast samples. This line of code is because I have 30 libraries, and each correspond to a different group. DGEList_8 <- DGEList(counts=featureCounts_8$counts, genes=featureCounts_8$annotation, group = grouping) DGEList_8 <- calcNormFactors(DGEList_8) design <- model.matrix(~DGEList_8$samples$group) // THERE ARE 7 TOTAL GROUPS IN THIS DGELIST. Remember that the only ones I'm interested in are A, D, and F (aka groups 0, 3, and 5) The rest of the groups are control groups DGEList_8 <- estimateDisp(DGEList_8, design) fit <- glmQLFit(DGEList_8, design) // BELOW IS WHERE I RUN THE QLFTESTS for each comparison, assuming that the CARDIOMYOCYTE group is my intercept qlf <- glmQLFTest(fit, coef = 3) //to compare group A (group 1: cardiomyocyte) vs group D (group 3: endothelial) topTags(qlf) qlf <- glmQLFTest(fit, coef = 6) //to compare group A (group 1: cardiomyocyte) vs group F (group 6: Fibroblast) topTags(qlf)
I posted my code so that hopefully someone may point out the step in which I should do my intra-library expression test. I will re-iterate the question: how do I find the highest upregulated genes in ONLY group A? Meaning, how do I find the genes with the highest read counts relative to all of the other genes found in library A?
Thanks in advance for your help, Damon