Question: Need help correcting overlappinge genes on valcano plot
gravatar for Adeler001
17 months ago by
Adeler0010 wrote:

I'm doing an RNA seq analysis and I'm trying to show my results using a volcano plot on R studio. I used the the follow script to make my volcano plot on R studio:

#Import count table#
countdata <- read.table("family1301RNA-seq.countsfixed.txt", header=TRUE, row.names=1)

#Convert to matrix#
countdata <- as.matrix(countdata) head(countdata)

#Assign condition (first four are controls, second four and third four contain two different experiments)#
condition<-factor(c("unaffected","unaffected","unaffected","affected","affected","affected"),levels=c("unaffected","affected")) subject <- factor(c("1","2","3","3","2","1"))


#Create a coldata frame and instantiate the DESeqDataSet#
coldata <- data.frame(row.names=colnames(countdata), subject, condition)

dds <- DESeqDataSetFromMatrix(countData=countdata, colData=coldata, design=~ subject + condition) dds

#pre-filtering to keep only rows that have at least 1 reads total#
keep <- rowSums(counts(dds)) > 1 dds <- dds[keep,]

#Run the DESeq#
dds <- DESeq(dds)

#Regularized log transformation for clustering/heatmaps#
rld <- rlogTransformation(dds) head(assay(rld)) hist(assay(rld))

#Colors for plots below#
library(RColorBrewer) (mycols <- brewer.pal(8, "Dark2")[1:length(unique(condition))])

#Sample distance heatmap#
sampleDists <- as.matrix(dist(t(assay(rld)))) library(gplots) png("qc-heatmap_baker.png", w=1000, h=1000, pointsize=20) heatmap.2(as.matrix(sampleDists), key=F, trace="none", col=colorpanel(100, "black", "white"), ColSideColors=mycols[condition], RowSideColors=mycols[condition], margin=c(10, 10), main="Sample Distance Matrix")

#Principal components analysis#
rldpca <- function (rld, intgroup = "condition", ntop = 500, colors=NULL, legendpos="bottomleft", main="PCA Biplot", textcx=1, ...) { require(genefilter) require(calibrate) require(RColorBrewer) rv = rowVars(assay(rld)) select = order(rv, decreasing = TRUE)[seqlen(min(ntop, length(rv)))] pca = prcomp(t(assay(rld)[select, ])) fac = factor(apply([, intgroup, drop = FALSE]), 1, paste, collapse = " : ")) if (is.null(colors)) { if (nlevels(fac) >= 3) { colors = brewer.pal(nlevels(fac), "Paired") } else { colors = c("black", "red") } } pc1var <- round(summary(pca)$importance[2,1]100, digits=1) pc2var <- round(summary(pca)$importance[2,2]100, digits=1) pc1lab <- paste0("PC1 (",as.character(pc1var),"%)") pc2lab <- paste0("PC2 (",as.character(pc2var),"%)") plot(PC2~PC1,$x), bg=colors[fac], pch=21, xlab=pc1lab, ylab=pc2lab, main=main, ...) with($x), textxy(PC1, PC2, labs=rownames($x)), cex=textcx)) legend(legendpos, legend=levels(fac), col=colors, pch=20)

png("qc-pca.png", 1000, 1000, pointsize=20) rld_pca(rld, colors=mycols, intgroup="condition", xlim=c(-20, 20))

#Get differential expression results#
res <- results(dds) table(res$padj<0.05)

#Order by adjusted p-value#
res <- res[order(res$padj), ]

#Merge with normalized count data#
resdata <- merge(,, normalized=TRUE)), by="row.names", sort=FALSE) names(resdata)[1] <- "Gene" head(resdata)

#get significant results (FDR<0.05)
Write results#
write.csv(resdata, file="sig_diffexpr-results.csv")

#Volcano plot with significant DE genes#
volcanoplot <- function (res, lfcthresh=2, sigthresh=0.05, main="Volcano Plot", legendpos="bottomright", labelsig=TRUE, textcx=1, ...) { with(res, plot(log2FoldChange, -log10(padj), pch=20, main=main, ...)) with(subset(res, padj<sigthresh ),="" points(log2foldchange,="" -log10(padj),="" pch="20," col="red" ,="" ...))="" with(subset(res,="" abs(log2foldchange)&gt;lfcthresh),="" points(log2foldchange,="" -log10(padj),="" pch="20," col="orange" ,="" ...))="" with(subset(res,="" padj<sigthresh="" &amp;="" abs(log2foldchange)&gt;lfcthresh),="" points(log2foldchange,="" -log10(padj),="" pch="20," col="green" ,="" ...))="" if="" (labelsig)="" {="" require(calibrate)="" with(subset(res,="" padj<sigthresh="" &amp;="" abs(log2foldchange)&gt;lfcthresh),="" textxy(log2foldchange,="" -log10(padj),="" labs="Gene," cex="textcx," ...))="" }="" legend(legendpos,="" xjust="1," yjust="1," legend="c(paste("FDR&lt;",sigthresh,sep="")," paste("|logfc|&gt;",lfcthresh,sep="" ),="" "both"),="" pch="20," col="c("red","orange","green"))" }="" png("diffexpr-volcanoplot.png",="" 1200,="" 1000,="" pointsize="20)" volcanoplot(resdata,="" lfcthresh="1," sigthresh="0.05," textcx=".8," xlim="c(-3," 3))=""<="" p="">

I am novice R studio user , the issue I'm having is that the gene name labels displayed on my Volcano plot overlap, making them unreadable how can I prevent this overlap of the gene labels?

rna-seq • 753 views
ADD COMMENTlink modified 17 months ago by jared.andrews078.4k • written 17 months ago by Adeler0010

Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.

Thank you!

ADD REPLYlink written 17 months ago by GenoMax95k

Hello Adeler001!

It appears that your post has been cross-posted to another site:

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLYlink written 17 months ago by ATpoint44k
gravatar for jared.andrews07
17 months ago by
Memphis, TN
jared.andrews078.4k wrote:

Rather than try to parse that plotting function, I'm just going to recommend EnhancedVolcano. It will take care of the label overlap and drastically simplify plotting for you.

ADD COMMENTlink written 17 months ago by jared.andrews078.4k

ok thanks ill try that

ADD REPLYlink written 17 months ago by Adeler0010
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1746 users visited in the last hour