PCA plot, less points than samples
2
0
Entering edit mode
13 months ago
eridanus ▴ 30

Hello. I am working with RNA seq data, and I am trying to do differential expression analysis. I am trying to make a PCA plot with Deseq2. Howevere although I have 12 samples, the PCA plot appears to have 11 dots. When I am making MDS plot with limma voom I see the same (11 dots instead of 12). I have checked dds object (and rlog transformed respectively) and its dimensions is 12. What could be the cause? Thank you!

dds <- DESeqDataSetFromMatrix(countData=countData, colData=colData, design=~group)
keep <- rowSums(counts(dds)) >= 10
dds <- dds[keep,]
rld <- rlog(dds)
plot_pca<-plotPCA(rld, intgroup = c("group"))
plotPCA(rld, intgroup = c("group"))
plot_pca <- plot_pca + geom_text(aes_string(label = "name"), color = "black")
print(plot_pca)

RNA-Seq • 432 views
0
Entering edit mode

This is my PCA plot. I checked and finally all the samples are included. I wonder if I should remove some samples from the differential expression analysis. WHat do you think?

2
Entering edit mode
13 months ago
ATpoint 55k

Use returnData=TRUE for plotPCA and see whether there are 12 or 11 samples in the returned data.frame. If 12 then two points are overlapping perfectly.

0
Entering edit mode

thank you. I tried and it is 12. I thought about overlapping, but then I added the labels, and I thought that if there was overlapping it would be obvious in the labels (that was not the case). what can I do about overlapping? Thank you a lot!

1
Entering edit mode

You can try to increase the number of genes used for PCA, maybe use 1000 and hope they separate better. ?plotPCA for details.

2
Entering edit mode

Or, I'd doublecheck that your 12 samples really are 12 samples, and that you don't have a goof where the same fastq was analyzed under two different names.

0
Entering edit mode

Good point. For this I guess a simple cor() would do and you should get a Pearson correlation of 1 in that case.

0
Entering edit mode

This is my PCA plot. The clustering is not that good, but in differential expression analysis I can see differences in expression. Should I remove some samples? What do you think? Thank you!

0
Entering edit mode

You can't remove samples just because they don't cluster how you like. It might be nice if you could figure out what's driving that first principle component.

0
Entering edit mode

Yes you are right. Thank you for your response!