Question

How to evaluate the similarity between two different samples by using RNA-Seq?

0

Entering edit mode

6.0 years ago

dz2353 ▴ 120

Hi, there!

I have two samples' RNA-Seq data, one is amniotic epithelial cells(AEC), and another is keratinocyte(KRT). I have done with the upstream analysis and get the original reads count matrix. After that, I did PCA analysis, differential gene expression analysis by using DESeq2. Actually, I want to find out the similarity between AEC and KRT at the gene level. But I do not know how to do that cause I do not think the un-differently expressed genes from the result of DESeq2 can represent the similarity. I only have two samples and each one has one replicate, so I can not do co-expression network analysis. Anyone can help me? Thanks in advance!

RNA-Seq rna-seq gene • 5.6k views

ADD COMMENT • link updated 6.0 years ago by Charles Warden 8.3k • written 6.0 years ago by dz2353 ▴ 120

0

Entering edit mode

I only have two samples and each one has one replicate...

Do you mean you have n=1 for each group? It is impossible to do proper statistics with such a 'poor' design (no offense), please consider adding more biological replicates.

Similarity between samples can be evaluated with clustering, such as hierarchical clustering. But also correlation can be used as a measure for similarity.

ADD REPLY • link 6.0 years ago by Benn 8.4k

1

Entering edit mode

Sorry, I mean each group has two replicates. A_1 and A_2 in AEC group, K_1 and K_2 in KRT group.

ADD REPLY • link 6.0 years ago by dz2353 ▴ 120

1

Entering edit mode

Are these technical replicates? For sound statistics you need biological reps... try to calculate correlation between your samples, and make a heatmap such as here. Correlation of 1 means similar, correlation of 0 means not similar.

ADD REPLY • link 6.0 years ago by Benn 8.4k

0

Entering edit mode

Yes, they are biological replicates and I've done with correlation analysis. But actually what I want to do is to find out a gene list that shows the same expression level between two groups. Do you think the complementary set of differently expressed gene list is my target？Thanks for your reply!

ADD REPLY • link 6.0 years ago by dz2353 ▴ 120

0

Entering edit mode

Sounds like you are looking for equivalence test, haven't seen that before with RNA-seq data but if that's what you need maybe worth a try.

ADD REPLY • link 6.0 years ago by Benn 8.4k

score 2 · Answer 1 · 2019-04-17

2

Entering edit mode

6.0 years ago

Charles Warden 8.3k

PCA and a dendrogram with hierarchical clustering (with Pearson Dissimilarity and/or Euclidan Distance as the distance metric) are the main things I would use to assess replicates before differential expression.

Otherwise, I would create a heatmap of differential expressed genes. Even if gene list sizes are similar, you may visually see better consistency of replicates with one method versus another (and I would test DESeq2/edgeR/limma-voom for your n=4 comparison).