Reg: Changing outliers in RNASEQ
2
0
Entering edit mode
8.7 years ago
harishk0201 ▴ 130

Hi all,

I'm performing an experiment using RNA-seq data which will subsequently involve Differential Gene Analysis, and I'm experiencing the following problem. I'm using the trinity rna-seq suite, I've used kallisto as well as RSEM to check the replicate correlation. The species is crocodilian in nature. (no reference genome yet for my species)

  1. On comparing replicates for the same tissue, eg. Brain or genital ridge, I'm observing that correlation between replicates for Brain sample are not correlating together. (quantitative heatmap)

    sample 1 -> cond 1 
    sample 2 -> cond 1
    sample 3 -> cond 2 
    sample 4 -> cond x (sample 4 should segregate with cond 2)
    

    Infact, sample 4 doesn't have a correlation (very less or negative) with other Brain tissue.

  2. On adding more samples from other tissues, the outlier changes entirely. And I'm observing that kidney samples or skin samples correlate with other brain samples (any tissue correlates somewhat well with other brain samples) barring sample 3 (in example above.)

Could someone point out what might be the probable reason for this behavior, other than sequencing and sample preparation error?

Harish

RNA-Seq Kallisto RSEM • 1.8k views
ADD COMMENT
0
Entering edit mode
8.7 years ago
biostart ▴ 370

What are the absolute values of the pairwise correlation between your samples? In this situation I would calculate pairwise correlations between all replicates and discard those which don't correlate well with other replicates (which requires at least three replicates for each tissue).

ADD COMMENT
0
Entering edit mode
8.7 years ago
harishk0201 ▴ 130

For pairwise correlation, the absolute values for correlation range from 0.75-0.9 for the confounding sample and rest are from 0.85-1.

I removed the confounding sample and the range for all the samples (sort of a pooled correlation heatmap) went from 0.2-1 to 0.6-1, which makes me wonder whether those samples have problems.

I have to ask the sequencing center at this point to verify integrity of the data and sequencing run logs.

Thanks for the help. If needed, I'll post the PCA and Heatmap plots.

ADD COMMENT

Login before adding your answer.

Traffic: 1154 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6