Question: How to remove batch effects when i have different organs?
gravatar for jyshiningsoul
19 months ago by
jyshiningsoul0 wrote:


I focus on three subspecies of bat, individuals of three subspecies were sequenced by different platforms and have inconsistent sex. l have a RNA-seq counts matrix containing the counts of three organs per individuals, when PCA analysis for three organs separately, the plots cluster together according to sequencing platform.

To remove the batch effects, I plan to use 'sva' in R. However, I have a doubt that should I generate the counts of different organs together or separate the counts according to different for 'sva'?

rna-seq • 511 views
ADD COMMENTlink written 19 months ago by jyshiningsoul0

Can you give some more details on the library prep and sequencing platform. It is quiet unusual that the platform is the dominating factor given that other covariates like gender and organ are present. Please also show the PCA plot.

ADD REPLYlink written 19 months ago by ATpoint35k

It isn't that uncommon batch effects dominating other experimental design factors, in fact there is a famous case:

A reanalysis of mouse ENCODE comparative gene expression data

But I fully agree, jyshiningsoul should provide more details, for example, explain the experimental design and how batch confounds with it - in the worst case, they can't be untangled.

ADD REPLYlink written 19 months ago by h.mon29k

Thank you for your reply. In my experimental design, I want to test whether the phylogenetic trees constructed from gene expression of organs of subspecies will be in accordance with real phylogenetic tree or not. But the different platform and inconsistent sex of individuals may have effect. For example, the PCA plot reflected A1 and C1 clustered together in organ brain, and the two individuals (A1, C1) were sequenced by the same company. I want to remove the batch effects to find the true classification in organs pca plot. But I don't how to remove the batch effects in my project. (you can see the pca plot and information of individuals in the post)

ADD REPLYlink written 19 months ago by jyshiningsoul0

Thank you for your reply, and I am sorry to reply to you so late. Each library construction (average insert size = 300 bp), and sequencing on an Illumina HiSeq 4000 sequencer (150 bp paired-end) were conducted by different companies. But the platform didn't dominate the distribution of the plots, just affected several plots. The pca plot ( and information of individuals (

Thank you a lot.

ADD REPLYlink modified 19 months ago • written 19 months ago by jyshiningsoul0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1897 users visited in the last hour