Batch effects in sequencing data
2
1
Entering edit mode
6.3 years ago
Andrew ▴ 60

I am looking for biases in sequencing data that may appear at any step in the process of sequencing, caused by batch effects.  For example, the impact of library prep on sequencing data.  I know where these biases could occur, I just don't know what type of impacts they will have (impact coverage, GC content, etc). 

Does anyone know of any papers which discuss this? I have found a few but mainly they just mention that they are correcting for batch effects but don't actually say what they are correcting.

Any help is much appreciated.

batch effects • 3.1k views
ADD COMMENT
2
Entering edit mode
6.3 years ago
Asaf 8.6k

When comparing expression levels with RNA-seq you have to make sure your library prep and sequencing is the same. Issues like ligation bias, RNA fragments length might influence the number of reads each mRNA has in the sequencing results, even if the initial amount of mRNAs was the same. The easiest way to correct for batch effect is to add the batch to the table of conditions (and to the linear model) of DESeq2, if there will be differences between samples that can be explained by the batch it will ignore this effect when calculating the effect of the differences in conditions.

ADD COMMENT
0
Entering edit mode
6.3 years ago

Have seen some large batch effects coming from batches using different versions of illumina chemistry. With the older chemsitry on hiseq we see many regions with very low coverage, these regions tend to be GC rich. More recently we've done more hiseq but with PCR free libraries and a lot of these problems go away. We have a some success correcting for this batch effect in GWAS by introducing the batch number as a co-variate.

ADD COMMENT

Login before adding your answer.

Traffic: 1676 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6