Question: Using PhiX to estimate bisulfite conversion rate
4
gravatar for igor
3.8 years ago by
igor7.6k
United States
igor7.6k wrote:

I was trying to use PhiX spike-in to estimate bisulfite conversion rate. When I processed my reads with Bismark aligning to the mouse genome (this is a mouse sample), the results seem reasonable:

C methylated in CpG context:    52.6%
C methylated in CHG context:    4.5%
C methylated in CHH context:    4.5%

Then I align the same sample against the PhiX genome. The alignment rate is less than 1% as expected, but the methylation rates are odd:

C methylated in CpG context:    98.0%
C methylated in CHG context:    97.9%
C methylated in CHH context:    97.8%

Why is it the opposite of what it should be? PhiX should be unmethylated. The output is generated by Bismark, so it's not possible that I used a wrong formula and the bisulfite conversion worked at least partially as demonstrated by the sample of interest. What am I missing here?

wgbs bismark rrbs • 2.3k views
ADD COMMENTlink modified 3.8 years ago by Antonio R. Franco4.0k • written 3.8 years ago by igor7.6k
1

That is extremely odd. Do you getting a methylation_ratio file for the PhiX genome? If so, look directly at the values to see if there's an error in the calculation. 
Also, where in the pipeline did you add the PhiX and are you sure it is an unmethylated variety (both may be sold)?

ADD REPLYlink written 3.8 years ago by Jautis270
2
gravatar for Antonio R. Franco
3.8 years ago by
Spain. Universidad de Córdoba
Antonio R. Franco4.0k wrote:

PhiX can be added to the Illumina sequencing at different steps

1. BEFORE the bisulfite treatment as a way to estimate the C to U conversion

2. AFTER the bisulfite conversion as a way to balance the DNA composition, otherwise Illumina base calling will not work properly. In this latter case, and depending upon the software version, you need to add from 10 to 50% of PhiX DNA to your samples to balance the base composition (or you have the choice to have a separated line with only PhiX DNA as control). More information HERE and HERE

So if you have not done the sequencing yourself, it is lilkely that the PhiX sequence you are analyzing have been added after the bisulfite treatment. This is mandatory

 

ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by Antonio R. Franco4.0k
1

Seems like a good explanation. However, I would say that 4.5% methylation in non CpG context is a little too high, we usually see <1%, but it might be expected in this case of course. In contrast, if PhiX has been added later, ~98% C seems too low, I would expect >99%. Maybe worth checking the sequence quality?

ADD REPLYlink written 3.8 years ago by dariober10.0k

I agree. In my hands, sequence quality plays a major role in the analysis of the GC content. It will not be the first time that a peak or valley with a different GC content analyzed with FastQC disappear after trimming the sequences for quality

My question.. 98% could be fair for what you expect of a sequencing platform that is nice, but still far to be perfect ?. Or you still is confident that you must get that >99% ?. 

ADD REPLYlink written 3.8 years ago by Antonio R. Franco4.0k

"98% could be fair for what you expect of a sequencing platform that is nice, but still far to be perfect" It's not unusual these days to have reads with quality consistently above Q30, especially on MiSeq (I'm talking about Illumina platforms here), which translates in error rate 0.1% or C 99.9%. A fastQC report would helpful for the OP...

ADD REPLYlink written 3.8 years ago by dariober10.0k

The quality is great. That is not the issue.

ADD REPLYlink written 3.8 years ago by igor7.6k

This was added before bisulfite treatment specifically to measure bisulfite conversion rates.

The PhiX added during sequencing does not get indexed, so it would disappear after demultiplexing.

ADD REPLYlink written 3.8 years ago by igor7.6k

It will be hard to find a logical explanation, though..

ADD REPLYlink written 3.8 years ago by Antonio R. Franco4.0k

I think the indexing of PhiX is not really an issue. One can align to PhiX the reads that failed to demultiplex, which would be a mixture of junk and real PhiX.

ADD REPLYlink written 3.8 years ago by dariober10.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1606 users visited in the last hour