Question

why I got different number of peaks used differnt version reference of hexaploid wheat analyzing the chip-seq data

0

Entering edit mode

6.6 years ago

ellahappy • 0

Hi,everyone!

I used Bowtie2 to align the reads to the wheat reference,first,I used the first version as the reference, the first version is assemblied as scaffold,the genome side is about 4.6G,I compared the peak between the duplicates and found 65% has overlap, then I think the reproducibility of the two experiments is better.then I used the new assembly chromosome levels reference (14.6G),and has only 33% overlap peaks in the same data sets,so I am confused, I don't know what is the problem? Is my experiments is not good so the repeatability is so poor? or my analysis of the data has some unthoughtful? or due to the genome repeats and transponsons are so many?

Thanks for your insights.

ChIP-Seq genome alignment next-gen • 1.4k views

ADD COMMENT • link updated 6.6 years ago by colindaven 6.4k • written 6.6 years ago by ellahappy • 0

0

Entering edit mode

How did you map the reads? What processing steps did you take after mapping? Which software and versions? And so on...

Please read How To Ask Good Questions On Technical And Scientific Forums.

ADD REPLY • link 6.6 years ago by h.mon 35k

score 0 · Answer 1 · 2017-09-20

0

Entering edit mode

6.6 years ago

colindaven 6.4k

Why would this be surprising ? The wheat genome was a poor assembly and has been massively improved over the last several years. You state the genome at 14.6 Gbp contains about 3x as much sequence as before. Why would it surprise you to get very different results?

You might look at multimapping reads in particular. The new genome fasta is much more likely to have the 3(6) different copies of each gene due to wheat's hexaploid nature. Is your mapping program/approach set to allow multimapping reads ?

ADD COMMENT • link 6.6 years ago by colindaven 6.4k

0

Entering edit mode

when mapped to the new wheat reference, I allowed two mismathes.What confused me is the reproducibility of the three replicate datas, when I used the former, the scaffold reference, the coefficent of biological replicates was ranged from 0.6--0.8, but when I changed the reference to the new one, the big and assembled well one, the coefficent was very low, now I can't tell the repeatability of the replicates and confused, I don't know how to explain this? the incompleted genome reference can induce the different result? I have no idea........

ADD REPLY • link 6.5 years ago by ellahappy • 0