Question: Correctin of GC content to No. of reads for NIFTY
gravatar for filipzembol
6.2 years ago by
filipzembol110 wrote:

Hi to all,

I have one really big problem, I am creating a software to detect aneuploidies from whole genome sequencing. At first I filtrate noise data, after that I separate genome to each bin with 60kb size. For now I have a tables where is information about coordinates of bin, gc content in bin for reads and number of reads. For now I have to do correction of number of reads to GC. Could you please help with this? Why I have to do it and what equation I have to use?

Thank you

gravatar for Devon Ryan
6.2 years ago by
Devon Ryan95k
Freiburg, Germany
Devon Ryan95k wrote:

The "why" is because biology is full of small biases, such as some steps in the illumina sequencing pipeline that favor particular GC amounts. The "how" involves plotting your test data and seeing what it looks like. That should lead you to your answer. I don't actually know how big of an issue this is between chromosomes as a whole, so I can't say how much this would actually affect you.

