Question

Manipulating Normal Individual Data For Cnv Analysis

0

Entering edit mode

12.2 years ago

Vikas Bansal ★ 2.4k

Dear all,

I posted this question in SEQanswers also but lost my thread there. I am working on CNV analysis (trying to find out CNV's which can be responsible for causing disease) for heart disease patients. I have exome seq data of about 50 patients and for CNV analysis I saw some interesting tools like VarScan, ExomeCNV etc. But all these tools need normal exome seq data also. Somehow I managed to get exome data for 7 normal individuals. My question is -> should I randomly select any 1 normal sample (for every patient) for CNV analysis or is there any way to use all these normal samples as 1 sample and then input in one of the tool along with the patients?

Best regards, Vikas

cnv varscan exome next-gen sequencing • 4.1k views

ADD COMMENT • link updated 12.2 years ago by Christof Winter ★ 1.0k • written 12.2 years ago by Vikas Bansal ★ 2.4k

0

Entering edit mode

Hi,

I meet the same question as you detailed above. I want to know how you solve the problem.Thanks a lot.

Best regards,

Di Zhang

ADD REPLY • link updated 4.4 years ago by Ram 43k • written 8.4 years ago by 1036268670 • 0

score 1 · Answer 1 · 2012-02-20

1

Entering edit mode

12.2 years ago

Alex Paciorkowski 3.5k

I would first of all be cautious about basing CNV analysis on what I am presuming are whole exome sequencing data. Your coverage may not be of uniform depth across your samples to reliably judge copy number. It can be done (Karakoc, et al, 2011), but generally whole genome sequence data is more reliable for copy number analyses.

That said, I would not limit your control sample in the way you propose. After you have aligned your sequence and identified indels -- why not draw from the very large amount of publicly available human CNV data? Examples include the HapMap Project, as well as the Database of Genome Variants, the Sanger CNV Project, the 1000 Genomes Project, as well as tools such as Decipher. This is one area where there is a wealth of publicly available control data.

Do you have SNP array data on your subjects as well? This would be one way of validating apparent copy number changes in your exome sequencing data, and may be a lot more accurate (although the resolution is not as great as with next-gen sequence). Parental samples may be important as well for evaluating pathogenicity of identified CNVs.

ADD COMMENT • link 12.2 years ago by Alex Paciorkowski 3.5k

0

Entering edit mode

Thanks a lot for your reply. As you said, that I can use publicly available data, but I am looking for novel CNV's which have not been reported yet. Secondly, I do not have SNP array data. So it means, there is no way or tool, which I can use according to my problem I mentioned in my question?

ADD REPLY • link 12.2 years ago by Vikas Bansal ★ 2.4k

0

Entering edit mode

Vikas -- I don't know the details of your experimental design, but the way to evaluate the novelty of any CNVs in your data is to compare to the control population data. But be warned -- just because a CNV is in the population data (and not "novel") doesn't mean it's not pathogenic.

ADD REPLY • link 12.2 years ago by Alex Paciorkowski 3.5k

0

Entering edit mode

the way to evaluate the novelty of any CNVs in your data is to compare to the control population data -> This is exactly what I am looking for. I have sequenced about 1000 genes from 50 patients and now I have sequenced 1000 genes from 7 normal individuals also. I am just confused that how should I use this 7 normal individual data, should I take only 1 normal individual randomly for all patients or is there any way with which I can use all these 7 normal individuals against all patients to find CNV's?

ADD REPLY • link 12.2 years ago by Vikas Bansal ★ 2.4k

Ram · Answer 2 · 2012-02-20

0

Entering edit mode

12.2 years ago

Christof Winter ★ 1.0k

This does not explicitly answer your question, but you could try Control-FREEC, which does not require a normal control.[?]

I found it easy to use and well documented.

ADD COMMENT • link updated 4.6 years ago by Ram 43k • written 12.2 years ago by Christof Winter ★ 1.0k

0

Entering edit mode

Thanks a lot for your reply. I looked at it, but it does not say that it will work on exome data?

ADD REPLY • link 12.2 years ago by Vikas Bansal ★ 2.4k

0

Entering edit mode

The new version 5.0 also runs with exome data: http://bioinfo-out.curie.fr/projects/freec/tutorial.html#EXOME although you need to provide a control sample in that case.

ADD REPLY • link updated 4.6 years ago by Ram 43k • written 12.2 years ago by Christof Winter ★ 1.0k

0

Entering edit mode

Does any one have a working script for the Control-FREEC or any documentation which shows how to create the config file for my samples for CNV-LOH analysis. I have my normal and tumor samples where I want to do the CNV analysis, I am having some trouble understanding the manual. Also how should I create the config file with normal and tumor sample? Can we directly use Control-FREEC with bam files with any command? I would be happy if anyone can share something like that, anyways am trying to figure it out.

ADD REPLY • link 9.8 years ago by ivivek_ngs ★ 5.2k