Manipulating Normal Individual Data For Cnv Analysis
2
0
Entering edit mode
12.2 years ago
Vikas Bansal ★ 2.4k

Dear all,

I posted this question in SEQanswers also but lost my thread there. I am working on CNV analysis (trying to find out CNV's which can be responsible for causing disease) for heart disease patients. I have exome seq data of about 50 patients and for CNV analysis I saw some interesting tools like VarScan, ExomeCNV etc. But all these tools need normal exome seq data also. Somehow I managed to get exome data for 7 normal individuals. My question is -> should I randomly select any 1 normal sample (for every patient) for CNV analysis or is there any way to use all these normal samples as 1 sample and then input in one of the tool along with the patients?

Best regards, Vikas

cnv varscan exome next-gen sequencing • 4.1k views
ADD COMMENT
0
Entering edit mode

Hi,

I meet the same question as you detailed above. I want to know how you solve the problem.Thanks a lot.

Best regards,

Di Zhang

ADD REPLY
1
Entering edit mode
12.2 years ago

I would first of all be cautious about basing CNV analysis on what I am presuming are whole exome sequencing data. Your coverage may not be of uniform depth across your samples to reliably judge copy number. It can be done (Karakoc, et al, 2011), but generally whole genome sequence data is more reliable for copy number analyses.

That said, I would not limit your control sample in the way you propose. After you have aligned your sequence and identified indels -- why not draw from the very large amount of publicly available human CNV data? Examples include the HapMap Project, as well as the Database of Genome Variants, the Sanger CNV Project, the 1000 Genomes Project, as well as tools such as Decipher. This is one area where there is a wealth of publicly available control data.

Do you have SNP array data on your subjects as well? This would be one way of validating apparent copy number changes in your exome sequencing data, and may be a lot more accurate (although the resolution is not as great as with next-gen sequence). Parental samples may be important as well for evaluating pathogenicity of identified CNVs.

ADD COMMENT
0
Entering edit mode

Thanks a lot for your reply. As you said, that I can use publicly available data, but I am looking for novel CNV's which have not been reported yet. Secondly, I do not have SNP array data. So it means, there is no way or tool, which I can use according to my problem I mentioned in my question?

ADD REPLY
0
Entering edit mode

Vikas -- I don't know the details of your experimental design, but the way to evaluate the novelty of any CNVs in your data is to compare to the control population data. But be warned -- just because a CNV is in the population data (and not "novel") doesn't mean it's not pathogenic.

ADD REPLY
0
Entering edit mode

the way to evaluate the novelty of any CNVs in your data is to compare to the control population data -> This is exactly what I am looking for. I have sequenced about 1000 genes from 50 patients and now I have sequenced 1000 genes from 7 normal individuals also. I am just confused that how should I use this 7 normal individual data, should I take only 1 normal individual randomly for all patients or is there any way with which I can use all these 7 normal individuals against all patients to find CNV's?

ADD REPLY
0
Entering edit mode
12.2 years ago
Christof Winter ★ 1.0k

This does not explicitly answer your question, but you could try Control-FREEC, which does not require a normal control.[?]

I found it easy to use and well documented.

ADD COMMENT
0
Entering edit mode

Thanks a lot for your reply. I looked at it, but it does not say that it will work on exome data?

ADD REPLY
0
Entering edit mode

The new version 5.0 also runs with exome data: http://bioinfo-out.curie.fr/projects/freec/tutorial.html#EXOME although you need to provide a control sample in that case.

ADD REPLY
0
Entering edit mode

Does any one have a working script for the Control-FREEC or any documentation which shows how to create the config file for my samples for CNV-LOH analysis. I have my normal and tumor samples where I want to do the CNV analysis, I am having some trouble understanding the manual. Also how should I create the config file with normal and tumor sample? Can we directly use Control-FREEC with bam files with any command? I would be happy if anyone can share something like that, anyways am trying to figure it out.

ADD REPLY

Login before adding your answer.

Traffic: 3832 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6