Finding Novel Disease Causing Cnv'S In Large Number Of Patients
2
2
Entering edit mode
10.6 years ago
Vikas Bansal ★ 2.4k

Dear all,

After some analysis, I used a tool to call copy numbers from my sequencing data. I got the output ->

CHROM  START     END    CopyNumber

chr1      0     1000    2.151000
chr2      0     1000    4.478000
chr2      1000  2000    5.431000    

Now, I did this analysis for 50 patients. So I have 50 files (as shown above) like this and for each file I have about 10,000 CNV's. Now I want to see which are the disease causing CNV's. So what I am thinking is ->

1.) Take the common CNV's which are present in all 50 patients.

2.) Filter them, if some of them are already present in database (DGV).

I want to know if there is any better strategy (or pipeline, filtering method, visualization of all at once), to find out novel CNV's from this kind of data?

Thanks and Best regards,

Vikas

cnv next-gen sequencing visualization • 2.6k views
1
Entering edit mode

What is your phenotype? Is this tumor vs normal tissue? Several of the replies here I think assume you are looking at a cancer phenotype. However, if you want to identify "disease-causing CNVs" in a phenotype associated with germ-line mutations/CNVs -- you need to know status of parental inheritance (inherited CNVs less likely to be pathogenic), and it really comes down to size of CNV (larger = more likely pathogenic) and gene content (very small CNVs can be pathogenic if the right gene is deleted).

1
Entering edit mode

@Alex: Can you please tell me, what do you mean by "and it really comes down to size of CNV (larger = more likely pathogenic) and gene content (very small CNVs can be pathogenic if the right gene is deleted)".

0
Entering edit mode

Just curious: which tool did you use in the end?

0
Entering edit mode

I used mrCaNaVar.

0
Entering edit mode

@Vikas, take a look at the 2011 review by Girirajan, Campbell, and Eichler (PMID:21854229). I think that paper gives a good overview.

2
Entering edit mode
10.6 years ago
B. Arman Aksoy ★ 1.2k

For visualization I would recommend Broad's IGV -- it is a great tool to start exploring the genomic alteration data across many samples. If you import your file into IGV, you will probably have a chance to see highly recurrent alterations just by eye. As far as I know, you can import BED files, but here is the documentation on supported file formats just in case: http://www.broadinstitute.org/software/igv/FileFormats

For the analysis, I have heard a couple of people using the RAE algorithm in order to find significant/recurrent CNAs. Here is the original paper if you want to see it in action:

Taylor BS, Barretina J, Socci ND, DeCarolis P, Ladanyi M, et al. 2008 Functional Copy-Number Alterations in Cancer. PLoS ONE 3(9): e3179. doi:10.1371/journal.pone.0003179

2
Entering edit mode
10.6 years ago

In addition to RAE, there are algorithms called RTS, GISTIC, and JISTIC, which do much the same thing - look across a cohort and find focal regions of statistically significant amplification and deletion.

Traffic: 1027 users visited in the last hour
FAQ
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.