Question: common CNVs among multiple files
0
gravatar for popayekid55
12 weeks ago by
popayekid5550
popayekid5550 wrote:

Hi all, i have analyzed 35 normal WGS sample for cnvs using cnvNator. Now i want to know common cnv region among these files so that those can be used as control panel.

Is the a tool or method to obtain these common region among all files at once?

thank you

cnv genome • 182 views
ADD COMMENTlink modified 12 weeks ago by Kevin Blighe37k • written 12 weeks ago by popayekid5550

We are not necessarily familiar with the output format of cnvNator, so it would be best if you could elaborate on which files you have.

ADD REPLYlink written 12 weeks ago by WouterDeCoster36k

output will be converted into bed file format like below

1   629471  638210  1   0.431094
1   671461  675070  3   2.75301
1   1414076 1416640 1   0.560963
1   2583526 2591885 3   12.4121
1   2634161 2684320 1   0.000940585

chromosome start and end of cnv, type of cnv and a score

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by popayekid5550

bedtools multiinter will help you

ADD REPLYlink written 12 weeks ago by IP530

Parse the chr and start end form CNVnator results then overlap the output file using Bedtools multiIntersectBed or Bedops.

ADD REPLYlink written 12 weeks ago by arup850
0
gravatar for Kevin Blighe
12 weeks ago by
Kevin Blighe37k
Republic of Ireland
Kevin Blighe37k wrote:

GAIA will find recurrent copy number regions from your input data and assign a p-value to each region to help with filtering them. I believe you have the required information to run GAIA. The starting data is a row-binded list of all regions, with an extra column that indicates the sample from which the region derived. You decide your own cut-off points for gain (1) and loss (0) based on the segment mean.

A practical example for cancer is given here: C: How to extract the list of genes from TCGA CNV data

Kevin

ADD COMMENTlink written 12 weeks ago by Kevin Blighe37k

i did not understand completely. I am looking for common (overlapping) cnv coordinates among these 35 files.

ADD REPLYlink written 12 weeks ago by popayekid5550

GAIA will find the common regions and assign a p-value based on how recurrent (frequent) they are in your dataset. The idea is that the more recurrent ones are more important.

If you literally just want to see the overlapping BED regions, even if it occurs in just 2 samples, then use the BEDTools solutions that were suggested. However, what would you do in the situation were one region is gain (amplified) in one sample but loss (deleted) in another? - does it make sense to merge these in light of what is your downstream analysis plan?

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by Kevin Blighe37k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1746 users visited in the last hour