Question: common CNVs among multiple files
0
gravatar for popayekid55
8 months ago by
popayekid5550
popayekid5550 wrote:

Hi all, i have analyzed 35 normal WGS sample for cnvs using cnvNator. Now i want to know common cnv region among these files so that those can be used as control panel.

Is the a tool or method to obtain these common region among all files at once?

thank you

cnv genome • 301 views
ADD COMMENTlink modified 8 months ago by Kevin Blighe46k • written 8 months ago by popayekid5550

We are not necessarily familiar with the output format of cnvNator, so it would be best if you could elaborate on which files you have.

ADD REPLYlink written 8 months ago by WouterDeCoster40k

output will be converted into bed file format like below

1   629471  638210  1   0.431094
1   671461  675070  3   2.75301
1   1414076 1416640 1   0.560963
1   2583526 2591885 3   12.4121
1   2634161 2684320 1   0.000940585

chromosome start and end of cnv, type of cnv and a score

ADD REPLYlink modified 8 months ago • written 8 months ago by popayekid5550

bedtools multiinter will help you

ADD REPLYlink written 8 months ago by IP590

Parse the chr and start end form CNVnator results then overlap the output file using Bedtools multiIntersectBed or Bedops.

ADD REPLYlink written 8 months ago by arup1.5k
0
gravatar for Kevin Blighe
8 months ago by
Kevin Blighe46k
Kevin Blighe46k wrote:

GAIA will find recurrent copy number regions from your input data and assign a p-value to each region to help with filtering them. I believe you have the required information to run GAIA. The starting data is a row-binded list of all regions, with an extra column that indicates the sample from which the region derived. You decide your own cut-off points for gain (1) and loss (0) based on the segment mean.

A practical example for cancer is given here: C: How to extract the list of genes from TCGA CNV data

Kevin

ADD COMMENTlink written 8 months ago by Kevin Blighe46k

i did not understand completely. I am looking for common (overlapping) cnv coordinates among these 35 files.

ADD REPLYlink written 8 months ago by popayekid5550

GAIA will find the common regions and assign a p-value based on how recurrent (frequent) they are in your dataset. The idea is that the more recurrent ones are more important.

If you literally just want to see the overlapping BED regions, even if it occurs in just 2 samples, then use the BEDTools solutions that were suggested. However, what would you do in the situation were one region is gain (amplified) in one sample but loss (deleted) in another? - does it make sense to merge these in light of what is your downstream analysis plan?

ADD REPLYlink modified 8 months ago • written 8 months ago by Kevin Blighe46k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1133 users visited in the last hour