Question: Find 3 copies of CNV in 1000 Genomes VCF
0
gravatar for Joe Ashmore
2.7 years ago by
United States
Joe Ashmore0 wrote:

I am trying to look through the 1000 Genomes VCF data to find genes with >2 Copies of a gene (>2 CNV).

I have downloaded the most recent VCF for my chromosome of interest (chr4) from: http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/

When I search the VCF using:

gunzip -c ALL.chr4.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz | grep CN3 | cut -f 1-5

##ALT=<ID=CN3,Description="Copy number allele: 3 copies">
##ALT=<ID=CN30,Description="Copy number allele: 30 copies">
##ALT=<ID=CN31,Description="Copy number allele: 31 copies">
##ALT=<ID=CN32,Description="Copy number allele: 32 copies">
##ALT=<ID=CN33,Description="Copy number allele: 33 copies">
##ALT=<ID=CN34,Description="Copy number allele: 34 copies">
##ALT=<ID=CN35,Description="Copy number allele: 35 copies">
##ALT=<ID=CN36,Description="Copy number allele: 36 copies">
##ALT=<ID=CN37,Description="Copy number allele: 37 copies">
##ALT=<ID=CN38,Description="Copy number allele: 38 copies">
##ALT=<ID=CN39,Description="Copy number allele: 39 copies"> 

4   3467434 esv3599431;esv3599432   T   <CN2>,<CN3>
4   8965379 esv3599554;esv3599555;esv3599556    C   <CN0>,<CN2>,<CN3>
4   9104669 esv3599560;esv3599561;esv3599562    G   <CN0>,<CN2>,<CN3>
4   9126509 esv3599563;esv3599564;esv3599565    C   <CN0>,<CN2>,<CN3>
4   9370866 esv3599568;esv3599569;esv3599570    G   <CN0>,<CN2>,<CN3>
4   9418201 esv3599572;esv3599573;esv3599574    G   <CN0>,<CN2>,<CN3>

I have also tried searching using grep CNV with similar results.

##ALT=<ID=CNV,Description="Copy Number Polymorphism">
4   67914   esv3599345;esv3599346   A   <CN0>,<CN2>
4   138870  esv3599353;esv3599354   C   <CN0>,<CN2>

How do I find if an individual has more than one copy of a gene/region? I assume the best way is to narrow the region based on the gene of interest, but I didn't want to lose relevant info.

If there is a way to do this in R, it would be even better.

Thanks!

R cnv 1000 genomes vcf • 1.0k views
ADD COMMENTlink written 2.7 years ago by Joe Ashmore0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 983 users visited in the last hour