Question: How to merge VCF files for SNP's ?
1
gravatar for bingnas
3.8 years ago by
bingnas10
United States
bingnas10 wrote:

Hi all

I called six SNP's files as individual, I want to merge them such that considering the position and location. I want to do that for converting them as integer numbers 0,1,2.

The question is:

Could anyone please help me how I can merge them as following?

REF is hg19 , ALT1 is first patient, ALT2 second patient ... so on till ALT6 sixth patient.

#CHROM POS REF ALT1 ALT2 ALT3 ALT4 ALT5 ALT6
chrM 3 T C G A C T C
chrM 4 C A C T A G C
chrM 150 T C T C C G A
chrM 195 C T C T C A T
chrM 410 A T T C C T C
chrM 711 G A C T T G G
chrM 1890 G . C T C A C
chrM 2354 C T T C A G C
chrM 2485 C T A G G A C,T
chrM 3457 T C G A G A C
chrM 4162 C T T A T C,A A
chrM 4217 T C G T A G T
chrM 4918 A G C . G A A
chrM 5581 C T G A A G .
chrM 8698 G A G A A C A
chrM 8702 G A G C G C A
chrM 9378 G A C T G A C
chrM 9541 C T C T C T C
chrM 10284 A G G A A C C
chrM 10399 G A G A A G T
chrM 10464 T C C G T C G
chrM 10820 G A G T . C A
chrM 10874 C T G T G C,T G
chrM 11018 C T C T A C C
chrM 11252 A G . C G A T
chrM 11723 C T . A C T T
chrM 11813 A G G A C A C

 

 is that possible?

I wrote period because someone told me you should have these periods if the positions there!

 

Thank you in advance

Bing 

sequencing snp next-gen sequence • 1.6k views
ADD COMMENTlink modified 3.7 years ago • written 3.8 years ago by bingnas10
1

If I understand correctly you want to recode SNPs from ACTG to 0,1,2 ?

You can use plink. First convert VCF into plink format, then run plink --recode12. If you are more comfortable working with vcf, you can convert it back to VCF again

ADD REPLYlink written 3.8 years ago by stolarek.ir600

Thank you stolarek for you answer, yes you got what i want. I will try 

Bing

ADD REPLYlink written 3.8 years ago by bingnas10
5
gravatar for ebrown1955
3.8 years ago by
ebrown1955300
United States
ebrown1955300 wrote:

Assuming you have 6 VCF files, you can use GATK.

java -jar GenomeAnalysisTK.jar \
-T CombineVariants \
-R reference.fasta \
--variant input1.vcf \
--variant input2.vcf \
--variant input3.vcf \
--variant input4.vcf \
--variant input5.vcf \
--variant input6.vcf \
-o output.vcf \
-genotypeMergeOptions UNIQUIFY

Then you can use VariantsToTable to turn output.vcf into a table as requested:

java -jar GenomeAnalysisTK.jar \
-R reference.fasta -T VariantsToTable \
-V output.vcf \
-F CHROM -F POS -F ID -F QUAL -F -GF GT \
-o output.table
ADD COMMENTlink written 3.8 years ago by ebrown1955300
0
gravatar for bingnas
3.7 years ago by
bingnas10
United States
bingnas10 wrote:

Hi ebrown1955,

Thank you very much for your a great answer, I would like to show you what I got from first commad (

CombineVariants) :

 

 

#CHROM POS ID REF ALT  3395_167 3395_1 3395_341 3395_343 3395_49 3395_60    
chrM 3 . T C ./. 0/1:6:2,0,4,0:4:54,0,27:0 ./. ./. ./. ./.    
chrM 4 . C A,G ./. 0/1:6:1,0,5,0:5:0 2/2:2:0,0,2,0:2:0 ./. ./. 2/2:2:0,0,2,0:2:0  
chrM 72 . T C ./. ./. 1/1:21:0,0,21,0:21:178,63,0:0 1/1:15:0,0,15,0:15:174,45,0:0 ./. ./.    
chrM 73 . G A ./. ./. 1/1:21:0,0,21,0:21:179,63,0:0 1/1:15:0,0,15,0:15:173,45,0:0 ./. ./.    
chrM 150 . T C 1/1:13:0,0,4,9:13:255,39,0:0 1/1:19:0,0,16,3:19:255,57,0:0 1/1:20:0,0,20,0:20:185,60,0:0 1/1:6:0,0,6,0:6:142,18,0:0 1/1:8:0,0,7,1:8:178,24,0:0 1/1:2:0,0,2,0:2:66,6,0:0
chrM 152 . T C ./. ./. ./. ./. 1/1:8:0,0,7,1:8:180,24,0:0 1/1:2:0,0,2,0:2:64,6,0:0
chrM 182 . C T ./. ./. ./. ./. 1/1:9:0,0,8,1:9:151,27,0:0 1/1:9:0,0,8,1:9:161,27,0:0
chrM 195 . C T 1/1:15:0,0,5,10:15:255,45,0:0 1/1:22:0,0,13,9:22:255,66,0:0 1/1:14:0,0,6,8:14:255,42,0:0 1/1:3:0,0,3,0:3:83,9,0:0 1/1:9:0,0,9,0:9:116,27,0:0 1/1:10:0,0,7,3:10:191,30,0:0
chrM 199 . T C ./. ./. ./. ./. 1/1:7:0,0,7,0:7:103,21,0:0 1/1:10:0,0,7,3:10:191,30,0:0
chrM 204 . T C ./. ./. ./. ./. 1/1:7:0,0,7,0:7:91,21,0:0 1/1:10:0,0,7,3:10:188,30,0:0
chrM 207 . G A ./. ./. ./. ./. 1/1:8:0,0,6,2:8:118,24,0:0 1/1:10:0,0,7,3:10:195,30,0:0
chrM 235 . A G ./. ./. 1/1:24:0,0,11,13:24:255,72,0:0 1/1:10:0,0,6,4:10:255,30,0:0 ./. ./.    
chrM 250 . T C ./. ./. ./. ./. 1/1:19:0,0,6,13:19:255,57,0:0 1/1:18:0,0,15,3:18:243,54,0:0
chrM 410 . A T 1/1:13:0,0,3,10:13:255,39,0:0 1/1:27:0,0,19,8:27:255,81,0:0 1/1:26:0,0,19,7:26:255,78,0:0 1/1:12:0,0,11,1:12:226,36,0:0 1/1:5:0,0,2,3:5:148,15,0:0 1/1:6:0,0,3,3:6:119,18,0:0

 

ADD COMMENTlink modified 3.7 years ago • written 3.7 years ago by bingnas10
0
gravatar for bingnas
3.7 years ago by
bingnas10
United States
bingnas10 wrote:

and from second command (variantsToTable) is:

CHROM POS QUAL 1nt.GT 49nt2.GT 60nt3.GT 167nt4.GT 341nt5.GT 343nt6.GT
chrM 3 24.03 T/C ./. ./. ./. ./. ./.
chrM 4 22.12 C/A ./. G/G ./. G/G ./.
chrM 72 145 ./. ./. ./. ./. C/C C/C
chrM 73 146 ./. ./. ./. ./. A/A A/A
chrM 150 222 C/C C/C C/C C/C C/C C/C
chrM 152 147.03 ./. C/C C/C ./. ./. ./.
chrM 182 118.02 ./. T/T T/T ./. ./. ./.
chrM 195 222 T/T T/T T/T T/T T/T T/T
chrM 199 70.07 ./. C/C C/C ./. ./. ./.
chrM 204 58.07 ./. C/C C/C ./. ./. ./.
chrM 207 85.03 ./. A/A A/A ./. ./. ./.
chrM 235 222 ./. ./. ./. ./. G/G G/G
chrM 250 222 ./. C/C C/C ./. ./. ./.
chrM 410 222 T/T T/T T/T T/T T/T T/T

could you please tell me what I should do now? I would give Dominant Homozyagous  2 and reccessive Homozyagous 0 and give Hetrozyagous 1. 

 

Thank you 

Bing

ADD COMMENTlink written 3.7 years ago by bingnas10
1

You could write a Python program to do this for you. You'll have to parse each line one by one separate each genotype by "/" and check to see if it's homozygous or heterozygous. I have a script that tells if a genotype includes the alternative allele and can be modified to do what you'd like it to do.

ADD REPLYlink written 3.7 years ago by ebrown1955300
0
gravatar for bingnas
3.7 years ago by
bingnas10
United States
bingnas10 wrote:

Thank you ebown1955 for your help 

Yes please, I would like to see that code if you do not mind!

To be honest I am not familiar with bioinformatics, this is first time dealing with SNP's data, and would to convert the data to 0,1,2 and 5 that I can use Regression Analysis.

 

Bing

ADD COMMENTlink modified 3.7 years ago • written 3.7 years ago by bingnas10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1703 users visited in the last hour