Question: Individual VCF files from main VCF file
0
gravatar for win
4.2 years ago by
win810
India
win810 wrote:

Hi all,

In the 1000 genomes project there is one large VCF file which has all the samples represented in columns.

I want to generate one VCF file for each sample, how can this be done.

Also with the script that can do this, is it possible to stream the main VCF so that I dont have to store it locally.

Thanks in advance

vcf • 2.5k views
ADD COMMENTlink modified 2.1 years ago by Jorge Amigo11k • written 4.2 years ago by win810
2
gravatar for Pierre Lindenbaum
4.2 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum119k wrote:

I wrote Biostar130456 https://github.com/lindenb/jvarkit/wiki/Biostar130456

$   curl -sL "https://raw.githubusercontent.com/arq5x/bedtools2/bc2f97d565c36a82c1a0b12f570fed4398001e5f/test/map/test.vcf" |\
    java -jar dist/biostar130456.jar -x -z -p "sample.__SAMPLE__.vcf.gz" 
sample.NA00003.vcf.gz
sample.NA00001.vcf.gz
sample.NA00002.vcf.gz

$ gunzip -c sample.NA00003.vcf.gz
(...)
##source=myImputationProgramV3.1
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  NA00003
chr1    10  rs6054257   G   A   29  PASS    AF=0.5;DB;DP=14;H2;NS=3 GT:DP:GQ:HQ 1/1:5:43
chr1    20  rs6040355   A   G,T 67  PASS    AA=T;AF=0.333,0.667;DB;DP=10;NS=2   GT:DP:GQ 2/2:4:35
chr1    130 microsat1   GTC G,GTCT  50  PASS    AA=G;DP=9;NS=3  GT:DP:GQ    1/1:3:40
chr2    130 microsat1   GTC G,GTCT  50  PASS    AA=G;DP=9;NS=3  GT:DP:GQ    1/1:3:40
ADD COMMENTlink modified 4.2 years ago • written 4.2 years ago by Pierre Lindenbaum119k

Does this code create only one VCF file with a specific sample ID or create one VCF file per sample in original file?

ADD REPLYlink written 8 months ago by c77500

it creates one VCF file per sample in original file

ADD REPLYlink written 8 months ago by Pierre Lindenbaum119k
2
gravatar for Jorge Amigo
2.1 years ago by
Jorge Amigo11k
Santiago de Compostela, Spain
Jorge Amigo11k wrote:

already stated here and here:

for file in *.vcf*; do
  for sample in `bcftools query -l $file`; do
    bcftools view -c1 -Oz -s $sample -o ${file/.vcf*/.$sample.vcf.gz} $file
  done
done
ADD COMMENTlink written 2.1 years ago by Jorge Amigo11k
0
gravatar for Lee Katz
4.2 years ago by
Lee Katz2.9k
Atlanta, GA
Lee Katz2.9k wrote:

You could do something with the new bcftools v1.1 like this:

bcftools query -H pooled.vcf.gz -f '%CHROM\t%POS\t%REF\t%ALT[\t%SAMPLE=%GT]\n' --samples 'favoriteSample'

I don't think it keeps all the headers but it will give you the information you might want.  If you don't want any headers at all, you can remove the -H. 

ADD COMMENTlink written 4.2 years ago by Lee Katz2.9k

I wanted each sample to have its own VCF

ADD REPLYlink written 4.2 years ago by win810

The --samples argument lets you choose one sample at a time.  So you'd have to run this command once per sample.

ADD REPLYlink written 4.2 years ago by Lee Katz2.9k
0
gravatar for geek_y
4.2 years ago by
geek_y9.4k
Barcelona/CRG/London/Imperial
geek_y9.4k wrote:

How To Split Multiple Samples In Vcf File Generated By Gatk?

ADD COMMENTlink written 4.2 years ago by geek_y9.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 798 users visited in the last hour