Question: bcftools merge; retaining sample names
0
gravatar for Lee Katz
4.3 years ago by
Lee Katz2.9k
Atlanta, GA
Lee Katz2.9k wrote:

When I do bcftools merge, the headers do not retain the filenames.  How can I specify filenames?

This is my command 

bcftools merge vcf/unfiltered/*.vcf.gz -O z > msa/pooled.vcf.gz

However this is the relevant part of my header, despite the filenames I gave it.  Is it just up to me to parse the mergeCommand line? Or is there a way to use bcftools query to get the right headers after the fact?

##bcftools_mergeVersion=0.2.0-rc7-47-g02a1fb3+htslib-0.2.0-rc7-36-g6e2ebc4
##bcftools_mergeCommand=merge -O z vcf/unfiltered/lambda_virus.fasta.wgsim.fastq.gz-lambda_virus.vcf.gz vcf/unfiltered/lambda_virus.fasta.wgsim.fastq.gz-reference.vcf.gz vcf/unfiltered/sample1.fastq.gz-lambda_virus.vcf.gz vcf/unfiltered/sample1.fastq.gz-reference.vcf.gz vcf/unfiltered/sample2.fastq.gz-lambda_virus.vcf.gz vcf/unfiltered/sample2.fastq.gz-reference.vcf.gz vcf/unfiltered/sample3.fastq.gz-lambda_virus.vcf.gz vcf/unfiltered/sample3.fastq.gz-reference.vcf.gz vcf/unfiltered/sample4.fastq.gz-lambda_virus.vcf.gz vcf/unfiltered/sample4.fastq.gz-reference.vcf.gz 
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  Sample1 2:Sample1       3:Sample1       4:Sample1       5:Sample1       6:Sample1       7:Sample1       8:Sample1       9:Sample1       10:Sample1

samtools merge bcftools vcf • 5.7k views
ADD COMMENTlink modified 4.3 years ago • written 4.3 years ago by Lee Katz2.9k
1
gravatar for Pierre Lindenbaum
4.3 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum116k wrote:

how about renaming the samples in all your input *.vcf before calling bcftools ?

ADD COMMENTlink written 4.3 years ago by Pierre Lindenbaum116k

How do you do that?  Just change the header from 

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  Sample1

to

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT lambda_virus.fasta

?

Or is there a Vcf.pm method?  Or bcftools/vcftools method?

ADD REPLYlink modified 4.3 years ago • written 4.3 years ago by Lee Katz2.9k
1

sed '/^#CHROM/s/Sample1/lambda_virus.fasta/' in.vcf > out.vcf

ADD REPLYlink written 4.3 years ago by Pierre Lindenbaum116k

I can't figure out your sed magic but it essentially works, thanks!  This is my full system call.

varscan.sh mpileup2cns $pileup --min-coverage $$settings{coverage} --min-coverage 10 --min-var-freq 0.75 --output-vcf 1 |\
    perl -lane 's/Sample1/\Q$vcf\E/; print;' |\
    bgzip -c > $vcf
ADD REPLYlink written 4.3 years ago by Lee Katz2.9k
1

Read the sed command so:

From the file in.vcf 

In lines that begin with "#CHROM" (/^#CHROM)

substitute "Sample1" with "lambda_virus.fasta" (/s/Sample1/lambda_virus.fasta)

and write the output to "out.vcf" (>out.vcf)

Put together,

sed '/^#CHROM/s/Sample1/lambda_virus.fasta/' in.vcf > out.vcf
ADD REPLYlink modified 4.3 years ago • written 4.3 years ago by RamRS20k
0
gravatar for Lee Katz
4.3 years ago by
Lee Katz2.9k
Atlanta, GA
Lee Katz2.9k wrote:

Answer was essentially from Pierre: find and replace Sample1 with the correct name in each corresponding vcf file

ADD COMMENTlink modified 4.3 years ago • written 4.3 years ago by Lee Katz2.9k
1

I moved my comment to an answer

ADD REPLYlink written 4.3 years ago by Pierre Lindenbaum116k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 649 users visited in the last hour