Question: Problem of bcftools merge TCGA vcf files
0
gravatar for l66081129
4 weeks ago by
l660811290
l660811290 wrote:

hi, everyone I am college students and try to do case-control GWAS through TCGA vcf data. i use gdc-client download TCGA vcfs, and i got many single-samples vcf . Therefore, i attempt to merge single-samples vcf into multiples by bcftools. I encounter problems below and i search for some methods on net but no idea.Please help or try to give some ideas how to solve this. this is my commands:

bcftools merge  *.vcf.gz -Oz -o merge.vcf.gz

The Error is : Duplicate sample names (NORMAL), use --force-samples to proceed anyway. so i

bcftools merge --force-samples *.vcf.gz -Oz -o merge.vcf.gz

but i lost most of chrom info in merge.vcf . and try to see the header

zgrep -n '^#CHROM'  merge.vcf.gz |cat

the outcomes are like

CHROM   POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  NORMAL  TUMOR   2:NORMAL    2:TUMOR 3:NORMAL    3:TUMOR 4:NORMAL    4:TUMOR 5:NORMAL    5:TUMOR

In the end ,thank in advance and sorry for my poor english.

snp merge • 150 views
ADD COMMENTlink modified 4 weeks ago by Jorge Amigo12k • written 4 weeks ago by l660811290
1
gravatar for Jorge Amigo
4 weeks ago by
Jorge Amigo12k
Santiago de Compostela, Spain
Jorge Amigo12k wrote:

If your vcf samples are all internally named "NORMAL" and "TUMOR", my first suggestion would be to rename them before merging them:

for file in $(ls *.vcf.gz | grep -v renamed | grep -v merge); do
 bcftools reheader -s <(echo ${file/.vcf.gz}) -o ${file/.vcf.gz/.renamed.vcf.gz} $file
 tabix -fp vcf ${file/.vcf.gz/.renamed.vcf.gz}
done

You wouldn't have any problem to merge them afterwards:

bcftools merge -Oz -o merge.vcf.gz *.renamed.vcf.gz

You should see all the correct sample names on the last header line

zgrep ^#CHROM merge.vcf.gz
ADD COMMENTlink modified 4 weeks ago • written 4 weeks ago by Jorge Amigo12k

thank you for your answer sincerely !! it helps me a lot i,ve tried just now but there is error ,so i take step by step

for file in $(ls *.vcf.gz |grep -v rename |grep -v merge);do bcftools reheader -s ${file/.vcf.gz} -Oz -o ${file/.vcf.gz/.renamed.vcf.gz} $file ;done  2>>error

and error seems like bcftools reheader -s ${file/.vcf.gz} doesn't load vcf header in . i would try to learn and adjust it it suggest me use bcftools view -h old.bcf > header.txt and bcftools reheader -h header.txt

thanks again , have a nice day

ADD REPLYlink written 4 weeks ago by l660811290
1

Sorry I didn't test the reheader command before. It doesn't accept a sample name but a file containing sample names, nor it doesn't accept the -Oz common bcftools option, so I've edited the code and it now works flawlessly.

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by Jorge Amigo12k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1503 users visited in the last hour