Question: How to convert a file to vcf format
0
gravatar for kamanovae
12 months ago by
kamanovae0
kamanovae0 wrote:

Hello, I have a file for each chromosome that represents the following format:

CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT Sample1 Sample2 Sample3  
10      62010  10:62010        C       T       999     PASS    SING;AA=NN;AN=CC;AD=CC  GT      0|0     0|0     0|0    

10      110548  10:110548       T       C       999     PASS    SING;AA=T;AN=TT;AD=TT   GT      0|0     0|0     0|0

10      110848  10:110848       T       C       999     PASS    SING;AA=T;AN=TT;AD=TT   GT      1|1     1|1     1|1

How can I combine all the files and convert this format to standard vcf file?

Thank you so much for your help!

combine file vcf • 507 views
ADD COMMENTlink modified 12 months ago by Pierre Lindenbaum129k • written 12 months ago by kamanovae0
0
gravatar for Pierre Lindenbaum
12 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum129k wrote:

assuming the delimiter is a tab:

run the following script for each file

awk -f script .awk input.txt  | bcftools view -O z -o out.vcf.gz

with script.awk (add a '##contig' line for all your chromosomes)

BEGIN   {
    FS="\t";
    }
/^CHROM/ {
    printf("##fileformat=VCFv4.2\n");
    printf("##FORMAT=<ID=GT,Number=1,Type=String,Description=\"Genotype\">\n");
    printf("##INFO=<ID=AA,Number=1,Type=String,Description=\"TODO\">\n");
    printf("##INFO=<ID=AN,Number=1,Type=String,Description=\"TODO\">\n");
    printf("##INFO=<ID=AD,Number=1,Type=String,Description=\"TODO\">\n");
    printf("##INFO=<ID=SING,Number=0,Type=Flag,Description=\"TODO\">\n");
    printf("##INFO=<ID=SING,Number=0,Type=Flag,Description=\"TODO\">\n");
    printf("##contig=<ID=10,length=135534747,assembly=human_g1k_v37>\n");
    printf("#%s\n",$0);
    next;
    }
    {
    print;
    }

index with 'bctools index' and then merge the vcf with bcftools merge

ADD COMMENTlink written 12 months ago by Pierre Lindenbaum129k

Thanks for the answer! I painted my question poorly. I need to create a file for each sample. It is necessary to get information about specific SNPs for a sample from each file of a certain chromosome and write it in a separate VCF file. I hope that there are ready-made programs for this

ADD REPLYlink written 12 months ago by kamanovae0

Splitting vcf files to individual samples

ADD REPLYlink written 12 months ago by Pierre Lindenbaum129k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1580 users visited in the last hour