Question

Merge VCF files using PLINK with repetition of the same VCF file for more than one individual

0

Entering edit mode

7.9 years ago

sarah • 0

Dear All, I'm trying to merge VCF files using plink,

I have more than 1400 mouse each belong to one of 19 strains, I'm trying to generate an input file to apply GWAS analysis,

so I have 19 VCF files and I used the following 2 commands:

for i in *.vcf;  do f="${i%.*}"; plink --vcf $i --recode --out $f ;done

./plink --merge-list allfiles.txt --make-bed --out input

allfiles.txt have the following form:

C57L_J.ped C57L_J.map

LP_J.ped LP_J.map

C3H_HeJ.ped C3H_HeJ.map

C57BL_6NJ.ped C57BL_6NJ.map

A_J.ped A_J.map

A_J.ped A_J.map

FVB_NJ.ped FVB_NJ.map

BUB_BnJ.ped BUB_BnJ.map

129S1_SvImJ.ped 129S1_SvImJ.map

KK_HiJ.ped KK_HiJ.map

......

where each line correspond to a mouse

and I got

Performing single-pass merge (19 people, 17631239 variants)

so it only considered adding each strain once, is there a way I could ask plink to accept taking the same strain vcf file more than once?

plink vcf vcftools • 8.2k views

ADD COMMENT • link updated 7.8 years ago by Biostar 20 • written 7.9 years ago by sarah • 0

0

Entering edit mode

Hey Sarah,

The only thing that I can imagine that's happening is that the sample name in each of your files is the same, per strain. I cannot be sure, though. You may take a look here at the extra parameters possible to see if they may help: https://www.cog-genomics.org/plink/1.9/input#vcf

Otherwise, I'd recommend renaming all of your samples to make the names unique. This could be achieved using BCFtools reheader, found here: https://samtools.github.io/bcftools/bcftools.html

Kevin

ADD REPLY • link 7.8 years ago by Kevin Blighe 89k