Entering edit mode
7.9 years ago
sarah
•
0
Dear All, I'm trying to merge VCF files using plink,
I have more than 1400 mouse each belong to one of 19 strains, I'm trying to generate an input file to apply GWAS analysis,
so I have 19 VCF files and I used the following 2 commands:
for i in *.vcf; do f="${i%.*}"; plink --vcf $i --recode --out $f ;done
./plink --merge-list allfiles.txt --make-bed --out input
allfiles.txt have the following form:
C57L_J.ped C57L_J.map
LP_J.ped LP_J.map
C3H_HeJ.ped C3H_HeJ.map
C57BL_6NJ.ped C57BL_6NJ.map
A_J.ped A_J.map
A_J.ped A_J.map
FVB_NJ.ped FVB_NJ.map
BUB_BnJ.ped BUB_BnJ.map
129S1_SvImJ.ped 129S1_SvImJ.map
KK_HiJ.ped KK_HiJ.map
......
where each line correspond to a mouse
and I got
Performing single-pass merge (19 people, 17631239 variants)
so it only considered adding each strain once, is there a way I could ask plink to accept taking the same strain vcf file more than once?
Hey Sarah,
The only thing that I can imagine that's happening is that the sample name in each of your files is the same, per strain. I cannot be sure, though. You may take a look here at the extra parameters possible to see if they may help: https://www.cog-genomics.org/plink/1.9/input#vcf
Otherwise, I'd recommend renaming all of your samples to make the names unique. This could be achieved using
BCFtools reheader
, found here: https://samtools.github.io/bcftools/bcftools.htmlKevin