I am using plink to to export sequence data in vcf format to a raw file but keep getting the following error at a specific line that halts the entire process:
"ALT allele duplicates REF allele on line 166993 of .vcf file"
using the following command
plink --vcf "infile.vcf.gz" --extract exomeids.txt --out "outfile" --recodeA
When I examine this line I see that all subjects appear to be homozygous (listed as either 0/0 or 1/1). I can remove this exact chromosome or position but then get the same error on a different line. Is there a more succinct way to remove this error than listing all the positions that result in this error?
Note that this position is not one that I want to extract for my results but it is still processed in the initial part of the --out command. I have also used similar commands for other vcf files with no errors so I am curious if something is wrong with the coding on this file since I cannot find documentation of these error elsewhere.
Thanks for your help in advance.
Yes, thanks for the code. I was about to go down that path but was hoping there was a more simple way to exclude them from the --out command
Note: --maf setting restricting minor allele frequencies to larger values hasn't worked to exclude these either which doesn't fully make sense to me either.
This is because VCF import happens before everything else plink does: the entire file is imported, and only then are flags like --maf applied. Otherwise, it would be necessary to add --maf logic to every single import routine, etc.
(The full order of operations is at https://www.cog-genomics.org/plink/1.9/order .)
Thanks. Now I understand why --maf is not working but the --not-chr is effective since it is processed before the --vcf turns the file into binary format. Based on this order of operations it appears that I would not be able to exclude an exact position within a chromosome using --from --to commands since that happens after --vcf processing. I guess I'm back to deleting the lines of the file as was originally suggested.
Appreciate all the help!