Manifest Reference Strand
0
0
Entering edit mode
12 months ago

Hi

I am trying to generate a vcf from SNP array data (genomestudio). So far I can convert to plink format from genomestudio and create a vcf via plink2 commands:

plink2 \
--pedmap gs-out-plink \
--make-pgen \
--sort-vars \
--merge-x \
--out test-1

then:

plink2 \
--threads 10 \
--pfile test-1 \
--fa ${ref} \
--snps-only just-acgt \
--export vcf \
--output-chr chrM \
--out out-test-1

I would then like to validate the vcf I created with GATK:

java -jar gatk ValidateVariants -R ${ref} -V plink_vcf.vcf.gz

The problem is however I get the error (renamed chr and Position):

The REF allele is incorrect for the record at position chrq:56789 fasta says C vs VCF says G

I believe this is down to Top/Bottom strand nomenclature. In their manifest file it gives (shortened example with alternate names):

Name    IlmnStrand  SNP GenomeBuild Chr MapInfo     RefStrand
 rs123456789    TOP [A/G]       38  15  987654321   -

Is there a software or any other way I can convert these to give the appropriate reference and alt allele?

Thanks in advance!

Manifest • 419 views
ADD COMMENT

Login before adding your answer.

Traffic: 1856 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6