GenomeStudio genotype format to VCF?
3
1
Entering edit mode
6.9 years ago
always_learning ★ 1.1k

Dear All,

We are working on some project to check concordance between Genotype Sample and NGS sample. We are using MEGA Genotype Array. We have processed genotype sample using GenomeStudio from Illumina and following procedure given at Is there a tool to transform GenomeStudio genotype format to VCF? .

Generated (Genotype) VCF has some very weird entries with Chromosome as 0 and Position as 0 as well. I am not sure why these entries are coming with Genotype VCF. As we are using same genotype file for concordance with Sequencing VCF file and we are getting concordance around 80-85% only which ideally should be more then 90%. Could some one help me with some additional input about any improvements specially why We are getting Chromosome as 0 and Position as 0 . Apart from this we are not getting QUAL score as well for any of VCF entries .

Thanks

Genotype • 6.2k views
0
Entering edit mode
6.9 years ago
vassialk ▴ 200

NextGene, DNALasergene, GeneousPro and CLCGenomics can help you to play with data.

0
Entering edit mode
6.9 years ago
Sam ★ 4.5k

If it is not a must for you to work on the vcf data, you may try and convert the genome studio data into plink format. Then you can use the --vcf function from plink to convert the vcf format into plink format. That will allow you to easily check the concordance. However, for those SNPs with chromosome 0 and position 0, something might be wrong. How do you call the VCF file from the NGS samples?

side note: vassialk, it is not really helpful to keep posting the commercial software here without actually answering the question.

0
Entering edit mode

We have used plink format and then converted in vcf format only.

0
Entering edit mode

So then you don't need to transform the genomestudio format into vcf? If you already got the vcf files, then you should follow Robert's suggestion to use bcftools for the concordance estimation.

0
Entering edit mode
6.9 years ago
Robert Sicko ▴ 630

Chr0 probes in the genome studio project are problem probes (mapping to multiple locations, etc.) and should be omitted prior to calculating concordance.

You can also generate a VCF from genome studio using: https://github.com/jaredo/chiamante/wiki

After that, you can use bcftools and intersect your array.vcf with sequencing.vcf to create union.vcf. Then compare concordance between array.vcf & sequencing.vcf for only locations in union.vcf