How to convert vcf dosage file format to mldose format?
1
1
Entering edit mode
9.4 years ago
genetic ▴ 40

Recently, we imputed our data using https://imputationserver.sph.umich.edu.

However, since the output file format is vcf dosage data, I cannot use Mach2dat for GWAS association analysis.

Does anyone know how to convert vcf dosage file format to mldose format?

Thank you so much!

Dosage file format (mldose for Mach2dat input format):

10009->QZ0526   DOSE    1.997   2.000   1.285   1.997   1.996   1.997   1.994   1.999   0.735   1.750   1.936
10010->QZ0488   DOSE    0.783   0.002   1.996   0.853   0.011   0.791   0.830   1.998   0.930   1.996   1.987

VCF file format from imputation:

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  QZ0001  QZ0002  QZ0003  QZ0006
20      60479   20:60479        C       T       .       PASS    MAF=0.00078;R2=0.01037  GT:DS   0|0:0.001       0|0:0.009       0|0:0.008       0|0:0.000
20      60522   20:60522        T       TC      .       PASS    MAF=0.00249;R2=0.02914  GT:DS   0|0:0.001       0|0:0.002       0|0:0.000       0|0:0.002
vcf imputation • 5.2k views
ADD COMMENT
0
Entering edit mode

That VCF would be easily converted to a PLINK .dat file - would that suit your purpose?

ADD REPLY
0
Entering edit mode

Could you be more specific on that? Thank you.

ADD REPLY
0
Entering edit mode

Yeah, could you elaborate more? I only know how to use PLINK to conver to ped file.

ADD REPLY
0
Entering edit mode

Hi I have this problem. Have you solved it? Could you give me the solution?

ADD REPLY
0
Entering edit mode
7.6 years ago

You can actually do a binary and quantitative GWAS using the michigan server output with SNPTEST if you are open to using different software. https://mathgen.stats.ox.ac.uk/genetics_software/snptest/old/snptest_v2.3.0.html#Support_for_vcf_files note in this example they are using the genotype (GT) versus probability (GP)

Commands are as follows snptest \  -data yourvcf.vcf.gz sample_file.sample \ -genotype_field GP \ -o outputfilename.txt \ -frequentist 1 \ -method score \ -pheno Affected \ -cov_names cov1 cov2 cov3 \ #should you need covariates

Hope this helps

ADD COMMENT

Login before adding your answer.

Traffic: 2488 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6