Question: which R package to use to do Quality control for SNPs data
0
gravatar for mms140130
2.4 years ago by
mms14013060
mms14013060 wrote:

Hello,

I have a large SNP data, I'm trying to remove the SNPs with minor allele frequency (MAF) < 5% and the ones that don't follow Hardy-weinberg equilibrium . I'm using R and I don't know which package does that any help please

snp R • 1.4k views
ADD COMMENTlink modified 2.4 years ago by willgilks260 • written 2.4 years ago by mms14013060

What is the input data look like? If gwas then try: https://bioconductor.org/packages/release/bioc/manuals/GWASTools/man/GWASTools.pdf

ADD REPLYlink written 2.4 years ago by zx87547.5k
2
gravatar for willgilks
2.4 years ago by
willgilks260
United Kingdom
willgilks260 wrote:

Hi MMS,

You could try https://cran.r-project.org/web/packages/vcfR/index.html although without meaning to sound intentionally vain, I think my graphs are better :) https://f1000research.com/articles/5-2644/v3 with code available at https://zenodo.org/record/159272#.WKCKsBAnp7E. To visualise the qc using R you can use GATK variantsToTable function to make a readable table. https://software.broadinstitute.org/gatk/documentation/tooldocs/current/org_broadinstitute_gatk_tools_walkers_variantutils_VariantsToTable.php

GATK also has a Hard-Weinberg calculator but I'm not sure about filtering variants directly https://software.broadinstitute.org/gatk/documentation/tooldocs/current/org_broadinstitute_gatk_tools_walkers_annotator_HardyWeinberg.php

If you don't use GATK, then programs like vcfTools and bcfTools could probably help, otherwise you have you write your own Perl/Bash/Python/whatever scripts. Plink 1.9 is good too https://www.cog-genomics.org/plink2. You have to convert your vcf into just plink format genotypes, then it's easy to filter by MAF, and HWE.

ADD COMMENTlink written 2.4 years ago by willgilks260

Thank you , I appreciate your help

ADD REPLYlink written 2.4 years ago by mms14013060
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 836 users visited in the last hour