Question: Interesting things to do with 23andme data
gravatar for LizzAlice
4 months ago by
LizzAlice20 wrote:


I am a bioinformatics student, but unfortunately, I gained no practical experience so far and my lectures were not very informative. So I decided I would start doing something fun in my free time to get some experience. I recently used 23andme, so now I have a file with SNPs. I would really appreciate impulses about which questions could be answered with this, as well as recommendations for tools. I was thinking about programming some kind of pipeline with python. Thanks!

snp 23andme • 322 views
ADD COMMENTlink modified 8 weeks ago by benformatics1.5k • written 4 months ago by LizzAlice20
gravatar for Brice Sarver
4 months ago by
Brice Sarver3.5k
United States
Brice Sarver3.5k wrote:

If the SNPs are reported with dbSNP or other IDs, use tools and APIs to find the chromosome, position, reference allele, and alternative allele. Try searching for any clinical annotations through Clinvar or other resources. Join with dbNSFP or other functional databases to attempt to predict the biological consequence of any mutations for splicing or coding regions. See what the frequencies of your mutations are in larger cohorts, like gnomAD, 1000 Genomes, or HapMap. Try converting your coordinates from one reference build (e.g., GRCh37) to another (e.g., GRCh38). Do other organisms have similar mutations in orthologous genes? Explore this and other evolutionary questions using the UCSC genome browser.

I think this is a great idea for someone diving into bioinformatics a bit! There are lots of resources, including the Biostars Handbook, that can help you if you get stuck.

ADD COMMENTlink written 4 months ago by Brice Sarver3.5k
gravatar for Charles Warden
4 months ago by
Charles Warden7.6k
Duarte, CA
Charles Warden7.6k wrote:

I think working with your raw data is usually a good idea, but I think getting the most out of your data may require a non-trivial time commitment.

That said, there are some things that don't require coding experience. For example, here are some links from this blog post:

If you are OK with making your data publicly available, you can also generate a GET-Evidence report from the Personal Genome Project (and you can see my data as an example here).

I am currently looking into My preliminary guess from MySeq and the 23andMe diabetes report is that the PRS percentiles may be helpful for critically assessing your data, but may not actually be the most helpful (although I am sure there must be some exceptions). I think they also provide some other things - however, even with the $5 donation, I don't have results for what I submitted yesterday.

I also think is making some changes, but I believe most of the other links that I have provided have free options.

In general, I would recommend against options that I have seen to re-analyze your data for a charge:

There might also be exceptions that I don't know about. However, learning about your data in greater detail with free options (learning more coding and biology) is what I think is really the best (all other things being equal).

ADD COMMENTlink modified 4 months ago • written 4 months ago by Charles Warden7.6k
gravatar for WouterDeCoster
4 months ago by
WouterDeCoster43k wrote:

Things that I did with my 23andme data:

  • Look at my APOE allele (major risk locus for Alzheimer Disease)
  • Look at heterozygous recessive alleles (carrier status, could be important if you are thinking about having kids)
ADD COMMENTlink written 4 months ago by WouterDeCoster43k

Yes – good point. I think rare disease carrier status is an excellent example of a robust genomics application.

I also checked my APOE status, but I also did some extra research to see if I could understand more about what data is being used for the risk associations. I also have somewhat similar blog posts for moderate-to-high risk cancer genes, in terms of population frequency and/or risk estimates.

ADD REPLYlink written 4 months ago by Charles Warden7.6k
gravatar for benformatics
8 weeks ago by
ETH Zurich
benformatics1.5k wrote:

You can convert your results into a VCF and then plug them into Ensembl VEP but then take a look at the nonsense/missense mutations that affect protein coding genes. However, I would take the results with a grain of salt.

ADD COMMENTlink written 8 weeks ago by benformatics1.5k

Another variant annotation alternative is OpenCRAVAT, which can run directly on the 23andMe files.

ADD REPLYlink written 7 weeks ago by Collin790
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1813 users visited in the last hour