Forum:How To Analyze Your Own Exome
2
4
Entering edit mode
11.3 years ago
alexej.knaus ▴ 130

Are you curious about your own exome and genetic variants?

Would you like to get your exmoe sequences and anlyze it by yourself?

-> Participate in the Personal Exome Project run by GeneTalk

  1. Visit www.gene-talk.de/pep and hit the "donate DNA button"
  2. Read the participants information, enter your personal information and hit the submit button -> The PEP-Support will take care about your participation and contact you
  3. We send you a saliva extraction kit
  4. Send back the saliva sample to GeneTalk
  5. Create an account at GeneTalk
  6. Your VCF file will be directly uploaded onto your account.
    • If you have your own VCF file already, simply register at GeneTalk and upload your VCF file onto your account
  7. Start filtering your own exome vcf at GeneTalk
  8. Provide annotations and view what other users write about specific variants
  9. Rate and comment existing annotations
  10. Help the community to find disease-causing mutations by providing annotations, gene-pnales for filtering or your expertise to the community.

If you would like to help the financing the PEP and the free platform GeneTalk, please donate here

exome filtering analysis vcf variant • 7.2k views
ADD COMMENT
3
Entering edit mode

Is this work being done with the oversight of an ethics review board?

ADD REPLY
2
Entering edit mode

Should it be? Do I need an ethics board to approve my own investigation of my exome for my own purposes? Does 23andMe get reviewed? I don't think so. What if I want to order my own paternity test or blood tests for my own information? I agree that ethics board should review any research that is conducted by a commercial entity or academic institute which is carrying out research into new treatments, diagnostics, etc on a patient population. But, it is less clear to me when this is self-directed. Unless we can't trust people to use the information safely. What other information isn't safe for people to have? Just asking questions with hard answers. :-)

ADD REPLY
5
Entering edit mode

Obi, I think you need to look at this project's documentation first. It is not clear to me that it is "self-directed" -- actually I'm not sure what "self-directed" means as you use it. Your examples do not match this situation, either. If you order a paternity test on yourself, I assume you are not doing it with a research intent. You pay a lab for that service, and they provide it. That is not research. In this case, however, it sounds like participants submit saliva samples to be sequenced at GeneTalk. From their documentation, "Your participation in this study will help to establish a database of genetic sequence variants, which will be freely accessible for the scientific community on the platform GeneTalk." They have a web page with what appears to be a consent document, although it is not explicitly labeled as such. It does not approach the level of detail necessary for any consent document I have ever seen. They state: "The data collected within the study will not be disclosed to third parties and are only accessible by physicians and the participant of the study." Which physicians? "The DNA will be stored for the duration of the study." Stored where? What is the duration of the study? "The results of the study will be used in anonymized form for publications in scientific journals." This sounds like research to me. Which should have oversight by a professional ethics review board. So, to answer your question, yes. I don't think it's actually a hard answer at all.

ADD REPLY
1
Entering edit mode

Yes. You are correct. The documentation at the project's website is completely different from how it was presented in the OP. I was responding to the idea presented in the title and body of the post which was to "get your exome sequenced and analyze it by yourself". And, since it read like an advertisement I assumed (always dangerous) the idea was to send a sample, pay a fee, get some exome sequence returned to you, do your own analysis. It was not clear to me that such a scheme (comparable to 23andMe) would require IRB approval. Or maybe it does? But, looking at the details of this project which were not described in the post, it is clear that this is a research project looking for participants and would almost certainly require ethics review.

ADD REPLY
0
Entering edit mode

The data obtaind within the PEP that will be used in publications will be not of need for a review of an ethic committee. However, we will establish a committee of scientists and clinicians that will discuss the project and provide guidance.

ADD REPLY
2
Entering edit mode

You should really talk to a lawyer before proceeding. We're also based in Germany and I remember many of my clinical colleagues talking about the legal difficulties surrounding collecting/using genetic information here.

Also, you need to run things before the ethics panel before collecting the samples. Any data from humans needs to be run by a review board prior to data collection.

ADD REPLY
1
Entering edit mode

Alexej, who determined that your work does not need ethics board oversight? That determination is actually made by an ethics board, not you. Has an ethics board reviewed your protocol and determined your organization does not need ethics board oversight? Or have you internally made this decision? All of your group's documentation makes it clear that this is a research endeavor, and it involves human subjects. International standards of human subject protection require ethics board review prior to any work being started with human subjects. Please do this -- you may have a wonderful study proposal, but it will make it very difficult for other scientists to make use of your data in the way you seem to intend if the data were not collected in agreement with international ethical research standards.

ADD REPLY
0
Entering edit mode

The idea behind the PEP is, that sequece variants from all participants will be counted and (such as in the big scale studies) but rather than publishing information about a patiants variants we / or the participant will provide just annotations about his variants. It means that the patient data will not be publically availiable in its complete form. GeneTalk will separate patient information (exome data) into single variant information and store it in the GeneTalk knowledge base. While we willy release only the genotype frequencies in a collapsed manner to reduce data abuse, it's up to you to share your whole data with other GeneTalk users.

ADD REPLY
2
Entering edit mode

This seems to be out of place in the tutorial section and reads more like an advertisement. Maybe it would fit better in the Forum section?

ADD REPLY
0
Entering edit mode

So...has there been ethics board review? And I agree with you, Chris, this is more of an advertisement for a research project, not a tutorial.

ADD REPLY
0
Entering edit mode

Details about the PEP study:

Sequencing: Illumina HiSeq2000 (100bp paired-end run)
Enrichment:  SureSelectXT Human All Exon V4
Coverage: 50x
Variant calling: GATK v2.6 (please ask me if you wish to know further details)

We hope to get more participants to lower the costs of sequencing below 1000$.

ADD REPLY
0
Entering edit mode

I second Alex's question regarding oversight of a review board. I also am interested if the raw data would be provided to the subjects (FASTQ or BAM) for further independent analysis? Can you provide more technical information as well? Platform, targeted depth of coverage, algorithm used for generating VCF, etc.

ADD REPLY
0
Entering edit mode

Raw data can be provided to the participants if they want to. The details are explained in the second post

ADD REPLY
0
Entering edit mode
11.3 years ago
alexej.knaus ▴ 130

Ok here comes the how to: There are several options to filter for medically relevant variants, therefore I will explain several settings and the output information.

Filter for disease causing variants: Filter settings:

Functional:

  • Nonsynonymous
    • Missense
    • Nonsense
    • Stop loss
  • Frameshift (insertion/deletion)
  • Nonframeshift (insertion/deletion)
  • Splice site affecting

Inheritance:

  • Homozygous positions (i.e., x=y, no matter whether x is reference allele or not)
    • Homozygous variants (i.e., x=y and x≠reference allele)

Annotation:

  • Medical relevance at least 5 star (Disease causing)
  • Scientific evidence at least 5 star (Very high (e.g., therapy studies)) (all dbSNP annotations that have the "clinically precious" flag and HGMD annotations that are tagged as "disease causing" are rated initially 5 star for medical relevance and 5 star for scientific eveidence. If you disagree with this rating, you have the power to change it!)
  • remove variants without annotation (1 star)

The filtered VCF file will cotain variants that have an effect on protein level and are rated as disease causing with a very high scientific evidence. This information comes largely from existing databases such as HGMD and dbSNP, but may also contain variants that were annotated by experts from the GeneTalk community.

In my case (exome) I found over 600 variants with this filter setting. There were many that were initially ranked as disease causing, but came from GWAS studies and had a high genotype frequency in the population (1kGP and ESP) and were therefore definetly not disease causing. By rating down the variants that are not disease causing but might be disease associated you can help the community of scientists and clinicians to efficiatly look for disase causign variants.

With an additional filter step:

Frequency: 1%

You can filter out common polymorphisms and reduce the ammount of variants with annotations to around a dozen, which will be a very manageable number.

If you have any specific questions that you do not want to discuss on this forum you can send me an email: knaus@gene-talk.de

ADD COMMENT
0
Entering edit mode
11.2 years ago
alexej.knaus ▴ 130

The Personal Exomes Project - Participate

Information for participants of the personal exome project for the identification of genetic sequence variants

Dear study participant, Exome sequencing is an effective diagnostic method for identifying rare, disease-causing variants in genetic disorders. The challenge is to discriminate sequence variants that are medically relevant from those that are without clinical impact in a patient. Many monogenic diseases are almost completely penetrant. This means that such disease-causing genotypes will hardly occur in the healthy population. Homozygous sequence variants identified in some unrelated, healthy adults are therefore most likely not the cause of recessive, fully penetrant, early onset genetic disorders and heterozygous sequence variants are not the cause of dominant, fully penetrant, early onset genetic disorders. However, knowing that these variants are not disease-causing, may be crucial for the identification of disease-relevant mutations, since these variants can be filtered out when sequence variants of a patient are analyzed.

Your participation in this study will help to establish a database of genetic sequence variants that is similar to the other large population studies as e.g. the 1000 genomes project or the 6500 exomes project. We will count how often a certain genotype is observed in all participants of GeneTalks personal exome project (PEP). We will release and updated a collapsed genotype frequency vector whenever there are ten new participants in the PEP. This genotype frequency vector of healthy individuals will be freely accessible for the scientific community on the platform GeneTalk. While we willy release only the genotype frequencies in a collapsed manner to reduce data abuse, it's up to you to share your whole data with other GeneTalk users.

To improve the classification of variant annotations in GeneTalk, we also count on your expertise! This means we would greatly appreciate if you participate actively in rating the medical relevance and scientific evidence of sequence variants and contribute your expert knowledge. You should filter your own exome with a five star rating for medical relevance and a one star rating for scientific evidence in GeneTalks annotation filter. All mutations that will pass this filter would have the current rating disease causing for a rare genetic disorder that is linked to in the annotation view. As you are not affected by this disease, you should reduce the rating for medical relevance. You should leave a comment for this annotation so that other users may benefit from your assessment. We will also organize personal exome workshops to discuss the findings of your exomes analysis.

For this study human genetic material (DNA) will be isolated from a saliva sample. Personal data (name, date of birth, address) and molecular genetic data (deducting genetic information, results of sequencing) collected within this study will be encrypted (pseudonymized) and digitally stored separately by Gene Talk, Berlin. The physician and principal investigator Dr. Peter Krawitz is responsible for data processing. The data collected within the study will not be disclosed to third parties and are only accessible by physicians and the participant of the study. The DNA will be stored for the duration of the study. The results of the study will be used in anonymized form for publications in scientific journals.

Participation in the study is free of charge. For the donation of the saliva sample participants do not receive any reward or payments. Participants agree that there are no claims for compensation, bonus or other benefits and participation in financial gains that may be achieved on the basis of this study and followed research.

You have the right to request all information from the principal investigator about any existing personal data that is obtained during the study. Consent to the use the DNA sample may be revoked at any time and without giving reasons. In case of cancellation, the DNA sample can be stored for control purposes. However, the participant has the right to demand its destruction. We would like to point out that molecular genetic studies can take a long time, therefore results are not likely to be expected six months after the sample collection. Participation in this study is completely voluntary. A withdrawal of consent is possible at any time and without notice. In case of revocation of the consent of all relevant personal data and the samples will be destroyed. We declare that the collected samples are purely for scientific investigations. They are not used for commercial purposes, which excludes the patenting and sale of genetic data.

If you need any further assistance or have any queries regarding the study, please do not hesitate to contact knaus@gene-talk.de

ADD COMMENT

Login before adding your answer.

Traffic: 1392 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6