Question: Gene by Gene Exome file processing
gravatar for pgrovetom
3.7 years ago by
pgrovetom0 wrote:

I made a leap of getting my Exome squenced by Gene by Gene and now have begun the process of trying to go from the files I downloaded to the ability to browse Genes and SNPs withing my Exome.

I'm just loooking at the files on a Windows system they sent which include:  BAI file  7,430 KB     File Association Manger  4,194,304 KB     File Association Manger   2,731,837 KB

GRC15551831_Exome      Windows Live Mail Contact File

GRC15551831_S4_L001_R1_001.fasq    GZ file  2,332,240 KB

GRC15551831_S4_L001_R2_001.fasq    GZ file  2,391,791 KB

GRC15551831_S4_L002_R1_001.fasq    GZ file  2,371,114 KB

GRC15551831_S4_L002_R2_001.fasq    GZ file  2,398,929 KB

I'm looking for any input on how to process the files down to a usable Gene/SNP format. Even a service or consultant that could assit me. Or someone I could hire to help me. I would appreciate any thoughts or help.


ADD COMMENTlink modified 3.7 years ago • written 3.7 years ago by pgrovetom0

I suppose, all the information you need is in the "Windows Live Mail Contact File" (which is in fact a Variant Call Format VCF file)

ADD REPLYlink written 3.7 years ago by Pierre Lindenbaum123k

"I suppose, all the information you need is in the "Windows Live Mail Contact File" (which is in fact a Variant Call Format VCF file)"

I've looked at the Broad Insitutes "Integrative Genomics Viewer"

VCF Files "Viewing VCF File with genotpes"

"Viewing Variants" for example

Is this all my "variants" or alleles that are considered an SNP because they vary from the "norm"

Is it the file that contains the final end product of the exome data and the SNPs found?

I have a limited undertanding of Bioinfortmatics but do understand that each allele or difference from the norm defined by its zygosity. Some SNP differences are simply traits while others can be harmless or some fatal while others have been found to cause disease or the probability of disease.

I guess there a couple of levels of viewing the data depending on whether the corresponding SNPs and the genotype = specific nucleotide pair

If they are fatal, I wouldn't be asking this question.

If a single SNP corresponds to a very specific disease risk, it should be easy to identify

If a disease risk is based on many SNPs and is probalistic, its complicated

So if one had a list of all the SNPs found in one's Exome

Some could be errors due to the sequencing error rate but

But the one's that are not errors could be used to identify genetic based disease probability

For example, rather than have individual testing of something like the Human Leukocyte Antigen (HLA) - for example HLAB27, I should be able to find my genotype withing my Exome if it is not normal. Maybe its not that simple but I believe HLAB27 is a specific allele that is well studied in autoimmune conditions.

Is it true that most of the SNPs associated with disease probability studied are located on the Exome? And that is because these are the protein coding genes

So should it possible for me to find a viewer or software that would allow me to identify my HLAB27 type or any similar genetic allele specific test such as BRA/1/2? It could be an error based on the 70x coverage by Gene by Gene on their Illumina HiSeq and a .1% error rate but could easily point to specifix testing to verify correctness.

thanks for any input




ADD REPLYlink written 3.7 years ago by pgrovetom0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 798 users visited in the last hour