How To Get The Genotype From Sequence Data?
2
2
Entering edit mode
11.9 years ago
Stingery ▴ 20

I have a data set of unphased sequence data, i.e. one nucleotide sequence per locus per individual. From there, I would like to compute the genotype, because a wide range of statistical software works with the genotype, but not the sequence. How can I do that? Is there a programme, which I could employ?

Thanks!

genotyping nucleotide sequence haplotype • 4.8k views
ADD COMMENT
3
Entering edit mode
11.9 years ago

You have to do a type of analysis called SNP calling. There are many tools to do it, depending on the type of data you have. A good starting point may be bowtie used in combination with samtools; have a look at its tutorial, it should be a good primer. Another tool is gatk.

Have also a look at the other discussions in this website. For example:

ADD COMMENT
1
Entering edit mode
11.9 years ago

If you have a VCF file after QC, Alignment/Mapping and Variant calling steps you can get the genotype data from it (using utility or use vcftools to filter based on various conditions (Read about VCF specifications here and VCFtools manuscript here) . If you have any missing genotypes, you can perform genotype imputation. If you are new to analysis of NGS data refer to Giovanni's answer and other sequencing related discussions at BioStar.

ADD COMMENT
0
Entering edit mode

+1 for the nice answer and for the new photo plus hair cut! :-)

ADD REPLY
0
Entering edit mode

Thanks Giovanni :) !

ADD REPLY

Login before adding your answer.

Traffic: 2813 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6