Genotype Illumina Sequence Data At A Small Number Of Snp Positions
1
0
Entering edit mode
11.9 years ago
dan.brewer ▴ 120

I have got Illumina whole genome sequence data from a set of samples in BAM format and I have also got genotype data on 90 snps from a Sequenom. What I would like to be able to do is to quickly compare the sequenom data with the sequence data.

As a first step I would like to get the genotype of the positions of the 90 SNPs in the sequence data. Has anyone got an idea of the best approach to do this.

I have been looking at using a combination of samtools mpileup with bcftools but I am not sure if this is appropriate. Problems I'm having:

  1. It seems to require a reference sequence
  2. Using a -r in mpileup is very fast but providing the same data in the bed file is very slow.
  3. I'm not sure that I'm using the correct options as I'm not getting what I would expect.

Command line:

samtools mpileup -u -r 10:106039185-106039185 ../sample_BAMs/ComplexMen/PD7445a.bam | /software/CGP/bin/bcftools view -g -

Output:

#CHROM    POS    ID    REF    ALT    QUAL    FILTER    INFO    FORMAT    PD7445a
10    106039185    .    N    A    222    .    DP=75;VDB=0.0404;AF1=1;AC1=2;DP4=0,0,43,32;MQ=60;FQ=-253    GT:PL:GQ    1/1:255,226,0:995

The test bed file:

10    106039185    106039185

Sorry if this is a dumb question, I'm new to this type of stuff.

genotyping next-gen • 2.1k views
ADD COMMENT
1
Entering edit mode
11.9 years ago

That vcf looks fine. You have an A allele, with 75x high coverage.

I suppose the other thing you could try, assuming that your SNPs are all in unique regions, is to skip the alignment step, and just grep your fastq for all the reads that contain the same sequence as your Sequenom extend primer. You could use a script, or some combo of grep and awk or sed or whatever to get the letter just after the extend primer in each read, and then sum those up.

ADD COMMENT
0
Entering edit mode

So in the example would it be AA because it is 1/1? How would it look if you had CT or something like that?

ADD REPLY
0
Entering edit mode

Yes, and because the DP4 shows all the reads are for the alterante letter, and there are none for the reference letter.

ADD REPLY

Login before adding your answer.

Traffic: 2652 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6