Question: ambiguous characters are not reported in Bowtie2 alignments
0
gravatar for Famf
12 months ago by
Famf20
United States
Famf20 wrote:

I am trying to align a set of read sequences (SNP chip array) to a reference genome. I want to get the physical positions of each SNP in each read.

My reads are in FASTA format, e.g.

>id00001Zh:Chr02:57,645,640
TGCAGACYCAGACAAGGTTTAACACAGATTGGAACCGTTA
>id00002Zh:Chr07:53,650,797
TGTAAGATTGCYGGCAATGAATGTCAGTCGAGATGAAGAC
>id00003Zh:Chr06:48,898,851
CAGTMATTTTGATCCCTCGGTTGATGTGACTTCAAGCAGTA

I am using bowtie2 software. In each read there is an ambiguous character (IUPAC code) which are the SNPs I am interested to get their positions in the reference genome. Base on what I understood from the Bowtie2 manual I was expecting to get the following parameters in the alignment section of my output SAM file:

CIGAR = 39M and alignment score = MD:Z:7Y32
CIGAR = 39M and alignment score = MD:Z:11Y28
CIGAR = 39M and alignment score = MD:Z:4M35

Instead I got the following parameters for almost all the alignments

CIGAR = 40M and alignment score = MD:Z:40

This is the code I am using:

bowtie2 -x <bt2.idx> -f snp_chip.fa -S results.sam --no-unal

Any idea what could I am doing wrong?

alignment • 250 views
ADD COMMENTlink modified 12 months ago • written 12 months ago by Famf20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1654 users visited in the last hour