Question: Aligning Colorspace Reads Using Bwa
4
gravatar for Farhat
9.5 years ago by
Farhat2.9k
Pune, India
Farhat2.9k wrote:

I have about 100 million reads from a SOLiD run. I am trying to align them using bwa and I got 0 alignments. What am I doing wrong here? Here are the commands that I am using

~/software/bwa-0.5.9/bwa aln -n 6 -t 6 -o 2 -c ~/genomes/hsap/hg19.fa sampleTF5.fastq.gz    
~/software/bwa-0.5.9/bwa samse ~/genomes/hsap/hg19.fa sampleTF5.sai sampleTF5.fastq.gz |samtools view -bS -|samtools sort - sampleTF5

About 40% of the reads align using Bioscope so I know that at least some reads should align. The index was created using -c so it is a colorspace index.

ETA: Couple of reads from the fastq file

@853_2_23
T10201001101112312122022330313023.22201032232203002
+
.06%8+23,-/,740&+2,&(*+&26%&%'';!%'(&)':2((,,-'%(.
@853_2_76
T00221112202322220011002232000222000212301132232001
+
&<*(%'?'&'&5)*'%%%&('-'(()-')&)&%)*'/%%&%'%(%&&'&%
alignment solid bwa • 6.5k views
ADD COMMENTlink modified 8.7 years ago • written 9.5 years ago by Farhat2.9k
1

what do your reads look like? did you use solid2fastq.pl?

ADD REPLYlink written 9.5 years ago by brentp23k
1

There are a couple of different scripts called solid2fastq.pl floating around: http://kevin-gattaca.blogspot.com/2010/05/plethora-of-solid2fastq-or-csfasta.html The bwa one double-encodes and the BFAST one doesn't, or at least that was the case a while ago.

ADD REPLYlink modified 8.5 years ago by Istvan Albert ♦♦ 85k • written 9.5 years ago by Mikael Huss4.7k

Yes, I used solid2fastq.pl. The reads are 50 bp long colorspace reads. The quality statistics looked okay with FASTQC.

ADD REPLYlink written 9.5 years ago by Farhat2.9k

@853_2_23 T10201001101112312122022330313023.22201032232203002 + .06%8+23,-/,740&+2,&(+&26%&%'';!%'(&)':2((,,-'%(. @853_2_76 T00221112202322220011002232000222000212301132232001 + &<(%'?'&'&5)'%%%&('-'(()-')&)&%)'/%%&%'%(%&&'&%

ADD REPLYlink written 9.5 years ago by Farhat2.9k

@853_2_23 T10201001101112312122022330313023.22201032232203002 + .06%8+23,-/,740&+2,&(+&26%&%'';!%'(&)':2((,,-'%(. @853_2_76 T00221112202322220011002232000222000212301132232001 + &<(%'?'&'&5)'%%%&('-'(()-')&)&%)'/%%&%'%(%&&'&%

ADD REPLYlink modified 8.5 years ago by Istvan Albert ♦♦ 85k • written 9.5 years ago by Farhat2.9k
3
gravatar for Farhat
9.5 years ago by
Farhat2.9k
Pune, India
Farhat2.9k wrote:

I found the solution to this and Alastair's link helped. Apparently bwa needs the fastq files to be 'double encoded'. Thus, you have to rewrite the colorspace fastq with tr/0123./ACGTN/ to get bwa to work. I am adding the solution here just in case others run into this issue too.

ADD COMMENTlink written 9.5 years ago by Farhat2.9k
2
gravatar for Alastair Kerr
9.5 years ago by
Alastair Kerr5.3k
Manchester/UK/Cancer Biomarker Centre at CRUK-MI
Alastair Kerr5.3k wrote:

You may need the -a bwtsw option. Check out this thread on SeqAnswers for further information

EDIT: 2016

bwa does not seems to support colorspace from version 0.6 onwards. The last version I am aware of that worked was 0.5.1.

I would suggest looking at BFAST, shrimp or novoalignCS but I have not needed to use colorspace reads now for many years.

ADD COMMENTlink modified 4.3 years ago • written 9.5 years ago by Alastair Kerr5.3k

@Alastair: I am trying this command to index the genome in color space: bwa index -a bwtsw -c GRCh38.r76.fa but I get the following error: index: invalid option -- 'c' . Can you guide me if this -c option is deprecated or what is wrong here. Thanks

ADD REPLYlink written 4.3 years ago by Bioinformatist Newbie250
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1733 users visited in the last hour