I have downloaded some data from the short read archive using the sratoolkit. The data is SOLiD data. I have seen people using the Lifescope (Life Technologies) to align the reads, as I presume it works for this type of data. But unfortunately, I can't get anyone to help with administrator permissions on our cluster, so I'm looking at alternative ways to perform this.
My data was aquired by using the fastq-dump looks like this:
My question is once I have converted the data using one of these can I proceed with using the bwa aligner and then filter for mapping quality and continue? Or should I align using bfast? What I need is a resulting bam file, as I would like to merge the lanes (Each sample is across 5 or so), edit the readgroups and sort the files so they can be variant called with another data set.
This is so-called colorspace data. Bowtie can align colorspace data until version 1.3.0 when they dropped support for it. Hence I would get version 1.2.3 and align with it. Easiest is probably to get with conda:
conda install -c bioconda bowtie=1.2.3
You will first have to build a colorspace index from the reference genome (or use one provided on the bowtie website). Then align with bowtie which can optionally output a SAM file (read its manual) which you can later convert to BAM with samtools.