Question: convert from solid fastq to sanger fastq
1
gravatar for pcantalupo
4.8 years ago by
pcantalupo120
United States
pcantalupo120 wrote:

Hello,

I downloaded Solid fastq file from here: ftp://ftp.ddbj.nig.ac.jp/ddbj_database/dra/fastq/SRA062/SRA062077/SRX207729/. I need to convert this format into sanger fastq because the software pipeline that I wrote only handles sanger fastq.

Here is what the first two sequences in the file looks like:

@SRR943115.1 solid0530_20110107_PE_HIVCB212TRL_HIVCB212TRL_2_27_44_F3 length=50
T.03.0.1.....................................1....3
+SRR943115.1 solid0530_20110107_PE_HIVCB212TRL_HIVCB212TRL_2_27_44_F3 length=50
!!A5!+!:!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!%!!!!%
@SRR943115.2 solid0530_20110107_PE_HIVCB212TRL_HIVCB212TRL_2_27_210_F3 length=50
T.12.1.0.....................................1....0
+SRR943115.2 solid0530_20110107_PE_HIVCB212TRL_HIVCB212TRL_2_27_210_F3 length=50
!!8%!,!%!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!8!!!!&

I already tried to use 'solid2fastq' in BFAST but that program requires a csfasta file and a qual file as input. I have not been able to find a csfastq -> fastq converter script.

Thank you,

Paul

 

solid fastq • 3.4k views
ADD COMMENTlink modified 4.8 years ago • written 4.8 years ago by pcantalupo120

I think you will find your answer if you look for similar posts.

ADD REPLYlink written 4.8 years ago by Chris Evelo10.0k

As Chris said this issue has been widely discussed on Biostars. BTW, which aligner are you using for the alignment?

ADD REPLYlink modified 4.8 years ago • written 4.8 years ago by Ashutosh Pandey11k

I am aligning with Bowtie2 but then I want to capture the unmapped sequences (I could care less about the human reads) for analysis in my viral meta-genomics pipeline which needs real Gs As Ts and Cs. Thank you again...Paul

ADD REPLYlink written 4.8 years ago by pcantalupo120

There are two ways you can convert color space  (.fasta, .qual) data into fastq. One is using a lossy compression method that is never recommended as a single error by sequencing machine in the read will be transmitted to the preceding bases in the reads durign translation. The other conversion method doesn't translate the reads into A, T, C, G or nucleotide reads and prepares colorspace fastq files that can be aligned against the reference genome which has been indexed accordingly for color space reads. Both SHRiMP2 and Bowtie can map such reads. Now coming to your question, first thing: Bowtie2 can't be used to map colorspace fastq or csfasta reads. So you will have to use Bowtie. I think the fasta file you have shown above (2 lines) is already formatted and can be used by bowtie (Remember bowtie not bowtie2). See this: http://bowtie-bio.sourceforge.net/manual.shtml#colorspace-alignment

ADD REPLYlink modified 4.8 years ago • written 4.8 years ago by Ashutosh Pandey11k
2
gravatar for pcantalupo
4.8 years ago by
pcantalupo120
United States
pcantalupo120 wrote:

Hello Stars,

Thank you to those that commented above. Since I couldn't find a script to suit my needs, I wrote my own colorspace fastq to sanger fastq Perl script in case anybody is interested. If you detect a bug in the script, please let me know.

P

ADD COMMENTlink modified 4.8 years ago • written 4.8 years ago by pcantalupo120
1

In my comments, I pointed out problems with this kind of conversion. Read this post: Transforming And Manipulating Color Space Reads

ADD REPLYlink written 4.8 years ago by Ashutosh Pandey11k
1

duly noted.

ADD REPLYlink written 4.8 years ago by pcantalupo120
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 722 users visited in the last hour