Question: Generating Srf Files From Solid Reads For Archiving - Best Practice?
gravatar for Ian
9.8 years ago by
University of Manchester, UK
Ian5.7k wrote:

Does anyone have experience of generating SRF formatted SOLiD reads for publication?

I know there is solid2srf (, but I do not know if there is a 'best practice' for generating the files required by sequence archives. There does not seem to be much guidance on what is expected...

I am aiming for Array Express, if that helps. Anyone else submitted sequences here?


solid • 1.6k views
ADD COMMENTlink written 9.8 years ago by Ian5.7k
gravatar for Drio
9.5 years ago by
United States
Drio920 wrote:

There are really no best practices. The srf conversion is the easy part. The most annoying part is creating the metadata to match the specifications required by the recipient.

ADD COMMENTlink written 9.5 years ago by Drio920

Thanks for answering. Like you i found the conversion easy, but there is so little documentation i had no idea what metadata was needed. It turned out that Array Express accepts gzip'ed .csfasta and .qual files.

ADD REPLYlink written 9.5 years ago by Ian5.7k
gravatar for Jorge Amigo
9.5 years ago by
Jorge Amigo12k
Santiago de Compostela, Spain
Jorge Amigo12k wrote:

of course it depends on the repository you're considering submitting your results, and maybe this answer won't help you if the repositories you have in mind do not accept these suggestions, but I think that storing data directly on BAM format would be wise because of its reduced size, and also because reads can be recovered and reprocessed if needed (although some may have been left in the mapping step). if all raw results are to be saved I would still go for fastq format (you'll find several solid2fastq implementations from different mapping tools), which also reduces size considerably compared to csfasta+qual files. again sorry if I don't directly answer your question, but I just wanted to leave here a few ideas to be read by anyone landing on this post interested in storing SOLiD data.

ADD COMMENTlink written 9.5 years ago by Jorge Amigo12k

the solid2fastq version I've found most useful is the bfast C implementation, which is extremely fast. and if you combine this with pbzip (parallel bzip, which uses all your computer cores available and not just one as gzip does) you will end up reducing the size of your results in a glimpse.

ADD REPLYlink written 9.5 years ago by Jorge Amigo12k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2043 users visited in the last hour