Am I right to use the following script to transform a fastq file (named test.fastq) to a fasta file? THANKS a lot!
In the list, seqtk, bioawk and seqret work with multi-line fastq; the rest don't. If you just want to use the standard unix tools, rtliu's sed solution is preferred, both short and efficient. It should be noted that file t.fq is put in /dev/shm and the results are written to /dev/null. In real applications, I/O may take more wall-clock time than CPU. In addition, frequently the sequence file is gzip'd. For seqtk, decompression takes more CPU time than parsing fastq.
SES observed that seqret was faster than Irsan's command. At my hand, seqret is always slower than most. Is it because of version, locale or something else?
I have not tried the native bioperl parser. Probably it is much slower. Using bioperl on large fastq is discouraged.
I do agree 4-line fastq is much more convenient for many people. However, fastq is never "defined" to consist of 4 lines. When it was first introduced at the Sanger Institute for ~1kb capillary reads, it allows multiple lines.
Tools in many ancient unix distributions (e.g. older AIX) do not work with long lines. I was working with a couple of such unix even in 2005. I guess this is why older outputs/formats, such as fasta, blast and genbank/embl, used short lines. This is not a concern any more nowadays.
To convert multi-line fastq to 4-line (or multi-line fasta to 2-line fasta): seqtk seq -l0 multi-line.fq > 4-line.fq
why so complicated ?
Why so complicated? :)
seqkit fq2fa in.fastq.gz -o out.fasta
and 2.4 years later...
Hi Pierre, I recently learned that the '@' from a fastq file is trouble when it's left in the fasta file for alignment... https://github.com/samtools/samtools/issues/773
So maybe your oneliner could use another update almost two years later... :)
Here's an update with '@' removal and multi-threaded unzipping thrown in as a bonus. https://zlib.net/pigz/
Common problem for new folk, no need for a perl script to do this, built in commands like that posted by Pierre to use awk are good.
Or this might be a little easier to remember / type:
Can we also apply this for whole folder. so that all file can we converted at once..