merge large amount of fastq files into a single one
6.8 years ago
catherine ▴ 190

I have 30 small fastq files from same sample, and I want to merge it into one file. I know the command is

cat file1.fastq file2.fastq > bigfile.fastq

but is there any short cut for doing it? It just looks silly to type 30 file names one by one...

Thank you for any idea!

ChIP-Seq fastq • 63k views
Those with Windows can use this GUI tool (works also on Linux via wine): http://www.dnabaser.com/download/Merge%20Fasta/index.html

6.8 years ago
cat file*.fastq > bigfile.fastq
οh yeah! i was so stupid!

Be cautious about this approach!  Depending on your system, you can enter an endless loop of concatenating the new file to itself.  I strictly do:

cat *.fq > merged.fastq or cat *.fastq > merged.fq

...or whatever is needed to ensure the pattern does not match the new file being created.

Does this happen? My understanding is that shell first parses "*.fq" and at that time "merged.fq" has not been generated yet. I bet a lot of people must have typed "cat *.txt > out.txt". Shell developers should have been aware of such an issue for many years. I could be wrong, though.

Actually, it happened to me once. That's why I put the 'file' as prefix for the input and 'bigfile' for the output. But I didn't know that it is system dependent. Thanks for mentioning it, Brian.

I was wrong. You and Brian are right. I can reproduce this endless loop.

6.8 years ago

"It just looks silly to type 30 file names one by one..."

with file globbing: http://en.wikipedia.org/wiki/Glob_%28programming%29

cat file*.fastq > bigfile.fastq

note: it also works with fastq.gz files. ( http://stackoverflow.com/questions/8005114 )

cat file*.fastq.gz > bigfile.fastq.gz

Error while using: cat*.R1_unmapped.fq > unmapped_R1.fq

216_7W_Ca1_R1_unmapped.fq
216_9W_Co2_R1_unmapped.fq 218_5W_Pa1_R1_unmapped.fq
218_7W_Pa2_R1_unmapped.fq

[root@psgl unmapped]# cat *.R1_unmapped.fq > unmapped_R1.fq\

cat: *.R1_unmapped.fq: No such file or directory

(extra dot)

cat *_R1_unmapped.fq > unmapped_R1.fq

Nice solution. Yes. Basically you need to do a 'dumb' file merge.