Question

merge large amount of fastq files into a single one

10

Entering edit mode

9.5 years ago

catherine ▴ 250

I have 30 small fastq files from same sample, and I want to merge it into one file. I know the command is

cat file1.fastq file2.fastq > bigfile.fastq

but is there any short cut for doing it? It just looks silly to type 30 file names one by one...

Thank you for any idea!

ChIP-Seq fastq • 98k views

ADD COMMENT • link updated 2.3 years ago by Ram 44k • written 9.5 years ago by catherine ▴ 250

0

Entering edit mode

Those with Windows can use this GUI tool (works also on Linux via wine): http://www.dnabaser.com/download/Merge%20Fasta/index.html

ADD REPLY • link 8.3 years ago by BioApps ▴ 800

16

Entering edit mode

9.5 years ago

Pierre Lindenbaum 164k

It just looks silly to type 30 file names one by one...

With file globbing

cat file*.fastq > bigfile.fastq

Note: It also works with fastq.gz files. (http://stackoverflow.com/questions/8005114)

cat file*.fastq.gz > bigfile.fastq.gz

ADD COMMENT • link updated 2.3 years ago by Ram 44k • written 9.5 years ago by Pierre Lindenbaum 164k

0

Entering edit mode

Error while using: cat*.R1_unmapped.fq  > unmapped_R1.fq

216_7W_Ca1_R1_unmapped.fq  
216_9W_Co2_R1_unmapped.fq 
 218_5W_Pa1_R1_unmapped.fq  
218_7W_Pa2_R1_unmapped.fq  

[root@psgl unmapped]# cat *.R1_unmapped.fq  > unmapped_R1.fq\

cat: *.R1_unmapped.fq: No such file or directory

ADD REPLY • link updated 2.3 years ago by Ram 44k • written 7.7 years ago by Bioinfonext ▴ 470

1

Entering edit mode

(extra dot)

cat *_R1_unmapped.fq > unmapped_R1.fq

ADD REPLY • link 7.7 years ago by vmicrobio ▴ 290

0

Entering edit mode

Nice solution. Yes. Basically you need to do a 'dumb' file merge.

ADD REPLY • link 7.6 years ago by BioApps ▴ 800

Ram · Accepted Answer · 2015-03-26

12

Entering edit mode

9.5 years ago

David Langenberger 11k

cat file*.fastq > bigfile.fastq

ADD COMMENT • link updated 2.3 years ago by Ram 44k • written 9.5 years ago by David Langenberger 11k

0

Entering edit mode

οh yeah! i was so stupid!

ADD REPLY • link 9.5 years ago by catherine ▴ 250

13

Entering edit mode

Be cautious about this approach! Depending on your system, you can enter an endless loop of concatenating the new file to itself. I strictly do:

cat *.fq > merged.fastq** or **cat *.fastq > merged.fq

...or whatever is needed to ensure the pattern does not match the new file being created.

ADD REPLY • link updated 2.3 years ago by Ram 44k • written 9.5 years ago by Brian Bushnell 20k

0

Entering edit mode

Does this happen? My understanding is that shell first parses "*.fq" and at that time "merged.fq" has not been generated yet. I bet a lot of people must have typed "cat *.txt > out.txt". Shell developers should have been aware of such an issue for many years. I could be wrong, though.

ADD REPLY • link 9.5 years ago by lh3 33k

1

Entering edit mode

Actually, it happened to me once. That's why I put the 'file' as prefix for the input and 'bigfile' for the output. But I didn't know that it is system dependent. Thanks for mentioning it, Brian.