Question: Merge fastq files
0
gravatar for rse
9 months ago by
rse90
Singapore
rse90 wrote:

Hi,

I have the following fastq files: *_L001_R1.fastq, *_L001_R2.fastq, *_L002_R1.fastq, *_L002_R2.fastq, *_L001_I1.fastq, *_L001_I2.fastq, *_L002_I1.fastq, *_L002_I2.fastq

How do i merge these files?

Thank you

sequencing next-gen • 525 views
ADD COMMENTlink written 9 months ago by rse90

You mean: cat file1 file2 file3 > mergefile ?

ADD REPLYlink written 9 months ago by zhangdengwei50

Yes, i want to merge these files into paired end files: R1.fastq and R2.fastq so i will merge all R1's and R2's together using separate cat commands. But i am not sure what to do of I1 and I2? Do i just ignore it?

ADD REPLYlink modified 9 months ago • written 9 months ago by rse90

perhaps first explain what the goal of the merging is.

if it is simply to 'reduce' the number of files then yes cat (as suggested by zhangdengwei will do) but that actually makes little 'biological/technical' sense

ADD REPLYlink written 9 months ago by lieven.sterck6.7k

I want to merge these files into paired end files: R1.fastq and R2.fastq so i will merge all R1's and R2's together using separate cat commands. But i am not sure what to do of I1 and I2? Do i just ignore it?

ADD REPLYlink written 9 months ago by rse90

OK, then cat is NOT the correct approach. What you should look for is tools that can create interleaved fastq files starting from separate fastq files. Simply cat-ing them together will not generate valid fastq files

still don't fully get why you want to merge them though, most programs will expect two files when processing paired-end data anyway (or at best the interleaved format as explained above)

ADD REPLYlink modified 9 months ago • written 9 months ago by lieven.sterck6.7k

Ok, thank you. Yes, i have 4 files (2 R1 and 2 R2 files from different lanes) so i am merging the 4 files into 2 files.

ADD REPLYlink written 9 months ago by rse90

Does anyone know how to handle the I1 and I2 files? Thank you

ADD REPLYlink written 9 months ago by rse90

Those (the I files I mean) you can omit, they are index files and not needed for typical downstream analysis

if you want to join the two R1 files together and then the two R2 then you could use cat (but make sure you keep the order correct). if you want to join R1 with R2 then you will have to go for interleaved

ADD REPLYlink written 9 months ago by lieven.sterck6.7k

Ok, understand. Thank you for the help.

ADD REPLYlink written 9 months ago by rse90

lieven.sterck : That is only correct if one has no interest in the index sequences (not sure why one would run these samples as indexed in first place but stuff happens).

It sounds like these samples are not demultiplexed. The index reads are present in separate files. This type of data is generally required for Qiime analysis.

rse : Are these 16S/metagenomic samples? If so you will need to make use of those I* files. If these are not for Qiime analysis are you interested in separating the samples based on the index sequences?

ADD REPLYlink modified 9 months ago • written 9 months ago by genomax76k

I stand corrected.

Indeed, I jumped to conclusion to soon. Of course the index files are useful (and required) for some analyses.

ADD REPLYlink written 9 months ago by lieven.sterck6.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1047 users visited in the last hour