Question: Merging fastq files
0
gravatar for william.mcentegart
11 months ago by
william.mcentegart0 wrote:

Good day colleagues. I have running some RNA-seqs on the illumina miseq with a control and two test groups. I have 3 samples of each group. I did one miseq chip with all 9 samples on, then did another two miseq chips with 5 samples on one then 4 on the other (to increase amount of data).

I have analysed the data all good. My boss however is unhappy that the data has come from different miseq runs on separate days. They want me to merge the fastq files from the different runs into one fastq files for each sample. This is in order that it looks better to reviewers when it comes to publishing time.

I feel this is a] a bit pointless b] like it might it fudging the data. I also couldnt merge the fastq files successfully using the cat function.

Does anyone know a way to either successfully blend fastq files or have a solid explanation as to why it will make no difference (so i can just tell him it wont work and we can all move on).

Thanks

rna-seq merge miseq fastq illumina • 508 views
ADD COMMENTlink modified 8 months ago by Biostar ♦♦ 20 • written 11 months ago by william.mcentegart0

Give us more details about not being able to cat the files together. That should almost always work.

BTW, point out to your boss that we can determine the number of runs from looking at the fastq files, so it's not like concatenating the files is really going to mask that from reviewers that care.

ADD REPLYlink written 11 months ago by Devon Ryan91k
1

the data has come from different miseq runs on separate days.

Do we have to worry about batch effect?

ADD REPLYlink modified 11 months ago • written 11 months ago by RamRS24k
2

Not if they're the same library. If a different library was made then quite likely (less so if they use a robot for library prep).

ADD REPLYlink written 11 months ago by Devon Ryan91k

Using cat should work (see the comment by RamRS); I also recently used cat to merge fastq files without problems. However, as Devon correctly pointed out, if the fastq originates from different libraries, merging theme might not be a good idea. In that case you might want to insert the "run number" term as a covariate in the differential expression test model. See DESeq2 manual for a hint, then you can probably use any method to detect DE.

ADD REPLYlink written 8 months ago by Fabio Marroni2.3k

If your fastq files are gzipped, you might need to use zcat instead of cat.

ADD REPLYlink modified 11 months ago • written 11 months ago by grant.hovhannisyan1.7k
2

I think cat works fine with gzipped files also. See: A: How To Merge Two Fastq.Gz Files?

ADD REPLYlink written 11 months ago by RamRS24k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1149 users visited in the last hour