I have fastq files of a bacteria cultured in two different media. Each has 3 replictaes. If I were to perform a denovo transcriptome assembly, how should I proceed? Shall I merge the 3 fastq replicates into one or just proceed with the single file? And If Ineed to merge, which tool would be the best.
Since this is bacterial data
rockhopper
is an option: https://cs.wellesley.edu/~btjaden/Rockhopper/the question is whether
refers to which tool is best for merging or which tool is best for assembling.
Truth to be told, I've never heard of Rockhopper, but it looks interesting. I ought to give it a go and benchmark it.
As for merging tools, typically, there is no need for a tool there. You can usually list multiple files as input; if not, you can concatenate files.
By the way, quite surprisingly, even gzipped files can be just concatenated without unzipping and will remain a valid gzipped file:
Way the original post is worded is subject to interpretation but in my mind the
best
part is referring to the original aim of this question - bacterial RNAseq. Rockhopper is designed specifically for it.I was thinking to concatenate the same way, but how would the generated file be? Wouldn't it be cluttered if 3 fastq (replicates) would be merged? So I asked for a tool.
There is a nuance. For
de novo
assembly of the transcriptome you can concatenate the files so there is a single transctiptome created for the genome. This will ensure comprehensive representation of the expression.Since these are biological replicates, actual differential expression analysis should be done with individual files using the transcriptome you assemble.
Ok, got it. I'll use all the files to create de novo transcriptome. But for DGE, I'll use individual file to map. Thanks alot.