Question: Assembling a Transcriptome using Trinity with multiple Illumina Fastq files
0
gravatar for Bioinformatics_finn
21 months ago by
Bioinformatics_finn0 wrote:

Hey y'all,

I am an undergraduate biology major and have just started exploring the field of bioinformatics. I have a few questions about how I should go about running Trinity when I have multiple FASTQ files from each sample. Should I concatenate all of the forward files together and all of the reverse read files together, so that I'm left with one forward and reverse? Is it better to just run them all at once in Trinity like in the following code: Trinity --seqType fq --max_memory 50G --left 2016BZ017_S11_L001.cleaned.1.fastq,2016BZ017_S11_L002.cleaned.1.fastq,2016BZ017_S2_L001.cleaned.1.fastq,2016BZ017_S2_L002.cleaned.1.fastq --right 2016BZ017_S11_L001.cleaned.2.fastq,2016BZ017_S11_L002.cleaned.2.fastq,2016BZ017_S2_L001.cleaned.2.fastq,2016BZ017_S2_L002.cleaned.2.fastq --CPU 6 or Trinity --seqType fq --max_memory 50G --left 2016BZ017_1_concatenated.fastq --right 2016BZ017_2_concatenated.fastq --CPU 6

My sample folder contains the following fastq files2016BZ017_S11_L001.cleaned.1.fastq, 2016BZ017_S11_L001.cleaned.2.fastq, 2016BZ017_S11_L002.cleaned.1.fastq, 2016BZ017_S11_L002.cleaned.2.fastq, 2016BZ017_S2_L001.cleaned.1.fastq, 2016BZ017_S2_L001.cleaned.2.fastq, 2016BZ017_S2_L002.cleaned.1.fastq, 2016BZ017_S2_L002.cleaned.2.fastq

Also: Would y'all suggest trimming the illumina adapters using the trimmomatic option within Trinity or doing it standalone?

Thank you all so much

ADD COMMENTlink modified 21 months ago by carlopecoraro2910 • written 21 months ago by Bioinformatics_finn0
1

Should I concatenate all of the forward files together and all of the reverse read files together, so that I'm left with one forward and reverse?

I would recommend doing this. You want to create a single assembly across all samples. If you're doing DGE downstream, you can specify replicates later.

ADD REPLYlink modified 21 months ago • written 21 months ago by st.ph.n2.4k

Would you suggest removing the header of the fastq files that are being concatenated?

ADD REPLYlink written 21 months ago by Bioinformatics_finn0
1

No these information are needed to pair and assemble the sequences.

ADD REPLYlink written 21 months ago by st.ph.n2.4k
2
gravatar for h.mon
21 months ago by
h.mon24k
Brazil
h.mon24k wrote:

Concatenating and passing them as --left 1.fq --right 2.fq, or passing lists of several files should not make a difference for assembly: when you pass several files - as in your first command line - Trinity will convert to fasta and concatenate them prior to assembly. Not concatenating will save disk space, at least temporarily.

ADD COMMENTlink modified 21 months ago • written 21 months ago by h.mon24k
1

Not concatenating will save disk space, at least temporarily

If the OP has high coverage, one can use the in-silico read normalization parameter as well.

ADD REPLYlink written 21 months ago by st.ph.n2.4k

Latest versions of Trinity - starting from 2.3.2 - have digital normalization on by default.

ADD REPLYlink written 21 months ago by h.mon24k

Thanks, I was unaware.

ADD REPLYlink written 21 months ago by st.ph.n2.4k
2
gravatar for carlopecoraro2
21 months ago by
Berlin
carlopecoraro2910 wrote:

Hi, I would suggest you to post this question here: https://groups.google.com/forum/#!forum/trinityrnaseq-users The Trinity google forum is very active and you will always find @BrianHaas there.

Hope this helps.

ADD COMMENTlink written 21 months ago by carlopecoraro2910
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 973 users visited in the last hour