Assembling a Transcriptome using Trinity with multiple Illumina Fastq files
2
0
Entering edit mode
3.8 years ago

Hey y'all,

I am an undergraduate biology major and have just started exploring the field of bioinformatics. I have a few questions about how I should go about running Trinity when I have multiple FASTQ files from each sample. Should I concatenate all of the forward files together and all of the reverse read files together, so that I'm left with one forward and reverse? Is it better to just run them all at once in Trinity like in the following code: Trinity --seqType fq --max_memory 50G --left 2016BZ017_S11_L001.cleaned.1.fastq,2016BZ017_S11_L002.cleaned.1.fastq,2016BZ017_S2_L001.cleaned.1.fastq,2016BZ017_S2_L002.cleaned.1.fastq --right 2016BZ017_S11_L001.cleaned.2.fastq,2016BZ017_S11_L002.cleaned.2.fastq,2016BZ017_S2_L001.cleaned.2.fastq,2016BZ017_S2_L002.cleaned.2.fastq --CPU 6 or Trinity --seqType fq --max_memory 50G --left 2016BZ017_1_concatenated.fastq --right 2016BZ017_2_concatenated.fastq --CPU 6

My sample folder contains the following fastq files2016BZ017_S11_L001.cleaned.1.fastq, 2016BZ017_S11_L001.cleaned.2.fastq, 2016BZ017_S11_L002.cleaned.1.fastq, 2016BZ017_S11_L002.cleaned.2.fastq, 2016BZ017_S2_L001.cleaned.1.fastq, 2016BZ017_S2_L001.cleaned.2.fastq, 2016BZ017_S2_L002.cleaned.1.fastq, 2016BZ017_S2_L002.cleaned.2.fastq

Also: Would y'all suggest trimming the illumina adapters using the trimmomatic option within Trinity or doing it standalone?

Thank you all so much

RNA-Seq Trinity Illumina Transcriptome • 2.0k views
ADD COMMENT
1
Entering edit mode

Should I concatenate all of the forward files together and all of the reverse read files together, so that I'm left with one forward and reverse?

I would recommend doing this. You want to create a single assembly across all samples. If you're doing DGE downstream, you can specify replicates later.

ADD REPLY
0
Entering edit mode

Would you suggest removing the header of the fastq files that are being concatenated?

ADD REPLY
1
Entering edit mode

No these information are needed to pair and assemble the sequences.

ADD REPLY
2
Entering edit mode
3.8 years ago
h.mon 32k

Concatenating and passing them as --left 1.fq --right 2.fq, or passing lists of several files should not make a difference for assembly: when you pass several files - as in your first command line - Trinity will convert to fasta and concatenate them prior to assembly. Not concatenating will save disk space, at least temporarily.

ADD COMMENT
1
Entering edit mode

Not concatenating will save disk space, at least temporarily

If the OP has high coverage, one can use the in-silico read normalization parameter as well.

ADD REPLY
0
Entering edit mode

Latest versions of Trinity - starting from 2.3.2 - have digital normalization on by default.

ADD REPLY
0
Entering edit mode

Thanks, I was unaware.

ADD REPLY
2
Entering edit mode
3.8 years ago
carlopecoraro2 ★ 2.0k

Hi, I would suggest you to post this question here: https://groups.google.com/forum/#!forum/trinityrnaseq-users The Trinity google forum is very active and you will always find @BrianHaas there.

Hope this helps.

ADD COMMENT

Login before adding your answer.

Traffic: 2486 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6