combine trinity outputs
1
0
Entering edit mode
3.3 years ago
jrenart47 • 0

Hi all, I have had to split fastq files (downloaded with fasterq-dump) due to RAM limitations in my system in order to run Trinity for RNA-seq analysis from a non-model organism (elephant shark). After running Trinity I have four ouputs. In order to continue with differential gene expression analysis, I am not sure what to do: 1) run DGE from each of the different outputs, or 2) concatenate them together first. In any of these possibilities it is not clear to me if the final results would be correct. I would appreciate any advice on this issue.

RNA-Seq • 1.1k views
ADD COMMENT
0
Entering edit mode

Did you try to run Trinity with in silico normalization to 50x coverage using all reads? I think, this would be the best solution if you don't have enough RAM.

ADD REPLY
0
Entering edit mode

Thank you for the advise, Shelkmike! I'll try to run it this way. Jaime

ADD REPLY
0
Entering edit mode
3.3 years ago

All samples should be used to assembly one result, my workflow:

1:

Trinity --seqType fq --max_memory 20G --samples_file sample_file.config  --genome_guided_bam ref_sorted.bam --genome_guided_max_intron 10000 --CPU 60

Find assembled transcripts as: trinity_out_dir/Trinity-GG.fasta

sample_file.config is a txt file; suppose have 4 samples and everyone has 3 replicates, sample_file.config like below:

sample1    sample1-rep1    sample1-rep1_1.fq    sample1-rep1_2.fq
sample1    sample1-rep2    sample1-rep2_1.fq    sample1-rep2_2.fq
sample1    sample1-rep3    sample1-rep3_1.fq    sample1-rep3_2.fq
sample2    sample2-rep1    sample2-rep1_1.fq    sample2-rep1_2.fq
sample2    sample2-rep2    sample2-rep2_1.fq    sample2-rep2_2.fq
sample2    sample2-rep3    sample2-rep3_1.fq    sample2-rep3_2.fq
sample3    sample3-rep1    sample3-rep1_1.fq    sample3-rep1_2.fq
sample3    sample3-rep2    sample3-rep2_1.fq    sample3-rep2_2.fq
sample3    sample3-rep3    sample3-rep3_1.fq    sample3-rep3_2.fq
sample4    sample4-rep1    sample4-rep1_1.fq    sample4-rep1_2.fq
sample4    sample4-rep2    sample4-rep2_1.fq    sample4-rep2_2.fq
sample4    sample4-rep3    sample4-rep3_1.fq    sample4-rep3_2.fq

before run trinity assembly, align all the fq files to the elephant shark geome(hisat), then merge(samtools) all the sam format result to "ref_sorted.bam";

2: Transcript Quantification with salmon

trinityrnaseq-v2.11.0/util/align_and_estimate_abundance.pl \
    --transcripts ./trinity_out_dir/Trinity-GG.fasta \
    --seqType fq \
    --samples_file sample_file.config \
    --output_dir salmon_transcript_quantification \
    --aln_method bowtie2 \
    --thread_count 60 \
    --est_method salmon \
    --trinity_mode --prep_reference

3: DE analyse with DESeq2

trinityrnaseq-v2.11.0/Analysis/DifferentialExpression/run_DE_analysis.pl \
    --matrix salmon_transcript_quantification/salmon.gene.counts.matrix \
    --method DESeq2 \
    --samples_file sample_file.config \
    --contrasts contrasts.file \
    --output Differential_Expression_Analysis

contrasts.file like below(suppose sample1 vs sample2, sample3 vs sample4 for DE analyse):

sample1    sample2
sample3    sample4

reference: https://github.com/trinityrnaseq/trinityrnaseq/wiki

ADD COMMENT
0
Entering edit mode

Thank you wangmingcheng, I will try to run the hisat2 pipeline. I had used it before, but dindn't thought to do it with the elephant shark, because I didn't know I could make the gff file with it.

ADD REPLY
0
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (`text` becomes text), or select a chunk of text and use the highlighted button to format it as a code block. If your code has long lines with a single command, break those lines into multiple lines with proper escape sequences so they're easier to read and still run when copy-pasted. I've done it for you this time.
code_formatting

ADD REPLY

Login before adding your answer.

Traffic: 1990 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6