Quantification after transcriptome assembly with Trinity
0
0
Entering edit mode
8 months ago
langziv ▴ 50

Hi.
During Trinity's run an output directory named Trinity_outputs was created, and after the run was over it included files with the following extensions: fastq.gz.P.qtrim.gz, fastq.gz.PwU.qtrim.fq, fastq.gz.U.qtrim.gz, fastq.gz.P.qtrim. Another output file that could be relevant is "Trinity_outputs/insilico_read_normalization/tmp_normalized_reads/left.fa".
The documentation mentions the output file "trinity_out_dir.Trinity.fasta", which I couldn't find.

If I understand correctly, now I need to use "align_and_estimate_abundance.pl" and then possibly "run_DE_analysis.pl" in order to quantify genes' expression in the various RNA seq data I have.
In the documentation it's mentioned that "align_and_estimate_abundance.pl" requires the Trinity.fasta file. Are any of the output files qualify as that file?

EDIT
Here's an error message from the Trinity run printout:

-preparing to extract selected reads from: /home/Ziv/Trinity_output/GMCF-1464-9S-04-13_S104_L002_R1_001.fastq.gz.PwU.qtrim.fq /home/Ziv/Trinity_output/GMCF-1464-9S-13-09_S100_L002_R1_001.fastq.gz.PwU.qtrim.fq /home/Ziv/Trinity_output/GMCF-1464-9S-13-11_S102_L002_R1_001.fastq.gz.PwU.qtrim.fq /home/Ziv/Trinity_output/GMCF-1464-9C-04-14_S105_L002_R1_001.fastq.gz.PwU.qtrim.fq /home/Ziv/Trinity_output/GMCF-1464-9C-13-10_S101_L002_R1_001.fastq.gz.PwU.qtrim.fq /home/Ziv/Trinity_output/GMCF-1464-9C-13-12_S103_L002_R1_001.fastq.gz.PwU.qtrim.fq ... done prepping, now search and capture.
-capturing normalized reads from: /home/Ziv/Trinity_output/GMCF-1464-9S-04-13_S104_L002_R1_001.fastq.gz.PwU.qtrim.fq
-capturing normalized reads from: /home/Ziv/Trinity_output/GMCF-1464-9S-13-09_S100_L002_R1_001.fastq.gz.PwU.qtrim.fq
-capturing normalized reads from: /home/Ziv/Trinity_output/GMCF-1464-9S-13-11_S102_L002_R1_001.fastq.gz.PwU.qtrim.fq
-capturing normalized reads from: /home/Ziv/Trinity_output/GMCF-1464-9C-04-14_S105_L002_R1_001.fastq.gz.PwU.qtrim.fq
-capturing normalized reads from: /home/Ziv/Trinity_output/GMCF-1464-9C-13-10_S101_L002_R1_001.fastq.gz.PwU.qtrim.fq
-capturing normalized reads from: /home/Ziv/Trinity_output/GMCF-1464-9C-13-12_S103_L002_R1_001.fastq.gz.PwU.qtrim.fq
Thread 3 terminated abnormally: Error, not all specified records have been retrieved (missing 168) from /home/Ziv/Trinity_output/GMCF-1464-9S-04-13_S104_L002_R1_001.fastq.gz.PwU.qtrim.fq /home/Ziv/Trinity_output/GMCF-1464-9S-13-09_S100_L002_R1_001.fastq.gz.PwU.qtrim.fq /home/Ziv/Trinity_output/GMCF-1464-9S-13-11_S102_L002_R1_001.fastq.gz.PwU.qtrim.fq /home/Ziv/Trinity_output/GMCF-1464-9C-04-14_S105_L002_R1_001.fastq.gz.PwU.qtrim.fq /home/Ziv/Trinity_output/GMCF-1464-9C-13-10_S101_L002_R1_001.fastq.gz.PwU.qtrim.fq /home/Ziv/Trinity_output/GMCF-1464-9C-13-12_S103_L002_R1_001.fastq.gz.PwU.qtrim.fq, see file: /home/Ziv/Trinity_output/insilico_read_normalization/GMCF-1464-9S-04-13_S104_L002_R1_001.fastq.gz.PwU.qtrim.fq_ext_all_reads.normalized_K25_maxC200_minC1_maxCV10000.fq.missing_accs for list of missing entries at /usr/local/bin/util/insilico_read_normalization.pl line 579.
Error encountered with thread.
-preparing to extract selected reads from: /home/Ziv/Trinity_output/GMCF-1464-9S-04-13_S104_L002_R2_001.fastq.gz.PwU.qtrim.fq /home/Ziv/Trinity_output/GMCF-1464-9S-13-09_S100_L002_R2_001.fastq.gz.PwU.qtrim.fq /home/Ziv/Trinity_output/GMCF-1464-9S-13-11_S102_L002_R2_001.fastq.gz.PwU.qtrim.fq /home/Ziv/Trinity_output/GMCF-1464-9C-04-14_S105_L002_R2_001.fastq.gz.PwU.qtrim.fq /home/Ziv/Trinity_output/GMCF-1464-9C-13-10_S101_L002_R2_001.fastq.gz.PwU.qtrim.fq /home/Ziv/Trinity_output/GMCF-1464-9C-13-12_S103_L002_R2_001.fastq.gz.PwU.qtrim.fq ... done prepping, now search and capture.
-capturing normalized reads from: /home/Ziv/Trinity_output/GMCF-1464-9S-04-13_S104_L002_R2_001.fastq.gz.PwU.qtrim.fq
-capturing normalized reads from: /home/Ziv/Trinity_output/GMCF-1464-9S-13-09_S100_L002_R2_001.fastq.gz.PwU.qtrim.fq
-capturing normalized reads from: /home/Ziv/Trinity_output/GMCF-1464-9S-13-11_S102_L002_R2_001.fastq.gz.PwU.qtrim.fq
-capturing normalized reads from: /home/Ziv/Trinity_output/GMCF-1464-9C-04-14_S105_L002_R2_001.fastq.gz.PwU.qtrim.fq
-capturing normalized reads from: /home/Ziv/Trinity_output/GMCF-1464-9C-13-10_S101_L002_R2_001.fastq.gz.PwU.qtrim.fq
-capturing normalized reads from: /home/Ziv/Trinity_output/GMCF-1464-9C-13-12_S103_L002_R2_001.fastq.gz.PwU.qtrim.fq
Thread 4 terminated abnormally: Error, not all specified records have been retrieved (missing 168) from /home/Ziv/Trinity_output/GMCF-1464-9S-04-13_S104_L002_R2_001.fastq.gz.PwU.qtrim.fq /home/Ziv/Trinity_output/GMCF-1464-9S-13-09_S100_L002_R2_001.fastq.gz.PwU.qtrim.fq /home/Ziv/Trinity_output/GMCF-1464-9S-13-11_S102_L002_R2_001.fastq.gz.PwU.qtrim.fq /home/Ziv/Trinity_output/GMCF-1464-9C-04-14_S105_L002_R2_001.fastq.gz.PwU.qtrim.fq /home/Ziv/Trinity_output/GMCF-1464-9C-13-10_S101_L002_R2_001.fastq.gz.PwU.qtrim.fq /home/Ziv/Trinity_output/GMCF-1464-9C-13-12_S103_L002_R2_001.fastq.gz.PwU.qtrim.fq, see file: /home/Ziv/Trinity_output/insilico_read_normalization/GMCF-1464-9S-04-13_S104_L002_R2_001.fastq.gz.PwU.qtrim.fq_ext_all_reads.normalized_K25_maxC200_minC1_maxCV10000.fq.missing_accs for list of missing entries at /usr/local/bin/util/insilico_read_normalization.pl line 579.
Error encountered with thread.
Error, at least one thread died at /usr/local/bin/util/insilico_read_normalization.pl line 434.
Error, cmd: /usr/local/bin/util/insilico_read_normalization.pl --seqType fq --JM 50G  --max_cov 200 --min_cov 1 --CPU 2 --output /home/Ziv/Trinity_output/insilico_read_normalization --max_CV 10000  --left GMCF-1464-9S-04-13_S104_L002_R1_001.fastq.gz.PwU.qtrim.fq,GMCF-1464-9S-13-09_S100_L002_R1_001.fastq.gz.PwU.qtrim.fq,GMCF-1464-9S-13-11_S102_L002_R1_001.fastq.gz.PwU.qtrim.fq,GMCF-1464-9C-04-14_S105_L002_R1_001.fastq.gz.PwU.qtrim.fq,GMCF-1464-9C-13-10_S101_L002_R1_001.fastq.gz.PwU.qtrim.fq,GMCF-1464-9C-13-12_S103_L002_R1_001.fastq.gz.PwU.qtrim.fq --right GMCF-1464-9S-04-13_S104_L002_R2_001.fastq.gz.PwU.qtrim.fq,GMCF-1464-9S-13-09_S100_L002_R2_001.fastq.gz.PwU.qtrim.fq,GMCF-1464-9S-13-11_S102_L002_R2_001.fastq.gz.PwU.qtrim.fq,GMCF-1464-9C-04-14_S105_L002_R2_001.fastq.gz.PwU.qtrim.fq,GMCF-1464-9C-13-10_S101_L002_R2_001.fastq.gz.PwU.qtrim.fq,GMCF-1464-9C-13-12_S103_L002_R2_001.fastq.gz.PwU.qtrim.fq --pairs_together  --PARALLEL_STATS   died with ret 7424 at /usr/local/bin/Trinity line 2919.
    main::process_cmd("/usr/local/bin/util/insilico_read_normalization.pl --seqType "...) called at /usr/local/bin/Trinity line 3472
    main::normalize("/home/Ziv/Trinity_output/insilico_read_normalization", 200, ARRAY(0x55cc898f2c60), ARRAY(0x55cc898f2c90)) called at /usr/local/bin/Trinity line 3412
    main::run_normalization(200, ARRAY(0x55cc898f2c60), ARRAY(0x55cc898f2c90)) called at /usr/local/bin/Trinity line 1450
De-novo-transcriptome-assembly Trinity RNA-seq-analysis • 1.2k views
ADD COMMENT
1
Entering edit mode

Please do not delete posts once they have received at least one comment/answer.

ADD REPLY
0
Entering edit mode

Hi, this sounds like the run might have died. Did you collected the outputs from STDERR or STDOUT?? Did the pipeline say the run was completed successfully? You can collect these like this:

Trinity --seqType fq --max_memory 50G --left reads_1.fq.gz  --right reads_2.fq.gz --CPU 6 > trinity.out 2> trinity.err

Cheers

ADD REPLY
0
Entering edit mode

Thanks.
I couldn't find a relevant log file, so I run the command again with nohup.

ADD REPLY
0
Entering edit mode

I'll modify the question with error messages I got.

ADD REPLY
0
Entering edit mode

I've found a thread on the Trinity github that hints it may have to do with the IDs of the reads (thread here: https://github.com/trinityrnaseq/trinityrnaseq/issues/481)

Would you be able to post just the first few reads of one of your files? (assuming they all have the same format)

ADD REPLY
0
Entering edit mode
@A00419:697:HN5NFDRX2:2:2101:1832:1000 1:N:0:GATTCGAG+ATACTCGG
CNCGCTGAAAGTGCTTTACAACCCGAAGGCCTTCTTCACACACGCGGCATGGCTGCATCAGGCTTGCGCCCATTGTGCAATATTCCCCACTGCTGCCTCCCGTAGGAGTCTGGACCGTGTCTCAGTTCCAGTGTGGCTGGTCATCCTCTCA
+
F#:FFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:F::FFFFF:FFFFFFFF
@A00419:697:HN5NFDRX2:2:2101:1226:1016 1:N:0:GATTCGAG+ATACTCGG
CTTTAACGTTCCTTCAGGAGACTTAAAGTCTCAGGGAGAACTCATCTCGGGGCAAGTTTCGTGCTTAGATGCTTTCAGCACTTATCTCTTCCGCATTTAGCTACCGGGCAGTGCCATTGGCATGACAACCCGAACACCAGTGATGCGTCCA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFF:FFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFF:FF:FFFFFF:FFFFFFF:F,FFFFFFF:FFFFFFFFF
@A00419:697:HN5NFDRX2:2:2101:1515:1016 1:N:0:GATTCGAG+ATACTCGG
GTAAAGAAGCCGATGGTGTAGTCAGTGCCTTTGATATCGTTGTCGATGGTGCCCGGCAGGCCGATGCATGGGAAGCCCATCTCGGTCAGGCGCATCGCCCCCAGATAGGAACCGTCACCGCCGATAACCACCAGCGCGTCCAGGCCGCGCG
+
,FFFFFFFFFFFFFF::FFFFF:FFFFF,F:FFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFF:FFFF:FF:F:FFFFFF,FFF:FF:FFFFFFFFFF,F:FF:FFFFF:FFFFFFFFFF:,F:FFF,:FFF:FFFFFFF:FF,F,

Would that suffice?

ADD REPLY
0
Entering edit mode

It should, you could try something like this:

For reads with R1

zcat FileR1.fq.gz  | awk '{ if($0 ~ "^@"){gsub(" .*","/1",$0);print $0}else{print $0}}'| gzip -c > FileR1.Renamed.fq.gz

And for reads with R2

zcat FileR2.fq.gz  | awk '{ if($0 ~ "^@"){gsub(" .*","/2",$0);print $0}else{print $0}}'| gzip -c > FileR2.Renamed.fq.gz

With these modified reads it should run.

Hopefully it helps!

ADD REPLY
0
Entering edit mode

Thanks. I'll try it.
These commands read a gz file without decompressing it?

ADD REPLY
0
Entering edit mode

It didn't work. Seems like the same result as before. Thanks for trying (:
I'll try the Trinity forum.

ADD REPLY
0
Entering edit mode

These commands will read compressed .gz files without decompressing through zcat. The end goal of these pipelines is to go from this:

@A00419:697:HN5NFDRX2:2:2101:1832:1000 1:N:0:GATTCGAG+ATACTCGG
CNCGCTGAAAGTGCTTTACAACCCGAAGGCCTTCTTCACACACGCGGCATGGCTGCATCAGGCTTGCGCCCATTGTGCAATATTCCCCACTGCTGCCTCCCGTAGGAGTCTGGACCGTGTCTCAGTTCCAGTGTGGCTGGTCATCCTCTCA
+
F#:FFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:F::FFFFF:FFFFFFFF

to this

@A00419:697:HN5NFDRX2:2:2101:1832:1000/1
CNCGCTGAAAGTGCTTTACAACCCGAAGGCCTTCTTCACACACGCGGCATGGCTGCATCAGGCTTGCGCCCATTGTGCAATATTCCCCACTGCTGCCTCCCGTAGGAGTCTGGACCGTGTCTCAGTTCCAGTGTGGCTGGTCATCCTCTCA
+
F#:FFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:F::FFFFF:FFFFFFFF

Is just a matter of changing the IDs of the reads by removing the text after the space and adding a /1 or /2 depending of the read file.

ADD REPLY
0
Entering edit mode

And from your experience the format my files have is incompatible with Trinity?

ADD REPLY

Login before adding your answer.

Traffic: 3122 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6