Question

in silico read normalization with several pairs of reads.

0

Entering edit mode

5.6 years ago

luzglongoria ▴ 50

Hi there, I am trying to do some normalization before starting with the assembling. I am using in silico normalization. I have been looking at the manual and I have figure out the command for two paired data:

/PATH/insilico_read_normalization.pl \ --seqType fq --JM 1G --max_cov 50 --left /PATH/s21_1.fq --right /PATH/s21_2.fq \ --pairs_together --output /PATH/insil_norm_ex

This command works but I have more data (s22_1.fq, s22_2.fq, s23_1.fq, s24_2.q....etc) I have seen that there is an option for these kind of things (--left_list or --righ_list).. The thing is that I don't understand very well what I have to do...do I need to create a file (.txt) with all the names for s.1.fq (left) and another one for s.2.fq (right) and then specify the PATH for each file in the command?

RNA-Seq • 2.3k views

ADD COMMENT • link updated 5.6 years ago by h.mon 35k • written 5.6 years ago by luzglongoria ▴ 50

1

Entering edit mode

Help for the program says this:

--left  <string>    :left reads    if specifying multiple files, list them as comma-delimited. eg. leftA.fq,leftB.fq,...)

You may want to look at bbnorm.sh from BBMap as an option. There is a guide here.

ADD REPLY • link 5.6 years ago by GenoMax 141k

0

Entering edit mode

So, the command will be

/PATH/insilico_read_normalization.pl \ --seqType fq --JM 1G --max_cov 50 --left /PATH/s21_1.fq,s22_1.fq,s23_1.fq --right /PATH/s21_2.fq,s22_2.fq,s23_2.fq \ --pairs_together --output /PATH/insil_norm_ex

Right?

ADD REPLY • link 5.6 years ago by luzglongoria ▴ 50

score 1 · Answer 1 · 2018-09-13

In silico normalization is performed by default since Trinity release v2.3.2 Nov 20, 2016, so you don't need to run the insilico_read_normalization.pl script. To pass multiple input files to the Trinity assembler, you can either pass a comma-separated list of files:

Trinity --seqType fq \
 --left condA_1.fq.gz,condB_1.fq.gz,condC_1.fq.gz \
 --right condA_2.fq.gz,condB_2.fq.gz,condC_2.fq.gz

Or use the --samples_file parameter to use a tab-delimited 'samples.txt' file that describes the data:

cond_A    cond_A_rep1    A_rep1_left.fq    A_rep1_right.fq
cond_A    cond_A_rep2    A_rep2_left.fq    A_rep2_right.fq
cond_B    cond_B_rep1    B_rep1_left.fq    B_rep1_right.fq
cond_B    cond_B_rep2    B_rep2_left.fq    B_rep2_right.fq