Question: merging all coordinate-sorted bam files for Trinity genome-guided approach?
1
gravatar for Farbod
2.8 years ago by
Farbod3.3k
Toronto
Farbod3.3k wrote:

Dear Friends, Hi. . ( I'm not native in English so, be ready for some possible language flaws).

I have used my 6 left and right fastq files (2 treatments, 3 biological replication paired-end RNA-seq for each- non-model fish) with STAR for creating "coordinate-sorted bam" files that Trinity needs it for genome-guided approach.

But now I have 6 coordinate-sorted bam files and in the Trinity script it is just one :

Trinity --genome_guided_bam rnaseq.coordSorted.bam \ --genome_guided_max_intron 10000 \ --max_memory 10G --CPU 10

What must I do now ? Do I must run the above script 6 times ? or merge all the 6 coordinate-sorted bam files and produce just on coordinate-sorted bam file ?

these are the scripts I have used for STAR indexing and and aligning, if needed :

STAR –runMode genomeGenerate –runThreadN 20 –genomeDir ‘/home/Zebrafish-genome-index-STAR’ --genomeFastaFiles ‘/home/Zebrafish-genome-index-STAR/GCF_000002035.5_GRCz10_genomic.fasta’

then I have run the below script 6 times for my 6 different fastq sets:

STAR --genomeDir '/home/Zebrafish-genome-index-STAR' --runThreadN 24 --readFilesIn '/home/F1left.fastq' '/home/F1right.fastq' --outFileNamePrefix F1_Zebra --outSAMtype BAM SortedByCoordinate

rna-seq alignment • 1.7k views
ADD COMMENTlink modified 2.8 years ago by Biogeek350 • written 2.8 years ago by Farbod3.3k
1

Since you can specify only one bam as a genome guide I suppose you will have to merge all six.

ADD REPLYlink written 2.8 years ago by genomax68k

Dear genomax2, Hi and thank you.

Do I must merge them by linux "cat" command or some tools or program is needed ?

ADD REPLYlink written 2.8 years ago by Farbod3.3k
1

samtools merge would be the way to do it. Re-sort after merging. It does not look like you need to keep sample identification lines.

ADD REPLYlink written 2.8 years ago by genomax68k
1

for "Re-sort after merging", do I need other tools ?

about "to keep sample identification lines" I really dont know, yet !

Thanks

ADD REPLYlink written 2.8 years ago by Farbod3.3k
1

samtools sort -o merged_sorted.bam merged.bam will do it.

ADD REPLYlink written 2.8 years ago by genomax68k
1
gravatar for Biogeek
2.8 years ago by
Biogeek350
Biogeek350 wrote:

Additionally on top of of what genomax2 said; sambamba tools can and also may offer a much faster process for merging coordinated files. I used this over samtools and it's ultrafast.

http://lomereiter.github.io/sambamba/

ADD COMMENTlink written 2.8 years ago by Biogeek350
1

Thank you dear Biogeek, I have used "samtools" and it was not very slow.

ADD REPLYlink written 2.8 years ago by Farbod3.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1527 users visited in the last hour