Question: Format files in Trinity
gravatar for luzglongoria
5 months ago by
luzglongoria20 wrote:

Hi there,

I am working with parasites so in my reads I have parasite and host information. In order to filter out my reads and keep only those ones from the parasite I have done a selective depletion and a selective capture with bowtie2 (I have a "close" related parasite genome and the host genome):

Command for selective capture:

bowtie2 --threads 4 --local --no-unal \
-x /home/luz_garcia_longoria/workspace/reference_genomes/parasitereference.fasta \
-q -k 1 --al aligned_reads.fastq \
-1 /home/luz_garcia_longoria/workspace/s21_1.fq,s22_1.fq,s23_1.fq,s24_1.fq,s25_1.fq,s31_1.fq,s32_1.fq,s33_1.fq,s34_1.fq,s35_1.fq \
-2 /home/luz_garcia_longoria/workspace/s21_2.fq,s22_2.fq,s23_2.fq,s24_2.fq,s25_2.fq,s31_2.fq,s32_2.fq,s33_2.fq,s34_2.fq,s35_2.fq | samtools view -b -o aligned_parasite.bam

Command for selective depletion:

bowtie2 --threads 4 --local --no-unal \
-x /home/luz_garcia_longoria/workspace/reference_genomes/parasite_host_reference.fasta \
-q -k 1 --un no_aligned_reads.fastq \
-1 /home/luz_garcia_longoria/workspace/s21_1.fq,s22_1.fq,s23_1.fq,s24_1.fq,s25_1.fq,s31_1.fq,s32_1.fq,s33_1.fq,s34_1.fq,s35_1.fq \
-2 /home/luz_garcia_longoria/workspace/s21_2.fq,s22_2.fq,s23_2.fq,s24_2.fq,s25_2.fq,s31_2.fq,s32_2.fq,s33_2.fq,s34_2.fq,s35_2.fq | samtools view -b -o no_aligned_host_parasite.bam

Now I have two BAM files with the information from the parasite and the information that (I guess) it's from my parasite species (specifically).

My next step is to do de novo assembling with Trinity. The problem is that I am not sure if I can use both files in one command in Trinity. I know I can merge these two files and then use them but I am not sure if this is correct. I have been searching and I found this page where they explain a little bit how to run Trinity with BAM files. This is the command they suggest:

$TRINITY_HOME/Trinity --genome_guided_bam alignments.hisat2.bam \
   --CPU 2 --max_memory 1G --genome_guided_max_intron 5000

So, my questions are:

Is it fine to use these two BAM (combined into one) in this case? How can I know the value of the option '--genome_guided_max_intron' ? It would be ok if I convert my BAM file into fastq file and then run Trinity or that would be something very stupid?

Thank you very much in advance.

rna-seq format files trinity • 213 views
ADD COMMENTlink modified 5 months ago • written 5 months ago by luzglongoria20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1113 users visited in the last hour