Question: To make the reference index for bowtie alignment.
0
gravatar for bandanaschapagain
2.2 years ago by
bandanaschapagain30 wrote:

Hi all, I am using Bowtie2 to align my reads to the respective references. I have three reference genomes. For three reference genomes from Bowtie2 I used -f option to give three reference index. I want to concatenate all the three reference genomes and create the index instead of giving three files. I did use cat to concatenate all the files. Is this the right way to concatenate the reference genomes into one? Does this works for bowtie2 alignment?

Thank you

rna-seq assembly • 3.0k views
ADD COMMENTlink modified 19 months ago by Arindam Ghosh160 • written 2.2 years ago by bandanaschapagain30
1

From the Bowtie2 manual:

<reference_in> A comma-separated list of FASTA files containing the reference sequences to be aligned to, [...]

edit: but be sure there are no duplicated contig / scaffold / chromosome names.

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by h.mon26k

Can you give me the idea about concatenating the fasta files for aligning the reads to the reference genomes?

ADD REPLYlink written 2.2 years ago by bandanaschapagain30
1

What about reading the manual? It is long, but it will pay off:

The original sequence FASTA files are no longer used by Bowtie 2 once the index is built.

edit: or do you mean concatenating the input files which will be mapped? Tip: search for -1 and -2.

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by h.mon26k

I read the manual but I am not sure if that I can concatenate the three reference files and then used as one file to build the index.. Can you please provide me information on that?

ADD REPLYlink written 2.2 years ago by bandanaschapagain30

It is a different question then, it is not "how I do this".

You want to know if it is sensible concatenating multiple reference genomes for mapping with Bowtie2. To answer that, we need to know why you want to use three genomes as reference, how close or distant are these genomes, do you expect to have the three genomes on your samples, etc.

ADD REPLYlink written 2.2 years ago by h.mon26k

This is a good point. With combined references you risk reads from one sample (species?) Mapping to a different reference. In theory you could use strict mapping, throw out multimappers and only use alignments that correspond to the proper reference. I'm not sure why you'd want to unless the samples are pooled or something. But if this is a way to avoid running several alignments I wouldn't recommend it.

ADD REPLYlink written 2.2 years ago by Jake Warner730

Just inquisitiveness: What is the need to combine multiple references?

ADD REPLYlink written 19 months ago by Arindam Ghosh160
1
gravatar for Jake Warner
2.2 years ago by
Jake Warner730
Jake Warner730 wrote:

You're on the right track:

cat ReferenceGenome1.fa ReferenceGenome2.fa ReferenceGenome3.fa > CombinedReference.fa ;
bowtie2-build -f CombinedReference.fa CombinedRef ;
bowtie2  -x CombinedRef -1 XXX_fastq.R1 -2 XXX_fastq.R2 -S results.sam
ADD COMMENTlink written 2.2 years ago by Jake Warner730

Thank you Jacob, it helped me much.

ADD REPLYlink written 2.2 years ago by bandanaschapagain30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1287 users visited in the last hour