To make the reference index for bowtie alignment.
1
0
Entering edit mode
7.1 years ago

Hi all, I am using Bowtie2 to align my reads to the respective references. I have three reference genomes. For three reference genomes from Bowtie2 I used -f option to give three reference index. I want to concatenate all the three reference genomes and create the index instead of giving three files. I did use cat to concatenate all the files. Is this the right way to concatenate the reference genomes into one? Does this works for bowtie2 alignment?

Thank you

Assembly rna-seq • 6.8k views
ADD COMMENT
1
Entering edit mode

From the Bowtie2 manual:

<reference_in> A comma-separated list of FASTA files containing the reference sequences to be aligned to, [...]

edit: but be sure there are no duplicated contig / scaffold / chromosome names.

ADD REPLY
0
Entering edit mode

Can you give me the idea about concatenating the fasta files for aligning the reads to the reference genomes?

ADD REPLY
1
Entering edit mode

What about reading the manual? It is long, but it will pay off:

The original sequence FASTA files are no longer used by Bowtie 2 once the index is built.

edit: or do you mean concatenating the input files which will be mapped? Tip: search for -1 and -2.

ADD REPLY
0
Entering edit mode

I read the manual but I am not sure if that I can concatenate the three reference files and then used as one file to build the index.. Can you please provide me information on that?

ADD REPLY
0
Entering edit mode

It is a different question then, it is not "how I do this".

You want to know if it is sensible concatenating multiple reference genomes for mapping with Bowtie2. To answer that, we need to know why you want to use three genomes as reference, how close or distant are these genomes, do you expect to have the three genomes on your samples, etc.

ADD REPLY
0
Entering edit mode

This is a good point. With combined references you risk reads from one sample (species?) Mapping to a different reference. In theory you could use strict mapping, throw out multimappers and only use alignments that correspond to the proper reference. I'm not sure why you'd want to unless the samples are pooled or something. But if this is a way to avoid running several alignments I wouldn't recommend it.

ADD REPLY
0
Entering edit mode

Just inquisitiveness: What is the need to combine multiple references?

ADD REPLY
2
Entering edit mode
7.1 years ago
Jake Warner ▴ 830

You're on the right track:

cat ReferenceGenome1.fa ReferenceGenome2.fa ReferenceGenome3.fa > CombinedReference.fa ;
bowtie2-build -f CombinedReference.fa CombinedRef ;
bowtie2  -x CombinedRef -1 XXX_fastq.R1 -2 XXX_fastq.R2 -S results.sam
ADD COMMENT
0
Entering edit mode

Thank you Jacob, it helped me much.

ADD REPLY

Login before adding your answer.

Traffic: 3112 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6