Question: Align reads on multi-fasta (contigs) file?
0
gravatar for ThePresident
10 weeks ago by
ThePresident100
ThePresident100 wrote:

I have paired-end fastq files, however my reference genome is in a list of contigs. Thus, the referece fasta file looks something like this

>contig1
AGTGCAGAC.....
>contig2
GCGATCACA......
>contig3
....

Is there a way to instruct bwa to perform alignement on the 'contig1', then move on the 'contig2' and so on? Concatenating contrigs into one single fasta file is not an option as I'll have random pairing of contigs and paired-end reads might be mapped to different contigs producing all sorts of funny stuff.

I tried looking for this on previous posts but couldn't find anything similar.

TP

bwa • 177 views
ADD COMMENTlink written 10 weeks ago by ThePresident100
2

Concatenating contrigs into one single fasta file is not an option as I'll have random pairing of contigs

What do you mean by "random pairing of contigs"?

BWA works on multi-contig fasta files, in fact, most reference genomes are multi-contig fasta files. I don't understand what is the problem, or maybe I don't understand what you want to do.

ADD REPLYlink written 10 weeks ago by h.mon19k

What I meant by "random pairing of contigs" is that contigs probably won't be in the correct order (compared to the reference sequence), and some of them might be reversed as compared to the reference sequence i.e. the actual genome. When paired-end reads are aligned on these incorrectly joined contigs, they might produce discordant pairs (outward oriented reads and such). Hence, aligning independently on each contig instead on concatenated contigs.

I didn't know that bwa could work on multi-fasta files. Does indexing works the same way?

ADD REPLYlink modified 10 weeks ago • written 10 weeks ago by ThePresident100

When you concatenate several fasta sequences on a single multi-fasta file, all contigs remain separated from each other. If a pair of reads map on different contigs, it will result in a discordant pair regardless of the contig orientation. This is a by product of incomplete assemblies, and there is nothing you can do about it, except for getting a better assembly somehow.

ADD REPLYlink written 10 weeks ago by h.mon19k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1475 users visited in the last hour