Human+Viral Genomes In A Single Index (Bwa Or Other Aligners)
1
2
Entering edit mode
10.9 years ago
Rm 8.2k

I am interested to identify the potential viral cause of Tumor samples.

I need commnents/suggestions in advantages or disadvantages in building a human + viral genome (full genomes available at ncbi) index using bwa or other aligners to map the (Paired End) reads. to see potential reads mapping to viral genomes.

and also how aligners might behave?

Thanks in advance

human genome index bwa bowtie • 2.8k views
ADD COMMENT
4
Entering edit mode
10.9 years ago
Gww ★ 2.7k

If you are combining thousands of viral genomes into the same index you may have some performance issues due to the large number of similar sequences (ie. all of the papillomaviridae). This may also cause a lot of issues with repeat sequence alignments.

You may have better luck pre-mapping the reads to the human genome then aligning the remaining reads to the viral genomes using megablast etc.

There may still be some issues using this strategy such as read pairs derived from an integrated viral genome where one read aligns to the human genome and another to a viral sequence. So you may have to write a custom solution to try and resolve these kinds of alignments.

ADD COMMENT
0
Entering edit mode

Megablast for more than 100M reads it generated Tbs of data...and takes for ever...

ADD REPLY
0
Entering edit mode

@RM: That is why I suggest you pre-align them with your favorite short read aligner first. Then align the left over reads with megablast or something else that's efficient at searching a large database.

ADD REPLY

Login before adding your answer.

Traffic: 2198 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6