Question

Detection of viral nucleic acids in metagenomic aDNA samples

1

Entering edit mode

9.3 years ago

stu111392 ▴ 30

Hello there,

at my Institute we are now aiming at detecting viral nucleic acids in metagenomic aDNA samples. As a guy who is totaly new to all the bioinformatics stuff at the moment I'am just looking for programs who seem to be usefull. The main Idea of the identification is to throw all the reads form the sequencer in an multiple alignment software like diamond and blast them against a db. Maybe we are going to do some de-novo assembly befor with meta velvet. Has anyone of you ever done this or is working with metagenomic samples on a familiar task? What programs are you using and is there a better approach?

With kind regards,

Julian

alignment sequence blast Assembly • 2.1k views

ADD COMMENT • link updated 2.1 years ago by Ram 43k • written 9.3 years ago by stu111392 ▴ 30

0

Entering edit mode

This paper may help you Metagenomic Detection of Viruses in Aerosol Samples from Workers in Animal Slaughterhouses

And

Metagenomics for pathogen detection in public health

ADD REPLY • link updated 2.1 years ago by Ram 43k • written 9.3 years ago by Medhat 9.7k

0

Entering edit mode

Thank you fior that. I'll have a look :)

ADD REPLY • link 9.3 years ago by stu111392 ▴ 30

score 1 · Answer 1 · 2014-12-16

BLAST would definitely be too slow for mapping sequencing reads to a DB. Diamond seems much faster (I have only tried it a few times) and may be a viable alternative, although I would perhaps prefer to use it for mapping assembled contigs to a protein database. Depends on how many reads you have.

You could try something like this:

Try to remove host DNA by mapping to, e g, the human genome (if that is the host) using bowtie2 or bwa, and discarding the reads that match the host. Here, you can also throw in additional filters such as suspected contaminating bacteria and so on.
Map the remaining reads with a short-read aligner (e g bowtie2/bwa) against a (nucleotide) viral database. You could get that from GenBank, for example.
It is possible that you could use Diamond at this step as well to map to a (protein) viral reference database.
The reads that still remain at this point (which didn't map to host, viral nt database or viral aa database) can be assembled by e g IDBA-UD, MegaHit or something like that.
The resulting contigs can be mapped by Diamond to the NR database.