Question

How To Go Fishing In A Metagenomics Sample Using Raw Reads As Bait

0

Entering edit mode

10.8 years ago

Lee Katz ★ 3.2k

Hi, I have a metagenomics sample. I also have a set of reads from a genome. How can I pull similar or identical reads from the metagenomics sample? My ultimate goal is to reconstruct the genome hidden in the metagenomics sample.

metagenomics fastq illumina • 3.1k views

ADD COMMENT • link updated 10.8 years ago by Pavel Senin ★ 1.9k • written 10.8 years ago by Lee Katz ★ 3.2k

0

Entering edit mode

I know that making an assembly and mapping would be one way to do it. This is a legitimate way but there will be some unassembled regions of the genome. What I'm asking is, is there a way to match reads to reads?

ADD REPLY • link 10.8 years ago by Lee Katz ★ 3.2k

0

Entering edit mode

What exactly you would assemble, metagenome, or genome? If latter, then you will reduce the search space from a set of reads to contigs and singletons (= the index size) and speed-up the selection process. I am interested, why bwa wouldn't work? I use this solution in the current project and it works.

ADD REPLY • link 10.8 years ago by Pavel Senin ★ 1.9k

0

Entering edit mode

After getting all the reads, I would assemble the genome hidden in the metagenome. The reason I don't want to map to an assembly is because some places in the assembly are not retained. Usually either due to repeats, misassemblies, or lack of coverage. I am willing to spend extra computer time on this in order to recover more or better reads.

ADD REPLY • link 10.8 years ago by Lee Katz ★ 3.2k

0

Entering edit mode

Some good verbal suggestions to me have been either 1) find reads with identical kmers or 2) blast metagenomics reads vs genomics reads to recover any queries that match.

ADD REPLY • link 10.8 years ago by Lee Katz ★ 3.2k

score 1 · Answer 1 · 2014-01-13

1

Entering edit mode

10.8 years ago

Pavel Senin ★ 1.9k

I use bwa to index the set of reads, then I align metagenomic sample to that reference - this will yield the "similar or identical reads".

ADD COMMENT • link 10.8 years ago by Pavel Senin ★ 1.9k

score 0 · Answer 2 · 2014-01-13

0

Entering edit mode

10.8 years ago

IV ★ 1.3k

How many are the reads from the genome that you already have?

If they are enough, you coud assemble them in contigs (or even to rough scaffolds of a genome) and then align against them.

If you have too few, the you could create a blast database or an aligner index and search against that.

Cheers,

IV

ADD COMMENT • link 10.8 years ago by IV ★ 1.3k

0

Entering edit mode

As I and Pavel already mentioned, you can create a bowtie, bwa or any other aligner index or blast database by using your available reads. You should at first collapse the reads and transform them into a fasta file. You can use this fasta to create your index or as the base of your blast database. You can use that index or that blast database to search against.

ADD REPLY • link 10.8 years ago by IV ★ 1.3k

0

Entering edit mode

I haven't used bowtie for quite a while but I think that the syntax could be something like:

bowtie-build myreadscollapsed.fa reads.index

ADD REPLY • link 10.8 years ago by IV ★ 1.3k