Question

k-mer based methods to piece sequenced amplicons together?

0

Entering edit mode

5.5 years ago

laurenkleine18 ▴ 20

I have a genomic region that I would like to sequence using my MiSeq, this region is approximately 600-900 bp long.

I have primers that can sequence both halves of this region separately. I plan to amplify both halves, sequence them using a 300 cycle nano kit, then piece them back together to get the entire 600-900 bp region sequence.

The issues I am facing are:

1) How to clean up the Illumina Adapter Overhang sequences:

Forward overhang: 5’ TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG‐[locus specific forward primer sequence]
Reverse overhang: 5’ GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG‐[locus specific reverse primer sequence]

2) How to piece together the sequences using a k-mer based tool

Thank you for your feedback and advice!

k-mer Assembly alignment • 1.1k views

ADD COMMENT • link updated 5.5 years ago by GenoMax 141k • written 5.5 years ago by laurenkleine18 ▴ 20

1

Entering edit mode

Pardon me for asking but you are only sequencing one region from one sample? Can you not do this using old fashioned sanger seq? Or am I missing something.

ADD REPLY • link 5.5 years ago by GenoMax 141k

0

Entering edit mode

Yes, we could use Sanger Sequencing; that is the method that is traditionally used to sequence this region of interest.

Our lab (industry) has a MiSeq already and all of the reagents. The MiSeq Nano Kit is only ~$300. Instead of sending our samples out to a third party lab, this option would be much more convenient for us to get results on our own time.

ADD REPLY • link 5.5 years ago by laurenkleine18 ▴ 20

0

Entering edit mode

You might be able to do it 'old school' and put the assemblies together with Phrap and Consed (though these are not Kmer based tools): http://www.phrap.org/phredphrapconsed.html

ADD REPLY • link 5.5 years ago by Joe 21k

score 0 · Answer 1 · 2018-10-10

How to clean up the Illumina Adapter Overhang sequences

Use a program like bbduk.sh from BBMap suite.

How to piece together the sequences using a k-mer based tool

Don't think you need to do anything fancy. A simple multiple sequence alignment (after converting the reads to fasta) could do the trick. You may be able to use tadpole.sh from BBMap for assembly.

Even at a million reads per nano run you are going to have a ton of coverage. So you will likely need to normalize the data down before doing the assembly.