Question: Questions about the Getting-started of wtdbg2
4 weeks ago by
boymin202020 wrote:


This is my first time to assemble long reads from nanopore sequencing. I also have the short reads generated by Illumina sequencer. Here is my plan, to use wtdbg2 to get the draft genome fasta file, then to use pilon to polish. However, I have been blocked at the getting-started part of wtdbg2. I am totally confused by the input and output files in the following command lines. Are they just in one pipeline or just independent examples?

#quick start with
./ -t 16 -x rs -g 4.6m -o dbg reads.fa.gz

# Step by step commandlines

# assemble long reads
./wtdbg2 -x rs -g 4.6m -i reads.fa.gz -t 16 -fo dbg

# derive consensus
./wtpoa-cns -t 16 -i dbg.ctg.lay.gz -fo dbg.raw.fa
reads.fa.gz is the input sequence file. Substitute with your own.

dbg.raw.fa would be the final consensus fasta file.

4 weeks ago by
h.mon31k wrote:

The is a Perl script that wraps the whole wtdbg2pipeline in one command. As such, it assemble the reads (with wtdbg2), derive the consensus (with wtpoa-cns), map (with minimap2) and filter (with samtools) the reads back to the consensus, to obtain a polished assembly (again, with wtpoa-cns).

The two commands bellow the (wtdbg2 and wtpoa-cns) correspond to the first two steps of the Perl pipeline.

So you can run the perl script, and be done with it, or run each command separately.

Thanks, h, your answer helped me a lot.

