finding repeats in de-novo assembled contig from PacBio reads
1
1
Entering edit mode
5.4 years ago
aindap ▴ 120

Dear BioStars Community:

I performed a de-novo assembly of PacBio reads using Canu for a viral genome. I have my resulting unitig from the canu pipeline. I am now interested in characterizing repeats in my resulting assembly. I'm new to assembly and repeat identification. One simple approach was taking the reads used to form the assembly, align them against the assembly with MUMmer, and take a look at the resulting dot plot? Are there any more sophisticated approaches that would yield better results?

Assembly PacBio repeat • 2.1k views
ADD COMMENT
2
Entering edit mode

why you do not use Tandem Repeat Finder or RepeatMasker to do this?

ADD REPLY
1
Entering edit mode

Assuming the assembly is correct, it seems to make more sense to align the assembly to itself rather than aligning reads to the assembly.

ADD REPLY
1
Entering edit mode
5.3 years ago
arnstrm ★ 1.8k

Yes, you should probably try RepeatModeler, which can detect repeat families (de novo) and classify them. It has worked well for both model/non-model species and is very easy to run (it does have few dependencies to install though: TRF, RECON, RepeatScount, NSEG).

EDIT: you can find my sample run script here!

ADD COMMENT

Login before adding your answer.

Traffic: 1925 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6