Question: finding repeats in de-novo assembled contig from PacBio reads
1
gravatar for aindap
4.0 years ago by
aindap120
United States
aindap120 wrote:

Dear BioStars Community:

I performed a de-novo assembly of PacBio reads using Canu for a viral genome. I have my resulting unitig from the canu pipeline. I am now interested in characterizing repeats in my resulting assembly. I'm new to assembly and repeat identification. One simple approach was taking the reads used to form the assembly, align them against the assembly with MUMmer, and take a look at the resulting dot plot? Are there any more sophisticated approaches that would yield better results?

pacbio repeat assembly • 1.8k views
ADD COMMENTlink modified 3.9 years ago by arnstrm1.8k • written 4.0 years ago by aindap120
2

why you do not use Tandem Repeat Finder or RepeatMasker to do this?

ADD REPLYlink written 4.0 years ago by reza240
1

Assuming the assembly is correct, it seems to make more sense to align the assembly to itself rather than aligning reads to the assembly.

ADD REPLYlink written 3.9 years ago by Brian Bushnell17k
1
gravatar for arnstrm
3.9 years ago by
arnstrm1.8k
Ames, IA
arnstrm1.8k wrote:

Yes, you should probably try RepeatModeler, which can detect repeat families (de novo) and classify them. It has worked well for both model/non-model species and is very easy to run (it does have few dependencies to install though: TRF, RECON, RepeatScount, NSEG).

EDIT: you can find my sample run script here!

ADD COMMENTlink modified 3.9 years ago • written 3.9 years ago by arnstrm1.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1648 users visited in the last hour
_