Question: finding repeats in de-novo assembled contig from PacBio reads
1
gravatar for aindap
2.3 years ago by
aindap110
United States
aindap110 wrote:

Dear BioStars Community:

I performed a de-novo assembly of PacBio reads using Canu for a viral genome. I have my resulting unitig from the canu pipeline. I am now interested in characterizing repeats in my resulting assembly. I'm new to assembly and repeat identification. One simple approach was taking the reads used to form the assembly, align them against the assembly with MUMmer, and take a look at the resulting dot plot? Are there any more sophisticated approaches that would yield better results?

pacbio repeat assembly • 1.3k views
ADD COMMENTlink modified 2.2 years ago by arnstrm1.7k • written 2.3 years ago by aindap110
2

why you do not use Tandem Repeat Finder or RepeatMasker to do this?

ADD REPLYlink written 2.3 years ago by reza210
1

Assuming the assembly is correct, it seems to make more sense to align the assembly to itself rather than aligning reads to the assembly.

ADD REPLYlink written 2.2 years ago by Brian Bushnell16k
1
gravatar for arnstrm
2.2 years ago by
arnstrm1.7k
Ames, IA
arnstrm1.7k wrote:

Yes, you should probably try RepeatModeler, which can detect repeat families (de novo) and classify them. It has worked well for both model/non-model species and is very easy to run (it does have few dependencies to install though: TRF, RECON, RepeatScount, NSEG).

EDIT: you can find my sample run script here!

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by arnstrm1.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 807 users visited in the last hour