Question: Suggestion on contig assembler which respect a single copy mutation in a nonaploid genome?
0
gravatar for johnnytam100
23 months ago by
johnnytam100100
johnnytam100100 wrote:

As title, and the following is the background FYI :

I am discovering a possible single copy insertional mutation in a nonaploid plant genome.

Previously I used SPades to assemble contigs of the mutant using reads containing the potentially mutated sequence, which then has given me a ~1000 contigs of the mutant genome.

While ~10 contigs containing breakpoints seems to have been discovered after mapping of reads (0 mismatch) to the set of 1000 contigs, after PCR screening, they all are false-positive.

Then I redo the mapping again using default no. of mismatch set by bwa, discovering reads with mismatches corresponding to the breakpoints actually joined the breakpoint and resulted in no difference between the wild-type and the mutant mapping result.

After that, I started to consider if the assembler "generalize" my contigs too much that makes the breakpoint difficult to be discovered by observing the difference between the wild-type and the mutant.

That comes to my question: is there any contig assemblers which respect a single copy mutation in a nonaploid genome? The situation is discovering the 1 in the 1:8 situation in the genome.

Thank you!

ADD COMMENTlink modified 23 months ago • written 23 months ago by johnnytam100100

Which dataset ? Pacbio or Illumina ? Coverage ? If Illumina I would say this is impossible.

Assemblers are not really up to the task of generating diploid assemblies yet, with a few exceptions like Falcon-unzip

ADD REPLYlink written 22 months ago by colindaven1.7k

I used 150bp illumina library... coverage of wildtype is 11 and mutant is 38. Do you suggest getting some long reads anyway?

ADD REPLYlink written 22 months ago by johnnytam100100

cov 38 is potentially useful, 11 not so much. I would expect many fragmented assemblies even if just haploid/diploid. This data is not sufficient for your goals.

If you are interested in one region can't you generate a whole range of PCR products and or preferably BACS and sequence those with a long read tech ? PAcbio or ONT ?

This is a very difficult project though. Is there any successful public data from the same organism ?

ADD REPLYlink written 22 months ago by colindaven1.7k

There is one example but is for a diploid relative species and I think there is no projects working on nonaploid of this species...

Let me ask if I could do some long reads... More suggestions on wet or dry experiments would be much appreciated!

ADD REPLYlink written 22 months ago by johnnytam100100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 551 users visited in the last hour