Copy Number Variation For Haploid Organisms
9.8 years ago
Raygozak ★ 1.4k

i've generated artificial copy variations on a microbial genome and have tried to use many tools to detect these copies in the artificial genome by simulating reads from it. I've tried a lot of tools but all of them seem to be tailored to humans and are very hard to use if i wanted to to use it in another genome. I've also looked for literature in this matter, but wanted to know if someone has some experience and is willing to give me some advice on tools or publications that deal with this in an easy way.

thanks

HI. I am developer and maintainer of CNAnorm. The tool was developed for tumour and I only tested it on human and mouse, but I always tryied to keep it general enough, and it should work regardless of the species. If you give it a go and you find it doesn't work, I'll be interested to know why. If you use it, when you call peakPloidy use method = 'closest'

let me know and good luck.

People mean many things by "CNV"- what size of event are you looking for and what resolution? You can use Cortex to assemble structural variants - it's best in the range SNPs-->few kb, although it does have a reference-based method for calling large (tens of kb) variants (mostly deletions).

main paper: De novo assembly and genotyping of variants using colored de Bruijn graphs. Z Iqbal(), M Caccamo(), I Turner, P Flicek, G McVean, Nature Genetics (2012)

latest paper, on microbes: High-throughput microbial population genomics using the Cortex variation assembler. Z Iqbal, I Turner, G McVean, Bioinformatics 2012

Cortex is set up to allow the user to specify ploidy and is definitely not "tuned" for human. If you are interested in very large (tens or hundreds of kb) duplications then Cortex is not able to call these I'm afraid. (I'm one of the authors I should add.)

In addition, this recent paper (nothing to do with me) looks interesting

http://bioinformatics.oxfordjournals.org/content/early/2012/10/09/bioinformatics.bts601.abstract De novo detection of copy number variation by co-assembly Jurgen F. Nijkamp1,2,3, Marcel A. van den Broek2,3, Jan-Maarten A. Geertman4, Marcel J.T. Reinders1,3,5, Jean-Marc G. Daran2,3 and Dick de Ridder1,3,5,*

the paper uses their new tool on haploids (yeast).

There are a lot of other CNV tools, but my experience of those is limited to human data, so i can't really comment.

Josh Herr 5.7k

This isn't my area, but I've been interested in CNVs for a while, just never spent any time investigating for more than a few minutes. It does seem like all the tools are for human genomes with good references. I think both PennCNV and VarScan both take short reads as SNP data with a reference. This paper from a few years back might be interesting to you. I did also see this recent paper just this week, on human data, so perhaps Aaron Quinlan can give more information as he's an author on the paper.