Searching For Deletion In A Low Number Of Reads
1
5
Entering edit mode
11.6 years ago
Assa Yeroslaviz ★ 1.8k

Hi everybody,

I am having sort of a non-conform problem.

we have a mutated mitochondrial genome (circular) of high coverage (~x10k). We know that there are a few deletions and insertions in there due to genomic re-organizations, breakdown and re-alignments. We have the mutated line and a wild-type line. we would like to compare both lines in the number of indels

Problem: As mitochondria have different genomes in one cell, not all of the mitochondria will have these deletions. We also estimate the number of reads which can show these deletions to be relatively low. We expect so see big deletions (5-8Kb in size)

proposed solutions: I ran tophat to look for junctions. The junctions will mark splitted reads which were mapped at both ends of the deletion. We also found some very large deletions, but only in very few reads (1-3).

I was wondering if there any tools out there who can confirm such results or maybe do a better indel analysis on low number of reads.

(I already tried Pindel and it couldn't show me the correct deletions, probably due to the fact that there were not enough reads)

Thanks for any help or ideas.

Assa

tophat indel mitochondria • 2.9k views
ADD COMMENT
0
Entering edit mode

Try aligning using gmap/gsnap or stampy and see

ADD REPLY
0
Entering edit mode

Thanks, I will.

ADD REPLY
0
Entering edit mode

Hi, I tried gmap/gsnap with little to no success. I couldn't run gmap with the fastq-coverted fasta file. In gsnap I found very little hits. Besides if I understand the papers correctly, it specialized in (much) smaller indel regions. citation:"detecting complex variants with four or more mismatches or insertions of 1–9 nt and deletions of 1–30 nt" I on the other hand have deletion of ma y kb length.

ADD REPLY
4
Entering edit mode
11.6 years ago

We found that CREST works quite well. It is designed for cancer, so it will allow for heterogeneity (some reads still show the "wild type" varsion) and it is relatively robust to chimeric reads. What do you mean by low number of reads? We developed CNAnorm a tool for copy number analysis (still for cancer) that allows for heterogeneity. This will let you find indels larger than twice your "window", where a window should have an average of 50 reads. Basically you divide your genome in a number of windows and look at depth of coverage. If you use CNAnorm set

method = 'closest'

in your peakPloidy function, as talking of ploidy for mitocondria does not make sense.

I hope this helps. Let me know.

ADD COMMENT
0
Entering edit mode

low number of reads are 'just' 2,3 reads which confirm these big deletion, at least as I have found in the junctions files of tophat. What I don't really understand is, why I don't see ant big deletions in the deletion file of tophat.

Thanks for the two software tips, I will try them.

ADD REPLY

Login before adding your answer.

Traffic: 2603 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6