Tool: MAGERI: a software tool for calling rare variants and detecting circulating tumor DNA from UMI-tagged high-throughput sequencing data
2.1 years ago by
Czech Republic, Brno, CEITEC
mikhail.shugay3.3k wrote:

Dear Colleagues,

I would like to announce our recently published software tool called MAGERI that is designed to facilitate the detection of ultra-rare variants from various kinds of high-throughput sequencing datasets prepared using the molecular barcoding technology.

The ability to detect ultra-rare variants having ~0.1% frequency in the sample is one of the key objectives for successful circulating tumor DNA screening, studying rare tumor subpopulations and rare drug resistant variants in viral populations.

However, the sequencing error rate is far beyond the limit required for accurate rare variant calling even for sequencing datasets of top-tier quality. Recent development of the molecular barcoding technology allows eliminating sequencing errors by tagging each input molecule with an unique molecular identifier (UMI) [Marx. Nature Methods 2017]. UMI-tagged read groups can be then assembled into consensuses correcting sequencing errors. Still, residual PCR errors introduced at first PCR cycles and during UMI tag attachment can decrease the accuracy of variant calling. Moreover, (to the best of my knowledge) so far there is no dedicated variant caller that can model error rates in UMI-tagged read group consensus sequences. MAGERI software aims to solve this problem by implementing a consensus assembly, alignment and variant calling pipeline optimized for the UMI-tagged data [Shugay et al. Plos Comp Biol 2017].

Note that the datasets containing rare variants with known frequency and a control dataset from healthy donor plasma DNA are publicly available at SRA; see this repository for metadata and analysis scripts/templates. We hope that these benchmark datasets will be of use to the community, especially for the researchers developing software tools for UMI-tagged data processing and rare variant calling software.

ADD COMMENTlink modified 4 months ago by alons270 • written 2.1 years ago by mikhail.shugay3.3k

Hi, I am new here and I am interested in MAGERI application to reanalyze my data carried out with a targeted panel by Ion Torrent Thermofisher which uses molecular tag to detect variant. I am not a bioinformatics expert and so I would be glad to find the simplest way to apply the tool. Could you suggest me all the steps necessary to make it work? Thank you very much. Giusy

written 6 months ago by giuseppa.deluca0

Hi Mikhail!

We're trying to run MAGERI on one of our cloud machines. However, it seems to fail on memory.

We're looking to upgrade the machine, what are the minimum and optimal system requirements for your software?

Thank you very much in advance!

Best, Alon

written 4 months ago by alons270

Dear Alon,

The fact is that MAGERI is loading all reads into memory for consensus assembly, so we've tested it on a 64GB RAM servers for HiSeq analysis. The answer basically depends on the structure of your dataset: the number of reads, the number of unique UMI tags and reads-per-UMI distribution (e.g. if all your UMIs are uniformly covered is different from the situation with small coverage for most UMIs but having a single UMI with a million of associated reads). You can also give a try to our new MiNNN software ( which is in a beta stage right now, if you have any question feel free to mail me (my userid at gmail dot com).

written 4 months ago by mikhail.shugay3.3k

Thank you very much Mikhail, we will look into it!

written 4 months ago by alons270

Hi Mikhail; I am doing a study to determine the frequency of rare SNPs in ctDNA. I am specifically looking at mutations in 4 genes. I am preparing the NGS libraries using the Takara SMARTer ThruPLEX Tag-Seq kit and using a probe based capture method to isolate the 8 exons of interest (~4.5kb total). I was hoping to use your software, but the library kit uses UMIs that consist of degenerate bases and I understand from Clement et al (2018) that Mageri doesn't allow degenerate bases in the UMIs. Is this correct? Many thanks for your help. Seanna

written 3 months ago by seanna.mctaggart0
