Tool:umitools - working with UMI incorporated data
0
2
Entering edit mode
9.8 years ago
Joe Brown ▴ 70

availability: https://github.com/brwnj/umitools

umitools facilitates the processing of data that has incorporated a unique molecular identifier (UMI). It assumes the UMI is incorporated as part of the read.

Using the IUPAC sequence design of the UMI, strip the sequence from the 5' end of the fastq:

umitools trim --end 5 unprocessed_fastq.gz NNNNNV > out.fq

The UMI sequence for reads are appended onto the read name and processed again after the reads are mapped. Duplicate UMIs at any given start site need to be removed:

umitools rmdup unprocessed.bam out.bam > before_after.bed

EDIT:

I've updated this to account for mismatches among a given UMI sequence set at a start site. This allows the user to essentially merge very similar UMIs into fewer representative sequences.

umitools rmdup --mismatches 1 unprocesed.bam out.bam > before_after.bed
UMI sequencing • 6.1k views
ADD COMMENT
0
Entering edit mode

Dose umitools adapt to paired-end data(PE is popular in NGS analysis)?

ADD REPLY
0
Entering edit mode

PE is popular? What are you trying to do? What's your UMI incorporation design?

ADD REPLY
0
Entering edit mode

Hello, in my PE reads, both 1.fq and 2.fq have UMIs.

1.fq: UMI1=============
      2.fq: UMI2=============

To take advantage of UMIs, I should take two UMIs into consideration.

So, does umitools can solve my problem?

ADD REPLY
0
Entering edit mode

unexpected problem with this tool: paired-end reads find themselves with different names, which causes BWA-MEM to quit. What aligner do you use downstream of umitools that does not require paired reads to have the same name?

ADD REPLY
0
Entering edit mode

I could make this work on PE reads, but it's unclear how I would be counting the UMIs at a given start. Would you want to remove R1s independently of R2s?

If you were interested in sharing data with me I think we can get it worked out. If you've already solved it and made the code available somewhere, I'd love to check it out!

ADD REPLY

Login before adding your answer.

Traffic: 2501 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6