Question

mirdeep2: UMI information lost when reads are collapsed

0

Entering edit mode

5.3 years ago

Rituriya ▴ 40

Hi All,

I have UMI extracted miRNA reads which I want to analyse using mirDeep2 for known and novel miRNA discovery. But when I give these as input, mapper.pl collapses reads drastically and brings down 7 million reads to 1200 reads with count tag on each read.

My question:

1) Is there a way to retain UMI information on the Fastq header so as to do deduplication later post alignment by mirdeep2?

2) Has anyone successfully analyzed UMI tagged miRNA data using mirDeep2?

Thank you, Pratibha.

mirdeep2 umi mirna • 2.1k views

ADD COMMENT • link 5.3 years ago by Rituriya ▴ 40

0

Entering edit mode

You might need to align to a genome with a regular aligner, use something like UMI_tools on that alignment, then put those reads through mirDeep2.

ADD REPLY • link 5.3 years ago by swbarnes2 14k

0

Entering edit mode

That was exactly my thought process initially, swbarnes2. But if I want to do that, mirdeep2.pl requires a reads_collapsed.fa and .arf file mandatorily. Shall I convert bowtie output to arf format using command:

convert_bowtie_output.pl reads_vs_refdb.bwt > reads_vs_refdb.arf

Let me try and see if it works downstream. Do let me know if you think otherwise.

Thanks, Rituriya.

ADD REPLY • link 5.3 years ago by Rituriya ▴ 40

0

Entering edit mode

I found bwa_sam_converter.pl to be exactly what I need but it does not retain UMI tag in header (so I am back to square one, where I will have to manually add that using programming) and also I am unable to run the perl script:

./bwa_sam_converter.pl -i /home/xyz/aligned-bowtie1.sam -o xyz_reads_collapsed.fa -a reads_aligned-bowtie1.arf -c

It always throws an error: Sam file not found. I checked the file is very much there. Further, I learnt that all SAM files will not work from this link, but I need to know which SAM fields need to be present/absent. Any idea?

ADD REPLY • link 5.3 years ago by Rituriya ▴ 40