miRNA low mapping(Qiagen miRNA Library Prep)
1
1
Entering edit mode
4 months ago

So I have the raw data and the kit used to prepare this is the qiagen mirna library prep kit. It has, accoording to my understanding and everything i read online the following structure: [biological seqeuence]-[constant_region]-[umi]-[adapter]. I orignally used this:

For umi ectraction and adapter discard:

umi_tools extract --extract-method=regex --stdin=$read1 --bc-pattern=".+(?P<discard_1>AACTGTAGGCACCATCAAT){s<=2}(?P<umi_1>.{12})(?P<discard_2>.*)" --stdout umiE-${sample_id}.fastq.gz

and in a later step i had the following

 fastp -i $umiE_read1 -o "comptrim-${sample_id}.fastq.gz" -A -Q -L --low_complexity_filter

followed by:

trim_galore -q 28 --phred33 --length 16 --basename "$sample_id" --fastqc -o . "$complexity_trimmed_read1"

the percetage of total reads obtained are arounf 58% after everything is said and done, from what ive read this is tyoical for miRNA data since its mostly adapter.But when I align against miRbase(previously extracted hsa) i get low mapping, when checking with seqkit. Help is very greatly appreciated. The umi extraction is based off my understanding of the prep method and from what ive found online. what could be my issue?

Qiagen pipeline trimming alignment miRNA • 1.4k views
ADD COMMENT
1
Entering edit mode

when I align against miRbase(previously extracted hsa)

What aligner are you using for this? Have you converted the "U's" in miRBase sequence before creating the index.

ADD REPLY
0
Entering edit mode

Im using Bowtie since it is meant for short reads , after much digging I found that is something I didn’t do and need to do but thank you so much for answering , since that just confirms it . I’ll get back with results after I do it in the morning. I believe this should help:

sed '/^[^>]/s/U/T/g'

ADD REPLY
0
Entering edit mode

So I got some alignment but basically nothing . I used Sam tools to check alignment and I see .31%. I changed U to T and then built bowtie-idx. Will touch base if I figure out the issue.

ADD REPLY
1
Entering edit mode

Can you post examples of a few reads that have survived the pre-processing you did.

ADD REPLY
0
Entering edit mode
@LH00150:566:22TM3VLT3:6:1101:51626:1064_CTTTCACACGCG 1:N:0:GTCAGACCTA+CTCCTCGTAT
GNGTGAAAGTAGGTCATCGTCAGAC
+
I#IIIII9II9IIIIIIIIIIIIII

@LH00150:566:22TM3VLT3:6:1101:48695:1144_ATTCCCCGTGCC 1:N:0:GTCAGACCTA+CGCCTCGTAT
TGTCAATTTCGAGGGGCGAAAA
+
IIIIIIIIII-IIIIIIIIIII

@LH00150:566:22TM3VLT3:6:1101:33339:1096_ACCACCGCCAGG 1:N:0:GTCAGACCTA+CTCCTCTTAT
GTGAAAGTAGGTCATCGTCAGAC
+
IIIIIIIIIIIII9IIIIIIIII

@LH00150:566:22TM3VLT3:6:1101:34689:1096_CGACACATGACG 1:N:0:GTCAGACCTA+CTCCTCGGAT
CTCGCAGTCGGGGTTC
+
IIIIIIIIIIIIIIII

@LH00150:566:22TM3VLT3:6:1101:36002:1096_TAGGATCGCCCG 1:N:0:GTCAGACCTA+CTCCGCGTAT
GCAAGCGGCGGAGCATGTGGATTA
+
IIIIIIIIII-9IIIIIIIIIIII

i posted 5 example ones, again thank you so much for helping me navigate this, currently trying other options, like algining against genome instead of miRbase.

ADD REPLY
0
Entering edit mode

miRNA should be 21-22+ bp. Looks like you have some shorter sequences in there as well.

ADD REPLY
0
Entering edit mode

Hello Thank you both for helping me , so I ended up aligning against the genome and i got the following:

72352566 + 0 in total (QC-passed reads + QC-failed reads)
72352566 + 0 primary

63258933 + 0 mapped (87.43% : N/A)
63258933 + 0 primary mapped (87.43% : N/A)

It finally worked you dont understand how good this feels thank you so much.

ADD REPLY
1
Entering edit mode

You will need to see what you get in terms of counts but they should also align against miRbase. You will need to allow the reads to multi-map as suggested by colindaven below. Use bowtie v.1.x.

I tried to look up the recommended data analysis protocol for this kit but it appears that Qiagen wants you to use some web-based pipeline that I suppose you get a license for when you buy this kit. No details were readily available.

ADD REPLY
0
Entering edit mode

Yeah thats exactly the issue ive been having is that there is hardly anything on this kit because they probably want you to go through their pipe(im doing this as a personal project), i tried the bowtie v1, also allowed v2 with an -m 5000, (m was based off a ucsc pipeline). I received 70.53% aligment which is not as good but I would rather take more stringent approach. Ill let you know how the progress continues. Thank you so much for all the help and depth you go into helping. It is truly and greatly appreciated. I cant wait to see my next steps and the results. Also in regard to the short sequences i set the lower limit to 16.

ADD REPLY
0
Entering edit mode

70% alignments should be fine for miRNA. You can move on to the remainder of analysis.

ADD REPLY
3
Entering edit mode
3 months ago

This might be a mapping quality issue since many identical sequences are in mirBase.

Suggestions

  • align vs the genome they were generated from. Human? Quantify using a miRNA gtf gff3 etc file later with featureCounts etc.
  • set max number of mappings to very high in bowtie (1 or 2 by the way? ). Remove any mapping quality MQ filters
ADD COMMENT
0
Entering edit mode

Currently aligning against the human genome as we speak, hopefully I get a slight improvement, will follow up.Thank you so much for taking your time to answer and help. Also using bowtie 1.

ADD REPLY

Login before adding your answer.

Traffic: 3377 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6