miRNA low mapping ratings
0
1
Entering edit mode
1 day ago
Ant ▴ 50

Hi everyone,

I'm working on a miRNA-seq experiment using human plasma samples and the QIAseq miRNA Library Kit (Qiagen). My FastQC reports look good, but after trimming, alignment, and running miRDeep2, the number of raw reads passing filters is extremely low, and very few reads align to the mature.fa (I also tried aligning to mature + genome, but again very few raw reads and very few known miRNAs were detected).

In particular, the majority of raw reads are shorter than 20 for the miRNAs found. I have tried various parameter adjustments for Cutadapt and Bowtie, but the results do not improve much. I'm concerned I might be making a mistake somewhere in the processing.

Here’s a summary of my workflow using one example sample:

1. Cutadapt trimming:

cutadapt --minimum-length=18 --maximum-length=30 \
    -o example_trimmed.fastq \
    example.fastq


2. Alignment with Bowtie 1:

bowtie -n 0 -l 32 --norc --best --strata -M 5000 --threads 16 \
    -x bowtie_index_hg38 \
    example_trimmed.fastq \
    -S example.sam


3. miRDeep2 analysis:

miRDeep2.pl \
    example_collapsed.fa \
    Genome_Index/hg38.fa \
    $(ls example/*.arf | tr '\n' ',') \
    mature_hsa.fa \
    hairpin_hsa.fa \
    -t hsa

Results for this sample:

> Total reads processed:                  50,693    

Reads that were too short: 41,238 (81.3%)
Reads that were too long: 9 (0.0%)
Reads written (passing filters): 9,446 (18.6%)
Reads aligning to genome: <1%

Another example when I aligned first to mature.fa and then to the genome:

mature.fa:
 reads processed: 128,440
 reads with at least one alignment: 466 (0.36%)
 reads that failed to align: 127,974 (99.64%) Reported 476 alignments


genome: reads processed: 127,974 reads with at least one alignment: 16,987 (13.27%) reads that failed to align: 110,987 (86.73%) Reported 120,986 alignments

I know plasma samples generally have low miRNA content, but compared to other studies using the same Qiagen kit on plasma with their Data Analysis Center, they report much higher raw read counts (see PMC8539647 – supplementary table).

Could I be doing something wrong in the processing steps (Cutadapt, Bowtie, or miRDeep2)? Any insights or suggestions would be greatly appreciated.

mirna bowtie1 preprocessing counts aligment • 2.0k views
ADD COMMENT
0
Entering edit mode

So this is a public dataset? QIAseq miRNA libraries may require special handling. Have you seen --> https://resources.qiagenbioinformatics.com/manuals/biomedicalgenomicsanalysis/120/index.php?manual=QIAseq_miRNA_Analysis.html

ADD REPLY
0
Entering edit mode

No, it's a personal dataset. I haven’t seen the link, but if I understand correctly, it's not possible to use the software for free, right?

ADD REPLY
0
Entering edit mode

What happens if you remove the --minimum-length requirement to cutadapt, and then run fastqc on the result - what size disitribution do you get?

I don't propose you use the output of the for downstream processing, but it might give you more information as to what is happening.

One possibility is a high rate of primer dimers in the sequencing library.

ADD REPLY
0
Entering edit mode

Also, can you just clarify that the output above is from cutadapt or form miRDeep2?

ADD REPLY

Login before adding your answer.

Traffic: 2989 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6