Hi everyone,
I'm working on a miRNA-seq experiment using human plasma samples and the QIAseq miRNA Library Kit (Qiagen). My FastQC reports look good, but after trimming, alignment, and running miRDeep2, the number of raw reads passing filters is extremely low, and very few reads align to the mature.fa (I also tried aligning to mature + genome, but again very few raw reads and very few known miRNAs were detected).
In particular, the majority of raw reads are shorter than 20 for the miRNAs found. I have tried various parameter adjustments for Cutadapt and Bowtie, but the results do not improve much. I'm concerned I might be making a mistake somewhere in the processing.
Here’s a summary of my workflow using one example sample:
1. Cutadapt trimming:
cutadapt --minimum-length=18 --maximum-length=30 \
-o example_trimmed.fastq \
example.fastq
2. Alignment with Bowtie 1:
bowtie -n 0 -l 32 --norc --best --strata -M 5000 --threads 16 \
-x bowtie_index_hg38 \
example_trimmed.fastq \
-S example.sam
3. miRDeep2 analysis:
miRDeep2.pl \
example_collapsed.fa \
Genome_Index/hg38.fa \
$(ls example/*.arf | tr '\n' ',') \
mature_hsa.fa \
hairpin_hsa.fa \
-t hsa
Results for this sample:
> Total reads processed: 50,693
Reads that were too short: 41,238 (81.3%)
Reads that were too long: 9 (0.0%)
Reads written (passing filters): 9,446 (18.6%)
Reads aligning to genome: <1%
Another example when I aligned first to mature.fa and then to the genome:
mature.fa: reads processed: 128,440 reads with at least one alignment: 466 (0.36%) reads that failed to align: 127,974 (99.64%) Reported 476 alignments
genome: reads processed: 127,974 reads with at least one alignment: 16,987 (13.27%) reads that failed to align: 110,987 (86.73%) Reported 120,986 alignments
I know plasma samples generally have low miRNA content, but compared to other studies using the same Qiagen kit on plasma with their Data Analysis Center, they report much higher raw read counts (see PMC8539647 – supplementary table).
Could I be doing something wrong in the processing steps (Cutadapt, Bowtie, or miRDeep2)? Any insights or suggestions would be greatly appreciated.