I am currently in the process of evaluating miRNA Seq data and would like to present my pipeline for your review. Given the absence of a dedicated bioinformatician in my department, particularly for this specific use case, I am eager to gather feedback on the robustness of this workflow for potential peer review of the project.
Here is an overview of the key steps in my analysis:
1, Utilizing Cutadapt to eliminate any adaptors and the common sequence, which is ligated precisely at the 3' end of the read.
Due to the utilization of a specific miRNA extraction and library kit, I opted to align the data to a miRNA genome obtained from RNA central, using the miRBase data for humans. This choice aims to minimize issues related to multi-mapping leveraging the specificity of the kits used.
Experimenting with various aligners, I found that STAR produced the most satisfactory and acceptable results for me. Consequently, I decided to adhere to STAR, and because of my familiarity and experience with the tool. I fine-tuned the settings to optimize the mapping, achieving a 35-50% unique mapped reads rate, with most of the remaining reads classified as multi-mappers.
- For the subsequent analysis of Differentially Expressed Genes (DEG) using a DESeq2 pipeline, my intention is to focus solely on the uniquely mapped reads to mitigate any potential bias.