I have some cDNA data comparing various mutants to a control sample (each with 3+ replicates) that were obtained using a modified version of the direct cDNA sequencing kit (SQK-DCS109) from ONT.
Due to the modified protocol, we’ve gotten rather low yields for usable reads in downstream analyses. For example, the control replicates only range from 80k-290k total reads. Out of the 20k genes represented in my dataset (at least one read for each gene was identified), ~14k of the genes have less than 20 reads in each sample.
We would like to do a DE analysis between the control and mutant samples; however, I’m not sure what the best practices are for datasets with such low total counts. In the past with Illumina data, I used edgeR. For this ONT dataset, I thought that changing the min.count parameter in the edgeR::filterByExpr function to min.count = 1 would be appropriate, but I’m not entirely sure.
I’m currently trying to use ONT’s epi2me-labs/wf-transcriptomes pipeline, but I’m wondering if this is the best option for low-yield data? Does anyone have any experience or references with low-yield ONT on the best choices for DE analysis?