Hi all,
I’m analyzing RNA-seq data generated with Illumina short reads from Singapore Consortium and I’m interested in using Bambu for transcript variant quantification. I have already used it for Nanopore long-read sequencing but I have very few TPM for my target.
Even though most of the examples and tutorials I’ve seen seem to be based on long-read data (e.g. PacBio or Oxford Nanopore) I was wondering If it is possible or appropriate to use Bambu with Illumina short-read data.
Ps I tried and it says "Bambu reports a new model with ROC AUC 0.914 and PR AUC 0.589 on my Illumina data, compared to 0.66 and 0.113 for the ONT pretrained model. Is this performance sufficient for reliable isoform quantification, or should I switch to Salmon?"
Did you see a recent thread --> Isoform analysis after quantification with Salmon/Star which delineates standard methods for short-read data.
Bambu is meant to be used for long reads and does not mention short/Illumina reads anywhere so it would of an off-label use.
I'll agree with GenoMax here. While the approach of bambu shares several methodological underpinnings with methods used for short-read quatification, the tool is explicitly designed and optimized for long-read assembly and quantification. I would avoid using it for short read quantification unless explicitly recommended by the authors.