I've a bunch of RNA-Seq samples sequenced on a HiSeq2000 (2x100bp) and an other bunch sequenced on a NextSeq500 (2x150bp and 2x76 depending on the samples). The sequencing depth varies between 25M to 60M on average. All libraires were prepared using the same kit (ribo-zero). Only the sequencing kit changes between the HiSeq and the NextSeq.
I applied kallisto on all these samples and now want to check the expression of a specific gene across the samples. Is it correct to use the TPM from kallisto to compare the samples ; or should I use an other metric : or maybe add an additional per-sample normalization step ?
Indeed I could do a PCA plot a see if my samples cluster by sequencer (or not). I'll try it.
Question : I didn't apply bootstrap (-b option in kallisto). Should I ?
I've not tested it's effect in DESeq2 to be honest, but I did notice the other day when looking through the tximport package, that there's a parameter called varReduce, which I think uses the bootstraps. You'll be fine not using bootstraps.
I tested a PCA plot using the TPM and it seems ok to compare samples sequenced on different sequencer (at least in my case).