What is the effect of setting "--fragment-length" (-l) either too low or too high for Kallisto single-end quantification and how could this affect your conclusions?
I've seen some variations of this question on here, but not a clear explanation for what the fragment length argument (which is required for running Kallisto on single-end data) actually does and how it affects quantification.
Some background/motivation: As others have noted, it is sometimes difficult to determine what this length should be, especially for data that you didn't generate yourself. I'm not sure how I feel about guessing a number for it due to previous experience because I've seen some very different results from quantification resulting from changing fragment length which I'm not sure how to think about. For example, in one experiment where there was ribosomal RNA contamination in an NEB-Next library, using the bioanalyzer-derived fragment size (~300 bp) led to an rRNA contamination content of about 10% of the reads, but setting the fragment length to 1 led to a contamination content of about 30%, more similar to what I got using bowtie mapping of the reads. How is this happening?
The reason I tried -l 1 is because I found a paper using QuantSeq data which used a fragment length of 1 (-l 1 -s 1) on their data, but I can no longer find that reference unfortunately. What is the effect of setting the fragment length to 1 and could this lead to incorrect conclusions?