Question: estimating read length and SD for Kallisto (single-end QuantSeq reads)
0
gravatar for cats_dogs
7 months ago by
cats_dogs0
cats_dogs0 wrote:

Hi all,

Deleted and reposted this to ask an actually sensible question. Sorry to those who've had to see an iteration of this twice.

I am intending to use Kallisto and was looking for an opinion re: estimating fragment length. We used a Lexogen 3’ QuantSeq kit. Here is a sample Bioanalyzer trace of one of the library preps.

bioanalyzer trace of library prep

Using kallisto for 3' QuantSeq was discussed in this paper, using l=100 s=30 to run kallisto quant. Looking at the GEO page for the prior study the authors accessed, however, I did not see these values listed (was this an approximation for convenience?).

Should I use the smear analysis (in red) or use the data called for peak 2, the 258 bp peak? The smear analysis boundaries were set manually at our sequencing core and the peak calling was automated. I am a little more inclined to go with the automated peak calling, however, I wanted to verify. Or should I use the approximation in the publication, which gave useful results?

Thank you!

ADD COMMENTlink modified 7 months ago by michael.ante3.3k • written 7 months ago by cats_dogs0
1

For further references you could just update the entire question - in that way everybody already responding would be notified.

ADD REPLYlink written 7 months ago by kristoffer.vittingseerup2.0k

ah, thanks! noted for the future! no one had replied yet though

ADD REPLYlink written 7 months ago by cats_dogs0
2
gravatar for michael.ante
7 months ago by
michael.ante3.3k
Austria/Vienna
michael.ante3.3k wrote:

Hi Cats_dogs,

According to the QuantSeq FAQ page result "mean library sizes of about 335 – 456 bp" in "mean insert sizes of 203 – 324 bp". Since your average fragment length is 28 bp shorter, the fragment should have a mean of 175 bp. The Kallisto's manual is referring to average insert size, not peak insert-size.

The actual sequenced fragment length may be shorter due to sequencing bias.

I'd try once with the mentioned mean of 100 bp and once with setting it to 175 bp. If you have Spike-ins like the ERCCs you can check the measured vs the expected concentration.

The standard deviation is harder to estimate. I'd stick to the published values.

Cheers,

Michael

ADD COMMENTlink written 7 months ago by michael.ante3.3k

Ah, rats, okay. Thank you, you saved me from a fairly large goof. Do you think it's advisable to adjust the insert size for libraries with different fragment lengths in addition to doing a run with 100 bp for all? Cheers!

ADD REPLYlink written 7 months ago by cats_dogs0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1603 users visited in the last hour