Question: Average Insert Size When Using Transcript Quantification Tools Such as Salmon?
gravatar for Dru Zod
2.1 years ago by
Dru Zod0
Dru Zod0 wrote:

I'm currently in the process of uploading my dataset to the gene omnibus (GEO), but the meta data form asks for an average insert size, but I'm not sure where to find this information. I used the transcript quantification tool Salmon in my analysis and I don't know if this information even exists given I didn't map the reads against a reference genome

Question: Is it possible to get an average insert size when mapping against a transcriptome using the newer transcript quantification tools such as Salmon

Any help would be greatly appreciated

rna-seq salmon next-gen • 1.1k views
ADD COMMENTlink modified 2.1 years ago by Rob4.5k • written 2.1 years ago by Dru Zod0

This doesn't actually the question, but the average insert size field is optional, so you can leave it blank if you don't know it.

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by igor11k

that's good to know. thanks

ADD REPLYlink written 2.1 years ago by Dru Zod0
gravatar for Rob
2.1 years ago by
United States
Rob4.5k wrote:

Is the data paired-end? In this case, salmon estimated the fragment length distribution. In the quantification directory the file libParams/flenDist.txt contains the parameters of the normalized distribution of observed fragment lengths. Each of the 1001 numbers gives the probability of observing a fragment of the same size. You can then extract any information you wish (mean, median, standard deviation, etc.) from this distribution.

ADD COMMENTlink written 2.1 years ago by Rob4.5k

Thank you - both for your great work with Salmon and for following up questions like this!!

ADD REPLYlink written 24 months ago by cursons.j10

Yes, my data was paired-end. I don't think I have this file saved, but I can re-run the analysis on the same files and take a look and mark this as the correct answer if it all works. Thanks

ADD REPLYlink written 2.1 years ago by Dru Zod0
gravatar for Devon Ryan
2.1 years ago by
Devon Ryan97k
Freiburg, Germany
Devon Ryan97k wrote:

It's highly likely that whoever did the library preparation will have this information. It's quite likely that they ran the samples on a bioanalyzer, or tape station, or something similar and would have these values from it. So ask them.

Alternatively, align a subset to the transcriptome and get the values from that.

ADD COMMENTlink written 2.1 years ago by Devon Ryan97k

If the wet-lab data are unavailable, you could also align a subset of the reads and get the insert size based on the paired-end information. picard CollectInsertSizeMetrics is an option.

ADD REPLYlink written 2.1 years ago by ATpoint41k

I can check with our universities core, but they did give me a file of qc information and the average insert size wasn't included, so I'm not sure if I'll have much luck

ADD REPLYlink written 2.1 years ago by Dru Zod0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1322 users visited in the last hour