Question: What Are The Effects Of The Tophat --Mate-Std-Dev Parameter On Finding Fusion Genes
7.5 years ago by
The Earth
samsara600 wrote:

I am new to tophat. I need to find some fusion genes from rna-seq data. I am using --mate-inner-dist 130. I am confused whether to use default value for --mate-std-dev or to calculate it somehow. If it is advisable not to use dafault, how can I calculate --mate-std-dev value. How the output (finding of fusion genes) is affected by the value of these parameters?

ADD COMMENTlink modified 5.3 years ago by Biostar ♦♦ 20 • written 7.5 years ago by samsara600
7.5 years ago by
Istvan Albert ♦♦ 84k
University Park, USA
Istvan Albert ♦♦ 84k wrote:

TopHat-Fusion makes use of the interval mate_inner_dist ± mate_std_dev when it tries to find fusions.

It is probably not something that you can directly calculate rather estimate from the sample preparation. It should correspond to the expected width of the histogram of your DNA fragments.

In general I would not fret too much about it, set it fairly large first (80, 100) see what happens.

ADD COMMENTlink modified 7.5 years ago • written 7.5 years ago by Istvan Albert ♦♦ 84k
7.5 years ago by
DG7.1k wrote:

One of the typical ways to estimate these from your data is to take some proportion of your reads and align to a reference transcriptome using BWA. You can then use PICARD to calculate the insert size metrics, giving you the insert size and standard deviations.

ADD COMMENTlink written 7.5 years ago by DG7.1k
