What Are The Effects Of The Tophat --Mate-Std-Dev Parameter On Finding Fusion Genes
2
1
Entering edit mode
11.2 years ago
samsara ▴ 630

I am new to tophat. I need to find some fusion genes from rna-seq data. I am using --mate-inner-dist 130. I am confused whether to use default value for --mate-std-dev or to calculate it somehow. If it is advisable not to use dafault, how can I calculate --mate-std-dev value. How the output (finding of fusion genes) is affected by the value of these parameters?

tophat rna-seq fusion • 4.7k views
ADD COMMENT
2
Entering edit mode
11.2 years ago

TopHat-Fusion makes use of the interval mate_inner_dist ± mate_std_dev when it tries to find fusions.

It is probably not something that you can directly calculate rather estimate from the sample preparation. It should correspond to the expected width of the histogram of your DNA fragments.

In general I would not fret too much about it, set it fairly large first (80, 100) see what happens.

ADD COMMENT
1
Entering edit mode
11.2 years ago
DG 7.3k

One of the typical ways to estimate these from your data is to take some proportion of your reads and align to a reference transcriptome using BWA. You can then use PICARD to calculate the insert size metrics, giving you the insert size and standard deviations.

ADD COMMENT

Login before adding your answer.

Traffic: 3389 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6