Question: Tophat Parameters Misunderstanding
2
gravatar for Nicolas Rosewick
7.3 years ago by
Belgium, Brussels
Nicolas Rosewick7.9k wrote:

Hi,

I have a little understanding problem on several tophat parameters :

-r/--mate-inner-dist <int> This is the expected (mean) inner distance between mate pairs. For, example, for paired end runs with fragments selected at 300bp, where each end is 50bp, you should set -r to be 200. There is no default, and this parameter is required for paired end runs.

--mate-std-dev <int> The standard deviation for the distribution on inner distances between mate pairs. The default is 20bp.

Is this the mean and sd distance between two paired-end reads on the cDNA.

pair 1 ------------->
cDNA   ----------------------------------------
pair 2                         <---------------
                     <-------->
                 inter-pair distance

or is it the size of the cDNA between the sequencing adapters ?

adapter 5' -------------
cDNA                    -----------------------
adapter 3'                                     ---------------
Library    ---------------------------------------------------

And anyone can also explain me these two params :

--closure-search Enables the mate pair closure-based search for junctions. Closure-based search should only be used when the expected inner distance between mates is small (<= 50bp)

--coverage-search Enables the coverage based search for junctions. Use when coverage search is disabled by default (such as for reads 75bp or longer), for maximum sensitivity.

Thanks a lot !

N.

tophat • 3.2k views
ADD COMMENTlink modified 3.0 years ago by Biostar ♦♦ 20 • written 7.3 years ago by Nicolas Rosewick7.9k

As of 1.3.2 -r is no longer required. "Deprecated -r as a required parameter (defaults to 50)" from the release notes. I think the manual is out of date.

ADD REPLYlink written 7.3 years ago by David Quigley11k

Regarding the second parameter: In eukaryotes, mRNAs undergo splicing. The regions of RNA that are included in mature mRNA are exons. When transcriptome is sequenced, some reads arise from single exon and some arise from more than one exons. Ones arising from more than one exon are junction reads. Junction reads are searched via --coverage-search option. It is on by default. But it can be time consuming so it can be disabled using --no-coverage-search.

ADD REPLYlink written 3.0 years ago by Satyajeet Khare1.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 576 users visited in the last hour