Question: Optimal parameters for Tophat on drosphila data
gravatar for Rox
4.1 years ago by
France / Toulouse / GeT-Plage
Rox1.2k wrote:

Hi everyone !

This will be my first time using Tophat. I'm working on a drosophila genome, and as I read the manual before to start, I saw that there is a little recommendations :

Please note: TopHat has a number of parameters and options, and their default values are tuned for processing mammalian RNA-Seq reads. If you would like to use TopHat for another class of organism, we recommend setting some of the parameters with more strict, conservative values than their defaults. Usually, setting the maximum intron size to 4 or 5 Kb is sufficient to discover most junctions while keeping the number of false positives low.

As Drosophila genomes are very dense in genes, maybe there is an optimal parameters set. Does anyone have an idea of what I should modify to enhance my results ?

PS : The drosophila species I'm working on contains also a high level of polymorphism, so maybe it could be a problem. My main goal is to annotate a assembly I've made of this species. And in order to do that, I have to produces RNA-seq evidences coming from my data. That's why I'm using Tophat at this stage

Thanks for your help !


rna-seq assembly • 1.3k views
ADD COMMENTlink written 4.1 years ago by Rox1.2k

Please note that TopHat has entered a low maintenance, low support stage as it is now largely superseded by HISAT2 which provides the same core functionality (i.e. spliced alignment of RNA-Seq reads), in a more accurate and much more efficient way.

ADD REPLYlink written 4.1 years ago by Medhat8.7k

You just need to use any splice-aware aligner. Both STAR and BBMap are fast. BBMap is simple to use.

ADD REPLYlink written 4.1 years ago by genomax89k

So, you are all suggesting me that Tophat2 is not appropriated for what I'm trying to do ?

ADD REPLYlink modified 4.1 years ago • written 4.1 years ago by Rox1.2k

It's not inappropriate. But it's not the best either, certainly not since it's deprecated.

ADD REPLYlink written 4.1 years ago by WouterDeCoster44k

I also heard that HISAT2 is going to be the new top-notch tool for RNA-seq Try it

ADD REPLYlink written 4.1 years ago by liartom20

Well, i'm actually downloading it, and I'm going to try. But maybe my questions is just the same right ? In the HISAT2 paper, they are talking about human genomes, so maybe the default parameters are just as well adapted for non gene dense genome. But Drosophila genome are very dense in genes, so I was looking for some advices about some parameters that can improve the analysis, taking count that the genome is gene dense (which for me means that all gene are very close from each other), because I know that sometimes it could be a problem, some genes are considered as one. I was looking for a way to avoid this. But I'm trying right now Hisat2, and I'm reading the manual also.

ADD REPLYlink written 4.1 years ago by Rox1.2k

First thing you should always do is try the default settings. Then, as suggested by the tophat manual, you could limit the intron size. I think this will make the main difference between a gene dense and gene sparse genome.

ADD REPLYlink written 4.1 years ago by WouterDeCoster44k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1055 users visited in the last hour