I have a RNA-seq experiment and I would like to use STAR as aligner. I did an RNA-seq course and they told me that introducing as a parameter how long is the longest intron in the genome will save time. But...they forgot to tell us how obtain this information and we forgot to ask...Could you tell me how can I do that?
Specifying the maximum intron length helps because it limits the search space for the "other end" of a read when it is being aligned to the genome. If the second half of your gene maps several MB away, it is unlikely that this represents a valid, biologically relevant, splice junction and is probably the result of a miss-alignment. If this is the case, it makes no sense to spend time looking MBs away for the mapping position of the second half of a split read.
It is also the case that some reference genomes contain gene models with unreasonably long introns, often that merge two genes together (i.e. one half of the junction is in one gene, and the other half is in a different gene, usually a different member of the same protein family).
A little bit of knowledge about your genome of interest can help here. In humans we use 2Mb as our maximum intron length because there is a gene with an intron that long that we are pretty confident is real (I don't remember which right now).
Otherwise you could trust the reference annotation and use the method outlined by Medhat.
I don't know how providing the longest intron length will help the aligner, but there are a lot of things I don't know. Either way, you can find some nice transcriptome summary statistics from http://genomewiki.ucsc.edu/index.php/Gene_Set_Summary_Statistics.
If you know the sequenced genome, you can make a script that takes as input your annotation file (GFF) and then looking for the longest intron.