Question: choosing a suitable max intron size for STAR (plant alignment)
gravatar for Biogeek
18 months ago by
Biogeek260 wrote:

Hey guys,

Any tips? I know STAR aligner is optimised for mammalian alignments. I have a reference genome with a gff3 file for a plant and there are only details for exon, CDS, UTRs, but not introns. Additionally the genome is presented in scaffolds only, not chromosomes. The default STAR settings put max intron length to around 500,000 nt which is huge. Can anyone suggest a suitable maxintron value, or point me to literature on such a matter. It seems this goes unreported in alignment methodology most of the time.


star plants intron size • 792 views
ADD COMMENTlink modified 6 months ago by claudiorivero920 • written 18 months ago by Biogeek260

If you map some reads with BBMap, you can produce a histogram of indel lengths with the "indelhist" flag, and use that to inform your decision. The distribution varies by plant species. in=reads.fq ref=genome.fa maxindel=500000 indelhist=ihist.txt reads=1m

That will just map the first million reads and stop.

If you already have a mapped sam/bam file, you can alternatively generate the indel length histogram with Reformat: in=mapped.sam indelhist=ihist.txt
ADD REPLYlink written 18 months ago by Brian Bushnell15k

Thanks Brian, seems like a handy little tool :-)

ADD REPLYlink written 18 months ago by Biogeek260

I'm assuming around 10,000 would be appropriate?

ADD REPLYlink written 18 months ago by Biogeek260

Hi, we have done RNA-seq analysis and to optimize parameters for plant genomes, minimum and maximum intron lengths were set as 60 and 6000 according to what was described for splicing in Arabidopsis (Márquez et al., 2010);

ADD REPLYlink written 6 months ago by claudiorivero920
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 935 users visited in the last hour