Question: STAR parameters in OmicsBox
0
gravatar for phamh
4 months ago by
phamh0
phamh0 wrote:

Hi,

I want do run STAR in OmicsBox and I have some questions about its parameters. Typically, there's not a lot of references for parameters and not every one of them should be set as default, so I'm really struggling with them. I'd really appreciate it if someone could help me.


1) 'Maximum distance between mates'

This info is from https://wikibits.ugent.be/index.php/Parameters_of_STAR

  • = maximum distance between reads from a pair when mapped to the genome. If reads map to the genome farther apart the fragment is considered to be chimeric. The default value of 500000 is fine-tuned to mammalian genomes, for plant and yeast genomes you will have to decrease it.
  • STAR maps the reads to the genome, this is why the max distance between reads of a pair is equal to the intron size. For organisms with small introns you should take intron size + max fragment length

Is this info correct? My research involves Physcomitrella patens, so I'll have to decrease the input value for this parameter, but I don't know by how much. Where does that default value (500000) come from? Can I use the suggestion mentioned above (intron size + max fragment length)?


2) ‘Include Chimeric Alignments’ checkbox

This info is from OmicsBox Manual http://manual.omicsbox.biobam.com/user-manual/module-transcriptomics/rna-seq-alignment/#RNA-SeqAlignment-RunRNA-SeqAlignment(STAR)

  • This option allows to include the chimeric alignments together with normal alignments in the main BAM file. The format of chimeric alignments follows the latest SAM/BAM specifications.

Is there a reason why one should or should not separate these two kinds of alignment?


3) 'Maximum Number of Mismatches'

This info is from https://wikibits.ugent.be/index.php/Parameters_of_STAR

  • = maximum number of mismatches for a read (single-end) or a pair of reads (paired-end). Default is 10. The value you should choose is dependent on the read length. For short quality trimmed reads you typically allow 5% mismatches.

The default value in STAR in OmicsBox is 999, which is confusing to me. My reads are 150bp, which is not short, right? I'm not sure what to do with this parameter. Should I leave it as default (10)?


Thank you.

omicsbox star parameters • 302 views
ADD COMMENTlink modified 4 months ago by swbarnes29.3k • written 4 months ago by phamh0

You should try a run leaving all parameters at default. Only thing I would change is the "max distance between mates". If you know what the average length of introns is in your organism then you can use that number instead of 500K which is appropriate for human/mammalian genomes.

ADD REPLYlink written 4 months ago by GenoMax94k
0
gravatar for swbarnes2
4 months ago by
swbarnes29.3k
United States
swbarnes29.3k wrote:

I think optimizing those three parameters is going to have an incredibly small effect on your mapping results. Except that showing 999 mismatched positions might bloat your bam a lot; if something maps that many times in the genome, there's a good chance you aren't going to do much with it anyway.

ADD COMMENTlink written 4 months ago by swbarnes29.3k

Can you please tell me why you think it would have minimal effect on my mapping results? In one source I found, they said STAR was optimized for mammalian genomes and suggested changing parameters if using plant genomes.

ADD REPLYlink modified 4 months ago • written 4 months ago by phamh0

It can be true both that it's optimized for mammals, and that the current settings are only slightly non-optimal for other kinds of eukaryotes. Are you expectng a huge number of chimeric reads? How many wrong alignments do you think you will get by having a too-generous max pair distance allowance?

ADD REPLYlink written 4 months ago by swbarnes29.3k

I honestly don't know how to answer those questions, so does my research faculty. Is there a way to figure those out?

ADD REPLYlink written 4 months ago by phamh0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1168 users visited in the last hour