Question

STAR reasonable parameters

2

Entering edit mode

6.3 years ago

Sharon ▴ 600

Hi everyone

I am using STAR for alignment with goal of variant calling, I feel the default parameters are too loosy, like too much mismatches, too much soft clipping, ..etc.

I want to get some feedback from your experience about what are much reasonable parameters, it's my first use for STAR. This is what I use so far:

  ${STAR}/STAR --runMode genomeGenerate --genomeDir ${WHERE} --genomeFastaFiles  ${WHERE}/genome.fa --sjdbFileChrStartEnd ${WHERE}/SJ.out.tab --sjdbOverhang 75 --runThreadN 4 g --outFileNamePrefix ${WHERE} --alignEndsType EndToEnd --outFilterMismatchNmax 4

Thanks

RNA-Seq • 4.4k views

ADD COMMENT • link 6.3 years ago by Sharon ▴ 600

2

Entering edit mode

Especially when it's the first time that I'm using software I don't consider myself smarter than the author and I'll go through the manual to see which options are recommended, but mostly stick to the defaults because those should be sensible.

The GATK best practices for variant calling in RNA-seq also uses STAR, so that would also be a logical place to look for optimal parameters.

ADD REPLY • link 6.3 years ago by WouterDeCoster 47k

1

Entering edit mode

This is what I am already following, the reason for my question is what they say in the top of the pipeline:

"You should always make sure you understand what is being done at each step and whether the values are appropriate for your data."

ADD REPLY • link 6.3 years ago by Sharon ▴ 600

0

Entering edit mode

At that point you might as well use a different aligner, since you're basically just decreasing the alignment rate.

ADD REPLY • link 6.3 years ago by Devon Ryan 104k

0

Entering edit mode

Like what? would salmon work with variant calling? And decreasing the alignment rate yes, but I don't want alignment with reads that has too many mismatches and softclipped, the default paramters are kinda high in this. What do you think?

ADD REPLY • link 6.3 years ago by Sharon ▴ 600

0

Entering edit mode

I mean in variant calling. mismatches would be considered later as SNP?

ADD REPLY • link 6.3 years ago by Sharon ▴ 600

1

Entering edit mode

Mismatches are only considered SNPs if it makes sense to do so given the totality of the data. It's not like variant callers are defining every mismatch in a read as a variant. Given that, while it might make sense to tamp down on soft-clipping (or just trim your adapters), doing much more than that and filtering by alignment score is just going to bias against regions with multiple variants.

ADD REPLY • link 6.3 years ago by Devon Ryan 104k

0

Entering edit mode

So how much to tamp down was my question, so you think default parameters of STAR is still fine and not aggressive? Thanks Devin so much !

ADD REPLY • link 6.3 years ago by Sharon ▴ 600

1

Entering edit mode

Generally I think the defaults for STAR are pretty good, though if the GATK best practices has some different suggestions then definitely follow them.