Question

STAR do i need --twopassMode ?

0

Entering edit mode

6.1 years ago

caggtaagtat ★ 1.9k

Hi there,

searching the STAR manual for default STAR arguments, I stumbled upon the argument --twopassMode. In my understanding, if you call it with --twopassMode Basic, STAR takes all exon junctions from the first run with a given sample and uses it as annotated junctions in a second step of mapping of the same sample, which apparently leads to an increase in reads for not annotated exon junctions but the same number of novel found junctions. I have RNA-Seq data of biological groups, four times each and am analysing the data for alternative splicing. Therefore I would like to treat annotated an not annotated splice sites as free from any bias as possible. Would you recomend using the --twopassMode Basic or --twopassMode None and what is the default value for STAR 2.5.4b?

Edit: The default setting in STAR is --twopassMode None

RNA-Seq STAR 2-pass mapping • 9.7k views

ADD COMMENT • link 6.1 years ago by caggtaagtat ★ 1.9k

GenoMax · Accepted Answer · 2018-03-05

3

Entering edit mode

6.1 years ago

Amitm ★ 2.2k

Hi, Though I haven't done a comparative analysis of using & not-using that param, but it definitely helps if you are interested in splicing analysis. Specially if you are going to do transcriptome assembly (using Cufflinks or StringTie like tools). I analyzed CRISPR treated samples and I was able to catch the aberrant splicing resulting from the effect of CRISPR on the particular gene. Of course the transcript assembler tool also had its role, but I would rather use the --twopassMode Basic param and increase my chances.

ADD COMMENT • link 6.1 years ago by Amitm ★ 2.2k

0

Entering edit mode

Ok thank you, I'm also doing trancriptome assembly and think this could increase my fidelity.

ADD REPLY • link 6.1 years ago by caggtaagtat ★ 1.9k

0

Entering edit mode

On top of that, Alex Dobin (developer of STAR) usually recommends to use junctions from all samples, not only from one. So first you do normal mapping of all your samples, collect all junctions, and insert them into the second step for each sample. Here are relevant threads
https://groups.google.com/forum/#!topic/rna-star/VTX9TfapSfQ,
https://groups.google.com/forum/#!msg/rna-star/9C3W_BMfGXM/-rg7C6HURHsJ,
https://groups.google.com/forum/#!msg/rna-star/yvJ6C3h7OMk/CB5QdWBL41IJ

ADD REPLY • link updated 6.1 years ago by GenoMax 141k • written 6.1 years ago by grant.hovhannisyan ★ 2.6k

0

Entering edit mode

I reads about that too and will try to redo the steps, like mentioned in the google groups...

Alex does the first pass this way:

 STAR --genomeDir Genome1/ --genomeLoad LoadAndKeep --readFilesIn SampleTest_R1_trimmed.1M.fastq.gz SampleTest_R2_trimmed.1M.fastq.gz --readFilesCommand zcat

Could I do my standart STAR version instead?

/STAR_folder/STAR  --chimSegmentMin 8 --outFilterMismatchNmax 10 --outFilterMismatchNoverLmax 0.05 --alignEndsType EndToEnd -runThreadN 64 --outSAMtype BAM SortedByCoordinate --alignSJDBoverhangMin 4 --alignIntronMax 300000 -limitBAMsortRAM 30943606211 --genomeDir /STAR_folder/star_index/hg38/ --sjdbOverhang 149 --quantMode GeneCounts --sjdbGTFfile /GTF_file_folder/Homo_sapiens.GRCh38.91.gtf --outFileNamePrefix /Output_folder/ --readFilesIn Input_folder/single_end.fastq

ADD REPLY • link 6.1 years ago by caggtaagtat ★ 1.9k

0

Entering edit mode

--outFileNamePrefix /Output_folder/ - I guess with this option you should specify prefix for your output files. And I don't have experience with chimera discovery, so can't say anything about --chimSegmentMin 8. The rest seems ok.

ADD REPLY • link 6.1 years ago by grant.hovhannisyan ★ 2.6k

0

Entering edit mode

Ok, thank you, I hope it works ;)

ADD REPLY • link 6.1 years ago by caggtaagtat ★ 1.9k

0

Entering edit mode

Hi again, Are you looking for chimera discovery? Then I would recommend that you could follow the guidelines of STAR-Fusion tool. On this page here, are the params that enables more sensitivity for chimera detection.

If you just want to increase sensitivity to splice-detection, then on page 7 of the current STAR manual are the options used for long RNA-seq. Try them unless you know what the above non-default settings are going to do. Specially the param you have written -

--alignEndsType EndToEnd

The default setting is Local for that param. I think it would be naive to assume that all reads would align end-to-end to its target. Unless you have some reason to suppress soft-clipping. In my experience, having soft-clipping ON, helps as there can be genuine InDels in your data.

ADD REPLY • link 6.1 years ago by Amitm ★ 2.2k

0

Entering edit mode

Hi, no I am just interested in exon junctions and alternative splicing. I read that rMATS, which also looks for alternative splicing, uses this parameter --alignEndsType EndToEnd. In my understanding, this would increase exon junction detection. Would you rather recommend to enable softclipping for better exon junction analysis?

ADD REPLY • link 6.1 years ago by caggtaagtat ★ 1.9k

0

Entering edit mode

Hi, I am not sure what would be the full effect on the BAM due to the param. Besides, I have not used rMATS myself. So I don't have anything specific to comment. Time permitting, you can run one sample w and w/o that param and check the alignment% and the count data. I haven't come across the need to suppress soft-clipping during the STAR step.

ADD REPLY • link 6.1 years ago by Amitm ★ 2.2k