Question: STAR do i need --twopassMode ?
0
gravatar for caggtaagtat
11 months ago by
caggtaagtat430
caggtaagtat430 wrote:

Hi there,

searching the STAR manual for default STAR arguments, I stumbled upon the argument --twopassMode. In my understanding, if you call it with --twopassMode Basic, STAR takes all exon junctions from the first run with a given sample and uses it as annotated junctions in a second step of mapping of the same sample, which apparently leads to an increase in reads for not annotated exon junctions but the same number of novel found junctions. I have RNA-Seq data of biological groups, four times each and am analysing the data for alternative splicing. Therefore I would like to treat annotated an not annotated splice sites as free from any bias as possible. Would you recomend using the --twopassMode Basic or --twopassMode None and what is the default value for STAR 2.5.4b?

Edit: The default setting in STAR is --twopassMode None

2-pass mapping rna-seq star • 1.4k views
ADD COMMENTlink modified 11 months ago • written 11 months ago by caggtaagtat430
2
gravatar for Amitm
11 months ago by
Amitm1.6k
UK
Amitm1.6k wrote:

Hi, Though I haven't done a comparative analysis of using & not-using that param, but it definitely helps if you are interested in splicing analysis. Specially if you are going to do transcriptome assembly (using Cufflinks or StringTie like tools). I analyzed CRISPR treated samples and I was able to catch the aberrant splicing resulting from the effect of CRISPR on the particular gene. Of course the transcript assembler tool also had its role, but I would rather use the --twopassMode Basic param and increase my chances.

ADD COMMENTlink written 11 months ago by Amitm1.6k

Ok thank you, I'm also doing trancriptome assembly and think this could increase my fidelity.

ADD REPLYlink modified 11 months ago • written 11 months ago by caggtaagtat430

On top of that, Alex Dobin (developer of STAR) usually recommends to use junctions from all samples, not only from one. So first you do normal mapping of all your samples, collect all junctions, and insert them into the second step for each sample. Here are relevant threads
https://groups.google.com/forum/#!topic/rna-star/VTX9TfapSfQ,
https://groups.google.com/forum/#!msg/rna-star/9C3W_BMfGXM/-rg7C6HURHsJ,
https://groups.google.com/forum/#!msg/rna-star/yvJ6C3h7OMk/CB5QdWBL41IJ

ADD REPLYlink modified 11 months ago by genomax62k • written 11 months ago by grant.hovhannisyan1.4k

I reads about that too and will try to redo the steps, like mentioned in the google groups...

Alex does the first pass this way:

 STAR --genomeDir Genome1/ --genomeLoad LoadAndKeep --readFilesIn SampleTest_R1_trimmed.1M.fastq.gz SampleTest_R2_trimmed.1M.fastq.gz --readFilesCommand zcat

Could I do my standart STAR version instead?

/STAR_folder/STAR  --chimSegmentMin 8 --outFilterMismatchNmax 10 --outFilterMismatchNoverLmax 0.05 --alignEndsType EndToEnd -runThreadN 64 --outSAMtype BAM SortedByCoordinate --alignSJDBoverhangMin 4 --alignIntronMax 300000 -limitBAMsortRAM 30943606211 --genomeDir /STAR_folder/star_index/hg38/ --sjdbOverhang 149 --quantMode GeneCounts --sjdbGTFfile /GTF_file_folder/Homo_sapiens.GRCh38.91.gtf --outFileNamePrefix /Output_folder/ --readFilesIn Input_folder/single_end.fastq
ADD REPLYlink modified 11 months ago • written 11 months ago by caggtaagtat430

--outFileNamePrefix /Output_folder/ - I guess with this option you should specify prefix for your output files. And I don't have experience with chimera discovery, so can't say anything about --chimSegmentMin 8. The rest seems ok.

ADD REPLYlink written 11 months ago by grant.hovhannisyan1.4k

Ok, thank you, I hope it works ;)

ADD REPLYlink modified 11 months ago • written 11 months ago by caggtaagtat430

Hi again, Are you looking for chimera discovery? Then I would recommend that you could follow the guidelines of STAR-Fusion tool. On this page here, are the params that enables more sensitivity for chimera detection.

If you just want to increase sensitivity to splice-detection, then on page 7 of the current STAR manual are the options used for long RNA-seq. Try them unless you know what the above non-default settings are going to do. Specially the param you have written -

--alignEndsType EndToEnd

The default setting is Local for that param. I think it would be naive to assume that all reads would align end-to-end to its target. Unless you have some reason to suppress soft-clipping. In my experience, having soft-clipping ON, helps as there can be genuine InDels in your data.

ADD REPLYlink written 11 months ago by Amitm1.6k

Hi, no I am just interested in exon junctions and alternative splicing. I read that rMATS, which also looks for alternative splicing, uses this parameter --alignEndsType EndToEnd. In my understanding, this would increase exon junction detection. Would you rather recommend to enable softclipping for better exon junction analysis?

ADD REPLYlink written 11 months ago by caggtaagtat430

Hi, I am not sure what would be the full effect on the BAM due to the param. Besides, I have not used rMATS myself. So I don't have anything specific to comment. Time permitting, you can run one sample w and w/o that param and check the alignment% and the count data. I haven't come across the need to suppress soft-clipping during the STAR step.

ADD REPLYlink written 11 months ago by Amitm1.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1367 users visited in the last hour