Question

STAR mapping pipeline with 2-pass for multiple samples?

4

Entering edit mode

4.9 years ago

Rajesh Detroja ▴ 200

Dear All,

As per my understanding from STAR manual, I am about to run a STAR 2.7.0f mapping pipeline with 2-pass mode for multiple samples of patiets of diseases and healthy peoples as follows:

Could you please help me to validate all the commands I am running correctly or do you have any suggestions?

1) Indexing genome with annotations

STAR --runMode genomeGenerate --genomeDir ~/db/hg38/ --genomeFastaFiles ~/db/hg38/hg38.fa --sjdbGTFfile ~/db/hg38/hg38.gtf --runThreadN 30 --sjdbOverhang 89

Note:

Indexing for maximum read length 90 bp.

2) 1-pass mapping with indexed genome

STAR --genomeDir ~/db/hg38/ --readFilesIn sample1.R1.fastq.gz sample1.R2.fastq.gz --readFilesCommand zcat --outSAMunmapped Within --outFileNamePrefix sample1. --runThreadN 30

Notes:

The same command has been run for multiple samples in the for loop, therefore, it will generate SJ.out.tab file for each sample.
Next, I have copied SJ.out.tab files of all the samples into a single folder "SJ_out"

3) Indexing genome with annotations and SJ.out.tab files

STAR --runMode genomeGenerate --genomeDir ~/db/hg38/SJ_Index/ --genomeFastaFiles ~/db/hg38/SJ_Index/hg38.fa --sjdbGTFfile ~/db/hg38/SJ_Index/hg38.gtf --runThreadN 30 --sjdbOverhang 89 --sjdbFileChrStartEnd SJ_out/*.SJ.out.tab

Note:

Again indexing for maximum read length 90 bp.

4) 2-pass mapping with new indexed genome with annotations and SJ.out.tab files

STAR --genomeDir ~/db/hg38/SJ_Index/ --readFilesIn sample1.R1.fastq.gz sample1.R2.fastq.gz --readFilesCommand zcat --outSAMunmapped Within --outFileNamePrefix sample1. --runThreadN 30

Notes:

Again, the same command has been run for multiple samples in the for loop, therefore, it will generate mapping files for each sample.

RNA-Seq STAR 2-pass Multiple samples Expression • 12k views

ADD COMMENT • link 4.9 years ago by Rajesh Detroja ▴ 200

2

Entering edit mode

4.9 years ago

vin.darb ▴ 300

I refer you to the post that I created on the google group linked to the STAR tool, I had the same question as you about the commands related to the 2 pass-mode and the author of STAR answered me:

https://groups.google.com/forum/#!msg/rna-star/4dhcEGFMiK0/XoMh6rB7CwAJ

ADD COMMENT • link 4.9 years ago by vin.darb ▴ 300

3

Entering edit mode

According to one of the latest post of Alex, he suggested the following criteria for the filtration:

1. Filter out the junctions on chrM, those are most likely to be false.

2. Filter out non-canonical junctions (column5 == 0).

3. Filter out junctions supported by multi mappers only (column7==0)

4. Filter out junctions supported by too few reads (e.g. column7<=2)

ADD REPLY • link 4.9 years ago by Rajesh Detroja ▴ 200

score 8 · Accepted Answer · 2019-06-07

8

Entering edit mode

4.9 years ago

Rajesh Detroja ▴ 200

SOLVED! Please look at my recent conversations with Alex regarding the issue.

ADD COMMENT • link 4.9 years ago by Rajesh Detroja ▴ 200