Question: STAR mapping pipeline with 2-pass for multiple samples?
2
gravatar for Rajesh Detroja
15 months ago by
INDIA
Rajesh Detroja130 wrote:

Dear All,

As per my understanding from STAR manual, I am about to run a STAR 2.7.0f mapping pipeline with 2-pass mode for multiple samples of patiets of diseases and healthy peoples as follows:

Could you please help me to validate all the commands I am running correctly or do you have any suggestions?

1) Indexing genome with annotations

STAR --runMode genomeGenerate --genomeDir ~/db/hg38/ --genomeFastaFiles ~/db/hg38/hg38.fa --sjdbGTFfile ~/db/hg38/hg38.gtf --runThreadN 30 --sjdbOverhang 89

Note:

  • Indexing for maximum read length 90 bp.

2) 1-pass mapping with indexed genome

STAR --genomeDir ~/db/hg38/ --readFilesIn sample1.R1.fastq.gz sample1.R2.fastq.gz --readFilesCommand zcat --outSAMunmapped Within --outFileNamePrefix sample1. --runThreadN 30

Notes:

  • The same command has been run for multiple samples in the for loop, therefore, it will generate SJ.out.tab file for each sample.

  • Next, I have copied SJ.out.tab files of all the samples into a single folder "SJ_out"

3) Indexing genome with annotations and SJ.out.tab files

STAR --runMode genomeGenerate --genomeDir ~/db/hg38/SJ_Index/ --genomeFastaFiles ~/db/hg38/SJ_Index/hg38.fa --sjdbGTFfile ~/db/hg38/SJ_Index/hg38.gtf --runThreadN 30 --sjdbOverhang 89 --sjdbFileChrStartEnd SJ_out/*.SJ.out.tab

Note:

  • Again indexing for maximum read length 90 bp.

4) 2-pass mapping with new indexed genome with annotations and SJ.out.tab files

STAR --genomeDir ~/db/hg38/SJ_Index/ --readFilesIn sample1.R1.fastq.gz sample1.R2.fastq.gz --readFilesCommand zcat --outSAMunmapped Within --outFileNamePrefix sample1. --runThreadN 30

Notes:

  • Again, the same command has been run for multiple samples in the for loop, therefore, it will generate mapping files for each sample.
ADD COMMENTlink modified 15 months ago • written 15 months ago by Rajesh Detroja130
5
gravatar for Rajesh Detroja
15 months ago by
INDIA
Rajesh Detroja130 wrote:

SOLVED! Please look at my recent conversations with Alex regarding the issue.

ADD COMMENTlink modified 15 months ago • written 15 months ago by Rajesh Detroja130
1
gravatar for darbinator
15 months ago by
darbinator220
darbinator220 wrote:

I refer you to the post that I created on the google group linked to the STAR tool, I had the same question as you about the commands related to the 2 pass-mode and the author of STAR answered me:

https://groups.google.com/forum/#!msg/rna-star/4dhcEGFMiK0/XoMh6rB7CwAJ

ADD COMMENTlink written 15 months ago by darbinator220
3

According to one of the latest post of Alex, he suggested the following criteria for the filtration:

1. Filter out the junctions on chrM, those are most likely to be false.

2. Filter out non-canonical junctions (column5 == 0).

3. Filter out junctions supported by multi mappers only (column7==0)

4. Filter out junctions supported by too few reads (e.g. column7<=2)

ADD REPLYlink modified 15 months ago • written 15 months ago by Rajesh Detroja130
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 799 users visited in the last hour