Question: STAR mapping pipeline with 2-pass for multiple samples?
0
gravatar for Rajesh Detroja
5 months ago by
INDIA
Rajesh Detroja60 wrote:

Dear All,

As per my understanding from STAR manual, I am about to run a STAR 2.7.0f mapping pipeline with 2-pass mode for multiple samples of patiets of diseases and healthy peoples as follows:

Could you please help me to validate all the commands I am running correctly or do you have any suggestions?

1) Indexing genome with annotations

STAR --runMode genomeGenerate --genomeDir ~/db/hg38/ --genomeFastaFiles ~/db/hg38/hg38.fa --sjdbGTFfile ~/db/hg38/hg38.gtf --runThreadN 30 --sjdbOverhang 89

Note:

  • Indexing for maximum read length 90 bp.

2) 1-pass mapping with indexed genome

STAR --genomeDir ~/db/hg38/ --readFilesIn sample1.R1.fastq.gz sample1.R2.fastq.gz --readFilesCommand zcat --outSAMunmapped Within --outFileNamePrefix sample1. --runThreadN 30

Notes:

  • The same command has been run for multiple samples in the for loop, therefore, it will generate SJ.out.tab file for each sample.

  • Next, I have copied SJ.out.tab files of all the samples into a single folder "SJ_out"

3) Indexing genome with annotations and SJ.out.tab files

STAR --runMode genomeGenerate --genomeDir ~/db/hg38/SJ_Index/ --genomeFastaFiles ~/db/hg38/SJ_Index/hg38.fa --sjdbGTFfile ~/db/hg38/SJ_Index/hg38.gtf --runThreadN 30 --sjdbOverhang 89 --sjdbFileChrStartEnd SJ_out/*.SJ.out.tab

Note:

  • Again indexing for maximum read length 90 bp.

4) 2-pass mapping with new indexed genome with annotations and SJ.out.tab files

STAR --genomeDir ~/db/hg38/SJ_Index/ --readFilesIn sample1.R1.fastq.gz sample1.R2.fastq.gz --readFilesCommand zcat --outSAMunmapped Within --outFileNamePrefix sample1. --runThreadN 30

Notes:

  • Again, the same command has been run for multiple samples in the for loop, therefore, it will generate mapping files for each sample.
ADD COMMENTlink modified 5 months ago • written 5 months ago by Rajesh Detroja60
1
gravatar for Rajesh Detroja
5 months ago by
INDIA
Rajesh Detroja60 wrote:

SOLVED! Please look at my recent conversations with Alex regarding the issue.

ADD COMMENTlink modified 5 months ago • written 5 months ago by Rajesh Detroja60
0
gravatar for darbinator
5 months ago by
darbinator190
darbinator190 wrote:

I refer you to the post that I created on the google group linked to the STAR tool, I had the same question as you about the commands related to the 2 pass-mode and the author of STAR answered me:

https://groups.google.com/forum/#!msg/rna-star/4dhcEGFMiK0/XoMh6rB7CwAJ

ADD COMMENTlink written 5 months ago by darbinator190
2

According to one of the latest post of Alex, he suggested the following criteria for the filtration:

1. Filter out the junctions on chrM, those are most likely to be false.

2. Filter out non-canonical junctions (column5 == 0).

3. Filter out junctions supported by multi mappers only (column7==0)

4. Filter out junctions supported by too few reads (e.g. column7<=2)

ADD REPLYlink modified 5 months ago • written 5 months ago by Rajesh Detroja60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1713 users visited in the last hour