Question: How to run STAR with multiple files
0
gravatar for Manoj
28 days ago by
Manoj80
United States
Manoj80 wrote:

Hi, I have a total of 197 PE samples (R1 and R2). I am trying to run STAR aligner with all these files simultaneously. I am trying with the following command. However, it seems something wrong with this script. any recommendation thanks much

for i in $(ls raw_data); do echo /DataAnalysis/STAR-2.7.5a/bin/Linux_x86_64/./STAR --genomeDir 
/DataAnalysis/test-star/SAindex \
--readFilesIn raw_data/${i}_R1.fastq,raw_data/${i}_R2.fastq \
--runThreadN 8 --outFileNamePrefix aligned/$i. \
--outSAMtype BAM SortedByCoordinate \
--quantMode GeneCounts; done
ADD COMMENTlink modified 28 days ago by h.mon31k • written 28 days ago by Manoj80
1

There should be a space between the two file names, not a comma.

ADD REPLYlink modified 28 days ago • written 28 days ago by rpolicastro1.9k

I tried but not working.

ADD REPLYlink written 27 days ago by Manoj80
3
gravatar for h.mon
28 days ago by
h.mon31k
Brazil
h.mon31k wrote:

A couple of remarks:

  1. Using ls to feed a loop or an array is not a good idea, better use globing or find (yes, I know the irony, I have advocated using ls exactly in this manner).
  2. As already noted by rpolicastro , STAR expects the input file names separated by a space.
  3. The output of the ls raw_data will include both R1 and R2 files, so the file names you are using will be wrong. They will be something like

    raw_data/file01_R1.fastq_R1.fastq,raw_data/file01_R1.fastq_R2.fastq raw_data/file01_R2.fastq_R1.fastq,raw_data/file01_R2.fastq_R2.fastq raw_data/file02_R1.fastq_R1.fastq,raw_data/file02_R1.fastq_R2.fastq

    and so on.

  4. you have an echo in front of your STAR command, so nothing will be run, the command will be echoed to the screen. This is used to troubleshoot the command, not to run it.

  5. You are missing the --genomeDir argument preceding the index.

Once you fix these issues, try again, and if something goes wrong, please post the error message as well, because "it seems something wrong with this script" is not informative at all.

ADD COMMENTlink written 28 days ago by h.mon31k

I have improved the script. However, it is still showing the following error.

for i in $(raw_data/270_R1.fastq,raw_data/270_R2.fastq raw_data/272_R1.fastq,raw_data/272_R2.fastq 
raw_data/274_R1.fastq,raw_data/274_R2.fastq raw_data/278C_R1.fastq,raw_data/278C_R2.fastq 
raw_data/284C_R1.fastq,raw_data/284C_R2.fastq); 
do 
/DataAnalysis/STAR-2.7.5a/bin/Linux_x86_64/./STAR --genomeDir /DataAnalysis/Manoj-data/test-star/SAindex \
--readFilesIn raw_data/${i}_R1.fastq,raw_data/${i}_R2.fastq \
--runThreadN 8 --outFileNamePrefix aligned/$i. \
--outSAMtype BAM SortedByCoordinate \
--quantMode GeneCounts; done

error:

./star.sh: line 7: raw_data/270_R1.fastq,raw_data/270_R2.fastq: No such file or directory
ADD REPLYlink written 27 days ago by Manoj80
1

Two people have already mentioned above that:

STAR expects the input file names separated by a space

yet you are still using a comma in between file names.

--readFilesIn raw_data/${i}_R1.fastq,raw_data/${i}_R2.fastq
ADD REPLYlink modified 27 days ago • written 27 days ago by genomax91k

I tried with or without a comma. However, it is showing the same error. Also, I tried with ls before raw_data. It is showing the following ERROR.

ls: cannot access raw_data/270_R1.fastq,raw_data/270_R2.fastq: No such file or directory
ls: cannot access raw_data/272_R1.fastq,raw_data/272_R2.fastq: No such file or directory
ls: cannot access raw_data/274_R1.fastq,raw_data/274_R2.fastq: No such file or directory
ls: cannot access raw_data/278C_R1.fastq,raw_data/278C_R2.fastq: No such file or directory
ls: cannot access raw_data/284C_R1.fastq,raw_data/284C_R2.fastq: No such file or directory
ADD REPLYlink written 27 days ago by Manoj80

I did not get what is file01_R1.fastq_R1.fastq. Could you please clarify that?

ADD REPLYlink written 27 days ago by Manoj80
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2056 users visited in the last hour