Error with STAR
0
0
Entering edit mode
13 months ago
Chris ▴ 260

Hi bioinformaticians,

I am still learning to use STAR and could not pass this error after googling the errors

STAR --runThreadN 8 \
  --runMode alignReads \
  --genomeDir /home/groups/user/genomeDir \
  --readFilesIn /home/groups/user/bulk_RNA-seq/sample1/sample1_1.fq \
  --outSAMtype BAM Unsorted

EXITING because of fatal input ERROR: could not open readFilesIn=Read1

Would you please tell what I miss here? For pair end read, path to read 2 will following read 1 with a space or a comma? I appreciate your help!

Update: I could read the fastq file now but still not sure about the input for the pair end read.

STAR • 1.6k views
ADD COMMENT
1
Entering edit mode

Hi! Separate read1 and read2 files with a space, see STAR manual here.

ADD REPLY
0
Entering edit mode

Thank you! The other manual I read use [] for read 2 which made me confused.

https://physiology.med.cornell.edu/faculty/skrabanek/lab/angsd/lecture_notes/STARmanual.pdf

ADD REPLY
1
Entering edit mode

[] refers to an optional parameter, not a literal []. Command line tools will never ask for [] or () or {} as these are all shell metacharacters.

What the manual is saying is that you can give a single FQ file if the FQ is Single Ended and 2 FQs in a space separated manner if you have PE FQs.

ADD REPLY
0
Entering edit mode

Thank you for the knowledge! If I only provide read 1, STAR creates a lot of files. If I provide both reads, there are fewer files created by STAR. Would you give a comment on this?

ADD REPLY
1
Entering edit mode

STAR creates a lot of files

Results or temporary files? What about directories (Look for directory name patterns: *_STARgenome, *_STARtmp, *_STARpass1)

ADD REPLY
0
Entering edit mode

I ran again with only read 1 and the number of files created is the same as providing both reads. What happened if I use only the output of read 1 for the next step with featurecounts?

I have these files:
testAligned.out.bam
testLog.final.out
testLog.out
testLog.progress.out
testSJ.out.tab but not as you wrote in red.

ADD REPLY
1
Entering edit mode

I'm using code formatting to present content better. See how it looks when I format your content:


I ran again with only read 1 and the number of files created is the same as providing both reads. What happened if I use only the output of read 1 for the next step with featurecounts?

I have these files:

testAligned.out.bam
testLog.final.out
testLog.out
testLog.progress.out
testSJ.out.tab

but not as you wrote in red.

ADD REPLY
0
Entering edit mode

Thank you for your reply. I thought it is the output so I didn't put in code formatting. Is that the post you mean to answer my question about using SE and PE?

mapping PE or SE?

ADD REPLY
0
Entering edit mode

I thought it is the output so I didn't put in code formatting.

Anything you're copy pasting from a terminal/console is better off in code formatting. If it's text generated by a program, (think output or error produced by a command), code formatting will format it best.

There might be multiple posts on this forum and elsewhere addressing SE vs PE mapping - all I'm saying is that it's a different topic and should not be discussed in this thread.

ADD REPLY
1
Entering edit mode

Your question is now about the difference between using SE and PE reads in RNAseq quantification, which is sufficiently different from your top level question to warrant being its own post (after searching the forum for existing posts that already address this question)

ADD REPLY
0
Entering edit mode

Hi Ram,

I ran again and got 10 files output instead of 5 for providing only read 1:

testsampleAligned.out.bam
testsampleLog.out
testsampleSJ.out.tab
testLog.final.out
testLog.progress.out
testsampleLog.final.out
testsampleLog.progress.out
testAligned.out.bam
testLog.out
testSJ.out.tab

Would you have a comment on this?

ADD REPLY
1
Entering edit mode

All the .out files are logs, so exclude the from the outputs category. You only have BAM and the SJ tab as the actual output.

Also, correct me if I'm wrong but this looks like a cumulative set of files. As in, you ran again and got 5 more files so with the 5 previous files, that makes 10. I am guessing this because one of your runs has the prefix test and the other has the prefix testsample and the rest of the file name components are the same:

test[sample] Log.out
test[sample] Log.progress.out
test[sample] Log.final.out
test[sample] SJ.out.tab #splice junction file
test[sample] Aligned.out.bam #alignment file
ADD REPLY
0
Entering edit mode

I remember I made a temp folder to contain the output files but let me run again to check. Thank you.

ADD REPLY

Login before adding your answer.

Traffic: 2439 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6