Question: Can STAR work with compressed .gz FASTQ input?
2
gravatar for cristian
3.0 years ago by
cristian240
cristian240 wrote:

Dear all,

I would like to align some reads to a reference using STAR. The following command works perfectly:

STAR --genomeDir output/spiking/index/star --readFilesIn reads.fastq --outFileNamePrefix outputFolder --runThreadN 8 > message.txt

However, the following command does not work:

STAR --genomeDir output/spiking/index/star --readFilesIn reads.fastq.gz --outFileNamePrefix outputFolder --runThreadN 8 > message.txt

Error: the read ID should start with @ or >

Presumably, STAR expects a FASTQ file and not a fastq.gz file. Does anybody know how to get round this in an efficient way?

Thanks.

C.

alignment star gz fastq compressed • 11k views
ADD COMMENTlink modified 3.0 years ago by Santosh Anand5.0k • written 3.0 years ago by cristian240
13
gravatar for Santosh Anand
3.0 years ago by
Santosh Anand5.0k
Santosh Anand5.0k wrote:

add --readFilesCommand zcat in your commandline

ADD COMMENTlink modified 3.0 years ago • written 3.0 years ago by Santosh Anand5.0k
2

Thanks. That didn't work for me probably because I am on Mac OSX. But I've found this and 'gunzip -c' instead of 'zcat' worked. https://groups.google.com/forum/#!topic/rna-star/NY56kU3mC64

ADD REPLYlink written 3.0 years ago by cristian240
3

Actually, you can use the bash shell hack <(gunzip -c filename.gz) to pass the gzipped file (or similarly, any other kind of zip file), which doesn't have a built-in mechanism to read the zipped files directly (STAR is awesome in providing the built-in mechanism :). It uses a trick of shell called Process Substitution. In essence, the command inside () is run inside a subshell, and the output of the command is passed as a input to the main command (which is your main program).

ADD REPLYlink written 3.0 years ago by Santosh Anand5.0k

Hi Santosh, Thanks for the reply. I am afraid this sounds a bit too technical for me. Do you know how the whole command would look like using this bash shell hack? C.

ADD REPLYlink written 3.0 years ago by cristian240
2

Suppose STAR was not supporting the gzipped file (reads.fastq.gz), then you would have run STAR like this: STAR --genomeDir output/spiking/index/star --readFilesIn <(gunzip -c reads.fastq.gz) --outFileNamePrefix outputFolder --runThreadN 8 > message.txt . Essentially, you need to put the zipped-file with the unzipping commands inside <()

The same mechanism for any other program which doesn't support zipped file natively.

ADD REPLYlink modified 3.0 years ago • written 3.0 years ago by Santosh Anand5.0k

Thanks, I didn't know about this!

ADD REPLYlink written 3.0 years ago by cristian240
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 827 users visited in the last hour