Question: Can STAR work with compressed .gz FASTQ input?
2
gravatar for cristian
2.1 years ago by
cristian220
cristian220 wrote:

Dear all,

I would like to align some reads to a reference using STAR. The following command works perfectly:

STAR --genomeDir output/spiking/index/star --readFilesIn reads.fastq --outFileNamePrefix outputFolder --runThreadN 8 > message.txt

However, the following command does not work:

STAR --genomeDir output/spiking/index/star --readFilesIn reads.fastq.gz --outFileNamePrefix outputFolder --runThreadN 8 > message.txt

Error: the read ID should start with @ or >

Presumably, STAR expects a FASTQ file and not a fastq.gz file. Does anybody know how to get round this in an efficient way?

Thanks.

C.

alignment star gz fastq compressed • 6.9k views
ADD COMMENTlink modified 2.1 years ago by Santosh Anand4.7k • written 2.1 years ago by cristian220
11
gravatar for Santosh Anand
2.1 years ago by
Santosh Anand4.7k
Santosh Anand4.7k wrote:

add --readFilesCommand zcat in your commandline

ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by Santosh Anand4.7k
2

Thanks. That didn't work for me probably because I am on Mac OSX. But I've found this and 'gunzip -c' instead of 'zcat' worked. https://groups.google.com/forum/#!topic/rna-star/NY56kU3mC64

ADD REPLYlink written 2.1 years ago by cristian220
2

Actually, you can use the bash shell hack <(gunzip -c filename.gz) to pass the gzipped file (or similarly, any other kind of zip file), which doesn't have a built-in mechanism to read the zipped files directly (STAR is awesome in providing the built-in mechanism :). It uses a trick of shell called Process Substitution. In essence, the command inside () is run inside a subshell, and the output of the command is passed as a input to the main command (which is your main program).

ADD REPLYlink written 2.1 years ago by Santosh Anand4.7k

Hi Santosh, Thanks for the reply. I am afraid this sounds a bit too technical for me. Do you know how the whole command would look like using this bash shell hack? C.

ADD REPLYlink written 2.1 years ago by cristian220
2

Suppose STAR was not supporting the gzipped file (reads.fastq.gz), then you would have run STAR like this: STAR --genomeDir output/spiking/index/star --readFilesIn <(gunzip -c reads.fastq.gz) --outFileNamePrefix outputFolder --runThreadN 8 > message.txt . Essentially, you need to put the zipped-file with the unzipping commands inside <()

The same mechanism for any other program which doesn't support zipped file natively.

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by Santosh Anand4.7k

Thanks, I didn't know about this!

ADD REPLYlink written 2.1 years ago by cristian220
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1322 users visited in the last hour