Can bowtie2-build index for gzipped file?
2
2
Entering edit mode
7.9 years ago
shl198 ▴ 420

Hi all,

When I use bowtie2-build to build index, I cannot use .fa.gz file. I also tried gunzip file.fa.gz | bowtie2-build - name, it didn't work. So bowtie2-build must input unzipped fasta file? thanks.

bowtie2 bowtie2-build gzip gunzip • 7.0k views
ADD COMMENT
0
Entering edit mode
7.9 years ago
Neilfws 49k

In your second attempt, you need the -c flag for gunzip; this outputs the file content to STDOUT so you can pipe it to other commands:

gunzip -c file.fa.gz | bowtie2-build [options]

Looking at the bowtie2 manual, I think bowtie2-build also requires a -c flag in this case. Haven't used it myself so no guarantee that any of this works, apart from the gunzip -c part.

ADD COMMENT
0
Entering edit mode

I tried this, didn't work either...

ADD REPLY
0
Entering edit mode
7.9 years ago
xb ▴ 420

It seems bowtie2-build won't work with pipeline by "|", though bowtie2 does. (Bowtie 2 version 2.2.3)

Try the command below? It works with small .fa.gz files; haven't tested with large files yet !

bowtie2-build -c $(zcat filename.fa.gz | awk '/^>/&&NR>1{printf ","}{ printf "%s",/^>/ ? "":$0 }') filename.bowtie2
ADD COMMENT
1
Entering edit mode

For large files it didn't work. When I build this for mouse genome, it fails and shows

bash: xrealloc: ../bash/subst.c:5179: cannot allocate 18446744071562067968 bytes...
ADD REPLY
1
Entering edit mode

They should really get around to making this a feature.

ADD REPLY

Login before adding your answer.

Traffic: 2533 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6