Database for local BLAST and BLAT for 200-1000bp query sequences in human and primates
1
1
Entering edit mode
6.8 years ago
rmartson ▴ 30

I've installed BLAST and BLAT on my computer (OSX) but am confused with how many different files there are to download to set up a database on my computer and how to go about it. Can I use one with both BLAST and BLAT? Should I get the partially non-redundant database?

blast blat • 2.6k views
ADD COMMENT
2
Entering edit mode
6.8 years ago

Once you have installed blat, you can download a FASTA file for your genome of interest from UCSC. If you wanted hg38, for instance:

$ for chr in `seq 1 22` X Y; do echo $chr; wget -qO- http://hgdownload.cse.ucsc.edu/goldenpath/hg38/chromosomes/chr$chr.fa.gz | gunzip -c - >> hg38.fa; done

You can use this with blat as described in my answer here.

You could run makeblastdb to make BLAST databases:

$ makeblastdb -in hg38.fa -parse_seqids -dbtype nucl

Or download nucleotide databases from NCBI.

ADD COMMENT
0
Entering edit mode

Hey, I appreciate your help on my other question. I have the hg38.fa database file and can run BLAT searches on it now. However, the psl2bed script you recommended returns a "segmentation fault" error

/usr/local/bin/psl2bed: line 140:  7344 Segmentation fault: 11  ${cmd} ${options} - 0<&0

I don't think I'm doing anything wrong personally

Is there any other way I can work with psl data? Maybe some way I could retrieve sequences from pslx data and convert it into FASTA somehow?

ADD REPLY
1
Entering edit mode

It may be easier to build binaries from source, if you are using pre-built binaries.

ADD REPLY
0
Entering edit mode

I'm on Mac OSX and just used the .pkg installer so there shouldn't have been any issues

ADD REPLY
0
Entering edit mode

Hmm, well, if you can put your PSL file somewhere I can look at, so that I can debug on my end, that would be handy. Now is a good time as we're getting close to a 2.4.27 release. Let me know if you'd like me to take a look at your file.

ADD REPLY
0
Entering edit mode

Oh wow, didn't know you were a developer. That's happened to me twice now

Here's a temporary pastebin if that works: https://pastebin.com/6QxK1mqG

I just downloaded and installed bedops 2.1.2 for mac from this site: https://bedops.readthedocs.io/en/latest/content/installation.html#mac-os-x

And then got this:

psl2bed < output.psl > output.bed
/usr/local/bin/psl2bed: line 140:  7336 Segmentation fault: 11  ${cmd} ${options} - 0<&0
ADD REPLY
0
Entering edit mode

Thanks, I'll take a look soon. I'm not sure if you're getting an older version of the BEDOPS pkg installer? We're up to v2.4.26, at the moment. You might be getting the right pkg, but you might double-check that, just in case.

ADD REPLY
0
Entering edit mode

Sorry, I installed an older version on that page

Just found the latest installer on github, thank you.

Edit: I updated and got the same error again though. Just a slightly different error message.

/usr/local/bin/psl2bed: line 140:  8282 Segmentation fault: 11  ${cmd} ${options} - 0<&0
ADD REPLY
0
Entering edit mode

Thanks. Looks like I can repeat the problem on my end. I'll follow up when I know more.

ADD REPLY
0
Entering edit mode

I have patched psl2bed to fix this issue. It may still be a day or two before a new release is out. If you need something now, you could build this from source.

If you need to install a compiler on your OS X computer, you could do the following:

$ xcode-select --install

Once that's out of the way, or if you already have a compiler toolkit installed, you can do the following

$ cd /tmp
$ git clone https://github.com/bedops/bedops.git
$ cd bedops
$ git checkout v2p4p27_mergedPoolMemory 
$ make
$ make install
$ cp bin/* /usr/local/bin

Then run psl2bed, as described above.

ADD REPLY

Login before adding your answer.

Traffic: 2878 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6