getfasta won't read my database
1
0
Entering edit mode
6.8 years ago

I have made a local NCBI database on my computer, and it works. I am now trying to extract sequences from the database after i blast it. I am using bedtools version 2.26 and trying to use the getfasta command. The issue is when i run it I get : The requested fasta database file (C:UsersOwnerDesktopLocalDBdatabase.fasta) could not be opened. Exiting! I am on a windows computer and using Bash on Ubuntu as the terminal.

$ bedtools getfasta -fo test -tab -fi C:\Users\Owner\Desktop\LocalDB\database.fasta -bed C:\Users\Owner\Desktop\LocalDB\Book1.txt

is the original command I inputted. I have a bed file of Book1 but i heard it will read txt files just fine. but for now the issue is reading the database. Thanks!!!

Bedtools getfasta NCBI database • 2.9k views
ADD COMMENT
1
Entering edit mode

I added code markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

ADD REPLY
0
Entering edit mode

If it is an NCBI blast+ database I don't think bedtools is going to work. Are you trying to retrieve sequences for intervals in a bed file? You will need to use blastdbcmd command to retrieve fasta sequences you want from the blast database into a new file. Once you have them in a new set, you can use bedtools getfasta to get intervals you need.

ADD REPLY
0
Entering edit mode

So i was toying around with it and I found the first problem which is Ubuntu doesn't read with backslashes and I have to add /mnt/ before i go into anyfiles. I now get index file /mnt/c/Users/Owner/Desktop/LocalDB/database.fasta.fai not found, generating...

ERROR: embedded newline at line 11347113 within sequence lcl|CP006468.1_cds_H780_YJM996L05288_455 [gene=GAB1] [protein=Gab1p] [protein_id=AJV86731.1] [location=2110728..2111912]
File not suitable for fasta index generation.

The idea is that I blast a sequence against my database and I pull out all the sequences that match it for downstream analysis. Can i use blast command to do that?

ADD REPLY
2
Entering edit mode
6.8 years ago
GenoMax 141k

Once you have your blast hits you can retrieve those sequences from the blast database using blastdbcmd. Look at the inline help for that command to see how to use it.

ADD COMMENT

Login before adding your answer.

Traffic: 2080 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6