Problem In Making Blast Database (Makeblastdb)
0
0
Entering edit mode
10.1 years ago
Zealseeker • 0

Hello, I am confused by BLAST. This is the problem: I have made a fasta file as following

>1|DNA (cytosine-5)-methyltransferase 3A
MPAMPSSGPGDTSSSAAEREEDRKDGEEQEEPRGKEERQEPSTTARKVGRPGRKRKHPPV
ESGDTPKDPAVISKSPSMAQDSGASELLPNGDLEKRSEPQPEEGSPAGGQKGGAPAEGEG
AAETLPEASRAVENGCCTPKEGRGAPAEAGKEQKETNIESMKMEGSRGRLRGGLGWESSL
RQRPMPRLTFQAGDPYYISKRKRDEWLARWKREAEKKAKVIAGMNAVEENQGPGESQKVE
EASPPAVQQPTDPASPTVATTPEPVGSDAGDKNATKAGDDEPEYEDGRGFGIGELVWGKL
RGFSWWPGRIVSWWMTGRSRAAEGTRWVMWFGDGKFSVVCVEKLMPLSSFCSAFHQATYN
KQPMYRKAIYEVLQVASSRAGKLFPVCHDSDESDTAKAVEVQNKPMIEWALGGFQPSGPK
GLEPPEEEKNPYKEVYTDMWVEPEAAAYAPPPPAKKPRKSTAEKPKVKEIIDERTRERLV
YEVRQKCRNIEDICISCGSLNVTLEHPLFVGGMCQNCKNCFLECAYQYDDDGYQSYCTIC
CGGREVLMCGNNNCCRCFCVECVDLLVGPGAAQAAIKEDPWNCYMCGHKGTYGLLRRRED
WPSRLQMFFANNHDQEFDPPKVYPPVPAEKRKPIRVLSLFDGIATGLLVLKDLGIQVDRY
IASEVCEDSITVGMVRHQGKIMYVGDVRSVTQKHIQEWGPFDLVIGGSPCNDLSIVNPAR
KGLYEGTGRLFFEFYRLLHDARPKEGDDRPFFWLFENVVAMGVSDKRDISRFLESNPVMI
DAKEVSAAHRARYFWGNLPGMNRPLASTVNDKLELQECLEHGRIAKFSKVRTITTRSNSI
KQGKDQHFPVFMNEKEDILWCTEMERVFGFPVHYTDVSNMSRLARQRLLGRSWSVPVIRH
LFAPLKEYFACV

>2|DNA (cytosine-5)-methyltransferase 3B
MKGDTRHLNGEEDAGGREDSILVNGACSDQSSDSPPILEAIRTPEIRGRRSSSRLSKREV
SSLLSYTQDLTGDGDGEDGDGSDTPVMPKLFRETRTRSESPAVRTRNNNSVSSRERHRPS
PRSTRGRQGRNHVDESPVEFPATRSLRRRATASAGTPWPSPPSSYLTIDLTDDTEDTHGT
PQSSSTPYARLAQDSQQGGMESPQVEADSGDGDSSEYQDGKEFGIGDLVWGKIKGFSWWP
AMVVSWKATSKRQAMSGMRWVQWFGDGKFSEVSADKLVALGLFSQHFNLATFNKLVSYRK
AMYHALEKARVRAGKTFPSSPGDSLEDQLKPMLEWAHGGFKPTGIEGLKPNNTQPVVNKS
KVRRAGSRKLESRKYENKTRRRTADDSATSDYCPAPKRLKTNCYNNGKDRGDEDQSREQM
ASDVANNKSSLEDGCLSCGRKNPVSFHPLFEGGLCQTCRDRFLELFYMYDDDGYQSYCTV
CCEGRELLLCSNTSCCRCFCVECLEVLVGTGTAAEAKLQEPWSCYMCLPQRCHGVLRRRK
DWNVRLQAFFTSDTGLEYEAPKLYPAIPAARRRPIRVLSLFDGIATGYLVLKELGIKVGK
YVASEVCEESIAVGTVKHEGNIKYVNDVRNITKKNIEEWGPFDLVIGGSPCNDLSNVNPA
RKGLYEGTGRLFFEFYHLLNYSRPKEGDDRPFFWMFENVVAMKVGDKRDISRFLECNPVM
IDAIKVSAAHRARYFWGNLPGMNRPVIASKNDKLELQDCLEYNRIAKLKKVQTITTKSNS
IKQGKNQLFPVVMNGKEDVLWCTELERIFGFPVHYTDVSNMGRGARQKLLGRSWSVPVIR
HLFAPLKDYFACE
......

the number before '|' is the protein 'id' and the string behind '|' refers to the name of the protein. Both of them are important information.

But when I execute

makeblastdb -in targets.fasta -out targets -dbtype prot

It seems stuck. The disk indicator light of my computer is alway blink or keeping light, and my computer becomes very slow. The CMD(windows OS) closed automatically after several minutes without anything changed.

This fasta file is smaller than 1M. Generally it only cost less than 1 sec even the a file is larger than 1M.

blast • 3.2k views
ADD COMMENT
0
Entering edit mode

Is there a gap between two sequences?

ADD REPLY
0
Entering edit mode

I use '\r\n' as the line break. >1|DNA...[name]\r\nMKG...[seq]\r\n>2...

ADD REPLY
0
Entering edit mode

This works for me without any error. makeblastdb -in in.fasta -dbtype prot -out blast.out -parse_seqids Could you try with a set of sequences at first and check if it still throws error !

ADD REPLY
0
Entering edit mode

Thank you for your advice. It works after I copy a set of sequences into a new file, and find that if I copy the whole sequences to a new text file, it still works. I used python to crate the file which can't work. Maybe it's a problem of code? I changed the encoding of the "wrong file" into unicode just now and it works. So amazing! (Poor English, hope you can understand rightly.)

ADD REPLY
0
Entering edit mode

Yes, this should work and there is no obvious problem in what you show, which version are you using? Have you tried with a single sequence first?

ADD REPLY

Login before adding your answer.

Traffic: 2918 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6