Question: BLAST options error (input format type error) in makeblastdb
0
gravatar for 2013302630028
3.2 years ago by
201330263002810 wrote:
When I use `makeblastdb` command to build a protein database by using a fasta file, it is ok. But when combined this fasta file together <pre>makeblastdb</pre> tell me have a BLAST options error.
My BLAST+ version is 2.4.0+, computer system version is ubuntu 16.04 and the detail imformation is below.
sqreb@sqreb-Vostro-1450:~/ORF_culster/blastdb$ makeblastdb -in progbbct1.fasta -input_type fasta -dbtype prot -parse_seqids -title 'bct_protein data in genbank' 


Building a new DB, current time: 07/21/2016 16:05:51
New DB name:   /home/sqreb/ORF_culster/blastdb/progbbct1.fasta
New DB title:  bct_protein data in genbank
Sequence type: Protein
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 50272 sequences in 1.98013 seconds.
makeblastdb -in progbbct2.fasta -input_type fasta -dbtype prot -parse_seqids -title 'bct_protein data in genbank' 


Building a new DB, current time: 07/21/2016 16:06:30
New DB name:   /home/sqreb/ORF_culster/blastdb/progbbct2.fasta
New DB title:  bct_protein data in genbank
Sequence type: Protein
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 70668 sequences in 2.765 seconds.
sqreb@sqreb-Vostro-1450:~/ORF_culster/blastdb$ cat -b progbbct1.fasta progbbct2.fasta >> progbbct_12.fasta
sqreb@sqreb-Vostro-1450:~/ORF_culster/blastdb$ makeblastdb -in progbbct_12.fasta -input_type fasta -dbtype prot -parse_seqids -title 'bct_protein data in genbank'


Building a new DB, current time: 07/21/2016 16:08:07
New DB name:   /home/sqreb/ORF_culster/blastdb/progbbct_12.fasta
New DB title:  bct_protein data in genbank
Sequence type: Protein
Keep MBits: T
Maximum file size: 1000000000B
BLAST options error: progbbct_12.fasta does not match input format type, default input type is FASTA

This is my blast file:

>BAA21794.1|AB000100.1 intrinsic membrane protein|121..912
MVRTPVPLYLRWAVSILSVLAFLAIWQIAAASGFLGKTFPGSLRTLQDLFGWLSDPFFDN
GPNDLGIGWNLLISLRRVAIGYLLATVVAIPLGIAIGMSALASSIFSPFVQLLKPVSPLA
WLPIGLFLFRDSELTGVFVILISSLWPTLINTAFGVANVNPDFLKVSQSLGASRWRTILK
VILPAALPSIIAGMRISMGIAWLVIVAAEMLLGTGIGYFIWNEWNNLSLPNIFSAIIIIG
IVGILLDQGFRFLENQFSYAGNR
>BAA21795.1|AB000100.1 malK-like protein|916..1785
MISEAVPAKEETGQAQLLIEQVGKVFTVNSPSLLDRLRQRSPKRYVALEDVNLTIASNTF
VSIIGPSGCGKSTLLNLIAGLDLPTSGQILLDGQRIRSPGPDRGIVFQNYALMPWMTALE
NVIFAVETARPNLSKSQAREVAREHLELVGLTKAADRYPGQISGGMKQRVAIARALSIRP
KLLLMDEPFGALDALTRGYLQEEVLRIWEANKLSVVLITHSIDEALLLSDRIVVMSRGPR
ATIREVIDLPAVRPRQRSVIEEDERFVKIKLRLEEHLFNETRAVEEASV
...
>BAL47787.1|AB648215.1 isocitrate dehydrogenase|<1..>518
DAAVEKAYKGERKISWMEIYTGEKSTQVYGQDVWLPAETLDLIREYRVAIKGPLTTPVGG
GIRSLNVALRQELDLYICLRPVRYYQGTPSPVKHPELTDMVIFRENSEDIYAGIEWKADS
ADAEKVINFLREEMGVKKIRFPEHCGIGIKPCSEEGTKRLVRAAIEYAIAND
>BAL47788.1|AB648216.1 isocitrate dehydrogenase|<1..>518
DAAVEKAYKGERKISWMEIYTGEKSTQVYGQDVWLPAETLDLIREYRVAIKGPLTTPVGG
GIRSLNVALRQELDLYICLRPVRYYQGTPSPVKHPELTDMVIFRENSEDIYAGIEWKADS
ADAEKVIKFLREEMGVKKIRFPEHCGIGIKPCSEEGTKRLVRAAIEYAIAND
(END)
ADD COMMENTlink modified 3.2 years ago by Sej Modha4.4k • written 3.2 years ago by 201330263002810
1
gravatar for Sej Modha
3.2 years ago by
Sej Modha4.4k
Glasgow, UK
Sej Modha4.4k wrote:

I have tested your command with the sample file provided on BLAST+ version 2.4.0+ and it works.

ADD COMMENTlink modified 3.2 years ago • written 3.2 years ago by Sej Modha4.4k
I combined the genbank files together and transfer into fasta format, it works. But tell me this one.What's wrong with my  BLAST+? thanks.
sqreb@sqreb-Vostro-1450:~/ORF_culster/blastdb$ makeblastdb -in bct_12.fasta -input_type fasta -dbtype prot -parse_seqids -title 'bct_protein data in genbank'


Building a new DB, current time: 07/21/2016 18:03:18
New DB name:   /home/sqreb/ORF_culster/blastdb/bct_12.fasta
New DB title:  bct_protein data in genbank
Sequence type: Protein
Keep MBits: T
Maximum file size: 1000000000B
FASTA-Reader: Ignoring invalid residues at position(s): On line 2: 45-46
FASTA-Reader: Ignoring invalid residues at position(s): On line 3: 45-46
FASTA-Reader: Ignoring invalid residues at position(s): On line 4: 45-46
...
FASTA-Reader: Ignoring invalid residues at position(s): On line 834524: 50-56
FASTA-Reader: Ignoring invalid residues at position(s): On line 834527: 45-51
Adding sequences from FASTA; added 120940 sequences in 20.5835 seconds.

ADD REPLYlink written 3.2 years ago by 201330263002810

There is some bug in my parser and I fix it. Thank you!

ADD REPLYlink written 3.2 years ago by 201330263002810
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 991 users visited in the last hour