How do I resolve this error when using tblastn -db?
1
0
Entering edit mode
5.5 years ago
jaqx008 ▴ 110

Hey All, I downloaded an 88mb genome from ncbi and tried to use tblastn to find a protein homolog after building the database for the organism. I encountered the following error, and looking for ways to overcome it. By the way, I have done this beofre and it worked fine but not this time. I checked that line and it looked fine. I assume there might be error in my downloaded. I repeated the download twice and still got the same error.

command

tblastn -db genome -query protein.txt -out file2.txt

Error

Error: NCBI C++ Exception: T0 "/private/tmp/blast-20170511-49816-1rw4ib1/ncbi-blast-2.6.0+-src/c++/src/objtools/readers/fasta.cpp", line 2428: Error: CFastaReader: Near line 1, there's a line that doesn't look like plausible data, but it's not marked as defline or comment. (m_Pos = 1)

mapping tblastn -db • 9.1k views
ADD COMMENT
1
Entering edit mode

I think the error is rather triggered by your input fasta file. Does that one looks OK?

ADD REPLY
0
Entering edit mode

Yes. the fasta looks fine to me. Anyway to check other than visually scanning through?

ADD REPLY
1
Entering edit mode
5.5 years ago

what is the output of

file protein.txt

and

head protein.txt
ADD COMMENT
0
Entering edit mode

file protein.txt gave protein.txt: data head protein.txt gave a binary looking file as outbut (just jagons inside I guess)

ADD REPLY
2
Entering edit mode

you have your answer: this is not a fasta file.

ADD REPLY
0
Entering edit mode

the content of the text file looks like this. I believe its a fasta

 >Cel-Ego1
    MGDEGYRGWIKLEIPCSLPERQMGPIVKCHVAKLEPALNEYNIKVLTKGQVQVVEEQDCEPFYETNYEVATSRFSHDLIA
    AIQTYLKDLSTDHLMPFQRGNLVLHSSDFWSSELTCHLVDIPLAAVFFGNIQGGTFINHWEVSFWDDVRRRKSARTRNTE
    PTQADKIGMNQIKVEFEFDKIDFMTVHFKHFENDFEVADKDAKRTKQTVTMYYQITVRRTSIRRIIVDPVVQDCNGSDRI
    RVHFELNCPVLIRRAYRTAKQESENRHSVPHYRRYLVINRGRSANQYPTAKAITDSPVFTIEFDQSVGLNEIYRLLSRLR
    IRTGVSIEFADIPSIDCLIWRENPYNRWTFLNNQHLSPTHFSAPIYRDFITTAFPKKHEVCGSREVDTNRERKFAITYLL
ADD REPLY
1
Entering edit mode

it's not, file protein.txt would have return something else.

$ echo -e ">Cel-Ego1\nMGDEGYRGWIKLEIPCSLPERQMGPIVKCHVAKLEPALNEYNIKVLTKGQVQVVEEQDCEPFYETNYEVATSRFSHDLIA\nAIQTYLKDLSTDHLMPFQRGNLVLHSSDFWSSELTCHLVDIPLAAVFFGNIQGGTFINHWEVSFWDDVRRRKSARTRNTE" | file -

/dev/stdin: ASCII text

it's a binary file; Something edited with an exotic text editor ?

ADD REPLY
0
Entering edit mode

You were right. I copied the content into a new text file and got something out. thanks for your sugestion

ADD REPLY
0
Entering edit mode

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.

Upvote|Bookmark|Accept

ADD REPLY
0
Entering edit mode

I did upvoted it.

ADD REPLY
0
Entering edit mode

is that a formatting issue in your post the indentation of the sequence?

How did you get to that content? cat tail head?

ADD REPLY
1
Entering edit mode

head returned a binary stream...

ADD REPLY
0
Entering edit mode

exactly, just wanted to know how OP then got any content of that file. And to hint to the fact OP likely edited the file with something other than linux compatible software

ADD REPLY

Login before adding your answer.

Traffic: 1992 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6