How To Determine Whether Proteins Are Contained In Assembled Contigs
0
0
Entering edit mode
10.5 years ago
HG ★ 1.2k

Hi everyone, I want to check whether proteins are contained in assembled contigs or not ?

For this After De-novo assembly I made a local data base of contig file, which are in fasta formate :

makeblastdb -in contigsH131.fasta -dbtype nucl -out ecolidb.db

In next step i want i tried tblastn and getting some error message like:

 ./tblastn -num_threads 2 -comp_based_stats F -query refence.fasta -db ecoli.db -out report.bls

Error msg

BLAST Database error: No alias or index file found for nucleotide database [ecoli.db] in search path [/home/hiren/Desktop/test/ncbi-blast-2.2.28+/bin::

Can anyone please help me out.

Thank you so much.

• 2.9k views
ADD COMMENT
0
Entering edit mode

Have you tried to set the complete db path? It's like the program is looking it up in the blast bin directory.

ADD REPLY
0
Entering edit mode

No i did not set . Because my working path in bin directory and database also present in bin. So do you think should i set complete path??

ADD REPLY
0
Entering edit mode

If I were you, I'd first predict proteins, and then create a protein db to blast against. Anyway, in your makeblastdb command you call the db "ecolidb.db" and then you try to blast against "ecoli.db". There's your problem.

ADD REPLY
0
Entering edit mode

Ok I appreciate your idea. If i am correctly understand you idea : i will predict the protein of reference genome first and make a local protein db then i will blast my contig.db against it. Please correct me if i am wrong ..

ADD REPLY
0
Entering edit mode

Sorry, what I meant to write is that first you predict proteins from your contigs with e.g. FragGeneScan, and then you blast them against a protein db, e.g. nr or refseq_protein, or maybe some more specific db that might serve your goals better. The upside of blasting against nr or refseq_protein is that you can link your hits to KO numbers (if you have access to KEGG FTP) and then see also what modules/pathways are present..

ADD REPLY
0
Entering edit mode

Trying according to your suggestion but getting some error any idea:

./run_FragGeneScan.pl -genome=./mydata/contigsH131.fasta -out=./mydata/contigsH131.test  -complete=1  -train=illumina_10
no. of seqs: 191

Use of uninitialized value $sff in split at ./post_process.pl line 141, <SEQ> line 88230.
ADD REPLY
0
Entering edit mode

No clue, I'm using this version of FGS. If you're predicting from contigs, the command is: ./FragGeneScan -s contigs.fasta -o FGS -w 1 -t complete

ADD REPLY
0
Entering edit mode

ok ...let me check and let you know

ADD REPLY
0
Entering edit mode

Error is like that: any idea

./FragGeneScan -s contigsH131.fasta -o FGS -w 1 -t illumina_1

no. of seqs: 189

Segmentation fault (core dumped)
ADD REPLY
0
Entering edit mode

Is that with the "mg-rast" version I linked, or the one from http://omics.informatics.indiana.edu/FragGeneScan because I also had segmentation errors with the latter..

ADD REPLY
0
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 1563 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6