I try to use blastx compare Trinity.fasta file with swissprot db (sp) and Uniref90 using Linux base. SO, I have to download swissprot db (sp) and Uniref90 in fasta file from http://www.uniprot.org/downloads
After download I got uniref90.fasta.gz.part and I can not extract this file. How can I extract this file?.
If you have some guidance about functional annotation using Trinity fasta file please let me know.
Thank you so much for your information
Dear toralmanvar, Can I ask again about uniprot_sprot.fasta (sp). I downloaded this fasta file from same website above but when I run follow this command
FORMAT="6 qseqid sseqid evalue stitle" EVALUE=1.0e-5 QUERY_CODE=1 MAX_TARGET_SEQ=1 NCPU=4 Home_blastx=/usr/local/bin/blastx OUTF=
blastx -query $QUERY \ -db $DB \ -evalue $EVALUE \ -query_gencode $QUERY_CODE \ -max_target_seqs $MAX_TARGET_SEQ \ -num_threads $NCPU \ -outfmt "$FORMAT" \ -out $OUTF
why is showed ; BLAST Database error: No alias or index file found for protein database [/run/media/hscience/DATA_CentOS/DATABASE/db/uniPROT/uniprot_sprot.fasta] in search path [/run/media/hscience/DATA_CentOS/Bluberry/RNAseqblueberry/4_RSEM_edgeR/edgeR.genes.dir/P1e-10_C8/PickIsoform_Echota_UP/Swissprot::]
Do you think it about uniprot_sprot.fasta or some think wrong?
Thank you Kan
Have you formatted your database as I instructed in my previous answer to your query? You are getting this error as the blast is not able to find formatted database. So please format the database using makeblastdb program:
It will result in the generation of 3 files having extension uniref90.fasta.phr, uniref90.fasta.pin and uniref90.fasta.psq.
Once it is generated you can use this formatted database for blast. Remember you have to use database name which you get after formatting. In above example case, it will be uniref90.fasta (i.e name before .phr, .pin and .psq extension)