I have to identify long non-coding RNA transcripts in Arabidopsis thaliana using expressed sequence tags. For that I have downloaded ESTs from dbEST NCBI. Next I need to blastx ESTs to nr database but can not be done through online database due to huge size therefore I am using standalone database. I have to find query ESTs without NR annotations not with NR annotations. I could find any command option in blastx for.

I found following command but it is not for finding no-hits query sequeces:

blastx -query sequence.fasta -db nr -out blastx_results -num_threads 40

Please suggest if we can find query ids or sequences which do not align to nr database (without nr annotations).


I am not able to understand what exactly you are trying to do.

Arabidopsis thaliana lncRNA's are well annotated and can be found/downloaded from here:

Thanks for informing GenoMax. I just need to run the pipeline on a plant specie I will choose some other specie. Can you guide how can I get no hit queries using standalone blast. I have found -outfmt parameter for blastx command with following alignment view options:

 0 = Pairwise,
 1 = Query-anchored showing identities,
 2 = Query-anchored no identities,
 3 = Flat query-anchored showing identities,
 4 = Flat query-anchored no identities,
 6 = Tabular,
 7 = Tabular with comment lines,
 8 = Seqalign (Text ASN.1),
 9 = Seqalign (Binary ASN.1),
10 = Comma-separated values,
11 = BLAST archive (ASN.1),
12 = Seqalign (JSON),
13 = Multiple-file BLAST JSON,
14 = Multiple-file BLAST XML2,
15 = Single-file BLAST JSON,
16 = Single-file BLAST XML2,
18 = Organism Report

I suspect that option 2 and option 4 may be used to find query sequences without hits.

Please confirm if its the right approach to find query sequences without match.


