Hi all, I am trying to retrive trnL sequence using the Entrez esearch utility. I used a code to do it for a long list of organisms for which I have the scientific name and for most of the organisms in the list it worked.
Since for many of them it doesn't work, I tried with the code for retriving only one organism, but still, it returns an empty fasta file. What could be the problem?
The code I used is the following (for one species only):
esearch -db nuccore -query "(trnl[gene]) AND (Abutilon theophrasti[orgn])" | efetch -format fasta >> output.fa
Any ideas? Thanks!
Thank you, I am not sure if I understood your answer, because I already checked on the website and I can find the trnL sequence for many of the plants that the command doesn't return..
As I said, there is no gene named trnL for this species in the databases. If you found something for this species you definitely used a different query.
Abutilon theophrasti[orgn] AND trnlis possibly what you used, note how that is different from the original query. It gives results but these are not gene sequences in the strict sense. For example: https://www.ncbi.nlm.nih.gov/nuccore/HQ696727.1 is the first hit, but look at its annotation:Abutilon theophrasti tRNA-Leu (trnL) gene, partial sequence; trnL-trnF intergenic spacer, complete sequence; and tRNA-Phe (trnF) gene, partial sequence; chloroplast.So this is a composed sequence consisting of part of trnl, the complete intergenic sequence between trnl and trnf and a part of the trnF gene from the chloroplast genome. This is not comparable to what you get from other searches. I don't know what you are aiming at but I would rather ignore these cases where you don't get anything because these non-gene sequences are sort of "broken" sequences. You can of course take a look through your scripts log output to check if an error occurred during some of the queries.Thank you very much, I understood now! :)