Default placeholder for missing fields doesn't work with block in xtract tool
0
0
Entering edit mode
14 months ago
Crunk • 0

I am using the xtract tool of the Entrez Direct (EDirect) package of NBCI. I have a list of Accession Numbers:

U47804, U47803, U47802, U47801, U47800, U47799, U47798, X92938

and try to get information about:

  1. Name of organism
  2. Product of gene
  3. Public year

I executed the following command:

for i in {U47804 U47803 U47802 U47801 U47800 U47799 U47798 X92938}; do \
    efetch -db "nucleotide" -id "$i" -format full; done |
    xtract -pattern Seq-entry -element Org-ref_taxname \
              -block Prot-ref_name -def "-" -element Prot-ref_name_E \
              -block Date-std -position first -element Date-std_year

The result I got:

Human immunodeficiency virus 1  gp120   1996
Human immunodeficiency virus 1  gp120   1996
Human immunodeficiency virus 1  gp120   1996
Human immunodeficiency virus 1  gp120   1996
Human immunodeficiency virus 1  gp120   1996
Human immunodeficiency virus 1  gp120   1996
Human immunodeficiency virus 1  gp120   1996
Human immunodeficiency virus 1  1995

Because the X92938 doesn't have <Prot-ref_name>, so I was expecting to replace the missing fields with "-" by the -def tag of the extract tool. It means my desired result is as follows:

Human immunodeficiency virus 1  gp120   1996
Human immunodeficiency virus 1  gp120   1996
Human immunodeficiency virus 1  gp120   1996
Human immunodeficiency virus 1  gp120   1996
Human immunodeficiency virus 1  gp120   1996
Human immunodeficiency virus 1  gp120   1996
Human immunodeficiency virus 1  gp120   1996
Human immunodeficiency virus 1    -     1995

Can someone have experience with the NBCI tool may help me? Thanks a lot.

bash nbci-tools xtract • 389 views
ADD COMMENT

Login before adding your answer.

Traffic: 1544 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6