How to convert an abstract retrieved from pubmed (NCBI) into NlProt format?
1
0
Entering edit mode
5.1 years ago
olima.marx • 0

I am trying to use a text mining tool called NlProt, but for to do so i need to convert a set of pubmed abstracts into NlProt format, which is something like:

plain natural language text (each line = one abstract/paper)
lines have to start with number followed by ">" and then the text
e.g. 0001>abstract1 abstract1 abstract1 ...


Can anyone help me?

PS: This is an example of the pubmed abstract format

1. Biotechnol Prog. 2017 May 27. doi: 10.1002/btpr.2508. [Epub ahead of print]

Enhanced expression of cysteine-rich antimicrobial peptide snakin-1 in
Escherichia coli using an aggregation-prone protein coexpression system.

Kuddus MR(1)(2), Yamano M(1), Rumi F(1), Kikukawa T(1)(3), Demura M(1)(3), Aizawa
T(1)(3).

Author information:
(1)Graduate School of Life Science, Hokkaido University, Sapporo, Hokkaido,
060-0810, Japan.
(2)Dept. of Pharmaceutical Chemistry, Faculty of Pharmacy, University of Dhaka,
(3)Global Station for Soft Matter, Global Inst. for Collaborative Research and
Education, Hokkaido University, Sapporo, Japan.

Snakin-1 (SN-1) is a cysteine-rich plant antimicrobial peptide and the first
purified member of the snakin family. SN-1 shows potent activity against a wide
range of microorganisms, and thus has great biotechnological potential as an
antimicrobial agent. Here, we produced recombinant SN-1 in Escherichia coli by a
previously developed coexpression method using an aggregation-prone partner
protein. Our goal was to increase the productivity of SN-1 via the enhanced
formation of insoluble inclusion bodies in E. coli cells. The yield of SN-1 by
the coexpression method was better than that by direct expression in E. coli
cells. After refolding and purification, we obtained several milligrams of
functionally active SN-1, the identity of which was verified by MALDI-TOF MS and
NMR studies. The purified recombinant SN-1 showed effective antimicrobial
activity against test organisms. Our studies indicate that the coexpression
method using an aggregation-prone partner protein can serve as a suitable
expression system for the efficient production of functionally active SN-1. ©
2017 American Institute of Chemical Engineers Biotechnol. Prog., 2017.

© 2017 American Institute of Chemical Engineers.

DOI: 10.1002/btpr.2508
PMID: 28556600

text mining • 997 views
0
Entering edit mode

Hello olima.marx!

It appears that your post has been cross-posted to another site: https://stackoverflow.com/questions/44800109

This is typically not recommended as it runs the risk of annoying people in both communities.

0
Entering edit mode
5.1 years ago
$for PMID in 28556600 28556601 28556602 28556604 ; do echo -en "${PMID}> " && curl  -s "https://www.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=\${PMID}&retmode=xml" | xmllint --format --xpath "//AbstractText/text()" - && echo; done

28556600> Snakin-1 (SN-1) is a cysteine-rich plant antimicrobial peptide and the first purified member of the snakin family. SN-1 shows potent activity against a wide range of microorganisms, and thus has great biotechnological potential as an antimicrobial agent. Here, we produced recombinant SN-1 in Escherichia coli by a previously developed coexpression method using an aggregation-prone partner protein. Our goal was to increase the productivity of SN-1 via the enhanced formation of insoluble inclusion bodies in E. coli cells. The yield of SN-1 by the coexpression method was better than that by direct expression in E. coli cells. After refolding and purification, we obtained several milligrams of functionally active SN-1, the identity of which was verified by MALDI-TOF MS and NMR studies. The purified recombinant SN-1 showed effective antimicrobial activity against test organisms. Our studies indicate that the coexpression method using an aggregation-prone partner protein can serve as a suitable expression system for the efficient production of functionally active SN-1. © 2017 American Institute of Chemical Engineers Biotechnol. Prog., 2017.
28556601> HLA-A*02:683 is most similar to 4 different HLA-A*02 subtypes with a single nucleotide difference.
28556602> The hydrogen-abstraction/acetylene-addition mechanism has been fundamental to unravelling the synthesis of polycyclic aromatic hydrocarbons (PAHs) detected in combustion flames and carbonaceous meteorites like Orgueil and Murchison. However, the fundamental reaction pathways accounting for the synthesis of complex PAHs, such as the tricyclic anthracene and phenanthrene along with their dihydrogenated counterparts, remain elusive to date. By investigating the hitherto unknown chemistry of the 1-naphthyl radical with 1,3-butadiene, we reveal a facile barrierless synthesis of dihydrophenanthrene adaptable to low temperatures. These aryl-type radical additions to conjugated hydrocarbons via resonantly stabilized free-radical intermediates defy conventional wisdom that PAH growth is predominantly a high-temperature phenomenon and thus may represent an overlooked path to PAHs as complex as coronene and corannulene in cold regions of the interstellar medium like in the Taurus Molecular Cloud.
28556604> The problem of technical complications in implants and implant-supported restorations has existed for decades. The most frequent complication is the loosening of the fixing screw, which although is not catastrophic, if it occurs repeatedly, it may affect the success of the implant therapy and the patient satisfaction. Factors that affect the frequency of prosthetic complications include: the implant-abutment connection, para-functional habits, cantilevers, and the type of restoration. Regarding the implant-abutment connection, the first systems were those with an external hexagon. Because of their small height and the disadvantages that this entails, other connection types were developed, such as those of hexagonal and conical connection, which decreased the complication rates, including the loosening of the fixing screw. On the dilemma "cement- or screw-retained restoration", the choice depends on biological, technical, and aesthetic factors. Cement-retained restorations are simpler in construction with lower cost and clinicians are more familiar with the clinical procedure. On the other side, if the fixing screw of the abutment is loosened in a cement-retained restoration, it may be a difficult and demanding clinical task to fix this prosthetic complication. Screw-retained restorations are more prone to loosening of the fixing screw, but allow easy retrievability and repair. Their use however, is often restricted because of diverting or unfavorable inclination of the alveolar ridge and the implant. The aim of this article was to present clinical solutions for the complication of screw-loosening through clinical examples and discuss the factors that may predispose to its occurrence.The problem of technical complications in implants and implant-supported restorations has existed for decades. The most frequent complication is the loosening of the fixing screw which, although is not catastrophic, if it occurs repeatedly, it may affect the success of the implant therapy and the patient satisfaction. Factors that affect the frequency of prosthetic complications include: the implant-abutment connection, para-functional habits, cantilevers and the type of restoration. The treatment options for the clinician are limited but certain preventing measures during construction of the restoration may be helpful to overcome this clinical problem.