Question: Remove Tags In Sbml File
2
gravatar for Michael Schubert
8.6 years ago by
Cambridge, UK
Michael Schubert6.9k wrote:

I'm trying to remove <annotation> tags in an SBML file and combine start- and end tags of the surrounding element if it empty as a consequence of this removal.


<sbml xmlns="&lt;a href=" http:="" www.sbml.org="" sbml="" level2"="" rel="nofollow">http://www.sbml.org/sbml/level2" level="2" metaid="metaid_0000001" version="1">
<listOfSpecies>
    <species compartment="compartment" id="II_f" initialConcentration="1400" metaid="metaid_0000115" name="Fluid phase Factor II">
        <annotation><content>something</content></annotation>
    </species>
</listOfSpecies>
</sbml>

should transform to:


<sbml xmlns="&lt;a href=" http:="" www.sbml.org="" sbml="" level2"="" rel="nofollow">http://www.sbml.org/sbml/level2" level="2" metaid="metaid_0000001" version="1">
<listOfSpecies>
    <species compartment="compartment" id="II_f" initialConcentration="1400" metaid="metaid_0000115" name="Fluid phase Factor II"/>
</listOfSpecies>
</sbml>

Note that the <annotation> tag can be in listOfReactions/reaction as well. I made a rather basic attempt at an XSLT stylesheet but so far the namespaces confuse me ;-)

Any suggestions on how to do this?

edit: removed RDF namespace as I did not add I properly before

xml • 2.3k views
ADD COMMENTlink modified 8.6 years ago by Ludo Cottret0 • written 8.6 years ago by Michael Schubert6.9k
3
gravatar for Michael Schubert
8.6 years ago by
Cambridge, UK
Michael Schubert6.9k wrote:

Solution using Python and libSBML:

from libsbml import *

doc = SBMLReader().readSBMLFromFile("filename.xml")
model = doc.getModel()

model.unsetAnnotation()
for species in model.getListOfSpecies():
    species.unsetAnnotation()
for reaction in model.getListOfReactions():
    reaction.unsetAnnotation()
for param in model.getListOfParameters():
    param.unsetAnnotation()

writeSBMLToFile(doc, "new_filename.xml")
ADD COMMENTlink written 8.6 years ago by Michael Schubert6.9k
2
gravatar for Pierre Lindenbaum
8.6 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum124k wrote:

First your xml document is missing a namespace declaration:

<sbml xmlns="&lt;a href=" http:="" www.sbml.org="" sbml="" level2"="" rel="nofollow">http://www.sbml.org/sbml/level2"
   xmlns:RDF="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   level="2" metaid="metaid_0000001" version="1"
   >
...

Second, as far as I understand, you just want to copy the tag 'species' in the sbml namespace with all its attributes but you want to skip the child nodes.


<xsl:stylesheet xmlns:xsl="&lt;a href="http://www.w3.org/1999/XSL/Transform" "="" rel="nofollow">http://www.w3.org/1999/XSL/Transform'
    xmlns:s="http://www.sbml.org/sbml/level2"
    xmlns:RDF="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

        version='1.0'
        >
<xsl:output method="xml"/>

<xsl:template match="@*|node()">
   <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
   </xsl:copy>
</xsl:template>

<xsl:template match="s:species|s:parameter|s:reaction">
<xsl:element name="{local-name()}" namespace="&lt;a href=" http:="" www.sbml.org="" sbml="" level2"="" rel="nofollow">http://www.sbml.org/sbml/level2">
<xsl:apply-templates select="@*"/>
</xsl:element>
</xsl:template>

</xsl:stylesheet>

xsltproc ~/file.xsl file.xml


<sbml xmlns="&lt;a href=" http:="" www.sbml.org="" sbml="" level2"="" rel="nofollow">http://www.sbml.org/sbml/level2" xmlns:RDF="http://www.w3.org/1999/02/22-rdf-syntax-ns#" level="2" metaid="metaid_0000001" version="1">
<listOfSpecies>
    <species compartment="compartment" id="II_f" initialConcentration="1400" metaid="metaid_0000115" name="Fluid phase Factor II"/>
</listOfSpecies>
</sbml>
ADD COMMENTlink modified 8.6 years ago • written 8.6 years ago by Pierre Lindenbaum124k

Thanks for the answer. When I try "xalan your.xslt my.xml" (xsltproc fails with a library error) no annotations are deleted. Another thing is that the [?] tag can occur inside [?], [?], and [?] tags, is there a way to match all of them?

ADD REPLYlink written 8.6 years ago by Michael Schubert6.9k

Thanks for the answer. When I try "xalan your.xslt my.xml" (xsltproc fails with a library error) no annotations are deleted. Another thing is that the [?] tag can occur inside [?], [?], and [?] tags (and there can be other nested tags that should not be deleted), is there an easy way to account for this as well?

ADD REPLYlink written 8.6 years ago by Michael Schubert6.9k

Michael, for the second problem I've updated the stylesheet according to your needs.

ADD REPLYlink written 8.6 years ago by Pierre Lindenbaum124k

For xalan, as far as i can see, the command line requires some options: http://xml.apache.org/xalan-j/commandline.html ( -IN, -XSL , -OUT ...)

ADD REPLYlink modified 11 weeks ago by RamRS25k • written 8.6 years ago by Pierre Lindenbaum124k
$ xalan --help

prints

Usage: Xalan [options] source stylesheet

So I think my usage should be fine. Gonne give it a closer look when I'm on my own machine again, thanks.

ADD REPLYlink modified 11 weeks ago by RamRS25k • written 8.6 years ago by Michael Schubert6.9k
1
gravatar for Jukka Matilainen
8.6 years ago by
Jukka Matilainen10 wrote:

Here's a generic XSLT solution for removing XML elements. The stylesheet below assumes you want to delete sbml:annotation elements, but that can be easily changed.

<xsl:stylesheet version="1.0" xmlns:xsl="&lt;a href=" http:="" www.w3.org="" 1999="" XSL="" Transform"="" rel="nofollow">http://www.w3.org/1999/XSL/Transform"
                xmlns:sbml="http://www.sbml.org/sbml/level2">

  <xsl:output method="xml" indent="yes"/>

  
  <xsl:strip-space elements="*"/>

  
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

  
  <xsl:template match="sbml:annotation"/>

</xsl:stylesheet>

This is based on the identity transformation which copies everything as it is. This is a useful starting point for stylesheets where you want the output to be almost like the input and to only make a "small" change, e.g. removing something specific or adding something specific.

The identity transform is overridden by the empty template matching sbml:annotation, producing nothing to the output, therefore removing the elements.

The xsl:strip-space top-level element is used so that whitespace-only nodes get stripped the whole document, allowing the empty-element tag (combined start and end tag) to be used where an element would only contain whitespace after removing elements. If you want to preserve whitespace-only text nodes somewhere else in the document, you may want to use something more specific than * to specify from which elements whitespace-only nodes should be stripped.

ADD COMMENTlink written 8.6 years ago by Jukka Matilainen10
1
gravatar for Heikki
8.6 years ago by
Heikki360
Heikki360 wrote:

A solution using perl and XML::Twig. I find XML::Twig usually easier to write and understand that XSLT (although the code above by Jukka is an example in clarity). Run as 'perl code.pl file.sbml > newfile.sbml':

use Modern::Perl;
use XML::Twig;

my $twig_handlers = {
    # remove a tag conditionally
    'annotation'     => sub { $_->delete if $_->text =~ /something/ },
    # output and free memory
    'listOfSpecies'  => sub { $_[0]->flush } 
};

my $twig = XML::Twig->new(
    TwigHandlers => $twig_handlers,
    KeepEncoding => 1,
    pretty_print => 'indented'
);

my $file = shift;
$twig->parsefile($file);
ADD COMMENTlink written 8.6 years ago by Heikki360
0
gravatar for Ludo Cottret
8.6 years ago by
Ludo Cottret0 wrote:

Hi !

On Linux, using sed, it's quite easy :

sed /^.annotation.$/d file.xml > newfile.xml

Ludo

ADD COMMENTlink written 8.6 years ago by Ludo Cottret0
1

It's about removing everything between the start and end tags, not only the tags ;)

ADD REPLYlink written 8.6 years ago by Michael Schubert6.9k

Ok, there is a small error in my code. The correct command line is : sed /^.\<annotation\&gt;.< em="">$/d test > newfile.xml This will remove all the lines containing the tag "<annotation>". However, this won't work if there are several lines of annotation...

ADD REPLYlink written 8.2 years ago by Ludo Cottret0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1206 users visited in the last hour