SAX2: Filters

The SAX interface assumes two basic streams:

  1. a stream of requests flowing from the application to the SAX driver; and
  2. a stream of events (and other information) flowing from the SAX driver to the application.

With SAX1, programmers quickly realized that it was possible to extend this model to support a processing chain, where requests could flow through several different components, or filters, before arriving at the original SAX driver, and events could flow through the same filters before arriving at the application. Each filter can make changes to the stream of events as it passes through, but the whole chain of filters still appears to be a single SAX driver to the application.

SAX2 formalizes this design technique by adding a new interface, org.xml.sax.XMLFilter, and a new helper class, org.xml.sax.XMLFilterImpl.

The XMLFilter interface itself is very simple, extending the basic XMLReader interface with two additional methods:

public interface XMLFilter extends XMLReader
{
  public abstract void setParent (XMLReader parent);
  public abstract XMLReader getParent ();
}

In other words, a SAX2 filter is simply an XMLReader that has another XMLReader as its parent.

In normal use, a filter will implement not only the XMLFilter interface but also one or all of the various resolver and handler interfaces (EntityResolver, DTDHandler, ContentHandler, and ErrorHandler). To the parent XML reader, the filter is the client application receiving the events; to the client application, the filter is the SAX driver producing the events.

The XMLFilterImpl helper class provides a convenient base for deriving SAX2 filters. This class implements the XMLFilter, EntityResolver, DTDHandler, ContentHandler, and ErrorHandler interfaces. By default, it passes all events on unmodified, but the derived filter can override specific methods.

Here's an example of a very simple filter that changes the Namespace URI http://www.foo.com/ns/ to http://www.bar.com/ wherever it appears in an element name (but not an attribute name):

public class FooFilter extends XMLFilterImpl
{
  public FooFilter ()
  {
  }

  public FooFilter (XMLReader parent)
  {
    super(parent);
  }


  /**
   * Filter the Namespace URI for start-element events.
   */
  public void startElement (String uri, String localName,
                            String qName, Attributes atts)
    throws SAXException
  {
    if (uri.equals("http://www.foo.com/ns/")) {
      uri = "http://www.bar.com/ns/";
    }
    super.startElement(uri, localName, qName, atts);
  }


  /**
   * Filter the Namespace URI for end-element events.
   */
  public void endElement (String uri, String localName, String qName)
    throws SAXException
  {
    if (uri.equals("http://www.foo.com/ns/")) {
      uri = "http://www.bar.com/ns/";
    }
    super.endElement(uri, localName, qName);
  }

}

Note the use of super.startElement and super.endElement to send the event on to the client. In a real filter, it would be good to override the ContentHandler.startPrefixMapping and ContentHandler.endPrefixMapping methods as well.

Long filter chains are not always the best approach, but you will find that it is sometimes easier to build complex XML applications if you can break them down into a collection of simple SAX filters, each one reading the events from its parent.


$Id$