XML Stream PAttern Matcher - concise, regexp-like pattern matching on streaming XML.

Cl-xmlspam [], an "XML Stream PAttern Matcher", (also known as xspam) is a library adjunct to cxml to allow simple on-the-fly parsing of streaming XML based on a non-backtracking regular-expression-like parser. It provides a compact syntax for matching, borrowing terminology from RELAX-NG, and also for regular-expression-based parsing of text elements, inspired by Rob Pike's structural regular expressions.

Here is a simple example of its use:

* (with-xspam-source "<?xml version=\"1.0\"?> <a> <b>hello there</b> <b>goodbye</b><e/> <p>345x675 234x754 786x532</p> <q when=\"23-11-2008\"/></a>"
    (element :a
          (element :b (text (matches "[^ ]+" (format t "word ~s~%" _)))))
        (element :p
            (matches "([0-9]+)x([0-9]+)"
              (format t "got point [~s ~s]~%" (_ 0) (_ 1)))))
        (element :q
          (attribute :when
            (match "([0-9]+)-([0-9]+)-([0-9]+)"
              (format t "got date ~s/~s/~s~%" (_ 0) (_ 1) (_ 2))))))))
word "hello"
word "there"
word "goodbye"
got point ["345" "675"]
got point ["234" "754"]
got point ["786" "532"]
got date "23"/"11"/"2008"

Some preliminary documentation can be found here, and one or two examples here.

Cl-xmlspam is currently in a young state - although it has been used to do useful work on real-world XML data, it has so far only been tested under SBCL, and the interface is subject to change. Feedback is much appreciated.

Unofficial Github Mirror: Github