A Python script for searching XML files for patterns specified in terms of XPath expressions.
Added tag 2.12 for changeset e14572e99e53
add support for namespace declarations on non-root elements




xgrep.py is a Python 3 script for searching for elements in XML files, using XPath 1.0 expressions.

The script is released 'as is' with no warranty under the GNU General Public License, version 2.0.


It requires the following Python 3 packages:

On Debian-based systems, the prerequisites can be installed as follows:

sudo apt-get install python3-blessings python3-lxml


The script is to be used as follows:

usage: xgrep.py [-h] [-a] [-c] [-C] [-i] [-l] [-L] [-m] [-M]
                [-n] [-N] [-p] [-P] [-q] [-r ns] [-s] [-v]
                expr file [file ...]

positional arguments:
  expr                  XPath 1.0 expression
  file                  XML file

optional arguments:
  -h, --help            show this help message and exit
  -a, --abbreviate      abbreviate matches
  -c, --count           count matches
  -C, --force-color     preserve color and formatting when piping output
  -i, --indent          indent matches
  -l, --files-with-matches
                        output list of matching files
  -L, --files-without-match
                        output list of non-matching files
  -m, --matches         output list of matches
  -M, --files-and-matches
                        output list of files and matches
  -n, --line-number     output line number of match start
  -N, --declare-ns      declare namespaces in matches
  -p, --pis             preserve processing-instructions in output
  -P, --comments        preserve comments in output
  -q, --quiet           only return exit status
  -r ns, --regex ns     namespace prefix for EXSLT regular expressions
  -s, --spaces          normalize whitespace to spaces
  -v, --version         show program's version number and exit

Normally, xgrep.py outputs the matching parts of the XML files together with their file names and the XPath expression. The option -m outputs only the matching parts, without file names or XPath expressions; with -M, the matching parts are prefixed with the corresponding file name. If -r <ns> is set, the EXSLT function <ns>:test() can be used in the XPath expression for matching regular expressions. The option -i indents the matching parts, and the option -N includes namespace declarations. Matching parts can be abbreviated to their first line by means of the option -a. The option -s normalises whitespace to spaces in the output. Processing instructions and comments in the XML files are ignored unless the options -p and -P are used. The -C option preserves color and formatting codes when piping output through GNU less or similar programs.

The options -c, -l, -L, -n, and -q mimic the behaviour of GNU grep. The latter option suppresses any output, but still returns the exit status (0 if there are matches, 1 if there are none, and 2 for errors).

Andreas Nolda (andreas@nolda.org)