A Python script for searching XML files for patterns specified in terms of XPath expressions.
Added tag 2.12 for changeset e14572e99e53
add support for namespace declarations on non-root elements
subdivide README.md

clone

read-only
https://hg.sr.ht/~nolda/xgrep
read/write
ssh://hg@hg.sr.ht/~nolda/xgrep

#XGrep

xgrep.py is a Python 3 script for searching for elements in XML files, using XPath 1.0 expressions.

The script is released 'as is' with no warranty under the GNU General Public License, version 2.0.

#Requirements

It requires the following Python 3 packages:

On Debian-based systems, the prerequisites can be installed as follows:

sudo apt-get install python3-blessings python3-lxml

#Usage

The script is to be used as follows:

usage: xgrep.py [-h] [-a] [-c] [-C] [-i] [-l] [-L] [-m] [-M]
                [-n] [-N] [-p] [-P] [-q] [-r ns] [-s] [-v]
                expr file [file ...]

positional arguments:
  expr                  XPath 1.0 expression
  file                  XML file

optional arguments:
  -h, --help            show this help message and exit
  -a, --abbreviate      abbreviate matches
  -c, --count           count matches
  -C, --force-color     preserve color and formatting when piping output
  -i, --indent          indent matches
  -l, --files-with-matches
                        output list of matching files
  -L, --files-without-match
                        output list of non-matching files
  -m, --matches         output list of matches
  -M, --files-and-matches
                        output list of files and matches
  -n, --line-number     output line number of match start
  -N, --declare-ns      declare namespaces in matches
  -p, --pis             preserve processing-instructions in output
  -P, --comments        preserve comments in output
  -q, --quiet           only return exit status
  -r ns, --regex ns     namespace prefix for EXSLT regular expressions
  -s, --spaces          normalize whitespace to spaces
  -v, --version         show program's version number and exit

Normally, xgrep.py outputs the matching parts of the XML files together with their file names and the XPath expression. The option -m outputs only the matching parts, without file names or XPath expressions; with -M, the matching parts are prefixed with the corresponding file name. If -r <ns> is set, the EXSLT function <ns>:test() can be used in the XPath expression for matching regular expressions. The option -i indents the matching parts, and the option -N includes namespace declarations. Matching parts can be abbreviated to their first line by means of the option -a. The option -s normalises whitespace to spaces in the output. Processing instructions and comments in the XML files are ignored unless the options -p and -P are used. The -C option preserves color and formatting codes when piping output through GNU less or similar programs.

The options -c, -l, -L, -n, and -q mimic the behaviour of GNU grep. The latter option suppresses any output, but still returns the exit status (0 if there are matches, 1 if there are none, and 2 for errors).

Andreas Nolda (andreas@nolda.org)