# HG changeset patch # User Nolan Prescott # Date 1618682309 14400 # Sat Apr 17 13:58:29 2021 -0400 # Node ID 27c8b0338e720325794a3974e0c6047532f79dec # Parent bab81689835add867953da75523b55796aab3a48 improve readme diff --git a/README b/README --- a/README +++ b/README @@ -1,12 +1,36 @@ Porter-2 Stemmer for English ════════════════════════════ - Intended to be an implementation of the [Porter Stemming Algorithm]. + Intended to be an implementation of the [Porter Stemming Algorithm] + for English. [Porter Stemming Algorithm] http://snowball.tartarus.org/algorithms/english/stemmer.html +Usage +───── + + This package exports a single function `stem' which will reduce + inflectional forms and affixes for English words by a heuristic + process. + + +Example +╌╌╌╌╌╌╌ + + ┌──── + │ STEMMER> (stem "adverserial") + │ "adverseri" + │ + │ STEMMER> (stem "disjointed") + │ "disjoint" + │ + │ STEMMER> (stem "hangings") + │ "hang" + └──── + + Origin ────── @@ -19,7 +43,7 @@ Caveats -─────── +═══════ Based on the published vocabularies this implementation produces the following discrepancies: diff --git a/README.org b/README.org --- a/README.org +++ b/README.org @@ -2,29 +2,42 @@ * Porter-2 Stemmer for English Intended to be an implementation of the [[http://snowball.tartarus.org/algorithms/english/stemmer.html][Porter - Stemming Algorithm]]. + Stemming Algorithm]] for English. +** Usage + This package exports a single function ~stem~ which will reduce + inflectional forms and affixes for English words by a heuristic + process. +*** Example + #+BEGIN_EXAMPLE + STEMMER> (stem "adverserial") + "adverseri" + STEMMER> (stem "disjointed") + "disjoint" + + STEMMER> (stem "hangings") + "hang" + #+END_EXAMPLE ** Origin This originated as a port of the [[https://github.com/kljensen/snowball][Snowball Go module]] (MIT licensed). It has been trimmed down and modified to the point that it might be recognizable if you squint. - -** Caveats - Based on the published vocabularies this implementation produces - the following discrepancies: +* Caveats + Based on the published vocabularies this implementation produces + the following discrepancies: - | Input | Output | Canonical | - |-------+--------+-----------| - | "'" | "" | "'" | - | "''" | "" | "''" | - | "'a" | "a" | "'a" | - | "'s" | "s" | "'s" | - | "a'" | "a" | "a'" | + | Input | Output | Canonical | + |-------+--------+-----------| + | "'" | "" | "'" | + | "''" | "" | "''" | + | "'a" | "a" | "'a" | + | "'s" | "s" | "'s" | + | "a'" | "a" | "a'" | - This results from the (perceived) ambiguity in the handling of - apostrophes between other implementations and the written - descriptions within - [[http://snowball.tartarus.org/texts/apostrophe.html][the - documentation]]. These discrepancies represent 0.016997% error in - the included test corpus. + This results from the (perceived) ambiguity in the handling of + apostrophes between other implementations and the written + descriptions within + [[http://snowball.tartarus.org/texts/apostrophe.html][the + documentation]]. These discrepancies represent 0.016997% error in + the included test corpus.