M README +26 -2
@@ 1,12 1,36 @@
Porter-2 Stemmer for English
════════════════════════════
- Intended to be an implementation of the [Porter Stemming Algorithm].
+ Intended to be an implementation of the [Porter Stemming Algorithm]
+ for English.
[Porter Stemming Algorithm]
http://snowball.tartarus.org/algorithms/english/stemmer.html
+Usage
+─────
+
+ This package exports a single function `stem' which will reduce
+ inflectional forms and affixes for English words by a heuristic
+ process.
+
+
+Example
+╌╌╌╌╌╌╌
+
+ ┌────
+ │ STEMMER> (stem "adverserial")
+ │ "adverseri"
+ │
+ │ STEMMER> (stem "disjointed")
+ │ "disjoint"
+ │
+ │ STEMMER> (stem "hangings")
+ │ "hang"
+ └────
+
+
Origin
──────
@@ 19,7 43,7 @@ Origin
Caveats
-───────
+═══════
Based on the published vocabularies this implementation produces the
following discrepancies:
M README.org +31 -18
@@ 2,29 2,42 @@
* Porter-2 Stemmer for English
Intended to be an implementation of the
[[http://snowball.tartarus.org/algorithms/english/stemmer.html][Porter
- Stemming Algorithm]].
+ Stemming Algorithm]] for English.
+** Usage
+ This package exports a single function ~stem~ which will reduce
+ inflectional forms and affixes for English words by a heuristic
+ process.
+*** Example
+ #+BEGIN_EXAMPLE
+ STEMMER> (stem "adverserial")
+ "adverseri"
+ STEMMER> (stem "disjointed")
+ "disjoint"
+
+ STEMMER> (stem "hangings")
+ "hang"
+ #+END_EXAMPLE
** Origin
This originated as a port of the
[[https://github.com/kljensen/snowball][Snowball Go module]] (MIT
licensed). It has been trimmed down and modified to the point that
it might be recognizable if you squint.
-
-** Caveats
- Based on the published vocabularies this implementation produces
- the following discrepancies:
+* Caveats
+ Based on the published vocabularies this implementation produces
+ the following discrepancies:
- | Input | Output | Canonical |
- |-------+--------+-----------|
- | "'" | "" | "'" |
- | "''" | "" | "''" |
- | "'a" | "a" | "'a" |
- | "'s" | "s" | "'s" |
- | "a'" | "a" | "a'" |
+ | Input | Output | Canonical |
+ |-------+--------+-----------|
+ | "'" | "" | "'" |
+ | "''" | "" | "''" |
+ | "'a" | "a" | "'a" |
+ | "'s" | "s" | "'s" |
+ | "a'" | "a" | "a'" |
- This results from the (perceived) ambiguity in the handling of
- apostrophes between other implementations and the written
- descriptions within
- [[http://snowball.tartarus.org/texts/apostrophe.html][the
- documentation]]. These discrepancies represent 0.016997% error in
- the included test corpus.
+ This results from the (perceived) ambiguity in the handling of
+ apostrophes between other implementations and the written
+ descriptions within
+ [[http://snowball.tartarus.org/texts/apostrophe.html][the
+ documentation]]. These discrepancies represent 0.016997% error in
+ the included test corpus.