Porter 2 stemmer in Common Lisp

heads

tip
browse log

clone

read-only
https://hg.sr.ht/~nprescott/stemmer
read/write
ssh://hg@hg.sr.ht/~nprescott/stemmer
Porter-2 Stemmer for English
════════════════════════════

  Intended to be an implementation of the [Porter Stemming Algorithm].


[Porter Stemming Algorithm]
http://snowball.tartarus.org/algorithms/english/stemmer.html

Origin
──────

  This originated as a port of the [Snowball Go module] (MIT
  licensed). It has been trimmed down and modified to the point that it
  might be recognizable if you squint.


[Snowball Go module] https://github.com/kljensen/snowball


Caveats
───────

  Based on the published vocabularies this implementation produces the
  following discrepancies:

  ━━━━━━━━━━━━━━━━━━━━━━━━━━
   Input  Output  Canonical 
  ──────────────────────────
   "'"    ""      "'"       
   "''"   ""      "''"      
   "'a"   "a"     "'a"      
   "'s"   "s"     "'s"      
   "a'"   "a"     "a'"      
  ━━━━━━━━━━━━━━━━━━━━━━━━━━

  This results from the (perceived) ambiguity in the handling of
  apostrophes between other implementations and the written descriptions
  within [the documentation]. These discrepancies represent 0.016997%
  error in the included test corpus.


[the documentation] http://snowball.tartarus.org/texts/apostrophe.html