Posts

Comments

Comment by Wayne (tee-weile-wayne) on SolidGoldMagikarp II: technical details and more recent findings · 2023-02-09T03:40:05.092Z · LW · GW

' newcom', 'slaught', 'senal' and 'volunte'

 

I think these could be a result of a simple stemming algorithm:

  • newcomer → newcom
  • volunteer → volunte
  • senaling → senal

Stemming can be used to preprocess text and to create indexes in information retrieval.

Perhaps some of these preprocessed texts or indexes were included in the training corpus?