Stemming is a Natural Language process where words are reduced to their root by removing, usually the suffix.
Questions tagged [stemming]
5 questions
7
votes
3 answers
Is there a good German Stemmer?
What I tried:
# -*- coding: utf-8 -*-
from nltk.stem.snowball import GermanStemmer
st = GermanStemmer()
token_groups = [(["experte", "Experte", "Experten", "Expertin", "Expertinnen"], []),
(["geh", "gehe", "gehst", "geht", "gehen",…
Martin Thoma
- 19,540
- 36
- 98
- 170
6
votes
2 answers
Python stemmer for Georgian
I am currently working with Georgian texts processing. Does anybody know any stemmers/lemmatizers (or other NLP tools) for Georgian that I could use with Python.
Thanks in advance!
Евгения Рубанова
- 63
- 3
1
vote
1 answer
How does Snowball Stemmer work?
I have been reading about Snowball Stemmer.
I wonder how does it work?
Does it use rules to stem word
or does it use machine learning to do that?
I checked snowballstem.org but could not find the answer!
asmgx
- 549
- 2
- 18
1
vote
2 answers
How to resolve country and nationality entities?
I've tried stemming and lemmatization on this but nothing has quite worked so far.
How can I resolve country name and nationality as a singular entity?
For example:
Canada and Canadian should just be one entity: Canada
Uganda and Ugandan should…
Learning stats by example
- 172
- 1
- 9
0
votes
1 answer
Faster preprocessing for Arabic texts
Background
I'm analyzing a relatively large text-based Arabic dataset using Python (50,000 - 70,000 text files; total size ~5GB).
I want to segment, stem, and POS tag the dataset. I am aware of two Python libraries that can do these 3 tasks:…
Alaa
- 1