Highest Voted 'stemming' Questions - Data Science Stack Exchange

7

votes

3 answers

Is there a good German Stemmer?

What I tried: # -*- coding: utf-8 -*- from nltk.stem.snowball import GermanStemmer st = GermanStemmer() token_groups = [(["experte", "Experte", "Experten", "Expertin", "Expertinnen"], []), (["geh", "gehe", "gehst", "geht", "gehen",…

nlp nltk stemming

asked Aug 08 '19 at 06:31

Martin Thoma

19,540
36
98
170

6

votes

2 answers

Python stemmer for Georgian

I am currently working with Georgian texts processing. Does anybody know any stemmers/lemmatizers (or other NLP tools) for Georgian that I could use with Python. Thanks in advance!

python nlp python-3.x stemming

asked Feb 05 '21 at 07:06

Евгения Рубанова

63
3

1

vote

1 answer

How does Snowball Stemmer work?

I have been reading about Snowball Stemmer. I wonder how does it work? Does it use rules to stem word or does it use machine learning to do that? I checked snowballstem.org but could not find the answer!

stemming

asked Jan 12 '21 at 03:02

asmgx

549
2
18

1

vote

2 answers

How to resolve country and nationality entities?

I've tried stemming and lemmatization on this but nothing has quite worked so far. How can I resolve country name and nationality as a singular entity? For example: Canada and Canadian should just be one entity: Canada Uganda and Ugandan should…

named-entity-recognition stemming

asked Oct 14 '19 at 19:15

Learning stats by example

172
1
9

0

votes

1 answer

Faster preprocessing for Arabic texts

Background I'm analyzing a relatively large text-based Arabic dataset using Python (50,000 - 70,000 text files; total size ~5GB). I want to segment, stem, and POS tag the dataset. I am aware of two Python libraries that can do these 3 tasks:…

python nlp dataset preprocessing stemming

asked Jul 30 '24 at 21:01

Alaa

1

Questions tagged [stemming]

Is there a good German Stemmer?

Python stemmer for Georgian

How does Snowball Stemmer work?

How to resolve country and nationality entities?

Faster preprocessing for Arabic texts