2

What aspects of linguistics are necessary or good to know for natural language processing? What references do you recommend for studying those aspects? Thanks!

Tim
  • 5,035
  • 5
  • 37
  • 71

4 Answers4

3

NLP is a big place, you might want to be more specific.

Within information retrieval, stemming is a linguistic idea that has become useful as a heuristic means of reducing vocabulary size. As a practitioner I learned about it from An Introduction to Information Retrieval.

phs
  • 209
  • 3
  • 12
2

It depends - generally, if you want to use or implement well known approaches or systems for some specific application, then you can get by without any linguistics, and if you want to improve state-of-art solutions, then you'd need at least a broad overview of general linguistics (things that apply to human lanuages as such), (non-CS) syntax theories, why POS-tags and phrase/dependency structures are chosen to be built that way, etc - but all the neccessary fragments tend to be taught as part of 'NLP courses', so you can get by with a single source and expect whoever is teaching that NLP course to gather up all the various domains.

Ah, and proper knowledge of your target languages - you can do quite a lot of stuff with foreign text you don't understand, but if you have some linguist available for every target language, then it is very useful.

Peteris
  • 426
  • 4
  • 9
1

There is no good answer to your question. Much depends on the kind of NLP you want to do. Do you want to do man-machine interfaces, information retrieval, syntax checkers, machine translation, data extraction from corpora? Do you want to process text or speech? Are you interested in ill formed sentences? Are you concerned with syntax or semantic processing? And so on ...

Now, your question is about "aspects of linguistics [...] necessary or good to know for natural language processing". Is statistics an aspect of linguistics? Is formal languages theory an aspect of linguistics? Or simply, is NLP an aspect of linguistics? If not, where is the border? Recall that formal language theory started with linguists such as Chomsky or Bar-Hillel.

My suggestion would be to study some of the systems that have been developed for NLP, at various stages of language processing, and it will force you to extend your knowledge of linguistics as you go along. Use call by need when you learn, especially if you do not know what is essential.

Another interesting kind of systems to study (which are also NLP topics) are system used to extract linguistic data from corpora, whether lexical, syntactic or semantic. This data will then be fed to your NLP systems to process actual texts for whatever purposes.

Statistics and formal languages are important to structure understanding, but they have to be motivated by linguistic considerations too. But if you start with linguistics, you may be bogged down into countless very specific studies.

babou
  • 19,645
  • 43
  • 77
1

The answer depends on what you want to do:

  1. For speech recognition: Phonetics and phonology
  2. For tagging and parsing: Morphology, syntax.
  3. For language understanding: Semantics and pragmatics.
ASDF
  • 141
  • 1
  • 9