I am working on Natural Language Processing, particularly mining some Chinese texts. How can I customize my MATLAB order?

1 view (last 30 days)
Dear MATLAB experts: I am considering to buy a MATLAB and SIMULINK suite for student use. As I am working on natural language processing, I turn bewildered when I try to select needful add-ons. Could you tell me what toolboxes I need for NLP? Here are some models NLK is dependent on.
"Among the most important models are state machines, rule systems, logic, probabilistic models, and vector-space models . These models, in turn, lend themselves to a small number of algorithms, among the most important of which are state space search algorithms such as dynamic programming, and machine learning algorithms such as classifiers and EM and other learning algorithms.
Probabilistic models are crucial for capturing every kind of linguistic knowledge. Each of the other models (state machines, formal rule systems, and logic) can be aug- mented with probabilities. For example the state machine can be augmented with probabilities to become *the weighted automaton or Markov model.* We will spend a significant amount of time on hidden Markov models or HMMs, which are used everywhere in the field, in part-of-speech tagging, speech recognition, dialogue under- standing, text-to-speech, and machine translation. *The key advantage of probabilistic models is their ability to to solve the many kinds of ambiguity problems that we dis- cussed earlier; almost any speech and language processing problem can be recast as: “given N choices for some ambiguous input, choose the most probable one”.*
For many language tasks, we rely on machine learning tools like classifiers and sequence models. Classifiers like decision trees, support vector machines, Gaussian Mixture Models and logistic regression are very commonly used. A hidden Markov model is one kind of sequence model; other are Maximum Entropy Markov Models or Conditional Random Fields.
Another tool that is related to machine learning is methodological;the use of distinct training and test sets, statistical techniques like cross-validation, and careful evaluation of our trained systems* ." from Speech and Language Processing by Daniel Jurafsky and James H. Martin.

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!