Questions tagged [hidden-markov-models]

42 questions
9
votes
2 answers

Why are HMMs appropriate for speech recognition when the problem doesn't seem to satisfy the Markov property

I'm learning about HMMs and their applications and trying to understand their usages. My knowledge is a bit spotty, so please correct any incorrect assumptions I'm making. The specific example I'm wondering about is for using HMMs for speech…
5
votes
2 answers

Combining multiple HMM models

Is there any way to combine multiple Hidden Markov Models trained from different sets of data? For example, I want to detect the phases of a sequential activity. I collect two sets of data by using two types of sensors: accelerometer, and video…
4
votes
1 answer

Viterbi training vs. Baum-Welch algorithm

I'm trying to find the most probable path (i.e., sequence of states) on an hidden Markov model (HMM) using the Viterbi algorithm. However, I don't know the transition and emission matrices, which I need to estimate from the observations (data). To…
dx_mrt
  • 141
  • 2
4
votes
0 answers

Next-Word Prediction, Language Models, N-grams

I was looking into how a next-word prediction engine like swift key or XT9 can be implemented. Here's what I did. I read about n-grams here - en.wikipedia.org/wiki/N-gram and aicat.inf.ed.ac.uk/entry.php?id=663 I read about Language Models/Markov…
3
votes
1 answer

HMM tagger - Baum Welch training

I am trying to implement a trigram HMM tagger for a language that has over 1000 tags. In my training data I have 459 tags. Now if we consider that states of the HMM are all possible bigrams of tags, that would leave us with $459^2$ states and…
3
votes
1 answer

Hidden Markov Model initial probability reestimate: Why $\pi^*_i = \gamma_i(1)$ instead of $\pi^*_i = \frac{\gamma_i(1)}{\sum_{j = 1}^N \gamma_j(1)}$

In the sources I consulted it states that in the Baum Welch algorithm the reestimate of the initial probability of state $i$ of the HMM is $\pi^*_i = \gamma_i(1)$. But $\gamma_i(t)$ is the probability of being in state ${\displaystyle i}$ at time…
3
votes
1 answer

How do POMDPs and Dynamic Influence Diagrams differ?

To give some perspective, first consider the following diagram comparing Markov Chains, HMMs, MDPs, and POMDPs (I'm not sure who to credit for it). Fully observable Partially observable _ _ _ _ _ _ _ _…
3
votes
1 answer

Viterbi algorithm recursive justification

I have a question regarding recursion in Viterbi algorithm. Define $\pi(k; u; v)$ which is the maximum probability for any sequence of length $k$, ending in the tag bigram $(u; v)$. The base case if obvious $\pi(0,*,*)=1$ The general…
3
votes
1 answer

Is it viable to use an HMM to evaluate how well a catalogue is used?

I was interested on evaluating a catalogue that students would be using to observe how is it being used probabilistically. The catalogue works by choosing cells in a temporal sequence, so for example: Student A has:…
2
votes
1 answer

Viterbi Algorithm: initial state with ONE probability

The Viterbi Algorithm can be used to calculate the most likely path, based on observations in a Hidden Markov Model. Using the same notations as Wikipedia, "each element T1[i, j] of T1 stores the probability of the most likely path so far X = (x[1]…
Attilio
  • 255
  • 1
  • 6
2
votes
1 answer

Examples of difference between Hidden Markov Model and Bayesian Network?

I am trying to more deeply understand the difference between Hidden Markov Models and Bayesian Network? The general idea is that HMMs have a single variable which has probabilities of entering different states, known and unknown, whereas the…
2
votes
1 answer

counteracting numerical instability in HMM training

I am training a HMM with Baum Welch for part of speech tagging. I am training the model with 79 hidden variables (part of speech tags) and 80,000 observed variables (words). I am working with log probabilities. To give you an idea, I defined the…
lo tolmencre
  • 295
  • 1
  • 3
  • 8
2
votes
0 answers

Machine Learning Algorithm Recommendation For Sensor Data

I would like to classify data coming from a sensor. In the literature Hidden Markov Model and SVM are used, but I would like to improve results with another methods.The picture how data and classes are as follows; x is time axis and y is sensor…
2
votes
1 answer

Convergence of Markov model

I was learning Hidden Markov model, and encountered this theory about convergence of Markov model. For example, consider a weather model, where on a first-day probability of weather being sunny was 0.9 while that of being rainy - 0.1. The transition…
2
votes
1 answer

Training a HMM with Baum-Welch gives different results across runs

I am running a Baum-Welch HMM algorithm (in R). The sequence vector contains a series of observations which have been gathered from a dataset where the data has 17 states. I can successfully run the HMM algorithm and it converges without any…
1
2 3