Supervised learning

In the case of supervised learning, we use the data generated by sampling the process that we are trying to model. If we are trying to parameterize our HMM model using simple discrete distributions, we can simply apply the MLE to compute the transition and emission distributions by counting the number of transitions from any given state to another state. Similarly, we can compute the emission distribution by counting the output states from different hidden states. Therefore the transition and emission probabilities can be computed as follows:

Here, T(i,j) is the transition probability from state i to state j. And E(i,s) is the emission probability of getting state s from state i.

Let's take a very simple example to make this clearer. We want to model the weather and whether or not it would rain over a period of time. Also, we assume that the weather can take three possible states:

Sunny (S)
Cloudy (C)
Windy (W)

And the Rain variable can have two possible states; that it rained (R) or that it didn't rain (NR). An HMM model would look something like this:

And let's say we have some observed data for this which looks something like D={(S,NR), (S,NR), (C,NR), (C,R), (C,R), (W,NR), (S,NR), (W,R), (C,NR)}. Here, the first element of each datapoint represents the observed weather that day and the second element represents whether it rained or not that day. Now, using the formulas that we derived earlier, we can easily compute the transition and emission probabilities. We will start with computing the transition probability from S to S:

Similarly, we can compute the transition probabilities for all the other combinations of states:

And, hence, we have our complete transition probability over all the possible states of the weather. We can represent it in tabular form to look nicer:

	Sunny(S)	Cloudy(C)	Windy(W)
Sunny(S)	0.33	0.33	0.33
Cloudy(C)	0	0.66	0.33
Windy(W)	0.5	0.5	0

Table 1: Transition probability for the weather model

Now, coming to computing the emission probability, we can again just follow the formula derived previously:

Similarly, we can compute all the other values in the distribution:

And hence our emission probability can be written in tabular form as follows:

	Sunny(S)	Cloudy(C)	Windy(W)
Rain (R)	0	0.5	0.5
No Rain (NR)	1	0.5	0.5

Table 2: Emission probability for the weather model

In the previous example, we saw how we can compute the parameters of an HMM using MLE and some simple computations. But, because in this case we had assumed the transition and emission probabilities as simple discrete conditional distribution, the computation was much easier. With more complex cases, we will need to estimate more parameters than we did in the previous section in the case of the normal distribution.