Using the HmmDecoder class with NBest tags

The tagging process considers multiple combinations of tags. The HmmDecoder class's tagNBest method returns an iterator of the ScoredTagging objects that reflect the confidence of different orders. This method takes a token list and a number specifying the maximum number of results desired.

The previous sentence is not ambiguous enough to demonstrate the combination of tags. Instead, we will use the following sentence:

String[] sentence = {"Bill", "used", "the", "force", 
"to", "force", "the", "manager", "to", "tear", "the", "bill","in", "to."}; List<String> tokenList = Arrays.asList(sentence);

An example of using this method is shown here, starting with declarations for the number of results:

int maxResults = 5;

Using the decoder object created in the previous section, we apply the tagNBest method to it as follows:

Iterator<ScoredTagging<String>> iterator =  
    decoder.tagNBest(tokenList, maxResults); 

The iterator will allows us to access each of the five different scores. The ScoredTagging class possesses a score method that returns a value reflecting how well it believes it performs. In the following code sequence, a printf statement displays this score. This is followed by a loop where the token and its tag are displayed.

The result is a score, followed by the word sequence with the tag attached:

while (iterator.hasNext()) { 
    ScoredTagging<String> scoredTagging = iterator.next(); 
    System.out.printf("Score: %7.3f   Sequence: ", 
scoredTagging.score()); for (int i = 0; i < tokenList.size(); ++i) { System.out.print(scoredTagging.token(i) + "/" + scoredTagging.tag(i) + " "); } System.out.println(); }

The output is as follows. Notice that the word "force" can have a tag of nn, jj, or vb:

    Score: -148.796   Sequence: Bill/np used/vbd the/at force/nn to/to force/vb the/at manager/nn to/to tear/vb the/at bill/nn in/in two./nn 
    Score: -154.434   Sequence: Bill/np used/vbn the/at force/nn to/to force/vb the/at manager/nn to/to tear/vb the/at bill/nn in/in two./nn 
    Score: -154.781   Sequence: Bill/np used/vbd the/at force/nn to/in force/nn the/at manager/nn to/to tear/vb the/at bill/nn in/in two./nn 
    Score: -157.126   Sequence: Bill/np used/vbd the/at force/nn to/to force/vb the/at manager/jj to/to tear/vb the/at bill/nn in/in two./nn 
    Score: -157.340   Sequence: Bill/np used/vbd the/at force/jj to/to force/vb the/at manager/nn to/to tear/vb the/at bill/nn in/in two./nn