The tagging process considers multiple combinations of tags. The HmmDecoder class's tagNBest method returns an iterator of the ScoredTagging objects that reflect the confidence of different orders. This method takes a token list and a number specifying the maximum number of results desired.
The previous sentence is not ambiguous enough to demonstrate the combination of tags. Instead, we will use the following sentence:
String[] sentence = {"Bill", "used", "the", "force",
"to", "force", "the", "manager", "to", "tear", "the", "bill","in", "to."}; List<String> tokenList = Arrays.asList(sentence);
An example of using this method is shown here, starting with declarations for the number of results:
int maxResults = 5;
Using the decoder object created in the previous section, we apply the tagNBest method to it as follows:
Iterator<ScoredTagging<String>> iterator = decoder.tagNBest(tokenList, maxResults);
The iterator will allows us to access each of the five different scores. The ScoredTagging class possesses a score method that returns a value reflecting how well it believes it performs. In the following code sequence, a printf statement displays this score. This is followed by a loop where the token and its tag are displayed.
The result is a score, followed by the word sequence with the tag attached:
while (iterator.hasNext()) { ScoredTagging<String> scoredTagging = iterator.next(); System.out.printf("Score: %7.3f Sequence: ",
scoredTagging.score()); for (int i = 0; i < tokenList.size(); ++i) { System.out.print(scoredTagging.token(i) + "/" + scoredTagging.tag(i) + " "); } System.out.println(); }
The output is as follows. Notice that the word "force" can have a tag of nn, jj, or vb:
Score: -148.796 Sequence: Bill/np used/vbd the/at force/nn to/to force/vb the/at manager/nn to/to tear/vb the/at bill/nn in/in two./nn Score: -154.434 Sequence: Bill/np used/vbn the/at force/nn to/to force/vb the/at manager/nn to/to tear/vb the/at bill/nn in/in two./nn Score: -154.781 Sequence: Bill/np used/vbd the/at force/nn to/in force/nn the/at manager/nn to/to tear/vb the/at bill/nn in/in two./nn Score: -157.126 Sequence: Bill/np used/vbd the/at force/nn to/to force/vb the/at manager/jj to/to tear/vb the/at bill/nn in/in two./nn Score: -157.340 Sequence: Bill/np used/vbd the/at force/jj to/to force/vb the/at manager/nn to/to tear/vb the/at bill/nn in/in two./nn