The SentenceDetectorME class possesses a sentPosDetect method that returns Span objects for each sentence. Use the same code as found in the previous section, except for two changes: replace the sentDetect method with the sentPosDetect method, and the for-each statement with the method used here:
Span spans[] = detector.sentPosDetect(paragraph); for (Span span : spans) { System.out.println(span); }
The output that follows uses the original paragraph. The Span objects contain positional information returned from the default execution of the toString method:
[0..74) [75..116) [117..145) [146..317)
The Span class possesses a number of methods. The following code sequence demonstrates the use of the getStart and getEnd methods to clearly show the text represented by those spans:
for (Span span : spans) { System.out.println(span + "[" + paragraph.substring( span.getStart(), span.getEnd()) +"]"); }
The output shows the sentences identified:
[0..74)[When determining the end of sentences we need to consider several factors.] [75..116)[Sentences may end with exclamation marks!] [117..145)[Or possibly questions marks?] [146..317)[Within sentences we may find numbers like 3.14159, abbreviations such as found in Mr. Smith, and possibly ellipses either within a sentence ..., or at the end of a sentence...]
There are a number of other Span methods that can be valuable. These are listed in the following table:
Method |
Meaning |
contains |
An overloaded method that determines whether another Span object or index is contained with the target |
crosses |
Determines whether two spans overlap |
length |
The length of the span |
startsWith |
Determines whether the span starts the target span |