Chapter 5. Text Handling

Cocoa’s text system contains a rich set of features such as text input, layout, display, editing, copying and pasting, and font management. It also includes support for advanced typesetting features such as kerning and ligature, multilingual support with Unicode, and sophisticated layout capabilities.

This chapter discusses the primary classes of Cocoa’s text handling system and how they relate to one another. Figure 5-1 shows the hierarchy of classes related to the text system.

Hierarchy of text system class

Figure 5-1. Hierarchy of text system class

The following four classes make up the core architecture of Cocoa’s text handling system:

The relationship between these core classes is based on the same Model-View-Controller (MVC) pattern used throughout the Application Kit (and discussed in Chapter 3). Figure 5-2 shows the division of responsibilities in these four classes using the MVC pattern.

Figure 5-2 shows the relationship between the four classes, but doesn’t show the one-to-many relationship that may exist between instances of these classes. Instances of NSTextStorage own and manage one or more NSLayoutManager objects. Similarly, each instance of NSLayoutManager owns one or more NSTextContainer objects, while each text container object is paired with an NSTextView object. The nature of these relationships is what gives Cocoa’s text handling system much of its flexibility and power.

NSTextStorage makes up the data storage layer for the text system. NSTextStorage’s data is stored as a sequence of Unicode characters, which makes the text system capable of localizing an application in any language. Unicode also contains character sets for mathematics and other technical fields. To see the huge number of characters that Unicode can represent,[4] launch Mac OS X’s Character Palette from the Input menu, as shown in Figure 5-3.

NSTextStorage is a subclass of NSMutableAttributedString. Every character in the text storage, therefore, is associated with a set of attributes that define appearance characteristics such as font and color (a single attributes dictionary will probably be applicable to a range of characters, but it might have a different set of attributes for each character). Cocoa defines a standard set of attributes, which were enumerated in Table 4-2. Additionally, developers may choose to assign their own application specific attributes to text, which could support features such as syntax coloring.

As mentioned earlier, NSTextView contains action methods that let controls change the appearance and layout of a selected region of text. These action methods let controls in the user interface (such as a bold-italic-underline button group, or the font and color panels) interact with the contents of the text view. However, using NSTextView’s API to effect these attribute changes programmatically is inefficient since those methods are intended for use as user interface actions; it is preferable to use the API provided by NSMutableAttributedString. For example, the method setAttributes:range: takes a dictionary with attribute key-value pairs and a range to which these attributes should be applied. Chapter 2 discusses attributed strings in more detail.

The job of NSLayoutManager is to accurately map characters and glyphs and lay out the resulting glyphs in text containers managed by the layout manager. Figure 5-4 shows a ligature for “Th” in the font Snell Roundhand and illustrates the mapping of characters into glyphs.

The distinction between characters and glyphs is important, as it represents the intersection between the text-system’s data and view layers. Glyphs, unlike Unicode character codes, can take on many forms, the visual appearance of which depends on the attributes of a particular character such as its font, the other characters around that character, and how ligatures are handled in the font being rendered. For example, the glyph for the letter “T” in the Times font is quite different for the glyph the Zapfino font defines for the same letter. Moreover, multiple characters in a sequence may actually define a single glyph. This is especially true in nonwestern alphabets and in fonts that define ligatures for certain pairs of letters.

The flow of information with NSLayoutManager goes in two directions. You just read about the flow from the data model to the view; however, experience shows that information must flow from the view to the data layer whenever you alter the content by typing, making selections, or changing formatting. To facilitate this, NSLayoutManager must be able reconcile the position of selections and the cursor in the glyph stream with character ranges in the storage layer.

The NSLayoutManager class has nearly 100 instance methods. Most of these methods are responsible for mapping characters to glyphs, setting attributes of glyphs, and controlling how they are laid out in the view. The API discussed here are the methods that control the text containers that define where text is laid out.

When laying out text, NSLayoutManager first converts a run of characters into a mapped sequence of glyphs. Once the layout manager knows exactly what needs to be laid out within a text container, it can check with the text container object for guidance in this layout. To do this, NSLayoutManager determines the bounding rectangle of the line of glyphs and passes it to the current text container as a proposed layout rectangle. The text container looks at this proposed rectangle and compares it to its own bounding rectangle . For example, if the proposed rectangle is too long, the text container returns the largest available rectangle for the current line in the text container to the layout manager. Additionally, the text container returns a remainder rectangle, which is the difference between the proposed rectangle and the accepted rectangle. NSLayoutManager repeats the proposal process with the remainder rectangle, and each successive remainder rectangle until the layout is complete.

When determining how to modify the proposed rectangle, NSTextContainer takes into account the direction in which the glyphs are sequenced in a line, and the direction lines are placed relative to their preceding lines. These directions are referred to as the line sweep direction and line movement direction , respectively. When a text container modifies the proposed rectangle, the text container can shorten the rectangle from the direction of the line sweep, and it is allowed to shift the rectangle in the direction of the line movement. By adhering to these rules, NSTextContainer and NSLayoutManager can break up a continuous line of glyphs into an arranged set of lines that can be rendered in a view. There is a clear division of responsibility here:

The method in NSTextContainer that performs these functions is:

lineFragmentRectForProposedRect:sweepDirection:
              movementDirection:remainingRect:

The sweepDirection: argument is of type NSLineSweepDirection, and the movementDirection: argument is of type NSLineMovementDirection. NSTextContainer returns the remainder rectangle to the sender through the remainingRect: argument, which is a pointer to an NSRect structure. Subclasses override this method to perform custom layout. If the text container object determines that the proposed rectangle cannot fit into the container, then the constant NSZeroRect is returned.



[4] You can find out more information about Unicode, including the characters that can be represented, at http://www.unicode.org.