Using Strings

So far in this course, we have only covered the Swift Standard Library, but when it comes to strings we must also include the Foundation framework, as it contains a lot of both basic and advanced text functionality that is missing from the Swift Standard Library.

Foundation is available on all Apple platforms and has been around for a long time (there is also a version for other platforms, re-implemented in Swift; see: https://github.com/apple/swift-corelibs-foundation). It is written in and for Objective-C, but a lot of its API has been updated to be easier to work with from Swift. Not all of it has been though, and as we'll see, you might run into some problems when converting Foundation types to Swift types.

Foundation's string type is NSString, and it works directly with UTF-16 encoded text. It does not know what the Character type is, and does not necessarily handle Unicode text correctly like Swift does. NSString can be used as Swift String and vice versa as they can share the same underlying storage.

It also has CharacterSet, which, despite the name, is a set of UnicodeScalar. It has several useful predefined sets, like CharacterSet.alphanumerics, .whitespaces, .decimalDigits, and more. You can only use them if you're lucky enough to have characters consisting of only one UnicodeScalar:

CharacterSet.alphanumerics.contains(character.unicodeScalars.first!)

Foundation's range type is NSRange, and it uses integers to refer to positions in an NSString. It can do this efficiently because each element of NSString takes up the same amount of space. We can always convert a Swift Range to NSRange with NSRange(range, in: string), but we can't necessarily go the other way, as we will see later on.

Creating Strings

Let's look at creating strings by following these steps:

There are many ways of creating strings. You've already seen the string literal:
```
let literal = "string from literal"
```
There are also multi-line literals:
```
let multilineLiteral = """
  line 1
  line 2
    line 3 indented

  """
```
The result is "line 1\nline 2\n\tline 3 indented\n". The closing three quotes must be at the beginning of the line (excluding indentation) and any indentation that precedes it will be removed from the beginning of every line in the string.
Use backslash to insert special characters like \\ (backslash), \t (horizontal tab), \n (line feed), \r (carriage return), \" (double quotation mark), and \' (single quotation mark).
We can create characters directly from their hexadecimal Unicode code points, like this:
```
let blackDiamond = "\u{2666}" // ♦
let brokenHeart = "\u{1F494}" // 💔
```

To include variables in the text, we use string interpolation, like this:

let array = [1,2,3]
let stringInterpolation = "The array \(array) has \(array.count) items."
// "The array [1, 2, 3] has 3 items."

Strings can describe absolutely any type, as shown here:

struct CustomType {
  let value: Int
  let otherValue: Bool
}

let customType = CustomType(value: 5, otherValue: false)
String(describing: customType) // "CustomType(value: 5, otherValue: false)"

We can customize the description, like this:

extension CustomType: CustomStringConvertible {
  var description: String {
    return "\(value) and \(otherValue)"
  }
}

String(describing: customType) // "5 and false"

Text can be repeated, as shown here:
```
String(repeating: "la", count: 5)
```

We can read text files, like this:

import Foundation

do {
  let fileContents = try String(contentsOfFile: "file.txt")
} catch { /* ... */ }

Common Operations

Follow these steps to look at how to implement common operations on a string:

Many of the common sequences and collection methods are useful on strings too, as shown here:

let string = """
           Line 1
           line 2
           """
let range1 = ..<string.index(of: "1")!

// return the substring over range 1
string[range1]

// return true if the string begins with "Line"
string.hasPrefix("Line")
// return true if the string ends with "2"
string.hasSuffix("2")

These mutate the string:

var mutablestring = string

// remove the characters in range1, and insert "line up" there.
mutablestring.replaceSubrange(range1, with: "line up")
// remove the characters in range1.
mutablestring.removeSubrange(range1)
// remove the first character.
mutablestring.removeFirst()
// remove the first 2 characters.
mutablestring.removeFirst(2)
// remove the last character.
mutablestring.removeLast()
// remove the last 2 characters.
mutablestring.removeLast(2)

There aren't many operations specifically made for strings:

// return a new string in uppercase.
string.uppercased()
// return a new string in lowercase.
string.lowercased()

We get a lot more if we import Foundation, like this simple test for the existence of a substring:
```
string.contains(" 1")
```

All of the following methods return a new string with the changes; the original string is left intact:

// new string with all the words capitalised (ignoring language)
string.capitalized
// new string with all the words capitalised, using the rules of the language from the provided locale
string.capitalized(with: Locale.current)
// new string with all occurrences of one substring replaced with another
string.replacingOccurrences(of: "Line", with: "line")
// new string with all occurrences of a substring removed
string.replacingOccurrences(of: "Line", with: "")
// new string with all occurrences of a substring in the provided range removed, using the provided options
string.replacingOccurrences(of: "line", with: "triangle", options: .caseInsensitive, range: string.startIndex..<string.index(of: "\n")!)

// the range of the first character that belongs to the provided CharacterSet
string.rangeOfCharacter(from: .decimalDigits)
// the range of the first occurrence of the substring
let range = string.range(of: "Line")!
// the substring over this range
string[range]
// the range of the line or lines containing the provided range
string.lineRange(for: range)
// new string with the characters in the provided CharacterSet removed from the beginning and the end
" \t  trim  \n ".trimmingCharacters(in: .whitespacesAndNewlines)
// a new string of the given length, by either removing characters from the end or adding 'withPad' to the end
"Padded".padding(toLength: 10, withPad: " ", startingAt: 0)
"Pad".padding(toLength: 10, withPad: "_ ", startingAt: 1)

The following methods return an array of strings:

// an array of strings, from splitting the original string over the provided substring
string.components(separatedBy: ". ")
// an array of strings, from splitting the original string over characters in the provided CharacterSet
string.components(separatedBy: .newlines)

Implementing Extra Text Operations on a String

Follow this step to implement extra text operations on a string:

Open Strings.playground on the Common string operations page and see if you can find more text operations on string, using autocomplete and the documentation in Xcode.

This section is focused on how we can use strings and the various operations on strings that are allowed in Swift. Next, we'll look at substrings in detail.

Activity B-1: All Ranges of a Substring

There is already a method on String for finding the first range of a substring. This method will find all of the ranges of a substring.

To use an Xcode playground to create a method on string which finds all ranges of a substring.

Open the StringsExtra Xcode project, and go to the StringsExtra.swift file.

Enter the following code:

import Foundation

extension String {

The method has the same parameters as String.range:

  public func allRanges(of aString: String,
    options: String.CompareOptions = [],
    range searchRange: Range<String.Index>? = nil,
    locale: Locale? = nil) -> [Range<String.Index>] {

If no search range is given, we search the entire string:

    var searchRange = searchRange ?? startIndex..<endIndex
    var ranges = [Range<String.Index>]()

while let is a very useful combination of loop and optionals. It continues until self.range returns nil:

    while let foundRange = self.range(of: aString, options: options, range: searchRange, locale: locale) {
      ranges.append(foundRange)

If we are searching backwards, we need to narrow the search range from the right instead of from the left. We only narrow it by one character so we can find repeating substrings (like the five occurrences of lala in lalalalalala):

      searchRange = options.contains(.backwards) ?
        searchRange.lowerBound..<self.index(before: foundRange.upperBound) :
        self.index(after: foundRange.lowerBound)..<searchRange.upperBound
    }
    return ranges
  }
}

Go to the unit tests in StringsExtraTests.swift.
Uncomment the first comment block, so these become active:
```
  let string = """
  func testAllRanges()
```
Run all unit tests and verify that they pass.

Activity B-2: Counting Words, Sentences, and Paragraphs

Perhaps the most straightforward way of counting the number of words in a string is to count the number of spaces and add one. But, even if you only have text using the Latin alphabet, this will often be wrong (there could be two spaces in a row, and doesn't is technically two words). Foundation has NSLinguisticTagger, which handles these things and other alphabets. Not all of its APIs have been updated for Swift yet, so it can be a bit cumbersome to use, but the method that we will use here is fairly straightforward.

To use an Xcode playground to create a method on string which can count words, sentences, and paragraphs.

Open the StringsExtra Xcode project, and go to the StringsExtra.swift file.
Enter the following code:
```
extension String {
```
- This class can do a lot of advanced text analysis, such as detecting
  nouns, verbs, and so on, and find the stem of words, but in this case we are only interested in linguistic tokens:
  let tagger = NSLinguisticTagger(tagSchemes: [.tokenType], options: 0) tagger.string = self
- Like everything in Foundation, this class works on NSString, which sometimes uses NSRange instead of Range. Luckily, converting from Range to NSRange is no problem:
  let range = NSRange(startIndex..<endIndex, in: self) var result = 0
  This closure has parameters for a tag type, nsrange, and a Boolean for whether or not it should stop, but in this case we are only interested in how many times it is called:
  tagger.enumerateTags(in: range, unit: unit, scheme: .tokenType, options: options, using: { _, _, _ in result += 1 }) return result } }

You can call it like this:

string.countLinguisticTokens(ofType: .paragraph)
string.countLinguisticTokens(ofType: .sentence)
string.countLinguisticTokens(ofType: .word)

Go to the unit tests in StringsExtraTests.swift.

Uncomment the next comment block, so these become active:

let english = """
func testCountLinguisticTokens_English() {
let internationalText = """
func testCountLinguisticTokens_International() {

Run all unit tests and verify that they pass.