Chapter 9. LINQ and Lambdas Get Control of your data

Images

It’s a data-driven world…it’s good to know how to live in it. Gone are the days when you could program for days, even weeks, without dealing with loads of data. Today, everything is about data. And that’s where LINQ comes in. LINQ not only lets you query data in a simple, intuitive way, but it lets you group data and merge data from different data sources. And once you’ve got the hang of wrangling your data into manageable chunks, you can use lambda expressions to refactor your code to make your C# code even more expressive (and impressive!).

Jimmy’s a Captain Amazing super-fan...

Meet Jimmy, one of the most prolific collectors of Captain Amazing comics, graphic novels, and paraphernalia. He knows all the Captain trivia, he’s got props from all the movies, and he’s got a comic collection that can only be described as, well, amazing.

Images

…but his collection’s all over the place

Jimmy may be passionate, but he’s not exactly organized. He’s trying to keep track of the most prized “crown jewel” comics of his collection, but he needs help. Can you build Jimmy an app to manage his comics?

Images

Use LINQ to query your collections

In this chapter we’ll learn about LINQ (or Language-Integrated Query). LINQ combines some really useful classes and methods with some powerful features built directly into C#, all created to help you work with sequences of data.

Let’s use Visual Studio to do start exploring LINQ. Create a new Console App (.NET Core) project and give it the name LinqTest. Add this code, and when you get to the last line add a period and look at the IntelliSense window:

Images

Let’s use some of those new methods to finish your console app:

        IEnumerable<int> firstAndLastFive = numbers.Take(5).Concat(numbers.TakeLast(5));
        foreach (int i in firstAndLastFive)
        {
            Console.Write($"{i} ");
        }
     }
  }
}

Now run your app. It prints this line of text to the console:

1 2 3 4 5 95 96 97 98 99

So what did you just do?

LINQ (or Language INtegrated Query) is a combination of C# features and .NET classes that combine to help you work with sequences of data.

Let’s take a closer look at how you’re using the LINQ methods Take, TakeLast, and Concat.

numbers

This is the original List<int> that you created with a for loop.

Images

numbers.Take(5)

The Take method takes the first elements from a sequence.

Images

numbers.TakeLast(5)

The TakeLast method takes the last elements from a sequence.

Images

numbers.Take(5).Concat(numbers.TakeLast(5))

The Concat method concatenates two sequences together.

Images

LINQ works with any IEnumerable<T>

When you added the using System.Linq; directive to your code, your List of numbers suddenly got “superpowered”—a bunch of LINQ methods suddenly appeared on it. And you can do the same thing for any class that implements IEnumerable<T>.

When an object implements IEnumerable<T>, any instance of that class is a sequence:

  • That list of numbers from 1 to 99 was a sequence.

  • When you called its Take method, it returned a reference to a sequence that contained five elements.

  • When you called its TakeLast method, it returned another 5-element sequence.

  • And when you used Concat to combine the two 5-element sequences, it created a new sequence with 10 elements and returned a reference to that new sequence.

Images

Any time you have an object that implements the IEnumerable interface, you have a sequence that you can use with LINQ. Doing an operation on that sequence in order is called enumerating the sequence.

LINQ methods enumerate your sequences

You already know that foreach loops work with IEnumerable objects. Think about what does a foreach does: it starts at the first element in the sequence, then does an operation on each element in the sequence in order. When a method goes through each item in a sequence in order, that’s called enumerating the sequence. And that’s how LINQ methods work.

Objects that implement IEnumerable can be enumerated.

e-nu-mer-ate, verb.

mention a number of things one by one. Suzy enumerated the toy cars in her collection for her dad, telling him each car’s make and model.

Here’s a really useful method that comes with LINQ. The static Enumerable.Range method generates a sequence of integers. Calling Enumerable.Range(8, 5) return a sequence of 5 numbers starting with 8: 8, 9, 10, 11, 12.

Images

The LINQ methods in this exercise have names that make it obvious what they do. Some LINQ methods, like Sum, Min, Max, Count, First, and Last return a single value. The Sum method adds up the values in the sequence. The Average method returns their average value. The Min and Max methods return the smallest and largest values in the sequence. And the First and Last methods do exactly what it sounds like they do.

Other LINQ methods like Take, TakeLast, Concat, Reverse (which reverses the order in a sequence), and Skip (which skips the first elements in a sequence) return another sequence.

Images

LINQ isn’t just for numbers. It works with objects, too.

When Jimmy looks at stacks and stacks of disorganized comics, he might see paper, ink, and a jumbled mess. But when us developers look at them, we see something else: lots and lots of data just waiting to be organized. And how do we organize comic book data in C#? The same way we organize playing cards, bees, or items on Sloppy Joe’s menu: we create a class, then we use a collection to manage that class. So all we need to help Jimmy out is a Comic class, and code to help us bring some sanity to his collection. And LINQ will help!

LINQ works with objects

Do this!

Jimmy wanted to know how much some of his prize comics are worth, so he hired a professional comic book appraiser to give him prices for each of them. It turns out that some of his comics are worth a lot of money! Let’s use collections to manage that data for him.

  1. Create a new console app and add a Comic class.

    Use two automatic properties for the name and issue number.

    Images
  2. Add a method to build Jimmy’s catalog.

    Add this static Catalog field to the Comic class. It returns a sequence with Jimmy’s prized comics.

    Images
  3. Use a Dictionary to manage the prices.

    Add a static Comic.Prices field—it’s a Dictionary<int, decimal> that lets you look up the price of each comic using its issue number (using the collection initializer syntax for dictionaries that we learned in Chapter 8). Note that we’re using the IReadOnlyDictionary interface for encapsulation—it’s an interface that includes only the methods to read values (so we don’t accidentaly change the prices):

    Images

We used a Dictionary to store the prices for the comics. We could have included a property called Price. We decided to keep information about the comic and price separate. We did this because prices for collectors’ items change all the time, but the name and issue number will always be the same. Do you think we made the right choice?

LINQ’s query syntax

So far you’ve seen LINQ’s methods. But they’re not quite enough on their own to answer the kinds of questions about data that we might have—or the questions that Jimmy has about his comic collection.

And that’s where the LINQ declarative query syntax comes in. It uses special keywords—including where, select, groupby, and join—to build queries directly into your code.

Let’s build a simple query now. It selects all the numbers in an int array that are under 37 and puts those numbers in ascending order. It does that using four clauses that tell it what object to query, what criteria to use to determine which of its members to select, how to sort the results, and how the results should be returned.

LINQ queries work on sequences (objects that implement IEnumerable<T>). They start with the from keyword:

from (variable) in (sequence)

This tells the query what sequence to execute against.

Images
Images

Anatomy of a query

Images

Let’s explore how LINQ works by making a couple of small changes to the query:

  • That minus in the orderby clause is easy to miss. We’ll change that clause to use the descending keyword—that should make it easier to read.

  • The select clause you just wrote selected the comic, so the result of the query was a sequence of Comic references. Let’s replace it with an interpolated string that uses the comic range variable—now the result of the query is a sequence of strings.

Here’s the updated LINQ query. Each clause in the query produces a sequence that feeds into the next clause—we’ve added a table under each clause that shows its result.

Changing the select clause causes the query to return a sequence of strings.

Images

The var keyword lets C# figure out variable t ypes for you

We just saw that when we made the small change to the select clause, the type of sequence that the query returned changed: when it was select comic; the return type was IEnumerable<Comic>. When we changed it to select $"{comic} is worth {Comic.Prices[comic.Issue]:c}"; the return type changed to IEnumerable<string>. When you’re working with LINQ , that happens all the time—you’re constantly tweaking your queries. Sometimes it’s not always obvious exactly what type they return. Sometimes going back and updating all of your declarations can get annoying.

Luckily, C# gives us a really useful tool to help keep variable declarations simple and readable. You can replace any variable declaration with the var keyword. So you can replace any of these declarations:

  IEnumerable<int> numbers = Enumerable.Range(1, 10);
  string s = $"The count is {numbers.Count()}";
  IEnumerable<Comic> comics = new List<Comic>();
  IReadOnlyDictionary<int, decimal> prices = Comic.Prices;

These declarations do exactly the same thing:

  var numbers = Enumerable.Range(1, 10);
  var s = $"The count is {numbers.Count()}";
  var comics = new List<Comic>();
  var prices = Comic.Prices;

When you use the var keyword, you’re telling C# to use an implicitly typed variable. We saw that same word—implicit—back in Chapter 8 when we talked about covariance. It means that C# figures out the types on its own.

And you don’t have teo change any of your code. Just replace the types with var and everything works.

When you use var, C# figures out the variable’s type automatically

Go ahead—try it right now. Comment out the first line of the LINQ query you just wrote, then replace IEnumerable<Commic> with var:

Images

Now hover your mouse cursor over the variable name in the foreach loop to see its type:

Images

The IDE figured out the mostExpensive variable’s type—and it’s a type we haven’t even seen before. Remember how we talked in Chapter 7 about how interfaces can extend other interfaces? The IOrderedEnumerable interface is part of LINQ—it’s used to represent a sorted sequence—and it extends the IEnumerable<T> interface. Try commenting out the orderby clause and hover over the mostExpensive variable—you’ll find that it turns into an IEnumerable<Comic>. That’s because C# looks at the code to figure out the type of any variable you declare with var.

LINQ magnets

Images

We had a nice LINQ query that used the var keyword arranged with magnets on the refrigerator—but someone slammed the door and the magnets fell off! Rearrange the magnets so they produce the output at the bottom of the page.

Images

Output:

Get your kicks on route 66

LINQ Magnets Solution

Images

Rearrange the magnets so they produce the output at the bottom of the page.

Images

Output:

Get your kicks on route 66

Images

You really can use var in your variable declarations.

And yes, it really is that simple. A lot of C# developers declare local variables using var almost all the time, and include the type only when it makes the code easier to read. As long as you’re declaring the variable and initializing it in the same statement, you can use var.

There are a couple of rules: for example, you can only declare one variable at a time with var, you can’t use the variable you’re declaring in the declaration, and you can’t declare it equal to null. And if you create a variable named var, you won’t be able to use it as a keyword anymore. And you definitely can’t use var to declare a field or a property—you can only use it as a local variable inside a method. But if you stick to those ground rules, you can use var pretty much anywhere.

So when you did this in Chapter 4:

int hours = 24;
short RPM = 33;
long radius = 3;
char initial = 'S';
int balance = 345667 - 567;

Or this in Chapter 6:

SwordDamage swordDamage = new SwordDamage(RollDice(3));
ArrowDamage arrowDamage = new ArrowDamage(RollDice(1));

Or this in Chapter 8:

List<Card> cards = new List<Card>();

You could have done this:

var hours = 24;
var RPM = 33;
var radius = 3;
var initial = 'S';
var balance = 345667 - 567;

Or this:

var swordDamage = new SwordDamage(RollDice(3));
var arrowDamage = new ArrowDamage(RollDice(1));

Or this:

var cards = new List<Card>();

... and your code would have worked exactly the same.

But you can’t use var to declare a field or property.

class Program
{
   static var random = new Random(); // this will cause a compiler error

   static void Main(string[] args)
   {

LINQ is versatile

You can do a lot more than just pull a few items out of a collection. You can modify the items before you return them. And once you’ve generated a set of result sequences, LINQ gives you a bunch of methods that work with them. Top to bottom, LINQ gives you the tools you need to manage your data. Let’s do a quick review of some of the LINQ that we’ve already seen.

  • Modify every item returned from the query.

    This code will add a string onto the end of each string in an array. It doesn’t change the array itself—it creates a new sequence of modified strings.

    Images
  • Perform calculations on sequences.

    You can use the LINQ methods on their own to get statistics about a sequence of numbers.

    var random = new Random();
    var numbers = new List<int>();
    int length = random.Next(50, 150);
    for (int i = 0; i < length; i++)
        numbers.Add(random.Next(100));
    
    Console.WriteLine($@"Stats for these {numbers.Count()} numbers:
    The first 5 numbers: {String.Join(", ", numbers.Take(5))}
    The last 5 numbers: {String.Join(", ", numbers.TakeLast(5))}
    The first is {numbers.First()} and the last is {numbers.Last()}
    The smallest is {numbers.Min()}, and the biggest is {numbers.Max()}
    The sum is {numbers.Sum()}
    The average is {numbers.Average():F2}");

    The static String.Join method concatenates all of the items in a sequence into a string, specifying the separator to use between them.

    Images

LINQ queries aren’t run until you access their results

Do this!

When you include a LINQ query in your code, it uses deferred evaluation (sometimes called lazy evaluation) That means the LINQ query doesn’t actually do any enumerating or looping until your code executes a statement that uses the results of the query. That sounds a little weird, but it makes a lot more sense when you see it in action. Create a new Console App and add this code:

Images

Now run your app. Notice how the Console.WriteLine that prints "Set up the query" runs before the get accessor ever executes. That’s because the LINQ query won’t get executed until the foreach loop.

If you need the query to execute right now, you can force immediate execution by calling a LINQ method that needs to enumerate the entire list. One easy way is to call its ToList method, which turns it into a List<T>. Add this line, and change the foreach to use the new List:

    var immediate = result.ToList();

    Console.WriteLine("Run the foreach");
    foreach (var number in immediate)
             Console.WriteLine($"Writing #{number}");

Now run the app. This time you’ll see the get accessors called before the foreach loop starts executing—which makes sense, because ToList needs to access every element in the sequence to convert it to a list. Methods like Sum, Min, and Max also need to access every element in the sequence, so when you use them you’ll see immediate execution as well.

When you call ToList or another LINQ method that needs to access every element in the sequence, you’ll get immediate evaluation.

Set up the query
Getting #1
Getting #2
Getting #3
Getting #4
Run the foreach
Writing #1
Writing #2
Writing #3
Writing #4

Use a group query to separate your sequence into groups

Sometimes you really want to slice and dice your data. For example, Jimmy might want to group his comics by the decade they were published. Or maybe he wants to separate them by price (cheap ones in one collection, expensive ones in another). There are lots of reasons you’d want to group your data together. And that’s where the LINQ group query comes in handy.

Group this!

Images
  1. Create a new console app and the Card classes and enums.

    Create a new .NET Core console app named CardLinq. Then go to the Solution Explorer panel, right-click on the project name, and choose Add >> Existing Items (or Add >>Existing Files on a Mac). Navigate to the folder where you saved the Two Decks project from Chapter 8. Add the files with the Suit and Value enums, then add the Deck, Card, and CardComparerBySuitThenValue classes.

    Make sure you modify the namespace in each file you added to match the namespace in Program.cs so your Main method can access the classes you added.

  2. Make your card comparable.

    We’ll be using a LINQ orderby clause to sort groups, so we need the Card class to be sortable. Luckily, this works exactly like the List.Sort method, which you learned about in Chapter 7 Modify your Card class to extend the IComparable interface.

    Images
  3. Modify the Deck.Shuffle method to support method chaining.

    The Shuffle class shuffles the deck. So all you need to do to make it support method chaining is to modify it to return a reference to the Deck instance that just got shuffled.

    Images
  4. Use a LINQ query with group clause to group the cards by suit.

    The Main method will get 16 random cards by shuffling the deck, then using the LINQ Take method to pull the first 16 cards. Then it will use a LINQ query with a group clause to separate the deck into smaller sequences, with one sequence for each suit in the 16 cards.

    Images

Anatomy of a group query

Let’s take a closer look at how that group query works.

Images
Images

Use join queries to merge data from t wo sequences

Every good collector knows that critical reviews can have a big impact on values. Jimmy’s been keeping track of reviewer scores from two big comic review aggregators, MuddyCritic and Rotten Tornadoes. Now he needs to match them up to his collection. How’s he going to do that?

LINQ to the rescue! Its join keyword lets you combine data from two sources using a single query. It does it by comparing items in one sequence with their matching items in a second sequence. (LINQ is smart enough to do this efficiently—it doesn’t actually compare every pair of items unless it has to.) The final result combines every pair that matches.

  1. Start off your query with the usual from clause. But instead of following it up with the criteria to determine what goes into the results, you add:

      join name in collection

    The join clause tells LINQ to enumerate both sequences to match up pairs with one member from each. It assigns name to the member it’ll pull out of the joined collection in each iteration. You’ll use that name in the where clause.

  2. Next you’ll add the on clause, which tells LINQ how to match the two collections together. You’ll follow it with the name of the member of the first collection you’re matching, followed by equals and the name of the member of the second collection to match it up with.

  3. You’ll continue the LINQ query with where and orderby clauses as usual. You could finish it with a normal select clause, but you usually want to return results that pull some data from one collection and other data from the other.

    Images
    { Name = "Woman's Work", Issue = 36, Critic =
      MuddyCritic, Score = 37.6 }
    { Name = "Black Monday", Issue = 74, Critic =
      RottenTornadoes, Score = 22.8 }
    { Name = "Black Monday", Issue = 74, Critic =
      MuddyCritic, Score = 84.2 }
    { Name = "The Death of the Object", Issue = 97,
      Critic = MuddyCritic, Score = 98.1 }

The result is a sequence of objects that have Name and Issue properties from the Comic, but Critic and Score properties from the Review. But it can’t be a sequence of Comic objects, but it also can’t be a sequence of Review objects, because neither class has all of those properties. So what’s the type of the sequence generated by the query?

Use the new keyword to create anonymous types

You’ve been using the new keyword since Chapter 3 to create instances of objects. Every time you use it, you include a type (so the statement new Guy() creates an instance of the type Guy). But you can also use the new keyword without a type to create an anonymous type. That’s a perfectly valid type that has read-only properties, but doesn’t have a name. You can add properties to your anonymous type by using an object initializer.

Here’s what that looks like:

public class Program
{
 public static void Main()
 {
    var whatAmI = new { Color = "Blue", Flavor = "Tasty", Height = 37 };
    Console.WriteLine(whatAmI);
 }
}

Try pasting that into a new console app and running it. You’ll see this ouptut:

{ Color = Orange, Flavor = Tasty, Height = 37 }

Now hover over whatAmI in the IDE and have a look at the IntelliSense window:

a-non-y-mous, adjective.

not identified by name. Secret Agent Dash Martin used his alias to become anonymous and keep the enemy spies from recognizing him.

Images

The whoAmI variable is a reference type, just like any other reference. It points to an object on the heap, and you can use it to access that object’s members—in this case, its three properties.

  Console.WriteLine($"My color is {whatAmI.Color} and I'm {whatAmI.Flavor}");

Besides the fact that they don’t have names, anonymous type are just like any other types.

Images
Images

Now what if we need to get their shirt size? If we have a sequence called jerseys whose items have a Number property and a Size property. A join would work really well for combining the data:

     var doubleDigitShirtSizes =
        from player in players
        where player.Number > 10
        join shirt in jerseys
        on player.Number equals shirt.Number
        select shirt;

Q: That query will just give me a bunch of objects. What if I want to connect each player to his shirt size, and I don’t care about the number at all?

A: That’s what anonymous types are for—you can construct an anonymous type that only has the data you want in it. And it lets you pick and choose from the various collections that you’re joining together, too.

So you can select the player’s name and the shirt’s size, and nothing else:

     var doubleDigitShirtSizes =
         from player in players
         where player.Number > 10
         join shirt in jerseys
         on player.Number equals shirt.Number
         select new {
                player.Name,
                shirt.Size
     };

The IDE is smart enough to figure out exactly what results you’ll be creating with your query. If you create a loop to enumerate through the results, as soon as you type the variable name the IDE will pop up an IntelliSense list.

Images

Notice how the list has Name and Size in it. If you added more items to the select clause, they’d show up in the list too, because the query would create a different anonymous type with different members.

Q: How do I write a method that returns an anonymous type?

A: You don’t. Methods cannot return anonymous types. C# doesn’t give you a way to do that. You can’t declare a field or a property with an anonymous type, either. And you also can’t use an anonymous type for a parameter in a method or a constructor—that’s why you can’t use the var keyword with any of those things.

And when you think about it, these things make sense. Whenever you use var in a variable declaration, you always have to include a value, which the C# compiler or IDE use to figure out the type of the variable. But if you’re declaring a field or a method parameter, there’s no way to specify that value—which means there’s no way for C# to figure out the type. (Yes, you can specify a value for a property, but that’s not really the same thing—technically, the value is set just before the constructor is called..)

You can only use the var keyword when you’re declaring a variable. You can’t write a method that returns an anonymous type, or which takes one as a parameter, or use one with a field or a property.

Images

Unit tests help you make sure your code works

You’ve sleuthed out a lot of bugs in the first 8 chapters of this book, so you know how easy it is to write code that doesn’t do exactly what you intended it to do. Luckily, there’s a way for us to help find bugs so we can fix them. Unit tests are automated tests that help you make sure your code does what it’s supposed to do. Each unit test is a method that makes sure that a specific part of the code (the “unit” being tested) works. If the method runs without throwing an exception, it passes. If it throws an exception, it fails. And most large programs have a suite of tests that cover most or all of the code.

Visual Studio has built-in unit testing tools to help you write your tests and track which ones pass or fail. The unit tests in this book will use MSTest, a unit test framework (which means that it’s a set of classes that give you the tools to write unit tests) developed by Microsoft.

Visual Studio also supports unit tests written in NUnit and xUnit, two popular open source unit test frameworks for C# and .NET code.

Visual Studio for Windows has the Test Explorer window

Open the Test Explorer window by choosing View >> Test Explorer from the menu. It shows you the unit tests on the left, and the results of the most recent run on the right. The toolbar has buttons to run all tests, run a single test and repeat the last run.

Images

When you add unit tests to your solution, you can run your tests by clicking the Run All Tests button. You can debug your unit tests on Windows by choosing Tests >> Debug all tests, and on Mac by clicking Debug all tests in the Unit Test pad.

Images

Visual Studio for Mac has the Unit Test pad

Open the Unit Test pad by choosing View >> Pads >> Unit Tests from the menu. It has buttons to run or debug your tests. When you run the unit tests, the IDE displays the results in a Test Results pad (usually at the bottom of the IDE window).

Images

Add a unit test project to your solutionAdd

  1. Add a new MS Test (.NET Core) project.

    Right-click on the solution name in the Solution Explorer, then choose Add >> New Project... from the menu. Name your project JimmyLinqUnitTests.

    Images
  2. Add a dependency on your existing project.

    You’ll be building unit tests for the ComicAnalyzer class. When you have two different projects in the same solution, they’re independent—by default, the classes in one project can’t use classes in another project—so we’ll need to set up a dependency to let your unit tests use ComicAnalyzer.

    Expand the JimmyLinqUnitTests project in the Solution Explorer, then right-click on Dependencies and choose Add Reference... from the menu. Check the JimmyLinq project that you created for the exercise.

    Images
  3. Make your ComicAnalyizer class public.

    When Visual Studio added the unit test project, it created a class called UnitTest1.cs. Edit that file and try adding the using JimmyLinq; directive inside the namespace:

    Images

    Hmm, something’s wrong—the IDE won’t let you add the directive. The reason is that the JimmyLinq project has no public classes, enums, or other members. Try modifying the Critics enum to make it public: public enum Critics – then go back and try adding the using directive. Now you can add it! The IDE saw that the JimmyLinq namespace has public members, and added it to the pop-up window.

    Now change the ComicAnalyzer declaration to make it public: public static class ComicAnalyzer

    Uh-oh—something’s wrong. Did you get a bunch of “Inconsistent accessibility” compiler errors?

    Images

    The problem is that ComicAnalyzer is public, but it exposes members that have no access modifiers, which makes them internal—so other projects in the solution can’t see them. Add the public access modifier to every class and enum in the JimmyLinq project. Now your solution will build again.

Write your first unit test

The IDE added a class called UnitTest1 to your new MSTest project. Rename the class (and the file) ComicAnalyzerTests. The class contains a test method called TestMethod1. Next, give it a very descriptive name: rename the method ComicAnalyzer_Should_Group_Comics. Here’s the code for your unit test class:

Images

Now run your test by choosing Test >> Run All Tests (Windows) or Run >> Run Unit Tests (Mac) from the menu. The IDE will pop up a Test Explorer window (Windows) or Test Results panel (Mac) the test results.

Test method JimmyLinqUnitTests.ComicAnalyzerTests.ComicAnalyzer_Should_Group_Comics threw exception: System.Collections.Generic.KeyNotFoundException: The given key '2' was not present in the dictionary.

This is the result of a failed unit test. Look for this icon in Windows: Images or this message at the bottom of the IDE window in Visual Studio for Mac: Images – that’s how you see a count of your failed unit tests.

Did you expect that unit test to fail? Can you figure out what went wrong with the test?

Write a unit test for the GetReviews method

The unit test for the GroupComicsByPrice method used MSTest’s static Assert.AreEqual method to check expected values against actual ones. But the GetReivews method returns a sequence of strings, not an individual value. We could use Asser.AreEqual to compare individual elements in that sequence, just like we did with the last two assertions, using LINQ methods like First to get specific elements. But that would take a LOT of code.

Luckily, MSTest has a better way to compare collections: the CollectionAssert class has static methods for comparing expected versus actual collection results. So if you have a collection with expected results and a collection with actual results, you can compare them like this:

 CollectionAssert.AreEqual(expectedResults, actualResults);

If the expected and actual results don’t match the test will fail. Go ahead and add this test to validate the ComicAnalyzer.GetReviews method:

[TestMethod]
public void ComicAnalyzer_Should_Generate_A_List_Of_Reviews()
{
     var testReviews = new[]
     {
         new Review() { Issue = 1, Critic = Critics.MuddyCritic, Score = 14.5},
         new Review() { Issue = 1, Critic = Critics.RottenTornadoes, Score = 59.93},
         new Review() { Issue = 2, Critic = Critics.MuddyCritic, Score = 40.3},
         new Review() { Issue = 2, Critic = Critics.RottenTornadoes, Score = 95.11},
     };

     var expectedResults = new[]
     {
         "MuddyCritic rated #1 'Issue 1' 14.50",
         "RottenTornadoes rated #1 'Issue 1' 59.93",
         "MuddyCritic rated #2 'Issue 2' 40.30",
         "RottenTornadoes rated #2 'Issue 2' 95.11",
     };

     var actualResults = ComicAnalyzer.GetReviews(testComics, testReviews).ToList();
     CollectionAssert.AreEqual(expectedResults, actualResults);
}

Now run your tests again. You should see two unit tests pass.

Write unit tests to handle edge cases and weird data

In the real world, data is messy. For example, we never really told you exactly what review data is supposed to look like. You’ve seen review scores between 0 and 100. Did you assume those were the only values allowed? That’s definitely the way some review websites in the real world operate. But what if we get some weird review scores—like negative ones, or really big ones, or zero? And what if we get more than one score from a reviewer for an issue? Even if these things aren’t supposed to happen, they might happen.

We want our code to be robust, which means that it handles problems, failures, and especially bad input data well. So let’s build a unit test that passes some weird data to GetReviews and makes sure it doesn’t break.

Images

Adding unit tests that handle edge cases and weird data can help you spot problems in your code that you wouldn’t find otherwise.

Images

Your projects actually go faster when you write unit tests.

We’re serious! It may seem counterintuitive that it takes less time to write more code, but if you’re in the habit of writing unit tests, your projects go a lot more smoothly because you find and fix bugs early. You’ve written a lot of code so far in the first eight and a half chapters in this book, which means you’ve almost certainly had to track down and fix bugs in your code. When you fixed those bugs, did you have to fix other code in your project too? When we find an unexpected bug, we often have to stop what we’re doing to track it down and fix it, and switching back and forth like that—losing our train of thought, having to interrupt our flow—can really slow things down. Unit tests help you find those bugs early, before they have a chance to interrupt your work.

Use the => operator to create lambda expressions

We left you hanging back at the beginning of the chapter. Remember that mysterious line we asked you to add to the Comic class? Here it is again:

  public override string ToString() => $"{Name} (Issue #{Issue})";

You’ve been using that ToString method throughout the chapter—you know it works. What would you do if we asked you to rewrite that method the way you’ve been writing methods so far? Would you write something like this:

  public override string ToString() {
      return $"{Name} (Issue #{Issue})";
  }

And you’d basically be right. So what’s going on? What, exactly is that => operator?

The => operator that you used in the ToString method is the lambda operator. You can use => to define a lambda expression, or an anonymous function defined within a single statement. Lambda expressions look like this:

 (input-parameters) => expression;

There are two parts to a lambda expression:

  • The input-parameters is a list of parameters, just like you’d use when you declare a method. If there’s only one parameter, you can leave off the parentheses.

  • The expression is any C# expression: it can be an interpolated string, a statement that uses an operator, a method call—pretty much anything you would put in a statement.

Lambda expressions may look a little weird at first, but they’re just another way of using the same familiar C# expressions that you’ve been using throughout the book—just like the Comic.ToString method, which works the same way whether or not you use a lambda expression.

Images

Yes! You can use lambda expressions to refactor many methods and properties.

You’ve written a lot of methods and throughout this book that just contain a single statement. You could refactor most of them to use lambda expressions instead. In many cases, that could make your code easier to read and understand. Lambdas give you options—you can decide when using them improves your code.

A Lambda Test Drive

Images

Let’s kick the tires on lambda expressions, which give us a whole new way to write methods, including ones that return values or take parameters.

Create a new console app. Add this Program class with the Main method:

class Program
{
    static Random random = new Random(); 

    static double GetRandomDouble(int max)
    {
        return max * random.NextDouble();
    }

    static void PrintValue(double d)
    {
        Console.WriteLine($"The value is {d:0.0000}");
    }

    static void Main(string[] args)
    {
        var value = Program.GetRandomDouble(100);
        Program.PrintValue(value);
    }
}

Run it a few times—it prints output each time with a different random number: The value is 37.8709

Images

Now refactor the GetRandomDouble and PrintValue methods using the => operator:

Images

Run your program again—it should print a different random number, just like before.

Before we do one more refactoring, hover over the random field and look at the IntelliSense pop-up:

Images

Now modify the random field to use a lambda expression:

  static Random random => new Random();

The program still runs the same way. But hover over the random field again:

Images

Wait a minute—random isn’t a field anymore. Changing it into a lambda turned it into a property! That’s because lambda expressions always work like methods. So when random was a field, it got instantiated once when the class was constructed. But when you changed the = to a => and converted it to a lambda, it became a method—which means a new instance of Random is created every time the property is accessed.

Images

Refactor a clown with lambdas

Do this!

Back in Chapter 7, you created IClown interface with two members:

Images

And you modified this class to implement that interface:

class TallGuy {
    public string Name;
    public int Height;

    public void TalkAboutYourself() {
        Console.WriteLine($"My name is {Name} and I'm {Height} inches tall.");
    }
}

So lets do that same thing again, but this time you’ll we’ll use lambdas. Create a new Console App project and add the IClown interface and TallGuy class. Then modify TallGuy to implement IClown:

class TallGuy : IClown {

Now open the Quick Fix menu and choose Implement interface. The IDE fills in all of the interface members, having them throw NotImplementedExceptions just like it does when you use Generate Method.

Images

Let’s refactor these methods so they do the same thing as before, but now use lambda expressions:

Images

The FunnyThingIHave property and Honk methods work exactly like they did in Chapter 7. Flip back and find the Main method you used before—your new implementation will work exactly the same way. But now that we’ve refactored them to use lambda expressions, they’re much more compact.

We think the new and improved TallGuy class is easier to read. Do you?

Images
Images

The IDE is telling you random is a now property—and it by showing you that it has a get accessor { get; }.

You can use the => operator to turn a field into a property with a get accessor that executes a lambda expression.

Use the ?: operator to make your lambdas make choices

What if you want your lambdas to do... more? It would be great if they could make decisions... and that’s where the conditional operator (which some people call the ternary operator) comes in. It works like this:

condition ? consequent : alternative;

which may look a little weird at first, so let’s have a look at an example. First of all, the ?: operator isn’t unique to lambdas—you can use it anywhere. So take this if statement from the AbilityScoreCalculator class in Chapter 4:

Images

Notice how we set Score equal to the results of the ?: expression. The ?: expression returns a value: it checks the condition (added < Minimum), and then it either returns the consequent (Minimum) or the alternative (added).

When you have a method that looks like that if/else statement, you can use ?: to refactor it as a lambda. For example, take this method from the MachineGun class in Chapter 5:

Images

Notice the slight change—in the if/else version, the BulletsLoaded property was set inside the then- and else-statements. We changed this to use a conditional operator that checked bullets againstMAGAZINE_SIZE and returned the correct value, and use that return value to set the BulletsLoaded property.

Lambda expressions and LINQ

Add this simple LINQ query to any C# app, then hover over the select keyword in the code:

Images

The IDE pops up a tooltip window just like it does when you hover over a method. Let’s take a closer look at the first line, which shows the method declaration:

Images

We can learn a few things from that method declaration:

  • The IEnumerable<int>.Select method returns an IEnumerable<int>

  • It takes a single parameter of type Func<int, int>

We’ll learn a lot more about Func (and its friend, Action) in Chapter 13. But for now, it means that there’s a LINQ method called Select that takes a Func<int, int> parameter. So what does that mean?

Use lambda expressions with methods that take a Func parameter

When a method that takes a Func<int, int> parameter, you can call it with a lambda expression that takes an int parameter returns an int. So you could refactor the select query like this:

  var array = new[] { 1, 2, 3, 4 };
  var result = array.Select(i => i * 2);

Go ahead—try that yourself in a console app. Add a foreach statement to print the output:

foreach (var i in result) Console.WriteLine(i);

When you print the results of the refactored query, you’ll get the sequence { 2, 4, 6, 8 } – exactly the same result as you got with the LINQ query syntax before you refactored it.

LINQ queries can be written as chained LINQ methods

Take this LINQ query from earlier and add it to an app so we can explore a few more LINQ methods:

     int[] values = new int[] { 0, 12, 44, 36, 92, 54, 13, 8 };
     IEnumerable<int> result =
                from v in values
                where v < 37
                orderby -v
                select v;

The OrderBy LINQ method sorts a sequence

Hover over the orderby keyword and take a look at its parameter:

Images

When you use an orderby clause in a LINQ query, it calls a LINQ OrderBy method that sorts the sequence. In this case, we can pass it a lambda expression with an int parameter that returns the sort key, or any value (which must implement IComparer) that it can use to sort the results.

The Where LINQ method pulls out a subset of a sequence

Now hover over the where keyword in the LINQ query:

Images

The where clause in a LINQ query calls a LINQ Where method that can use a lambda that returns a Boolean. The Where method calls that lambda for each element in the sequence. If the lambda returns true, the element is included in the results. If the lambda returns false, the element is removed.

Use the => operator to create switch expressions

You’ve been using switch statements since Chapter 6 to check a variable against several options. It’s a really useful tool... but have you noticed its limitations? For example, try adding a case that tests against a variable:

  case myVariable:

You’ll get a C# compiler error: A constant value is expected. That’s because you can only use constant values—like literals and variables defined with the const keyword—in the switch statements that you’ve been using.

But that all changes with the => operator, which lets you create switch expressions. They’re similar to the switch statements that you’ve been using, but they’re expressions that return a value. Here’s how they work:

Images

Let’s say you’re working on a card game that needs to assign a certain score based on suit, where spades are worth 6, hearts are worth 4, and other cards are worth 2. You could write a switch statement like this:

The whole goal of this switch statement is to use the cases to set the score variable—and a lot of our switch statements work that way. We can use the => operator to create a switch expression that does the same thing:

Images

Explore the Enumerable class

Before we finish this chapter, let’s take a closer look at enumerable sequences. We’ll start with the Enumerable class—specifically, with its three static methods: Range, Empty, and Repeat. You already saw the Enumerable.Range method earlier in the chapter. Let’s use the IDE to discover how the other two methods work. Type Enumerable. and then hover over Range, Empty, and Repeat in the IntelliSense pop-up to see their declarations and comments.

Images

Enumerable.Empty creates an empty sequence of any type

Sometimes you need to pass an empty sequence to a method that takes an IEnumerable<T> (for example, in a unit test). When you do, the Enumerable.Empty method comes in handy.

var emptyInts = Enumerable.Empty<int>(); // an empty sequence of ints
var emptyComics = Enumerable.Empty<Comic>(); // an empty sequence of Comic references

Enumerable.Repeat repeats a value a number of times

Let’s say you need a sequence of 100 3’s, or 12 "yes" strings, or 83 identical anonymous objects. You’d be surprised at how often that happens! And that’s what the Enumerable.Repeat method does—it returns a sequence of repeated values.

var oneHundredThrees = Enumerable.Repeat(3, 100);
var twelveYesStrings = Enumerable.Repeat("yes", 12);
var eightyThreeObjects = Enumerable.Repeat(
    new { cost = 12.94M, sign = "ONE WAY", isTall = false }, 83);

So what exactly is an IEnumerable<T>?

We’ve been using IEnumerable<T> for a while now. But we haven’t really answered the question of what an enumerable sequence actually is. So let’s finish the chapter by building some sequences ourselves. But before we do, take a minute or two to ponder the question of how you would design the IEnumerable<T> interface if we asked you to.

Create an enumerable sequence by hand

Let’s say we have some sports:

enum Sport { Football, Baseball, Basketball, Hockey, Boxing, Rugby, Fencing }

Obviously, we could create a new List<Sport> and use a collection initializer to populate it. But just for the sake of exploring how sequences work, let’s build one manually. Let’s create a new class called ManualSportSequence and make it implement the IEnumerable<Sport> interface. It just has two members that return an IEnumerator:

Images

Okay, so what’s an IEnumerator? It’s an interface that lets you enumerate a sequence, moving through each item in the sequence one after another. It has a property, Current, which returns the current item being enumerated. Its MoveNext method moves to the next element in the sequence, returning false if the sequence has run out. After MoveNext is called, Current returns that next element. Finally, the Reset method resets the sequence back to the beginning. Once you have those methods, you have an enumerable sequence.

Images
Images

And that’s all we need to create our own IEnumerable. Go ahead—give it a try. Create a new console app, add ManualSportSequence and ManualSportEnumerator, and then enumerate the sequence in a foreach loop:

     var sports = new ManualSportSequence();
     foreach (var sport in sports)
        Console.WriteLine(sport);

Use yield return to create your own sequences

C# gives you a much easier way to create enumerable sequences: the yield return statement. The yield return statement is a kind of all-in-one automatic enumerator creator. And a good way to understand it is to see a simple example. And let’s use a multi-project solution, just to give you a little more practice with that.

Add a new Console App project to your solution—this is just like what you did when you added the MSTest project earlier in the chapter, except this time instead of choosing the project type MSTest choose the same Console App project type that you’ve been creating for most of the projects in the book. Then right-click on the project under the solution and choose Set as startup project. Now when you launch the debugger in the IDE, it will run the new project. You can also right-click on any project in the solution and run or debug it.

Here’s the code for the new console app:

Images

Run the app—it prints four lines: apples, oranges, bananas, and unicorns. So how does that work?

Use the debugger to explore yield return

Set a breakpoint on the first line of the Main method and launch the debugger. Then use Step Into (F11 / Images) to debug the code line by line, right into the iterator:

  • Step into the code, and keep stepping into it until you reach the first line of the SimpleEnumerable method.

  • Step into that line again. It acts just like a return statement, returning control back to the statement that called it—in this case, back to the foreach statement, which calls Console.WriteLine to write apples.

  • Step two more times. Your app will jump back into the SimpleEnumerable method. But it skips the first statement in the method and goes right to the second line.

    Images
  • Keep stepping. The app returns to the foreach loop, then back to the third line of the method, then returns to the foreach loop, and back to the fourth line of the method.

So yield return makes a method return an enumerable sequence by returning the next element in the sequence each time it’s called, and keeping track of where it returned from so it can pick up where it left off.

Use yield return to refactor ManualSportSequence

You can create your own IEnumerable<T> by using yield return to implement GetEnumerator method. For example, here’s a BetterSportSequence class that does exactly the same thing as ManualSportSequence did. This version is much more compact because it uses yield return in its GetEnumerator implementation:

Images

Go ahead and add a new Console App project to your solution. Add this new BetterSportSequence class, and modify the Main method to create an instance of it and enumerate the sequence.

Add an indexer to BetterSportSequence

We’ve seen that you can use yield return in a method to create an IEnumerator<T>, and you can also use to create a class that implements IEnumerable<T>. One advantage of creating a separate class for your sequence is that you can add an indexer. You’ve already used indexers—any time you use brackets [] to retrieve an object from a list, array, or dictionary (like myList[3] or myDictionary["Steve"]), you’re using an indexer. An indexer is just a method. It looks a lot like a property, except it’s got a single named parameter.

The IDE has an especially useful code snippet to help you add your indexer. Type indexer followed by two tabs, and the IDE will add the skeleton of an indexer for you automatically.

Here’s an indexer for the SportCollection class:

     public Sport this[int index] {
         get => (Sport)index;
     }

Calling the indexer with [3] returns the value Hockey:

     var sequence = new BetterSportSequence();
     Console.WriteLine(sequence[3]);

Take a close look when you use the snippet to create the indexer—it lets you set the type. You can define an indexer that takes different types, including strings and even objects. And while our indexer only has a getter, you can also include a setter (just like the ones you’ve used to set items in a List).

Collectioncross

Images

Across

1. Use the var keyword to declare an _____ typed variable

7. A collection _____ combines the declaration with items to add

9. What you’re trying to make your code when you have lots of tests for weird data and edge cases

11. LINQ method to return the last elements in a sequence

12. A last-in first-out (LIFO) collection

18. LINQ method to return the first elements in a sequence

19. A method that has multiple constructors with different parameters

20. The type of parameter that tells you that you can use a lambda

21. What you take advantage of when you upcast an entire list

22. What you’re using when you call myArray[3]

25. What T gets replaced with when you see <T> in a class or interface definition

32. The keyword you use to create an anonymous object

33. A data type that only allows certain values

34. The kind of collection that can store any type

35. The interface that all sequences implement

36. Another name for the ?: conditional operator

Down

1. If you want to sort a List, its members need to implement this

2. A collection class for storing items in order

3. A collection that stores keys and values

4. What you pass to List.Sort to tell it how to sort its items

Down

5. What goes in the parentheses: ( _____ ) => expression;

6. You can’t use the var keyword to declare one of these

8. The access modifier for a class that can’t be accessed by another project in a multi-project solution

10. The kind of expression the => operator creates 13. LINQ method to append the elements from one sequence to the end of another

14. Every collection has this method to put a new element into it

15. What you can do with methods in a class that return the type of that class

16. What kind of type you’re looking at when the IDE tells you this: 'a is a new string Color, int Height

17. An object’s namespace followed by a period followed by the class is a fully _____ class name

23. The kind of evaluation that means a LINQ query isn’t run until its results are accessed

24. The clause in a LINQ query that sorts the results

26. Type of variable created by the from clause in a LINQ query

27. The Enumerable method that returns a sequence with many copies of the same element

28. The clause in a LINQ query that determines which elements in the input to use

29. A LINQ query that merges data from two sequences 30. A first-in first-out (FIFO) collection

31. The keyword a switch statement has that a switch expression doesn’t

Collectioncross solution

Images
Images
Images