As you saw in Chapter 20, SQL is a powerful tool for retrieving and filtering information from a database. Once you become accustomed to the syntax, with its select
s and from
s and join
s, it's somewhat intuitive as well. However, SQL commands don't integrate well with C#, as you saw. You need the bridge of DataAdapter
and DataSet
objects to connect the database query with your application. The Language Integrated Query (LINQ) is the solution to that problem. LINQ is a new feature of .NET that C# 3.0 takes advantage of, which makes it easier to work with data, as you'll see in the second part of the chapter.
Another useful feature of LINQ is that you can address a number of different data sources using similar syntax. In this chapter we'll show you how to use LINQ with SQL, but you don't need to use LINQ with a traditional database—it can retrieve data from XML files and other data sources equally well.
Perhaps the most interesting feature of LINQ is that you can query more than just data stored in other files. You can use LINQ to query collections that are held in-memory, that is, collection classes within your own code. So, for example, if you have a collection of Book
objects, you can use LINQ to query for all the books by a single author, or published after a certain date. You could certainly write C# code to accomplish that, but the query syntax is arguably more natural and certainly briefer. Because this use of LINQ is easy to understand and is potentially useful, we'll start with that, and then move on to using it with a SQL database.
As you've seen elsewhere in this book, C# allows you to create classes that are complex, with many different properties, which sometimes are objects of other classes as well. You've also seen how to create collections of objects that you can manipulate in different ways. Sometimes that complexity works against you, though. Suppose you have a class that represents shipping orders for a warehouse. You could keep a ton of data in such an object, which would make it very versatile, but what if you just wanted to extract a list of the zip codes where your customers live, for demographic purposes? You could write some code to go through the entire collection of objects and pull out just the zip codes. It wouldn't be terribly difficult, but it might be time-consuming. If that information were in a database, you could issue a simple SQL query, like you learned about in Chapter 20, but collections can't be queried like a database…until now. Using LINQ, you can issue a SQL-like query against a collection in your code to get another collection containing just the data you want. An example will help make this clear.
Before you can start, you'll need a collection to work with, so we'll define a quick and simple Book
class, like so:
public class Book { public string Title { get; set; } public string Author { get; set; } public string Publisher { get; set; } public int PublicationYear { get; set; } }
This is a very basic class, with three string
fields and one int
field.
Next, we'll define a generic List<Book>
, and fill it with a handful of Book
objects. This is a relatively short list, and it wouldn't be that hard to sort through by hand, if you needed to. That's because we're keeping the List
short for demonstration purposes; in other cases, it might be a list of hundreds of items read in from a file or someplace else:
List<Book> bookList = new List<Book> { new Book { Title = "Learning C# 3.0", Author = "Jesse Liberty", Publisher = "O'Reilly", PublicationYear = 2008 }, new Book { Title = "Programming C# 3.0", Author = "Jesse Liberty", Publisher = "O'Reilly", PublicationYear = 2008 }, new Book { Title = "C# 3.0 Cookbook", Author = "Jay Hilyard", Publisher = "O'Reilly", PublicationYear = 2007 }, new Book { Title = "C# 3.0 in a Nutshell", Author = "Ben Albahari", Publisher = "O'Reilly", PublicationYear = 2007 }, new Book { Title = "Head First C#", Author = "Andrew Stellman", Publisher = "O'Reilly", PublicationYear = 2007 }, new Book { Title = "Programming C#, fourth edition", Author = "Jesse Liberty", Publisher = "O'Reilly", PublicationYear = 2005 } };
Now you need to issue a query. Suppose you want to find all the books in the list that were authored by Jesse Liberty. You'd use a query like this:
IEnumerable<Book> resultsAuthor = from testBook in bookList where testBook.Author == "Jesse Liberty" select testBook;
Let's take this apart. The query returns an enumerable collection of Book
objects, or to put it another way, it'll return an instance of IEnumerable<Book>
. A LINQ data source must implement IEnumerable
, and the result of the query must as well.
The rest of the query resembles a SQL query. You use a range variable, in this case, testBook
, in the same way you would the iteration variable in a foreach
loop. Because your query is operating on bookList
, which was previously defined as a List<Book>
, the compiler automatically defines testBook
as a Book
type.
As with the SQL query you saw in the previous chapter, the from
clause defines the range variable, and the in
clause identifies the source. The where
clause is used to filter the data. In this case, you're testing a condition with a Boolean expression, as you would with any C# object.
The select
clause returns the results of the query, as an enumerable collection. This is called projection in database terminology. In this example, we returned the entire Book
object, but you can return just some of the fields instead, like this:
select testBook.Title;
Now that you have a collection of Book
objects, you can use a foreach
loop to process them; in this case, outputting them to the console:
foreach (Book testBook in resultsAuthor) { Console.WriteLine("{0}, by {1}", testBook.Title, testBook.Author); }
You can use any legal Boolean expression in your where
clause; for example, you could return all the books published before 2008, like this:
IEnumerable<Book> resultsDate = from testBook in bookList where testBook.PublicationYear < 2008 select testBook;
This simple example, with both queries, is shown in Example 21-1.
Example 21-1. You can use LINQ to query the contents of collections; this collection is very simple, but for large collections, this technique is powerful
using System; using System.Collections.Generic; using System.Linq; using System.Text; namespace Example_21_1_ _ _ _Querying_Collections { // simple book class public class Book { public string Title { get; set; } public string Author { get; set; } public string Publisher { get; set; } public int PublicationYear { get; set; } } class Program { static void Main(string[] args) { List<Book> bookList = new List<Book> { new Book { Title = "Learning C# 3.0", Author = "Jesse Liberty", Publisher = "O'Reilly", PublicationYear = 2008 }, new Book { Title = "Programming C# 3.0", Author = "Jesse Liberty", Publisher = "O'Reilly", PublicationYear = 2008 }, new Book { Title = "C# 3.0 Cookbook", Author = "Jay Hilyard", Publisher = "O'Reilly", PublicationYear = 2007 }, new Book { Title = "C# 3.0 in a Nutshell", Author = "Ben Albahari", Publisher = "O'Reilly", PublicationYear = 2007 }, new Book { Title = "Head First C#", Author = "Andrew Stellman", Publisher = "O'Reilly", PublicationYear = 2007 }, new Book { Title = "Programming C#, fourth edition", Author = "Jesse Liberty", Publisher = "O'Reilly", PublicationYear = 2005 } }; // find books by Jesse Liberty IEnumerable<Book> resultsAuthor = from testBook in bookList where testBook.Author == "Jesse Liberty" select testBook; Console.WriteLine("Books by Jesse Liberty:"); foreach (Book testBook in resultsAuthor) { Console.WriteLine("{0}, by {1}", testBook.Title, testBook.Author); } // find books published before 2008 IEnumerable<Book> resultsDate = from testBook in bookList where testBook.PublicationYear < 2008 select testBook; Console.WriteLine("\nBooks published before 2008:"); foreach (Book testBook in resultsDate) { Console.WriteLine("{0}, by {1}, {2}", testBook.Title, testBook.Author, testBook.PublicationYear); } } } }
The output looks like this:
Books by Jesse Liberty: Learning C# 3.0, by Jesse Liberty Programming C# 3.0, by Jesse Liberty Programming C#, fourth edition, by Jesse Liberty Books published before 2008: C# 3.0 Cookbook, by Jay Hilyard, 2007 C# 3.0 in a Nutshell, by Ben Albahari, 2007 Head First C#, by Andrew Stellman, 2007 Programming C#, fourth edition, by Jesse Liberty, 2005
You might expect that the data would be retrieved from the data source when you create the IEnumerable<T>
instance to hold the results. In fact, the data isn't retrieved until you try to do something with the data in the IEnumerable<T>
. In this case, that's when you output the contents in the foreach
statement. This behavior is helpful because databases with many connections may be changing all the time; LINQ doesn't retrieve the data until the last possible moment, right before you're going to use it.