i-think Twenty-Two

Now with more coherency.

Getting Started with LINQ

| Comments

I really like LINQ. It’s one of my favourite .NET features. When I first heard about it I was doing most of my programming in Visual Basic 6 (or worse, Visual Basic for Applications). Working now with C# and the .NET Framework has blessed me with full O-O, strong types, an excellent base class library, Visual Studio 2008 (and IntelliSense), Generics (I love generics) and LINQ.

So what is LINQ and why is it so important to add to your arsenal of .NET skills?

LINQ is so many things

At its core LINQ is exactly what its acronym suggests: Language Integrated Query. But what does this actually mean? Is it just some marketing hype designed to confuse the masses and look good on your resume. Probably. But the value of expressing a query concisely in the language of your choice becomes more apparent with each LINQ query you write. (Yes, I know LINQ Query would stand for Language Integrated Query query. Just go with it, it reads better.)

Importantly a LINQ query separates defining what you are looking for from how to find it. This means that a LINQ query could potentially be executed across multiple CPU cores and in the case of LINQ to SQL can be turned into an efficient SQL query so the hard work can be done by your database server.

But when are you going to actually use LINQ? Chances are good that you already have some code that could benefit from a bit of LINQ.

Take this example:

1
2
3
4
5
6
7
8
var itemsUnderOneDollar = new List<Item>();
foreach (var item in items)
{
   if (item.Price < 1)
   {
      itemsUnderOneDollar.Add(item);
   }
}

In this case we want to find all items that are under one dollar. The same in LINQ would be:

1
2
3
var itemsUnderOneDollar = (from item in items
                           where item.Price < 1
                           select item).ToList();

We can ignore the variable declaration for now (and the call to ToList()) so let’s break it down to just the core LINQ query.

1
2
3
from item in items
where item.Price < 1
select item

The LINQ query describes exactly what you want and nothing more. When we used the foreach construct we were resigned to the fact that we had to look at each and every item. We are also doing all this in a single thread. In fact, we spend more time describing how we want to find the items than saying what it is that we want. By describing what we want using LINQ we don’t bother with the implementation details resulting in cleaner code and improved flexibility for how the query should be implemented. In the case of a database query, the ideal implementation would be to generate a SQL query, execute the SQL query against the database and return the results. LINQ to SQL does just that with essentially the same code (I’ll be discussing LINQ to SQL in depth in another post).

From Where Select vs. Select From Where

If you are already familiar with SQL you may be a little confused by the syntax of a LINQ query. Indeed this is a major stumbling block most people encounter when they start to use LINQ. In SQL we have the ‘select’ statement upfront but in LINQ we save it for the end. Why? The primary reason for this choice was to enable great IntelliSense support in Visual Studio.

I’d like to argue that the syntax in LINQ actually makes more sense. Rather than starting with what we want at the end we start with the subject of our query. The reason this seems so foreign is that we are so used to it because of SQL. When you write code you typically say where you want to look before you say what you want to do when you’ve found it. In natural language it is like saying “From the store find a computer with 2GiB RAM and get me the price of the computer”. In SQL speak that would be “Get the price of the computer in the store where that computer has 2GiB RAM”. You tell me which form you’d be more likely to use.

The magic of type inference

Another convenient way to remember that from comes first is to think of the old foreach implementation. You’ll find that they have a lot in common. The biggest difference is that in the foreach loop we have to explicitly specify a type. In the example above I’ve used var to let the compiler infer the type. In the LINQ query the type is inferred automatically unless you specify it explicitly.

Type inference is used throughout most LINQ usage to simplify code and to improve maintainability. Queries return an object that implements the IEnumerable<T> interface. More advanced queries can return objects that implement a more complex interface (which is also an IEnumerable<T>). By using var to let the compiler infer the type of object returned by the query it saves the programmer from having to explicitly work out what type of object is returned. The full significance of this will become apparent in future posts.

How to get started

The best way to get started working with LINQ is to read up about it on MSDN. Then download the great tool LINQPad. LINQPad has some great sample LINQ queries and lets you play with LINQ outside of Visual Studio. It’s great for writing short snippets of code and is an ideal sandbox to try out bits of code. LINQPad is free, but Auto Completion is a paid feature (but well worth it). It also lets you run LINQ to SQL queries on a SQL database (and now SQL Compact Edition).

Once you have started familiarising yourself with LINQ you should start using it in your projects. There are two key requirements to using LINQ:

  1. Your project must target version 3.5 of the .NET Framework.
  2. You must include using System.Linq; to reference the LINQ namespace in all code files where you want to use LINQ.

If you don’t have Visual Studio 2008 you can download one of the free express editions from http://www.microsoft.com/express/. Once installed you might also want to turn on line numbers in Visual C# Express.

More to come

There’s plenty of stuff to talk about with LINQ. In my next few posts I’ll cover Extensions methods, Lambda expressions, LINQ over objects and much more.

Comments