Introduction to LINQ

Most programming environment required to integrate some sort of data into their applications. Often, we are taking from the multiple sources such as memory collections, relational databases, XML files etc. All data sources have different means of querying the data like SQL for databases, XQuery for XML, LDAP queries for Active Directory etc. Here comeup new feature in C# 3.0 “LINQ” what we are going to discussing up today.

LINQ ( Language INtegrated Query), One of the nice things about LINQ is that it integrates seamlessly with the existing .NET languages such as C#, VB.NET because the underlying LINQ API is just nothing but a set of .NET classes that operate like any other .NET class. In addition, the query functionality is not just restricted to SQL or XML data; you can apply LINQ to query any class as long as that class implements IEnumerable<T> class.

Simple LINQ Example:

open an Linq windows application project,

Namespaces required for LINQ 

using System.Query;

using System.data.DLinq;

using System.Xml.XLinq;

Method:

private void btnLoopThroughStrings_Click(object sender, EventArgs e)
{
    string[] names = {“John”, “Peter”, “Ravi”, “Scott”, “Donald”, “Eric”};
    IEnumerable<string> namesWithFiveCharacters =
                                from name in names
                                where name.Length < 5
                                select name;
    lstResults.Items.Clear();
    foreach(var name in namesWithFiveCharacters)
        lstResults.Items.Add(name);
}

To really understand LINQ, you also need to understand the new language features of C# 3.0. Specifically, it would be beneficial to discuss LINQ in the context of the below features of C# 3.0:

  • Type Inference
  • Lamda Expressions
  • Extension Methods
  • Anonymous Types

Type Inference

To understand type inference, let us look at couple of lines of code.

var count = 1;
var output = "This is a string";
var employees = new EmployeesCollection();

In the above lines of code, the compiler sees the var keyword, looks at the assignment to count, and determines that it should be an Int32, then assigns 1 to it. When it sees that you assign a string to the output variable, it determines that output should be of type System.String. Same goes for employees collection object. As you would have guessed by now, var is a new keyword introduced in C# 3.0 that has a special meaning. var is used to signal the compiler that you are using the new Local Variable Type Inference feature in C# 3.0.

As an example, let us modify our string query example to use the var keyword.

private void btnLoopThroughStrings_Click(object sender, EventArgs e)
{
    string[] names = {"John", "Peter", "Joe", "Patrick", "Donald", "Eric"};
    var namesWithFiveCharacters =
                            from name in names
                            where name.Length < 5
                            select name'
    lstResults.Items.Clear();
    foreach(var name in namesWithFiveCharacters)
        lstResults.Items.Add(name);
}

Lambda Expressions

C# 2.0 introduced a new feature, anonymous methods, that allows you to declare your method code inline instead of with a delegate function. Lambda expressions, a new feature in C# 3.0, have a more concise syntax to achieve the same goal. Take a closer look at anonymous methods before discussing lambda expressions. Suppose you want to create a button that displays a message box when you click it. In C# 2.0, you would do it as follows:

public SimpleForm()
{
    addButton = new Button(...);
    addButton.Click += delegate
    {
        MessageBox.Show ("Button clicked");
    };
}

As the above code shows, you can use anonymous methods to declare the function logic inline. However C# 3.0 introduces an even simpler syntax, lambda expressions, which you write as a parameter list followed by the “=>” token, followed by an expression or a statement block. Lambda Expressions are the natural evolution of C# 2.0’s Anonymous Methods. Essentially, a Lambda Expression is a convenient syntax that is used to assign a chunk of code (the anonymous method) to a variable (the delegate). As an example,

employee => employee.StartsWith("D");

In this case, the delegates used in the above query are defined in the System.Query namespace as such:

public delegate T Func<T>();
public delegate T Func<A0, T>(A0 arg0);

So this code snippet could be written as:

Func<string, bool> person = delegate (string s) {
                        return s.StartsWith("D"); };

Examples :

(int count) => count + 1    //explicitly typed parameter
(y,z) => return y * z;      //implicitly typed parameter

Extension Methods:

Note that C# 3.0 also makes it possible to add methods to existing classes that are defined in other assemblies. All the extension methods must be declared static and they are very similar to static methods. Note that you can declare them only in static classes. To declare an extension method, you specify the keyword “this” as the first parameter of the method, for example:

public static class StringExtension
{
    public static void Echo(this string s)
    {
        Console.WriteLine("Supplied string : " + s);
    }
}

string s = "Hello world";
s.Echo();

Based on the above code, here are the key characteristics of extension methods.

  1. Extension methods have the keyword this before the first argument
  2. When extension methods are consumed, the argument that was declared with the keyword this is not passed. In the above code, note the invocation of Echo() method without any arguments
  3. Extension methods can be defined only in a static class
  4. Extension methods can be called only on instances. Trying to call them on a class will result in compilation errors. The class instances on which they are called are determined by the first argument in the declaration, the one having the keyword this.

Using Collections in LINQ:

private void btnLoopThroughObjects_Click(object sender, EventArgs e)
{
    List<Person> persons = new List<Person>
        {new Person{FirstName = “Joe”, LastName = “Adams”, Address = “Chandler”},
        new Person{FirstName = “Don”, LastName =”Alexander”, Address = “Washington”},
        new Person{FirstName = “Dave”, LastName = “Ashton”, Address = “Texas”},
        new Person{FirstName = “Bill”, LastName = “Pierce”, Address = “Sacromento”},
        new Person{FirstName = “Bill”, LastName =”Giard”, Address = “Camphill”}};
    var personsNotInSeattle = from person in persons
                              where person.Address != “Texas”
                              orderby person.FirstName
                              select person;
    lstResults.Items.Clear();
    foreach (var person in personsNotInSeattle)
    {
        lstResults.Items.Add(person.FirstName + ” ” + person.LastName +
            ” – ” + person.Address);
    }
}

After creating the collection,  loop through the collection and filter out all the persons that are not in Texas and order the results by first name of the persons using the below query.

var personsNotInSeattle = from person in persons
                          where person.Address != "Texas"
                          orderby person.FirstName
                          select person;

Anonymous Types

In the previous section, the output from the query is an array of the persons. In the query, you specify that you only want those persons that are not in Seattle. In that case, you returned an array of Person objects with each Person object containing FirstName, LastName, and Address properties.

Let us say for example, you just want all the persons that meet the criteria but only with FirstName and Address properties. This means that you need to be able to create an unknown class with these two properties programmatically on the fly. This is exactly what the Anonymous Types in C# 3.0 allows you to accomplish this. Although these types are called anonymous types, CLR does assign a name to these types. But they are just unknown to us.

For example, the below snippet of code represents the Click event of a command button returns a sequence of a new type when queried using LINQ.

private void btnLoopThroughAnonymous_Click(object sender, EventArgs e)
{
    List<Person> persons = new List<Person>
        {new Person{FirstName = "Joe", LastName = "Adams", Address = "Chandler"},
        new Person{FirstName = "Don", LastName ="Alexander", Address = "Washington"},
        new Person{FirstName = "Dave", LastName = "Ashton", Address = "Seattle"},
        new Person{FirstName = "Bill", LastName = "Pierce", Address = "Sacromento"},
        new Person{FirstName = "Bill", LastName ="Giard", Address = "Camphill"}};
    var personsNotInSeattle = from person in persons
                              where person.Address != "Seattle"
                              orderby person.FirstName
                              select new {person.FirstName,
                              person.Address};
    lstResults.Items.Clear();
    foreach (var person in personsNotInSeattle)
    {
        lstResults.Items.Add(person.FirstName + " --- " + person.Address );
    }
}

The above snippet of code is very similar to previous example in that here also you examine all persons using a compound from clause and return only those persons that are not in Seattle. However one key difference is that it does not return the entire person. It returns a new type that contains two public properties: FirstName, Address. This new type was created by the compiler. Here is the definition the compiler creates:

public class ?????
{
    private string firstName;
    private string address;

    public string FirstName
    {
        get { return firstName; }
        set { firstName= value; }
    }

    public string Address
    {
        get { return address; }
        set { address = value; }
    }
}

As you can see the above class is very similar to the Person class except that it is nameless meaning that it is an anonymous type. So the return value from the query is IEnumerable<?????> and what goes as a replacement for the question mark is something that is determined by the compiler. To be able to capture this collection of anonymous return type, you need a variable that can hold any object types including the compiler created types. This is exactly why you need a var keyword. As mentioned before, the var keyword is used to declare local variables when you do not know the name of the anonymous type that the compiler created for you. Variables declared with var must be initialized at the point they are declared, because it is the only way the compiler knows what type they might be.

DLinq and XLinq:

Of course, data does not just exist in .NET memory. Two other important places where you will find data are databases and XML documents. This is where DLinq (LINQ to SQL) and XLinq (LINQ to XML) shine. DLinq provides special objects in addition to the standard LINQ objects that allow querying straight from the database. DLinq allows for data mapping through a simple class based mechanism. All query expressions are then dealt with as an “expression tree”, which allows any LINQ expression to be converted into a different equivalent expression such as T-SQL. Using this technique, the following C# code snippet can actually perform a native query in SQL Server:

from c in customers
    where c.LastName.StartsWith("A")
    select c

By way of an expression tree, this C# query gets translated into a valid T-SQL query which then executes on the server. C# developers never need to learn the native T-SQL syntax. Also, think of the possibilities this opens up for CLR stored procedures!

As mentioned before XLinq enables you to query XML data. Using XLinq, you can query all customers whose last name starts with an “A” in the following fashion:

from c in customerXml.Descendants("Customer")
    where c.Element("LastName").Value.StartsWith("A")
    select c

 

Leave a Comment