Skip navigation

Demystifying Entity Framework 4

A hands-on introduction to working with ADO.NET Entity Framework 4


Many people ask: "What exactly is the ADO.NET Entity Framework?" and "How do I use it?" I will answer these two basic questions in this article. I also will provide a clear picture of the architecture, as well as support your knowledge by providing sample code and external references to additional resources.

What Is Entity Framework?
The ADO.NET Entity Framework is an Object-Relational Mapping (ORM) tool from Microsoft that's meant to make a developer’s life easier. The primary purpose of an ORM is to create a representation of the back-end database that can be easily consumed in an application. Typically, each table in the database maps to a class in the ORM framework (usually created by a code generator). The Entity Framework takes this concept a step further by loosely coupling the entity containing the data with the back-end table. This means that one entity can represent one or many tables, and one or many entities can represent one table, with some restrictions.

To manage the looser coupling of tables and relationships, the framework employs three definitions to represent the model. The Storage Schema Definition Language (SSDL) represents the database storage attributes of an entity. The Conceptual Schema Definition Language (CSDL) defines the entities and their relationships to each other (the drill-through API you're about to see in this article uses these relationships). The Mapping Specification Language (MSL) maps the entities defined in the conceptual model to the storage model definition.

These "languages" are used to translate the LINQ query against the conceptual model into a database query, and support other pertinent database operations. The metadata defined here is also imported into the model at runtime.

Our First Model
Let’s start by creating a sample model. Open up Visual Studio 2010 and select File, New, ADO.NET Entity Data Model. Select the Generate from database option. The second step in the Entity Data Model Wizard requires a Data Connection for setting up the model, as Figure 1 shows.

Figure 1: Setting up the database connection
Figure 1: Setting up the database connection

Next, as Figure 2 shows, the wizard identifies the tables, views, and procedures to import into the model.

Figure 2: Importing database objects
Figure 2: Importing database objects

Clicking Finish creates the model representation that Figure 3 shows.


Figure 3: Generated model
Figure 3: Generated model

By default, each table represents an entity, with all the inter-relationships depicted in the same way as a database model would appear. It's possible to customize the mappings as mentioned earlier; however, that rarely is necessary.

Generated Code
Stepping into the generated designer file, the following list represents the various components used within an application:
•    ObjectContext: the main component that all queries originate from. It also provides benefits like model metadata, change tracking, transaction support, and batch updating to the database, to note a few. An application never uses the ObjectContext directly, but a generated class that inherits from it.
•    ObjectSet: Queries and updates are performed against the ObjectSet, which works with the ObjectContext. The ObjectSet is generic in form (i.e., ObjectSet) and essentially represents a database table conceptually.
•    EntityObject: the base class for an entity; a DataRow-like object that represents the data of the table logically.
•    EntityReference: Each entity object has a pointer to other objects that serve as the entity’s parent. This object represents logically the associated table’s primary key reference.
•    EntityCollection: Each entity object has a pointer to other objects that serve as the entity’s foreign key. This object represents logically the associated table’s foreign key reference.

In the context of ADO.NET DataSets, an EntityObject is equivalent to a strongly typed DataRow object, an ObjectSet is equivalent to a strongly typed DataTable and TableAdapter combined, and the ObjectContext is similar to a strongly typed DataSet. Even though similar, the ADO.NET Entity Framework functions much differently than its predecessor.

Querying the Database
The first type of query supported is to use Language Integrated Query (LINQ), which looks a lot like ANSI SQL. For instance, look at the following query:

Var ctx = new AdventureWorksObjectContext();
From c in ctx.Customers
where c.TerritoryID

select c;

Dissecting this query yields the following parts:
•    Ctx: the ObjectContext instance of type AdventureWorksObjectContext (code-generated)
•    Customers: an ObjectSet reference, where Customer is an EntityObject. Each object set is generated as a property of the object context.
•    TerritoryID: a property on the Customer object
•    C: an alias pointer to Customer

In many ways, these queries look a lot like ANSI SQL, with the exception of the select statement at the end. LINQ also supports many of the other types of operators. For instance, to order the results, you can use the orderby clause, which would appear after the where clause:

orderby c.CustomerID descending

Joins against multiple tables are also supported using the join clause:

From a in ctx.Addresses
join ca in ctx.CustomerAddresses
on ca.AddressKey equals a.AddressKey

Note that a join uses the word "equals" instead of the equals operator. You also can group and aggregate results in many ways:

var query = from c in context.Customers
group c by c.CustomerType into g
select new \\{ g.Key, Customers = g, CustomerCount = g.Count() \\};

For a detailed overview of the types of queries supported, check out the online MSDN samples.

Using Extension Methods
In addition to structured queries, the .NET Framework comes with a wide range of extension methods that also perform the same type of database queries. You can chain these extension methods together to perform more complicated queries. To do so, you would rewrite the previous query, as follows:

var custs = ctx.Customers.Where(i => i.TerritoryID  3)
    .OrderByDescending(i => i.CustomerID);

It’s possible to perform most operations using extension methods, as you can in LINQ. In this example, each extension method isn’t actually performing the database operation at that exact second. A database query is only executed when iterated through.

Stored Procedures/Function Imports
Let’s bring stored procedures into the model. Right-click in the background of the model, selecting the Update Model from Database option. Select the Stored Procedures option in the tree, as was previously illustrated. Once stored procedures are in the model, they appear in the model browser, as Figure 4 shows.

Figure 4: Setting up stored procedures in the model browser
Figure 4: Setting up stored procedures in the model browser

Note the two sections of the model browser. EFSamples.Data contains the conceptual entities and the Function Imports (methods that represent stored procedures). The EFSamples.Data.Store region is the storage model that contains the reference to the actual stored procedure.

Stored procedures are not immediately exposed through the model; the model requires the creation of a function import that represents the proc and its inputs/outputs. To import one, right click the Function Imports folder and select Add Function Import. Use the screen that Figure 5 shows to configure its parameters.

Figure 5: Setting up a stored procedure
Figure 5: Setting up a stored procedure

You can map the function import to any CRUD procedure with any name; you don't need to match the stored procedure name. But any stored procedure returning data must be mapped to an existing entity in the conceptual model, or to a custom entity. To create a custom type, select the Create New Complex Type button.

In a few rare occasions, the Entity Framework designer can have trouble importing the stored procedure. A few known causes and workarounds are available for this, such as adding a one-time configuration option to disable actual execution of the proc using the SET FMTONLY option.

Once the proc is generated, the new function GetUsedProducts is available, using the signature of the procedure (int? productID, DateTime? checkDate) and returning a custom result as shown here:

var ctx = new AdventureWorksObjectContext();
var products = ctx.GetUsedProducts(null, DateTime.Now);

This function import executes the back-end database stored procedure to perform the query; no LINQ is involved. A stored procedure can be executed and read only once; otherwise, an exception may be thrown stating that the data reader can only be read once. Calling ToList() before using the results can fix this issue.

Using Entity SQL

Another alternative for querying data is to use Entity SQL, which is an SQL-like string that performs some querying against data in the database. This approach is beyond the scope of this article, but you can find more information about it on the MSDN site.

ADO.NET Entity Framework vs. LINQ to Objects
Microsoft created varying flavors of LINQ, including the ability to query anything that’s enumerable. This is called LINQ to Objects. It works like an ADO.NET Entity Framework query, but actually processes the results in memory instead of through the database. It’s important to understand at what point LINQ queries and extension methods against an ADO.NET Entity Framework context translate into a database call versus an in-memory processing of objects. Let’s examine the code in Figure 6.

With LINQ, the shape and form of a database query can be transformed outside of the initial LINQ statement. In this sample, the Where and Select extension methods transform the underlying database query, which has not been executed at this point. The last Where statement returning data to the highPrices variable is a LINQ to Objects query.

The important point to understand is that the shape of a query is mutable until either iterated through or the ToList extension method is called. This process of query modification is both a powerful and very dangerous feature of ADO.NET Entity Framework.

LINQ Result Sets
Every LINQ query, extension method, or stored procedure proc result that returns more than one record returns an IQueryable result to the caller. The IQueryable interface may wrap an ObjectQuery class that represents the current LINQ query, or it could simply represent a LINQ to Objects result set.

Most LINQ constructs or extension methods return the IQueryable interface, but there are several exceptions. Extension methods like First, FIrstOrDefault, Single, or SingleOrDefault return only one record and thus return the object directly. The Any or All extensions return a Boolean. The OrderBy and OrderByDescending extensions return an IOrderedEnumerable interface and the GroupBy and GroupByDescending extensions return an IEnumerable> result.

Queries Under the Covers
Under the covers, ADO.NET Entity Framework queries return an ObjectQuery instance. This class contains a ToTraceString method, a method returning the live SQL of the query being executed. This feature is advantageous for performance tuning, or for figuring out how a certain LINQ query may translate to SQL. While it might seem that the LINQ query should execute the method you expect, in reality it might not. For instance, look at the query that Figure 7 shows.

The ToTraceString method returns the lengthy SQL result that Figure 8 shows. The LINQ query in Figure 7 translates in SQL to two subqueries, two WHERE conditions (matching the existing LINQ query where conditions), and one INNER JOIN. The ultimate point about this example: Be careful of the types of joins that you make in your LINQ query, and make sure you understand how the query is translated to SQL.

Drill-Through Model
Because Entity Framework is an ORM tool that maps entities to tables in your database, one of the goals of the framework is to provide a drill-through capability using Navigation Properties, or pointers to the relationships of the entities. For instance, a Customer may have many orders. In LINQ, when a Customer is retrieved from the database, it has a property that looks like this:

public EntityCollection Orders \\{ .. \\}

Conversely, an order has a pointer to the Customer parent. Parent references utilize three properties (depending on the configuration of your model generation), as in the following:

Public int CustomerKey \\{ get; set; \\}
Public Customer Customer \\{ get; set; \\}
Public EntityReference CustomerReference \\{ get; set; \\}

It might seem like overkill to have three properties managing one entity, but there is some significance to this. The first property is the actual primary key value that represents the customer. The second is a reference to the Customer entity, representing that CustomerKey. The EntityReference class maintains the metadata about the relationship, whether it’s loaded properly, and other important details.

What does this mean for a developer? It means that the Customer entity can drill into its Orders collection (of type EntityCollection), then into its Products collections, to get the grand total on all the orders related to the customer. Conversely, an Order can read information about its Customer who ordered products (through the Customer property that’s managed by the EntityReference class). This drilling into related objects can happen within the context of a LINQ query, or even after the LINQ query is executed against the database.

Previously, Entity Framework didn’t automatically load this information for you; it required an explicit call to the Load method on the EntityReference or EntityCollection classes. If the new Entity Framework 4.0 LazyLoadingEnabled property is set on the loaded context, this is no longer needed.

Additionally, the drill-through model concept has some interesting conundrums related to LINQ queries. Rather than having to explicitly define joins, it’s possible to simply use the drill-through model to specify the related entities in the WHERE or SELECT clauses, something that a database query is forced to do.

Inserts, Updates, Deletes
ADO.NET Entity Framework makes CRUD operations a snap. Let’s look at Figure 9, which shows an example of an insert. When adding new objects, the AddObject method of the ObjectSet or one of the various Add methods of the ObjectContext queues the insert operation. These methods don’t actually insert into the database; all changes are sent to the database once the ObjectContext’s SaveChanges method is called.

Sometimes you might get an exception thrown when SaveChanges is called. One reason for this happening might be that a primary key reference (either the key property or the object reference) has not been supplied. Other reasons might be that a value is out of the valid range of values (typically with dates), the object already exists within the database (in case of duplicate keys), or any database exceptions that occur.

Delete operations are even simpler to perform; simply calling the DeleteObject method on the ObjectSet queues up the removal of the object from the database, as shown below. By default, this does not cascade to delete the relationships, and as such they must be explicitly deleted.


Last, updates are the easiest. As soon as an object’s property is updated, an event fires, signaling to the ObjectContext that the object is dirty. The EntityState of this object is set to Modified until flushed to the database. Note that an UpdateObject method doesn’t exist anywhere; it really is as simple as the ObjectContext listening for a change event on every object.

You do have to be aware of some nuances. For instance, consider the following code:

var customer = ctx.Customers.First();
var order = new Order \\{ … \\};
order.Customer = customer;

Normally, any created object would not be queued up to the database for an insert operation. However, because the Customer object is "context-aware," meaning that it was queried by the ObjectContext, the new order is queued up to the database as inserted. In the sample code, you will see the entity is marked as "Added" to the database.

If this is a feature you don't want to have happen, rather than setting the Order object reference directly, set the OrderKey integer property instead (if you turned on foreign key properties upon generating the model).

Internally, the ObjectContext implements the unit of work pattern, using the ObjectStateManager component to track the objects that were queried, changed, or deleted. Because of this, the ObjectContext knows of the changes made to an object being updated. This only works for objects where change tracking is enabled, or objects where the Detach method wasn’t called on it.

Advanced Object Context Features
The object context provides built-in features to work with objects abstractly through various methods, as Figure 10 shows. These two options retrieve an object by an entity key. The entity key represents the unique identifier for the entity (AdventureWorksObjectContext.Customers), the primary key field name, and the ID used to query with. The primary difference is that one allows for a null to be returned; GetObjectByKey throws an exception when the object doesn’t exist with a given key.

Second, the ObjectSet represents everything the application can do with an entity. Using an object set, a query can be generated abstractly to perform some query operation against the database source. For instance, the following query creates the ObjectSet used to get a customer by the key:

ctx.CreateObjectSet().FirstOrDefault(i => i.CustomerID == key);

Code Generation
Code generation is an excellent feature to add to an application. I highly recommend using this capability, because it lets you customize the output to your own needs. Three options are available for generation: POCOs, Self-Tracking Entities, and Entity Objects, two of which Figure 11 shows.
Figure 11: Available code-generation templates
Figure 11: Available code-generation templates

I’m not going to go into the specifics of each; rather, let’s look at choosing one option and see how the code generation works.

The generated T4 template generates the ObjectContext and each entity representing the table. Altering this template can be advantageous. For instance, the template could add an interface whenever it detects certain columns that exist in the table. The possibilities are endless (sort of).

Once generated, the code-behind of the EDMX file is now empty, and all code exists within the AdventureWorksCodeGen.cs file. Any time the *.edmx file changes, the T4 template runs and re-creates the generated code. Depending on the template selected, the API for the object context and its entities still works the same.

Entity Framework Rocks
Entity Framework is a full-fledged ORM tool that generates code representing the application database. The framework uses LINQ queries or extension methods to query data against the database and even perform database manipulations. Additionally, the API supports a drill-through hierarchy to allow an application to easily access related data to an entity.

The infrastructure is flexible enough to track changes made to its objects (through the unit of work pattern), get objects dynamically through a unique identifier, generate queries dynamically, and much more.

Additionally, new developments in ADO.NET Entity Framework make the infrastructure even more flexible, such as support for generation of code and code customization. Yes, Entity Framework has some limitations, but all these features make it a great product.

Brian Mains ([email protected]) is a Microsoft MVP and consultant with Computer Aid, where he works with nonprofit and state government organizations.

Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.