Common Language Runtime from the DBA Point of View

When the Common Language Runtime (CLR) was introduced in SQL Server 2005, it was enthusiastically received by developers. CLR gave developers a powerful new way to write database code by using the same modern object-oriented (OO) languages that they used to develop other applications.

From the DBA point of view, the introduction of CLR blurred the distinction between compiled application code and database code. It raised questions about the roles of the developer and the DBA in the deployment, management, and security of CLR code modules. To some extent, it also blurred the distinction between the business logic tier and the data access tier in n-tier development models.

For many organizations, implementing CLR requires making decisions about the development process and about the roles of those involved. If you’re preparing to implement CLR, you need enough information to facilitate those decisions without getting immersed in too much technical detail.

To that end, I’ll explain how CLR is implemented and discuss the pros and cons of CLR code and the new wrinkles that this technology has brought to SQL Server management and security. I’ll also describe situations in which CLR might be helpful and situations in which it might not be appropriate.

Creating and Deploying CLR

CLR is a programming model that allows developers to use compiled OO languages to write code and to have the code executed as if it were a T-SQL procedure, function, or trigger. Basically, the process is as follows:

1. A developer creates code in a Visual Studio (VS) project. The code can be written in either C# or Visual Basic .NET.

2. The code is compiled and deployed from VS directly into the target database. The compilation process creates a DLL. The DLL is an executable file that contains the compiled code. Deployment also inserts information about the assembly into the sys.assemblies view and other related views in the target database.

3. One or more T-SQL objects are created in the target database by using the EXTERNAL NAME option to point to a code module in the assembly. These objects are referred to as prototypes. A prototype can be defined as a stored procedure, user-defined function (UDF), trigger, user-defined aggregate, or user-defined type. The prototype contains no code; it’s simply an entry point that can be called from other database code. The compiled code in the DLL is executed when the code is called.

Usually, all these steps are executed when you compile and deploy the code from a VS project. But you can also execute each step manually.

Modern OO languages have functionality that’s difficult or impossible to reproduce through T-SQL. CLR offers a way to use the rich feature set of these languages in database code. CLR is often better suited for computational or iterative operations than is T-SQL. CLR code executes in the same memory context as SQL Server, so it runs very efficiently.

CLR code is managed code. This means that it runs inside a CLR virtual machine (VM) that prevents execution of any code that has the potential to destabilize the server. Although this is a good thing, any form of virtualization has some degree of overhead. For example, CLR can be overused in ways that affect performance. Although compiled code normally executes efficiently within the SQL memory space, T-SQL is still the best option for heavy lifting. If a procedure accesses or returns large amounts of data, it probably isn’t a good candidate for CLR.

An oversimplified rule of thumb is that T-SQL handles set-based operations more efficiently, and CLR handles iterative, procedural, or computational operations more efficiently. There are many exceptions to this rule, however.

Managing CLR

Sometimes the first contact a DBA has with CLR is a request to enable CLR on the server. Doing this is as simple as enabling the clr enabled server configuration option. But there are many issues to consider before you flip the switch.

Deploying code into the database has traditionally been the responsibility of the DBA, who should make sure that the code works correctly and won’t affect performance or cause other problems on the server. However, CLR code is ordinarily deployed into the database directly from the VS environment in which it was developed. Traditionally, VS hasn’t been a tool used by the DBA. This suggests that the developer might be in a better position than the DBA to deploy CLR code.

This responsibility has to be worked out in each individual shop. Many DBAs would be uncomfortable knowing that a developer has an unfettered ability to create compiled code modules that run in the same memory space as SQL Server itself. However, the DBA still has granular control over who can load CLR assemblies and at which security level the assemblies can function.

Securing CLR

By default, only members of the sysadmin server role, the db_owner role, and the ddl_admin database role have permission to execute assembly-related Data Definition Language (DDL) statements. This permission can be granted to other users and roles. In addition, the following CLR permission sets let you assign three different levels of permissions to the assembly itself:

SAFE. The SAFE permission set allows only local data access and internal computation. Access to external system resources, such as files, networks, environment variables, or the registry, isn’t permitted. Anyone who has the permission to create an assembly can create a SAFE assembly. This is the default set if the permission level isn’t explicitly stated in the CREATE ASSEMBLY statement.
EXTERNAL ACCESS. The EXTERNAL ACCESS permission set grants the right to access external system resources. The user must have EXTERNAL ACCESS permission to create an assembly that has access to external system resources.
UNSAFE. Think twice before you permit UNSAFE assemblies on your server. UNSAFE assemblies have unrestricted access to resources within or outside of SQL Server. Code within an UNSAFE assembly can also call unmanaged code, which has the potential to destabilize the server. Only a member of the sysadmins group can grant UNSAFE permissions to an assembly.

By default, CLR code executes in the security context of the SQL Server service account. This can be a problem when you access resources that are external to SQL Server. A best practice is to have a minimal set of permissions for the service account. However, CLR code can be executed by impersonating another Windows account. Assemblies that have EXTERNAL ACCESS and UNSAFE permission sets have this ability. People who have the ability to create these assemblies must be well trusted by the organization.

Understanding CLR Metadata

In a CLR-enabled environment, the DBA should be familiar with the following five system views that contain metadata for the CLR objects in the database:

sys.assemblies. This view contains one record for every assembly in the database. It contains information about each assembly, such as its name, security level, and creation date.
sys.assembly_files. For each assembly in the database, this view should contain a row for every source code file, the AssemblyInfo file, and the DLL file. If the assembly is deployed from a VS project, all the files will be added automatically. If the assembly is deployed manually, the source code and AssemblyInfo files will have to be added manually. The sys.assembly_files view can provide a measure of source code control. The view’s "content" (varbinary) column contains the CLR source code for all the code modules and the binary code of the assembly DLL. This means that your CLR source code and the binary contents of the compiled DLL are backed up each time the database is backed up. You can easily retrieve the contents of the source code files in text format from this view. To do this, first set your query window to return results as text. Then, execute the query in Listing 1 to return all the C# source code for the specified assembly. (You can download the code in the listings by going to www.sqlmag.com, entering 129006 in the InstantDoc ID text box, clicking Go, then clicking the Download the Code Here button.)

USE MyDatabase;
SET NOCOUNT ON
SELECT CAST(content AS varchar(max))
  FROM sys.assembly_files f
JOIN sys.assemblies a
  ON a.assembly_id = f.assembly_id
  WHERE a.name = 'MyAssembly'
  AND f.name LIKE '%.cs';

sys.assembly_modules. An assembly can include multiple classes, such as procedures and functions. Classes within the assembly can have multiple code modules (methods) that implement different functionality. This view maps the IDs of the individual modules to the object IDs of the T-SQL prototypes that reference them. The query in Listing 2 retrieves this data and resolves the IDs to names. In the sys.assembly_modules view, a module can be mapped to more than one prototype. For example, a single code module might be implemented as a stored procedure and a function.

SELECT OBJECT_NAME(m.object_id) AS db_object,a.name AS assembly
  ,m.assembly_class
  ,m.assembly_method
  FROM sys.assembly_modules m
JOIN sys.assemblies a
  ON a.assembly_id = m.assembly_id

sys.assembly_references. An assembly can reference another assembly. This view contains a row for each pair of assemblies in which one is directly referencing another. You can use this view to find any dependencies before you drop an assembly.
sys.module_assembly_usages. This view maps assembly names to the object IDs of the T-SQL prototypes that reference them. The query in Listing 3 retrieves this data and resolves the object IDs to object names.

USE MyDB:
SELECT OBJECT_NAME(object_id)
  AS db_object,a.name AS assembly
  FROM sys.module_assembly_usages u
JOIN sys.assemblies a
  ON a.assembly_id = u.assembly_id

Removing CLR Assemblies

VS handles the details of creating and deploying a CLR assembly to the SQL Server database, but there’s no equivalent tool to uninstall a CLR assembly. You must manually drop objects in the opposite order in which they were created.

First, all T-SQL prototypes that were created must be dropped. You can use sys.module_assembly_usages to find all the T-SQL prototypes that are associated with an assembly. Before you drop a T-SQL prototype, you should make sure that no other T-SQL objects are dependent on it. The simplest way to do this is to right-click the prototype in SQL Server Management Studio’s (SSMS’s) object browser and click View Dependencies. Alternatively, you can use sp_depends. For example, if you want to find the objects that depend on the sp_MyProc stored procedure, you’d run the command

EXEC sp_depends @objname = N'sp_MyProc';

After you drop all prototypes, use the DROP ASSEMBLY command to drop the assembly itself. This will clean up all references to the assembly in the assembly views.

A Change for the Better

The T-SQL-only era of database programming is changing. CLR is just one example of this trend. Although crusty old SQL guys like me might grumble a bit, change is what makes life as a DBA interesting. CLR offers real advantages in areas in which T-SQL has limitations. The ability to mix the two programming models is a powerful tool, and DBAs and developers need to acquire the necessary skills for both if they don’t already have them.

Comments

Plain text