What makes developing Windows Azure applications unique is that instead of relying directly on the OS-provided infrastructure or on dedicated, local servers that provide similar functionality, your solutions are now highly dependent on services for fundamental aspects of their architecture, such as database storage, messaging, caching, and security. Developing on Windows Azure does not mean having to learn a new "Azure" language, become familiar with other development environments, or even learn a completely different API stack for a new OS. At the end of the day, developing on Azure is programming in the .NET language of your choice within Visual Studio 2010 where your code executes on virtual machines (VMs) running on a guest OS that's a variant of Windows Server 2008. Azure even supports development using open-source platforms.
My intention here is to provide an architectural overview of the various services Azure offers you, the developer, so that you are aware of your options and therefore are better equipped to select the right service for your solution. We will approach the architecture by asking and answering some basic topical questions about execution, storage, security and the like. Let's get started!
Where Do You Run Your Application?
When you have executable code to run (such as a website, web services, automated processing routines, or server applications), you deploy it to and run it within Windows Azure Compute. Compute provides three different role types that are designed to address needs specific to particular types of applications. All these roles run as VMs hosted in Azure data centers:
- Web role -- This role is targeted to running web server applications and provides out-of-the-box support for Microsoft IIS 7 and ASP.NET (including MVC 3). The Web role can also be used to host web services that rely on IIS.
- Worker role -- This role is targeted to broader, more general development of applications. It is used particularly for executing background processing logic.
- VM role -- The VM role enables you to upload your own VM image of Server 2008 containing the application components you want to execute on Azure infrastructure. This role is often used for porting legacy applications to Azure.
Understanding the Azure Service Model
Your usage of Windows Azure resources is grouped into a logical unit of billing called a subscription. Each subscription contains one or more hosted services, wherein each hosted service indicates which data center to host your application in as well as the DNS prefix that will be used to access your application.
A hosted service can contain up to two deployments: a staging deployment and a production deployment. Each deployment (staging or production) provides a DNS name used to access your service. Production deployments are fixed to the DNS prefix provided when you create the parent hosted service, whereas staging deployments are assigned an auto-generated DNS name.
Each deployment describes one or more roles (of the role types previously mentioned), wherein each role is responsible for, among other settings, specifying the size and number of VM instances dedicated to supporting that role. Typically each role represents a function of the application, such as the website or the background workers. Each VM instance allocated to a role runs a guest OS that is substantially compatible with Server 2008.
Roles describe not only the number of instances but also their size. Generally speaking, the larger the instance size, the greater the number of dedicated cores (as shown in Figure 1), memory, disk space, and bandwidth.
Figure 1: Mapping Azure roles to physical hosts
For example, an x-small instance shares a single CPU core with other x-small instances and has the smallest bandwidth (5Mbps peak), memory (768MB), and disk space (20GB), whereas an x-large instance gets 8 dedicated cores, 14GB of memory, 2040GB of disk space, and 800Mbps peak bandwidth. To find the detailed specs for each size, see "How to Configure Virtual Machine Sizes."
Communicating with Your Deployments
When you deploy your solution to Azure as a production deployment, you will get a DNS name of the form [dnsprefix].cloudapp.net (see Figure 1). You can use that DNS name to access your Azure-hosted applications directly. Alternatively, you may configure the CNAME entries in DNS for a domain name you own, so that they point to that [dnsprefix].cloudapp.net domain. By doing so, you can access your Azure-hosted application via your own domain name (e.g., mybigbadapp.com instead of mybigbadapp.cloudapp.net).
Communication with a specific role instance is generally not the approach you take. Instead, you target a role by specifying a port number in addition to the DNS name and rely on Azure to route your request to one of the instances for that role. For example, in Figure 1, Role B could be hosting an ASP.NET website, so it could be accessed via app.cloudapp.net within a web browser (which makes the request using port 80). Azure would ensure that the single large instance created for that role gets the request.
The communication endpoints (actual ports and protocols upon which your roles listen) are part of the role configuration that is known as the service definition. In addition to providing endpoints that respond to requests from outside the data center, Azure roles can define internal endpoints that allow communication only between roles of the same hosted service and do not allow external input.
How Do You Develop for Azure Compute?
.NET development on Windows Azure is done using Visual Studio 2010, where you have the Windows Azure SDK for .NET installed. This SDK includes the tools for Visual Studio, client libraries, and emulators that simulate Azure Compute and Azure Storage on your local development machine.
Figure 2: Sample development flow
I will not go into the details of each step, so check out the links in the Additional Resources box for detailed guidance. You start by creating a new cloud project within Visual Studio 2010, you add in additional roles and supporting projects, and you are then able to debug locally by pressing F5. This will launch the compute emulator, run your cloud project, and attach Visual Studio to the appropriate processes, so that you can step through your code as desired. The end goal is to build a package (which is a specially structured .zip file) that contains everything Azure needs to run your solution and a copy of your solution's Service Configuration file that contains settings for your roles. Visual Studio has a Publish context-menu option for all cloud projects that will both build the package and deploy it to Azure for you. Alternatively, you can build a package and then, using a browser, navigate to the Windows Azure Portal and deploy your package outside of Visual Studio.
It is worth noting that beyond package deployment, you can also use Web Deploy for much quicker, iterative deployment, which is typical of web application development. For details about doing so, see "Publishing a Windows Azure Application using the Windows Azure Tools."
Does Azure Support Open-Source Platforms?
You can build Azure-hosted applications on non-.NET platforms. In fact, Azure provides SDKs for building Azure-hosted applications using Node.js, PHP, and Java. You are not limited to these official SDKs, however, as you can build apps upon other languages and platforms, such as Ruby on Rails, Python, and Drupal. Currently, the only restriction is that the platform must run on Windows Server 2008, as Windows Azure does not support other OSs (e.g., Linux).
Where Can You Store Your Data?
Azure offers four different service options for storing your data, each with a particular purpose. As you can see in Figure 3, with the exception of Caching, all Azure storage options are supported for use from either Azure-hosted roles or from non-Azure hosted applications, such as on-premises applications or applications hosted in other third-party data centers.
Figure 3: Azure Storage services
This is exactly what you expect: Each compute instance has a local drive accessible to your application. This means you can use it for storing any data you want, but be forewarned that you should use local storage only for transient data. If the VM instance fails (e.g., if the hardware on which it is running fails) or if the Azure infrastructure decides to move your VM to another physical host, the data stored in local storage is lost.
How can you keep transient data for your Azure role instances? One way is to mount an Azure drive, which is a blob stored by Azure Blob storage but mounted as a logical drive within your instance. This will provide for persistent storage, even in the face of VM failure. The main caveat to using an Azure drive is that you can only mount it for read plus write access to a single instance (you can, however, mount it in a read-only mode across multiple instances).
Which brings us to Azure Blob storage, a component of Windows Azure storage, which is capable of storing binary data in two different ways: as page blobs and block blobs. Page blobs are ideal for storing virtual hard disk (VHD) data (e.g., Azure drives) because they have better performance for random writes; they don't store pages of data that are all zeros, which avoids charges for empty space (such as might be typical in a disk image); and they can store up to 1TB of data. Block blobs, on the other hand, make it easier to manipulate individual blocks within a blob but only support storing up to 200GB of data per blob. At the end of the day, Blob storage is best for files. (For more information about Azure Blob storage, see "Expert Tips for Working with Window Azure Blob Storage and Silverlight" and "How to Migrate a Web Application to Windows Azure and SQL Azure.")
Another component of Windows Azure storage is Table storage. Table storage enables you to store data in a keyed table structure, much like a property bag. Outside of the requirement that every entity within a table has a partition key column and a row key column, the tables themselves are schema-less and can be jagged, with properties varying from one entity to the next. Tables have indexes only on the partition key and row key and do not support secondary indexes for other properties stored in the table. With some planning, you can build the desired indexes between the partition key and row key, which will provide efficiently performing tables that can cost-effectively scale up to 100TB.
The Windows Azure Caching service provides a distributed, in-memory cache as a service for Azure-hosted applications. Because it provides support for caching any managed object into cache sizes ranging from 128MB to 4GB, the Caching service is a great solution for easily adding state, session, and output caching to Web role-hosted websites or Worker role-hosted web services or processes, particularly for caching the results of complex calculations or data retrieved from SQL Azure. Owing to the inherent network latency in accessing the Caching service, it really only makes sense to use it from within Azure data centers that are near (in a network latency sense) the Caching service. Although it is possible to use the Caching service from on-premises or off-premises applications, Microsoft does not recommend or support this scenario.
SQL Azure is the natural fit if your application uses a relational database, as SQL Azure extends the capabilities of SQL Server to the cloud. If you know how to build applications that use SQL Server, using SQL Azure is largely just a matter of making a change in your connection string. It uses the same TDS protocol for communication and supports most of the SQL Server feature set. SQL Azure databases can scale from 1GB to 150GB, and when you use SQL Azure Federations (which enables you to spread your data across multiple SQL Azure databases), the maximum size is practically limitless. In addition, SQL Azure includes related services that offer reporting (SQL Azure Reporting) and data synchronization with on-premises databases (SQL Azure Data Sync).
How Do Your Application Components Communicate?
Applications within Windows Azure may require various approaches for communication with each other and with external parties, and Azure offers technologies for each.
Queuing approaches. The basic idea behind a queue service is to enable clients to send messages by pushing messages onto a queue. Another set of clients can receive the messages by pulling them off the queue in a first-in-first-out order, as Figure 4 illustrates.
Figure 4: A basic queue
Windows Azure queues and Service Bus Brokered Messaging both provide this type of queuing functionality. However, Service Bus Brokered Messaging provides support for more complex scenarios, such as the subscriptions, filters, and topics shown in Figure 5.
Figure 5: Service Bus Brokered Messaging
In these scenarios, the use of Service Bus Brokered Messaging allows message-receiving clients of the queue (referred to as the topic) to see a filtered copy of the messages appropriate to them (via a subscription). (For more information about Service Bus Brokered Messaging and Azure queues, see "Comparing Windows Azure Queues to Azure AppFabric Durable Message Buffers," "How to Use Windows Azure AppFabric Service Bus Brokered Messaging," and "How to Use the Windows Azure AppFabric Service Bus Brokered Messaging APIs.")
Hybrid approaches. Windows Azure offers two services that can be used in hybrid solutions: those where on-premises components need to communicate with cloud-hosted components and vice versa -- Service Bus Relay and Windows Azure Connect, respectively.
The Service Bus Relay enables clients and services to communicate independent of network address translation (NAT) and firewall obstacles via a cloud-hosted relay service. Service Bus Relay is ideal for hybrid communication where services are involved (e.g., between clients and a Windows Communication Foundation web service).
Windows Azure Connect is more general-purpose. It is akin to an on-demand virtual private network (VPN) between on-premises computers and Azure role instances. Windows Azure Connect is well-suited to the task of enabling Azure role instances to securely access on-premises file shares, Active Directory, or even SQL Server.
There are also two technologies that will assist your solution in managing load against your Azure roles. The Content Delivery Network (CDN) provides a cache for your solution's Windows Azure Blob stored blob or Azure role instance output (such as web pages) at strategically placed locations, so that clients retrieve their data with minimal network latency. The CDN caching provides an added benefit of taking the load off your blob or instance.
The Windows Azure Traffic Manager lets you take a similar concept -- network proximity -- and apply it to the problem of load balancing across multiple hosted services. For example, you might have your hosted services in multiple geographic regions (say in the U.S. and in Europe). Windows Azure Traffic Manager enables you to route traffic to the correct region based on factors such as network proximity and service availability.
How Can You Control Access?
You can use the Access Control Service (ACS) to accept login credentials from various different identity providers, ranging from Facebook and Windows Live ID to the Active Directory domain credentials you might have in your enterprise, as shown in Figure 6.
Figure 6: Access Control Service
Think of the authentication flow this way: ACS handles checking the username and password, then gives the client a token it passes on to you that says with certainty that the caller is who they claim to be. With the ACS token received, your application can then perform its authorization checks against the claims (roles, facets, other properties) contained within the token and decide whether access should be granted, and if so, to what degree.
How Do You Manage Your Azure Services?
So far, we've covered much of the function of Azure's services, but it is worth briefly mentioning how you actually manage them. The primary entry point to managing all Azure services is via the Windows Azure Management Portal, which is found at windows.azure.com. From here you can manage all the services and access Account Management, which enables you to view estimates of your accrued usage charges and review detailed breakdowns of past bills. From the Windows Azure Management Portal, you can also access PowerShell cmdlets and APIs that empower you to manage the services from within your own applications.
How Can You Promote Your Azure Applications?
Once you've built your application on Azure, why not use Azure to help you market it? To meet this need, Azure provides the Windows Azure Marketplace, with the Data Market, which can help you sell access to your OData-ready data, and the Applications Market, where you can market access to your Software as a Service (SaaS) applications.
Think Elastic Scale
Now that you have had an architectural overview of Azure, you should take away one key theme as you consider how to apply the various services: Think elastic scale. On Azure the services are, by and large, priced where you pay for what you use, and they are structured to not only scale up but also scale back down. To give you an example, consider that you can scale your compute instance counts according to demand using the Azure Management APIs or by using the Enterprise Library's new Autoscaling Block, which will auto-scale according to configuration you define. (For more information about using Azure's auto-scaling, see "Auto-Scaling Windows Azure Roles: The Basics.") Regardless, plan for elastic scale in your solution from the outset, so that you can optimize your costs, and choose the right Azure service for the size and frequency of scaling you expect.