In my 15 years as a DBA and Data Consultant I've come to compile a formal questionnaire around all the questions I require in order to on-board a new SQL Server database implementation. Taking ownership over a purchased solution is always trickier than one developed internally because you're removed by at least one degree from the development team. Furthermore the solution was developed for a generic range of customers - agnostic of your environment and its idiosyncrasies. Therefore it's even more critical to get all the information you need up front - before a single server is built or hosting decision is made - in order to ensure you craft the appropriate environment to host the solution's database(s). Over-architecting is a waste of resources and can lead to supporting a dedicated instance of SQL Server when one is not required. Under-architecting the solution causes pain points that can last the lifespan of the application.
Here are the top 20 questions I ask when it comes to taking ownership over integrating a new SQL Server-based solution:
1. What is the contact information for the technical representative from the software vendor?
If something goes bump in the night I want to ensure my client(s) have a route towards resolution if it's not in the database layer and therefore not my responsibility. Additionally, if I encounter an issue with performance or structure and need to get a developer or technical support involved I want that information too.
2. What is the internal level of support / level of service / expected uptime / RTO for this application?
I need to know what the expectations are from the end users and my clients for this application in order to architect the hosting plan accordingly. Answering this question leads me to make decisions around the storage platform, SQL-centric high-availability options as well as virtualization decisions and perhaps implementation of technology such as Site Recovery Manager if supported in the environment I'm working with for this particular engagement.
3. Are there any dependencies on enterprise edition features of SQL Server?
Certain features exist only in Enterprise Edition and those features fluctuate between versions of SQL Server. Furthermore there are resource limitations in lesser editions of SQL Server that may come into play depending on the requirements from the product being implemented. For example, you're not going to be able to take full advantage of that 256GB of RAM on that proposed server if you install Standard Edition SQL Server since there is a cap at 128GB of RAM for that edition (currently.) Ultimately I need to know if we can implement a lesser edition of SQL Server or are we locked into an Enterprise Edition install. Depending on the existing Software Assurance arrangements within the organization I'm working with this may not be an expensive endeavor. If I'm working with a SQL-dedicated VMware cluster where I've licensed the physical hosts for SQL Enterprise Edition then I technically have access to an all-the-Enterprise-VMs-You-Can-Eat situation until those per-core licenses are consumed. In that case SQL Enterprise licensing is a fixed cost for the organization and those licenses sit idle awaiting implementation. A current list of feature differences between editions can be found here.
4. Which versions of Microsoft SQL Server are supported?
While we're talking SQL Server edition we should also be talking about SQL Server version. Many, if not most healthcare concerns for instance are locked into SQL 2008 or 2008R2 due to FDA regulatory concerns. It's important to understand what version of SQL Server is supported and just as critical to take feature sets for that/those version(s) supported when it comes to high availability or edition limitations for that particular version/edition combination.
5. Is virtualization supported?
It's surprising that the answer to this is "no" anymore but it still comes up as a limitation occasionally. In some of those cases it's possible to turn this into an educational opportunity with the software development company and sway their opinion. Sometimes they are unswayed and have their reasons they stick to. In some cases virtualization is not the best fit but I want to know if it is at least an option if the solution fits.
6. Are there any other SQL-centric products we need to include in the final architecture?
Does the implementation require SSAS, SSIS, SSRS, Change Data Capture (CDC), Auditing, Replication, Full Text Search, CLR or Filestream? If so I want to know that up front. Perhaps all the information I've collected tells me this database can be hosted on a shared instance of SQL. Then I find out after the fact that it needed a local install of SSIS because of a limitiation in the development process and need to alter either the existing shared environment or provision a new server. I want to understand the landscape for all the components required on the local SQL Server as well as if I need to employ any of these products elsewhere in the data center.
7. Is there anything from a security perspective that would not allow this solution to play friendly in a shared instance of SQL Server?
Speaking of shared v. dedicated SQL instances, security plays as strong as a role in that determination as does resource allocation. I specifically ask all of the following questions around security when determining that facet of the install:
- Distribution- Can the application components be separated from the database layer or is there a dependency on the application and database being co-hosted on the same server?
- Exclusivity Requirements - Can the database(s) for this solution be hosted on a shared/consolidated SQL instance?
- Logins/Users - What are the server role requirements for the end users and for vendor technical support? If there are any server role requirements it's a cue to me solution requires a dedicated SQL instance. I don't want any user of Database X having the ability to affect Database Y (other than through resource consumption which can be controlled via Resource Governor.) Additionally, I want to know what rights are to be extended to the users in the database(s). Ultimately I want to follow the model of least privilege and only assign the minimum rights required. I will settle for assigning db_datareader, db_datawriter and execute permissions to any stored procedures and functions deemed necessary as a maximum. I will always fight dbo or db_owner as it allows for the ability to perform tasks that could conceivably take down the SQL instance for everyone.
8. Is this database storing (or processing) privileged data?
Patient Health Information (PHI), credit or financial information, salary data, Social Security Numbers.. these are valuable bits of information to have and there are barbarians at the firewall trying to get it. Depending on the type of data, your geographical location and industry you have standards to maintain for hosting this information and those standards impact how the data is protected and hosted and that even has different levels of requirements when dealing with virtualized environments. The answer to this question impacts how and where your data is stored, whether encryption is employed and even comes into play if the application is just processing the data temporarily - whether the data is stored inside the SQL database or not.
9. Will there be interfaces to or from other systems in the environment?
If the answer to this is "yes" then I want to know what those processes are that are making that happen: ETL, HL7, APIs; are we expected to build SSIS packages to make these systems communicate? Are we being provided the necessary code to build these bridges? What security is required for these processes? All these questions and many more come into play when we have to make systems talk to eachother.
10. What are the vendor's CPU requirements for the implementation?
When it comes time to look at vendor resource guidelines I'm already prepared to be fed a bucket of lies and guesses. When I ask about CPU I also want to know what chipset the resource recommendations are based upon because the vendor may state they need 4 CPUs allocated but this is based upon tests conducted four versions of the product ago against a Core i5 Lynnfield chip and my client's environment is standardized on Core i7 Sandy Bridge. I need to be able to translate their requirements to the correct microprocessor we'll be implementing against rather than old specifications so as to not over or under allocate CPU. Over-allocating CPU is doubly critical when considering core licensing for SQL Server. I also need to know if this figure includes the overhead for CDC or compression if applicable.
11. What are the RAM requirements for this database implementation?
I need to know what the vendor requires for RAM allocation for this product and whether that figure includes overhead for an application footprint if the application needs to be co-hosted with the database(s). Likewise if SSAS, SSIS, etc. are involved.
12. What are the expected/necessary iOPS to allow the application/database to keep up with read/write load?
I don't want to get calls in the middle of the night because the "database is slow." Is this database a "heavy hitter" when it comes to iOPS requirements? If so we have options up front in the architectural design phase - tier 1 storage, creative RAID strategies, dedicated LUNs and SSDs are hardware options that are easier to implement before going live. Splitting IO onto different files and file groups is also an option. If compression is supported that may also help with IO pressure. We need to know before we go live though - rather than find out afterwards.
13. Speaking of storage, how much storage do you think we're going to need to provision?
This means I need to know a breakdown of the following:
- Initial data file(s) size and projected annual growth based upon the planned usage that has been (hopefully) a topic of discussion with the Project Manager.
- Same question, but in regards to the transaction log for each database.
- Amount of tempdb consumption we should expect from this implementation.
- Should we expect paging-out to the page file? (I'd hope not.)
14. Are there any custom SQL Server Agent jobs to be integrated?
The answer to this question impacts possible security decisions to allow for user rights assignments to msdb. It also could impact performance and load and we'd want to ensure we have a good look at the associated code as well as audit to ensure it does not try to overstep the bounds of the database(s) being installed for this solution.
15. Is there a requirement for a non-default collation?
Collation settings determine the code page that is used to store non-Unicode data in SQL Server and how SQL Server sorts and compares characters that are stored in non-Unicode data types. Regional settings also come into play when setting a collation. While a database can have a different collation than the master database the wisest thing is to set up the database with the correct collation required from the beginning. 9 out of 10 times the database will most likely match the default collation but you need to know up front to avoid a potential rebuild after the fact.
16. What environments need to be created for this solution?
It's assumed we will need to build and support a production environment but what about test, stage, Q/A, development and such? Furthermore do any of these need to be managed permanently or will they be destroyed after go-live? Each needs to be included in the resource tally and each comes with a price tag for the server, storage, licensing and my time to build the environments.
17. What are the expectations around disaster recovery?
Do we need to back up each environment? I would assume production is a given but what about the others? I would also expect we'd want to run in Full Recovery to allow for point-in-time recovery but perhaps the data is only loaded once daily and not modified after? Simple Recovery would be fitting there. The DBA is typically going to be responsible for the recovery strategy and backups are critical to the success of recovery. Understanding what is to be required is vital
Just like any situation questions always raise more questions once answered but these are the critical conversation starters I employ whenever facing a new project involving SQL Server.