When I started to learn English Query, I often knew the question I wanted English Query to answer, but I wasn't sure how to make English Query understand. Understanding the various relationships that you can define among entities was the key to giving English Query the proper information to answer my questions. This article looks at the relationships that you can define within English Query, how to define them, and the benefits you derive from their use. English Query supports the following relationships: name/ID, trait, preposition, adjective, subset, verb, and command. "English Query from the Beginning," June 2000, uses the SQL Server Northwind database to explain the basics of how to build and test an English Query model. This article also uses the Northwind database and the sample English Query model for Northwind that we built in the earlier article.
Although your initial goal in an English Query project might be to answer the most common questions your users will ask, the ultimate goal is to identify and model all the relationships between entities in your database. You want to have a semantic model that represents the business you run. During the final keynote address at the Professional Association for SQL Server (PASS) conference in October 2000, the public got a peek at some of the future benefits that can result from having a semantic model of your business. Using an English Query semantic model, a Microsoft demo automatically generated Web pages to display all the data the model represented and exposed the data as a Microsoft Outlook Web part. These HTML pages included such relationships as the parent-child relationship between the Orders and Order Details tables in the Northwind database. Developing a semantic model of your business has powerful implications. Imagine all the things you might be able to generate automatically if you had such a model. You might be able to automatically generate the physical data model, Visual Basic (VB) forms to implement data entry, and (as the previously mentioned demo shows) Web-based forms. Right now, English Query is the way you make a semantic model, which can be the basis for more than just English Query.
To create the complete semantic model, you must identify all the relationships among all the entities in your database. You can expose these relationships by asking questions that your users might ask. Every question you ask is actually a proxy—an example that represents a class of questions that users might ask. When you model the relationship or phrasing that lets English Query answer your specific question, English Query will probably be able to use that new relationship or phrasing to answer an entire class of related questions.
After you've defined the tables and entities to English Query (as we did in "English Query from the Beginning"), you must define relationships. Relationships describe how entities relate to one another. Although you might have defined the customers and products entities, English Query has no inherent understanding of how these two entities relate to one another. For example, English Query doesn't know that customers buy products until you define the relationship stating that fact.
As I mentioned in "English Query from the Beginning," the English Query 2000 Project Wizard automatically generates many of the relationships that exist among database entities. The Project Wizard can generate an even greater number of relationships for OLAP databases because an OLAP database schema's hierarchical design gives clues to many of the relationships. You can also add new relationships that the wizard doesn't automatically generate. In addition, you can change or delete any relationship regardless of whether the wizard created it automatically or you created it manually.
An entity is almost always involved in at least one relationship. A phrasing describes a relationship between entities; because two entities can be related in more than one way, each set of entities might have several phrasings. You might prefer to model all the possible relationships with all possible phrasings, but that task might be too large. To limit the scope of the project, think about the most likely questions your intended audience might ask. Consider the ways that your users might ask these questions, then make a list of the questions. Your model should include the relationships and phrasings necessary to answer the target questions. Let's look at English Query's different phrasing types.
Almost every entity has a name or ID. You use the name/ID phrasing to let English Query know which entity (column) contains the name or ID of the target entity. The Project Wizard will discover almost all of the relationships between entities and their names.
If you are going to use your own English Query model to follow along with this article, I recommend that you begin by creating a temporary directory on your hard drive. Then, copy all the files from the C:\Program Files\Microsoft English Query\Samples\Models\Northwind directory into your temporary directory. Now you can experiment with the Northwind model without affecting the original sample model.
Now, run English Query and open the Northwind model (northwind.eqp) in your temporary directory. In the Project Explorer window, open Northwind and double-click northwind.eqm. To see all entities and relationships, select the Semantics tab of the resulting window. To see the details of a relationship, double-click it to open the Relationship dialog box, select the phrasing of the same name, and click edit.
Here are some examples of name/ID relationship phrasings.
- "Employee names are the names of employees." This phrasing defines how the entity employee_name is related to the entity employee. In this case, the employee_name entity refers to the firstname and lastname columns of the Northwind database's Employees table.
- "Employee IDs are the IDs of employees." As with the employee name, this relationship tells English Query that the employee_ID entity is related to the employee entity.
These phrasings help English Query respond to the requests "List the employee names" and "Show the customer names." If you ask English Query the latter question and get the response I don't understand the word "customer" in the phrase "customer name," you're probably missing a name/ID phrasing that tells English Query that "customer names are the names of customers." To add a new relationship, you can right-click an entity involved in the relationship and select Add Relationship from the context-sensitive menu, or you can select Add Relationship from the Model menu item.
You can also ask about a specific entity—for example, "What do you know about Around the Horn?" (Around the Horn is a company name in the Northwind database's Customers table.) To enter a test question, select Start from the Debug menu to enter debug mode. Type your request into the Query window, and press Enter.
The Analysis tab in the same test dialog box, which Figure 1 shows, reveals that English Query uses the name/ID phrase to answer the question. The Analysis tab displays all the entities and relationships that English Query used to resolve your query.
You use a trait phrasing to describe an entity's attributes. For example, you could tell English Query that:
- Employees have birthdates
- Employees have Social Security numbers
- Employees have phone numbers
- Employees have names
These trait phrasings let English Query successfully answer such requests for information as:
- "List the employees and their birthdates."
- "List the employees' phone numbers and birthdates."
- "Show Mary Smith's phone number."
- "What is the phone number of Mary Smith?"
- "What is John Jones' Social Security number?"
- "What is the Social Security number of John Jones?"
- "What Social Security number does John Jones have?"
Traits are simple to understand and use. In fact, if you use other phrasings for a group of entities, English Query automatically infers the trait phrasing. For example, if you define the name phrasing "Employee names are the names of employees," English Query understands that employees have names. So you need to use trait phrasing only if you've used no other phrasings for the entities involved.
According to SQL Server Books Online (BOL), "Words commonly used as prepositions include about, above, across, before, below, concerning, down, for, from, in, like, of, on, over, past, regarding, since, through, till, to, toward, under, until, with, and without.
Phrasal prepositions include according to, along with, as to, because of, due to, in case of, in place of, instead of, up to, and with regard to."
You write a prepositional phrasing in the format Subjects are preposition object—for example, "Books are about subjects," "Employees are in departments," and "Patients are on medications." You can add as many as three additional prepositional phrases as well; for example, in "Employees are on projects (for customers)(at locations)(on contracts)," the parenthetical phrases are the additional prepositional phrases. You add these phrases one at a time as you would any other phrasing. Adding the phrase "Employees are on projects (for customers)(at locations)(on contracts)" lets English Query answer the following questions:
- "Who is on project X?"
- "Which employees are on project X?"
- "Who is on a project for customer Y?"
- "List the employees on a project on contract Z."
- "Who is on project X at location Y?"
Using additional prepositions lets you further qualify a question that you ask English Query. You might develop a hierarchy of prepositions when dealing with locations, for example. A hierarchy such as "Customers are in cities, cities are in states, states are in countries, countries are in continents" lets you create questions or requests such as:
- "How many customers are in England?"
- "List the customers in South Carolina."
- "Show the customers in North America."
The Project Wizard would probably discover this kind of hierarchy automatically for an OLAP database because the hierarchy is part of the data model that the Project Wizard uses.
We use adjectives all the time, but many of us never notice them, even when we talk about old books, good customers, or lazy employees. If you provide the right phrasings, English Query allows questions that use adjectives. The three types of adjective phrasings are Single adjective, Entity contains adjectives, and Measurements.
Single adjective phrasing. The first and easiest way to add a phrasing is simply to name an adjective. On the Semantics tab, right-click the employees entity and choose Add Relationship. To use the adjective old as it relates to the employees entity, from the GUI, select Single adjective, and enter the word old. Select OK. If you leave it at that, English Query will apply the adjective old to all employees. Unless you restrict the circumstances under which a relationship is true, the relationship always applies, so all employees would be old employees.
Now is a good time to examine the Relationship dialog box's Database tab, which lets you add a phrasing. When you select the check box This relationship is true only when the following SQL condition is true, you can add a Boolean expression which, when true, signifies that an employee is old. Add birthdate < 'Jan 1, 1960' as the condition to define old employees as those employees whose birthdate is before January 1, 1960. You can now ask such questions as "Who is old?" and "Which employees are old?" (After you experiment with this relationship, delete it from the model.)
Entity contains adjectives. Because many adjectives might describe an entity, the adjectives will likely be attributes stored in the database. Perhaps gender, race, education level, and so on are all codes in the Employees table. The gender column contains the values M and F, but people might refer to gender by the words man, male, woman, or female. If the database has a table that maps the gender codes to the gender names, English Query lets you include these adjectives in the model.
In the Northwind example, let's add a table called genders and add a gender column to the Employees table. To make the changes to the Northwind database, run Listing 1 from Query Analyzer. With your temporary project open in English Query, select Refresh Tables from Model. On the SQL tab, notice the new field gender under the dbo.Employees table and the new table Genders with its two fields. To add new entities and relationships that incorporate the new Genders table and the new gender field into the model, select Create Semantics from the Model menu item to run the Project Wizard. Clear all the check boxes in the dialog box except those that apply to the new items you added to the database, as Figure 2 shows. Because the Genders table doesn't have a primary key constraint in the SQL Server database, you need to tell English Query that the lookup key is the gender_cd field. On the SQL tab, right-click the table dbo.genders, and set the key by right-clicking the empty block to the left of the gender_cd field and selecting Primary key.
Now your database has an entity called genders. The attribute gender of the Employees table contains the value M or F, denoting the employee's gender. Your users will more likely want to "List the employees who are male" than "List the employees whose gender is M." English Query needs to use the lookup entity named genders to discover that M = Male and F = Female; the table (entity) called genders has two columns (entities) called gender_cd and gender_name. Gender_cd contains the codes M and F, and gender_name contains the words male and female.
Now, you need to add an adjective phrasing to the relationship "Employees have employee genders" so that English Query's semantic model will be accurate and responsive to employee gender questions. On the Semantics tab, double-click the appropriate relationship. Select Add, then select Adjective Phrasing, OK. In this adjective phrasing, the adjective type should be Entity contains adjectives, with a subject of Employees, and employee genders is the entity that contains the adjectives, as Figure 3 shows. Because the employee genders entity merely contains the codes, and you want to answer questions that include the words man and woman, a code conversion needs to occur. On the right side of the dialog box, click Associate English Values.
Select the Read English from table check box that Figure 4 shows. Specify that the Table is the genders table, the Word field is the entity that contains the English word—in this case, gender_name—and the Value field specifies the matching codes, gender_cd. Now English Query can successfully answer questions or requests such as
- "Which employees are men?"
- "List the female employees."
Measurements. Measurements phrasing, another type of adjective phrasing, fits well with the employee age example. The problem with using Single adjective phrasing to describe old employees is that the date is hard-coded. To be considered old, employees must have a birthdate earlier than January 1, 1960, so in 2030, the only "old" employees by this definition would be older than 70—probably no employees would fit into this category. To keep up with the passage of time, instead of hard-coding a date, you can choose an age limit that differentiates between young and old employees. As time progresses, employees will move from the young category to the old category automatically. This approach is more flexible than comparing against a specific date.
The Northwind sample data model includes this entire example; the relationship is named Employee ages indicate how old employees are and the entity is employee ages. To use a measurements adjective phrasing, you need a measure. In this case, the measure is the employee age entity, which maps back to the Age field of the Employees table on the SQL tab. However, the Northwind Employees table has no "age" column. On the SQL tab under dbo.Employees, right-click the Age field and select Edit. Then select the Advanced tab. As Figure 5 shows, the Age field is a computed field whose value is datediff(yy,birthdate,getdate()).
Figure 6 shows the adjective relationship Employee ages indicate how old employees are. To view this relationship, double-click it on the Semantics tab. Adding the concept of employee age and oldness to the model lets you enter such questions or requests as
- "Which employees are old?"
- "Who are the old employees?"
- "List the young employees."
- "Which employees are older than Anne Dodsworth?"
- "Who are the 3 oldest employees?"
- "What is the average age of employees?"
- "Who is older than the average age?"
- "Which employees are older than the average age of young employees?"
- "Who is young?"
English Query will use the employee age values of 60 and 35 for every questioner because the relationship has the age hard-coded, as Figure 6 shows. This means employees whose age is 60 or more are "old," and employees whose age is 35 or less are "young." If you want to let each individual define what "old" and "young" mean, leave the numeric threshold values blank, and English Query will ask each questioner for a clarification about the values it should use, as Figure 7 shows. You can go back and edit this relationship to leave the values blank and test again.
The subset relationship phrasing lets you create and refer to subsets of entities. For example, if you have an entity named mountains, a subset phrasing might tell English Query that "Some mountains are volcanoes." Other examples of subset phrasings are
- Some employees are programmers
- Some employees are contractors
- Some books are bestsellers
If you have difficulty deciding whether a phrasing should be an adjective or subset (noun) phrasing, think about how you want to refer to the object of the relationship. For example, consider the relationship that states "Some employees are old." Referring to "old employees" and asking, "Which employees are old?" uses "old" as an adjective to describe the employee. However, referring to old employees as "olds" (as in "list the olds") is a noun usage, which necessitates a subset phrasing. Now consider "Some employees are contractors." If you want to refer to contractors as a noun without the parent entity name, as in "List the contractors and their addresses," you need a subset phrasing.
With subset phrasing, you can add as many as two additional prepositional phrases, as Figure 8, page 60, shows. Some examples are
- Some books are novels (about politics)
- Some televisions are portables (with remote)
- Some employees are contractors (for customers)(with projects)
When defining a subset, you can specify one subset word or you can use an entity that contains either the subset word or a code. For example, you might have an employee_type_cd field that contains the letter M for managers. If you want English Query to list the managers, you would select the employee_type_cd as the "entity that contains category values" and select Associate English values to map the code M to the word "manager" (as in the "Employees have genders" adjective phrasing example).
The Pubs Sample model that comes with English Query contains an example of how to implement "bestsellers" as a subset phrasing. If you use a subset phrasing to identify some books as bestsellers, English Query can answer the following requests:
- "List the bestsellers."
- "Which books are bestsellers?"
- "Which bestsellers are about business?"
As another example, let's look at "Some orders are late." You need to decide how to quantify late orders, which you've defined as orders that have a shipped date that is later than their required date. Your business simply refers to these orders as "lates." Go into debug mode, and ask English Query to "List the lates." English Query's response is "Sorry, I didn't understand that." Now enter "List the late orders." English Query interprets that request incorrectly as "List the orders that have a good purchase date." You can fix both problems by adding the appropriate relationships.
To add a relationship for late orders, right-click the Order entity, and select Add a New Relationship. Let's add an adjective phrasing first so that English Query can correctly respond to "List the late orders." Select Add, then select Adjective Phrasing. Select orders as the subject and late as the single adjective that describes the subject, then click OK. Now add two subsets, both with orders as the subject. Give the first subset late as the subset word, and give the second subset overdue as the subset word, as Figure 8 shows. Remember to always use the singular form for the subset word.
Before you exit this dialog box, look at the Database tab that Figure 9 shows, which describes the orders that are considered late. Because not all orders are late, you need to add a WHERE clause to select the late orders. Select This relationship is true only when the following SQL condition is true, and add
getdate() > RequiredDate AND ShippedDate IS NULL
to the field. When this condition is true, an order is late. Note that the object names (RequiredDate and ShippedDate) are the SQL Server column names, not the English Query entity names. After recompiling, English Query can correctly respond to "List the late orders," "List the lates," and "List the overdues."
If you can use verbs to describe a relationship between entities, a verb phrasing will let English Query properly understand the relationship. You'll probably use verb phrasings frequently. Such phrasings as
- Employees sell products (to customers)
- Students study courses
let English Query answer the following questions:
- "Who sells Chai to Belk's?"
- "What does Bill Mason sell?"
- "What is sold to The General Store?"
- "What does Mary study?"
- "Who studied English?"
As with other phrasings, you can add as many as three prepositional phrases with verb phrasings. Verb phrasings can be of five types, as Figure 10 shows. The Northwind demo model contains many good examples of verb phrasings.
Here's a tip: Use the active voice instead of the passive voice when defining the verb phrase. If you're entering a verb phrasing that has one or more prepositions and is missing a subject or object, you might be writing (or thinking) in the passive voice. For example, you could better word the passive phrasing "Products are sold (to customers) (by employees)" actively as "Employees sell products (to customers)." When you use active voice to implement verb phrasings, English Query can answer questions issued in either the active or passive voice.
Command Relationships and Command Phrasings
Usually, English Query converts English statements into SQL statements that it uses to select information from the database. However, you might need to let users make requests other than requests to display information. These requests might include any of the normal transactional operations your system allows, such as changing data or requesting a form to allow data additions or changes. Command relationships let you create such special instructions as:
- Raise Bob Smith's salary by $5000.
- Run the nightly batch at 10 p.m.
- Add an employee. (This request might return the Add employee dialog box.)
A command is a type of phrasing, so you can add it the same way you add any other phrasing. You must give command relationships names, which English Query will return when the command relationships are invoked. Your application can then respond however you want it to—for example, by calling a stored procedure or running a program. Let's say you want to let a user give the command "Send 10 boxes of Alice Mutton to ALFKI." English Query needs to understand this command and return the information necessary for your application to send the product Alice Mutton to the customer whose customer ID is ALFKI.
To create the appropriate command relationship, open the Add relationship dialog box by right-clicking the Product entity, then add the Customer entity to the relationship. Next, select Add Phrasing, and complete the Command Phrasing dialog box, which Figure 11 shows. Notice the Sentence type drop-down list box. Figure 12 lists the three sentence types English Query allows as Command Phrasings.
Now you need to account for the units—in this case, 10. Double-clicking the entity you want to provide a quantity for—in this case, products—produces the role dialog box that Figure 13 shows. Selecting the first check box requires you to provide the product being purchased. Selecting the second check box lets you specify the quantity of product to send to the customer. You can also enter a prompt for English Query to display if the user's request doesn't specify an amount.
Finally, in the Command tab that Figure 14, page 62, shows, you can give the command a name and see the parameter names and IDs. Parameters come in two types: fields and values. Complete the form as Figure 14 shows, and click OK. In this example, parameter IDs 1 and 2 are fields. English Query returns a SQL query for each field. Executing query 1 returns the product ID, and query 2 returns the customer ID. Parameter ID 0 in this example is a value and is the quantity associated with the user's command. If a user requests "Send 10 of Alice Mutton to ALFKI," the value of parameter ID 0 is 10. As Figure 15, page 62, shows, your application now has the name of the command (purchase), the quantity parameter, and two SQL statements that will return the product and customer ID, which is enough information to call a stored procedure or business object to complete the sale.
Command phrasings let you add custom functionality, but the English Query Help file contains a rather long list of verbs to avoid when using command relationships. Before you add a command relationship, make sure you check this list. (Search for "verbs to avoid" to find the information.)
The Suggestion Wizard
Now that you know how to do things the hard way, you'll be pleased to know that English Query gives you an easier way to add new relationships while testing questions. When you're testing an English Query model, Microsoft gives you access to a tool—the Suggestion Wizard—that can help you add relationships. In the screen in Figure 15, the fifth icon on the toolbar looks like a magician's wand. If you ask a question that the English Query semantic model doesn't understand, select this icon. The Suggestion Wizard prompts you for enough information to create the proper phrasing. The wizard isn't 100 percent accurate, but it is a great help when you're getting started.
For example, I asked, "Who bought Chai?" in a small test model. When English Query couldn't answer the question, the Suggestion Wizard displayed the dialog box that Figure 16 shows. After I added the appropriate relationships in response to the wizard's suggestions, English Query could answer my question.
You can even pass questions to the Suggestion Wizard in batches. You might capture all unanswerable questions in your production environment, pass them to the Suggestion Wizard, answer the questions it asks, and constantly improve your model over time.
English Query is an amazingly flexible technology that can bring great power and potential to your applications. With a small amount of work, you can get wonderful results. Often, when English Query doesn't understand a question, you merely need to add a missing relationship to remedy the problem. Both inhouse and vertical applications can benefit from this technology, which will grow in functionality and popularity as time progresses.
Let me offer some advice: Start using English Query now. Don't wait another week. I believe English Query might be one of those important technologies that could change the landscape. English Query is just turning the corner to big usability. Soon, lots of companies will be jumping on the English Query bandwagon—why don't you be the leader? Get on now, and when the rest show up, give them a helping hand! If you're already using English Query, please share your experiences. You can contact me at [email protected] I look forward to hearing from you.