Pondering the connections made by Office Delve

On September 11, I reported that I consider Office Delve “to be possibly the most interesting new technology introduced into Office 365 since its launch.” You might consider that Delve has not had much competition for such an elevated status, but that’s beside the point. The thing is that most Office 365 tenants now have Delve and many are probably wondering what to do with it.

The first thing to realize about Delve is that it is simply a pretty interface based on queries executed against the Office Graph, which is the technology that constructs and understands the connections that link people and information. Delve stores no real data of its own—it depends on the results returned by Office Graph that, in turn, depend on data held in sources such as SharePoint or OneDrive for Business. Hopefully, some day that list will include Exchange.

Behind the scenes, Office Graph depends on the indexes created and maintained by the Search Foundation technology bought by Microsoft in the "FAST" acquisition in 2008. According to Microsoft, 95% of new information (adds or modifications) shows up in the index within an hour, which then makes it available for Office Graph and Delve. You can think of Office Graph plugging in to analyze the search indexes along with other "signals" (like important correspondents maintained by Exchange as People View and documents that have been worked on in the last three months) to construct its connections using machine learning algorithms.

Proof that Delve is a way of presenting information based on searches can be seen from the interesting work described by Richard Dizerega to build the “Office Bubbles” application using Graph Query Language (GQL) queries executed via the SharePoint REST API to access the Office Graph. Another approach to using Office Graph queries which incorporates Delve-like functionality into the SharePoint search center is described in this post.

Examining GQL a little closer, we learn some of the thinking behind Office Graph. “Actors” such as people have “Edges” or connections with objects like documents. In graph terminology, the Actors are source nodes and the objects are target nodes. Edges can represent a single action, such as someone viewing a document, or multiple actions ("trending"), which might indicate that an object is of interest to a particular user.

Edges have properties that define how the connections are made, such as the organizational relationships described below. The properties also include timestamps and weighting; the weighting is interesting because it indicates how strong the Office Graph considers the connection to be. Queries can sort returned results by these properties so that, for instance, the most important documents or the most recent documents are shown first.

The GCL calls specify the Actor by using a unique identifier to specify the person or persons that you’re interested in (or the special “ME” identifier to use the logged in and authenticated user). Together with the Actor, you pass an Action to state the kind of connection you want to examine, such as the documents seen by the actors in the last 3 months, or indeed the documents that should show up on the user’s Delve home page.

Interestingly, some of the action types are linked to organizational constructs and, I assume, depend on a well-populated Active Directory (in the case of Office 365, Azure AD). These types are OrgColleague (someone who works with the user), OrgDirect (anyone who works for the user), OrgManager (the user’s direct manager), and OrgSkipLevelManager. The last seems a curious name but the term “skip level manager” is often used in the U.S. to refer to upper management rather than direct line management. In Active Directory terms, I assume this would be the manager of your manager.

All of the searches performed by Delve are constrained by permissions. In other words, if you don’t have access to a document, it will not show up in Delve. This does not mean that inadvertent disclosure cannot occur because it is possible that users might forget to protect a document against public view. In this case, Delve might reveal the document to other users. Of course, this is similar to leaving a confidential document unprotected on a file share or other repository. If you do that, then you can’t complain if someone finds it.

While everyone can understand how human error can lead to exposure of an unsecured document in a file share, some wonder whether the “trending” behavior of Delve might result in more cases when information is revealed without being realized.

For instance, in the support article explaining how documents show up in Delve, Microsoft explain that a user who makes frequent updates to a document can result in that document being recommended to their manager. This is because the Office Graph observes the frequent updates and regards this activity as “trending” and because the manager is deemed to have a strong relationship with the user. Putting two and two together, Delve assumes that the manager might be interested in the document – and if that document is unprotected, the graph query will return it and the manager will be able to see it. The lesson here is clear – take explicit steps to protect sensitive information. But you’ve been doing that anyway – right?

At this point I think Delve is more interesting to large companies as these are the home of classic information silos. One part of the company doesn't know what another part knows and no one really wants to share. Delve won't help in sharing but it might just make information more accessible to all.

From the ongoing discussions in the Delve technical forum, I can see that people who have programmed SharePoint in the past can easily work with GQL queries and that the creative juices have started flowing. For instance, among the applications created by Mavention, a small Polish company, we find the first Windows Phone application (the "Document Finder") which displays the results of Office Graph Delve-like queries. It will be interesting to see how people use the Office Graph in their own applications and how Delve develops to accommodate different kinds of information (like Yammer and some types of Exchange mailboxes) and different ways of presenting information to users.

Follow Tony @12Knocksinna

Comments

Plain text