Tim Huckaby on the Microsoft Kinect for Windows SDK: Good News/Bad News for Windows Developers

Editor's Note: Welcome to .NET Rocks Conversations, excerpts from the .NET Rocks! weekly Internet audio talk show. This month's excerpt is from show 748, with Tim Huckaby, CEO and founder of Actus Interactive Software, creator of a suite of interactive, multi-touch kiosk and digital signage products, and InterKnowlogy. Richard Campbell and Carl Franklin chat with Tim about the possibilities of using Microsoft Kinect for Windows in building commercial applications.

Carl Franklin: Let us introduce Tim Huckaby. Called a .NET and smart-client industry pioneer by the press, Tim Huckaby is CEO of InterKnowlogy. Are you still CEO of InterKnowlogy?

Tim Huckaby: No. I thought I stepped down from the CEO role at InterKnowlogy about a year ago because the CEO role is really a crappy job. You know, I still own the company and do that strategic technical stuff and marketing. But I wanted to get back to my product roots, so I started this interactive software company whose first suite of products are interactive kiosk and digital signage.

CF: The kiosk stuff is all WPF and services. What exactly are you using in that?

TH: We run in WPF, Silverlight; and the world demands iOS solutions these days. I don't know if you noticed.

CF: Yeah, I heard that.

TH: Yeah. So even InterKnowlogy builds software in iOS these days, which, of course, is challenging. And God help us, we'll have to build an Android client. But we believe the kiosk is more than just a 42-inch thing on the wall. It's also a smartphone and a variety of other form factors.

Richard Campbell: Is that really a kiosk at that point?

TH: No. You think of a creative name for it, and I'm all over it.

CF: It's a sign, or it's a video feed. I was in the bank the other day, and they have one of these in the bank. [It goes] from quick news bites about celebrities and things like that to local weather and traffic advisories, and then they switched to the rates on mortgages at that bank...all this while you're waiting in line to make a deposit.

TH: Right. So that's us, and I would call what we do interactive, content-manageable, cloud-based assortment of digital and web-based assets. The short name for that is interactive kiosk.

CF: Okay. It's a very Microsoft-sounding title there. Digital signage, I think, works really well.

TH: Unfortunately, the kiosk thing is hard. It's a hard sell. When you just think of a kiosk, you think about a piece of hardware. But our real differentiator and, I think, the reason you asked me to come on is Kinect.

CF: Yeah.

RC: Well, and you said this before, most of the places where you put one of these interactive gizmos is not a place where I'm willing to touch anything.

TH: Yeah. So the use case I use most often, especially when I'm on stage because it resonates so well is, you know, I live in Carlsbad, California, so I have to travel through LAX.

RC: Right.

TH: If you've been to LAX before, you know that it's one of the most disgusting places in the world. You do not want to touch anything when you're in LAX, let alone a computer screen that everyone else has touched.

RC: Right.

TH: But walking to it, and yelling at it, saying, "United Airlines Denver," and having your flight come up or waving at it to the local restaurants in town is a legitimate use case that everyone can get their arms around.

CF: Yeah. I think the problem with talking to it is just the noise problem, which, indeed, is a problem, not just in a noisy airport but even in a doctor's office where there's chatter in the background or music or something like that. It becomes very difficult to recognize things.

TH: You know, Carl, I might have agreed with you three months ago. But in the new bit, in the new device, the fidelity of that -- this is your world, not mine -- multi-spectrum microphone...

CF: Yeah, the microphone rise.

TH: Dramatically better.

CF: OK, good. And this is exactly what I want to talk about -- GesturePak [for Kinect]. I'm counting on you to fill me in about what's in there.

TH: OK. And Richard and I have been strategically talking about how to deliver the good news and bad news here. So as you well know, your demo, your GesturePak, is such a great demo for me. It just resonates so well with developers.

CF: That's great.

TH: You show a little bit of code, and you can say, "Hey, look how easy this is." And then the Kinect team came out with their production SDK and broke every interface they have.

RC: Oh, man.

TH: And your app no longer runs. In fact, it's going to take a significant amount of work for you or for us, if you give us the blessing...

CF: Yeah.

TH: To put the damn thing back together.

CF: But do they still have an event that essentially updates 30 times a second and fires you all that data?

TH: Sure. But think of it like this. And I'm a good news/bad news guy on this. I mean, there's some just fantastic stuff that surprised us in the production bit, but it also broke everything, literally everything. So they introduced the, I guess you'd call it peer-to-peer support for Kinect. That's mostly what broke every API.

CF: OK.

TH: Meaning you have to iterate through the Kinect devices that your software is talking to now.

CF: Right. You still had to do that before. I mean there [were] all the devices that were available, and you find the device.

TH: OK. Well, they broke API to it, which means everything you do is now changed. So I was joking with them the other day. I shouldn't say this on your show, but you know, we're all a tight-knit community of 10 gazillion .NET programmers, right?

RC: Right.

TH: I was joking with the Kinect team, and the guy, I can't say his name out loud, but he goes, "Oh, why don't you just do a global search and replace and a semicolon?" That's not the case.

CF: Funny.

TH: So suffice it to say, InterKnowlogy is retrofitting as we speak, and in some cases, it's really painful.

CF: So, one other question. Because of the good news/bad news thing, is there gesture recognition built into it? Because I've seen some stuff on the web now that this is enabling gestures.

TH: Yeah. Well, it certainly is a lot harder to do that. It was really easy to do body tracking and do simple "Hello World" applications, and that is no longer the case.

CF: But there's no gesture recognition built into the SDK?

TH: Yeah. Sure there is. Well, what do you mean by that?

CF: Well, the SDK, the version that I had just passed through the data, and I had to write the stuff to recognize gestures. Is there a gesture recognition part to the SDK now?

TH: OK. I think we're getting into semantics here because what you're calling a gesture -- is it not true that you're just saying you measure one point, and then you measure another...?

CF: No, that's doing it by hand. That's what I'm doing. But is there a higher-level kind of pause and we'll record it?

TH: No, there's none. So, like the physical therapy stuff we do where a gesture is 25,000 points, and it's got to be absolutely accurate, that is a tremendous amount of work.

RC: Tim, I'm presuming that Microsoft did not break all the APIs because it's fun.

CF: Yeah, there's got to be a reason.

TH: In a way. Remember, we used to have an RD manager that used to tell us constantly, "Please don't speculate about what Microsoft is doing." And we would always speculate anyway.

CF: And we were always right.

TH: That's our job. One might speculate. Well, there are some things that appeared that you could draw some conclusions from. This macro mode is a total surprise. So maybe I should tell the audience that there are now two different Kinects. There's a Kinect for the Xbox. And then there's this new Kinect for Windows that is specifically designed for us .NET folks to build applications that run in Windows software.

CF: And it's a new device.

TH: Yeah. I mean it looks the same, but it is hardware different, and it's most certainly software different. But we go to this whole top process and this beta process for months, maybe a year, and then they ship a production device just magically. We had no idea they're going to swap out the device. Also we had no idea that we'll have some new features like macro mode. So try and think of a use case where your face would be six inches in front of the Kinect, or something would be six inches in front of the Kinect.

CF: Is that what macro mode is, a close-up?

TH: Yeah, up close. If you walk too close to the Kinect, the Xbox, excuse me, it would lose you.

CF: Yeah, it freaks out.

RC: And too close was, like, three feet?

TH: Yeah, three feet. I believe that is what it is. So now you can get within three feet, and it can track you.

CF: Wow.

TH: Now three feet, you can't be doing any arm waving. Right?

CF: Right.

TH: So if you want to draw a conclusion, one might also speculate that there's some API support for facial recognition.

CF: Yeah, there's got to be. And fingers.

TH: Yeah. So if we're not supposed to speculate, then I'm doomed because to me that says this is going to Windows, and they're going to do the Kinect-based authentication, and Kinect is going to be in every computer, every Windows computer maybe a year from now. That's my speculation. I mean, why else would they do that? I cannot think of an up-close use case other than facial recognition.

RC: Still I'm wrestling with why they broke everything. Did they add new features in it?

TH: So the three of us have been working for and with Microsoft for over 20 years, and I cannot remember a beta program and an early adopter program where they threw so many surprises and broke so many things from the final beta to the production bit. I've been around a long time.

CF: Well, back to Richard's idea, they didn't do it for fun. They must be setting themselves up for another version that has features that are going to rely on this new architecture.

TH: Yeah, yeah.

CF: So you might even be able to speculate further by looking at the architecture that they're laying out -- which I'm at a disadvantage because I have not seen it yet.

RC: The other thing that's exciting to me is this idea that support for Kinect is at one-to-one machine.

CF: Yeah.

RC: Let's just ponder for a moment with that.

CF: One in each corner of the room.

TH: Well, you know, one of the great use cases that we're working on right now for retail is, you know how hard it is to triangulate a location of a person, especially when they're inside. We've done it here in InterKnowlogy with wireless access points, but that only gives us like a 10-meter -- we could pin you down within 10 meters. But technically, if we get the fidelity out of the Kinect like we think we're getting, well, you could triangulate. You're not even triangulating at that point. You are physically looking at people and pinning them down to within, I would think, to a two-inch resolution where they are in a store. And if you knew exactly where they are, in retail at least, and knew what they're looking at and then maybe noticed that they look at the same thing at a same store in a different city, in a retail scenario in the hack of this thing, you could upsell these people.

[Say] you've been looking at the Fender guitar in Seattle, and now you're in Connecticut looking at it in a Guitar Center. You want 20 percent off. That is a realistic future-of-retail scenario that, in fact, we are really hoping to build.

CF: That scares me, dude.

There's much more! You can find the full interview at dotnetrocks.com/default.aspx?showNum=748.

Comments

Plain text