This is where I will explain the design decisions I make in the process of building things for the TiLAR project.
There are two main libraries that will interface with the Kinect and provide skeletal data. The first is the OpenNI/NITE stack created by PrimeSense, and the second is the Microsoft Kinect SDK.
When this project was started, OpenNI was chosen, if only because it was at the time the only such library available. There were rumors of an official SDK from Microsoft, but it was unconfirmed for several months. With the release of the Kinect SDK several months ago, I've made a few looks at it to see if it would work for our purposes, and believe that it could. However, as there is no pressing feature that would yet make us switch, I haven't focused on it.
OpenNI is an open source stack for interfacing with generic Natural Interaction devices and protocols, originally created and released by PrimeSense. NITE is the closed source OpenNI module that handles skeletal tracking, gesture detection, and other related features. This stack is complimented by avin2's fork of the PrimeSense sensor driver which is compatible with the Kinect.
The official Kinect SDK for developing Windows applications.
This model is defined by a base entity that contains a set of other components. The base entity in this case is the Kinect class, that provides the low level OpenNI interfaces to the Kinect by way of the depth image, RGB image, user tracking, and skeleton data. The final components are the KinectImage and KinectSkeleton that provide the high level abstraction that external classes can use.
By breaking this functionality out into components, adding functionality is simplified, as a clean new component can be created with the desired abilities. It has direct access to the raw data provided by OpenNI and NITE through the Kinect entity, but doesn't need to maintain this data itself. This allows multiple components to work off the same data set, and not require multiple repeated calculations or data storage.
Originally, I had one large object that handled skeleton tracking, pose detection, hand tracking, image generation, and everything. It quickly became too large to add functionality. Things that needed to modify the state of tracking, storing the last hand seen, and enabling and disabling pose watching all seemed to conflict with each other. But I didn't want to make each different object that wanted access to the Kinect to have to create its own set of Kinect callbacks and storage objects.
The current version of the Kinect library is designed to make the workflow fluid.
Calibration is now done online, rather than at a dedicated time, as it did originally. This still allows the user to calibrate when needed, but makes the workflow feel more fluid when the tracking is lost and found repeatedly in a short time, and also explains better what is going on during this process. It doesn't provide as easy a way to indicate the status of calibration, or alert the user to what they need to do to calibrate, which is a large drawback.
A new feature is saved calibrations. The first time ever (first run of the program), the first calibration data is saved to disk. For all subsequent calibrations, this base is used to enable much quicker calibration. The change was extremely noticeable. Not only did it recalibrate more quickly, but when calibrating from a file, it no longer requires the user to make a 'Y' pose for the calibration to detect. A downside to this feature is that on the rare case when a calibration file is not present, the user must know to make the 'Y' pose, and must wait longer to make the calibration complete.
Another design addition was the ability to reset the targeting if it is ever lost. This was added because, during initial testing, there were many times were it was either unclear who it was trying to target, or was difficult to target a second person when one person had already been located, or it would begin tracking with the skeleton in a disjointed manner. By enabling a 'kill' switch in the targeting, it is easy to reset things when they aren't as expected. Combined with the online calibration and saved calibrations, this greatly improves usability.
In the original incarnation, calibration occurred before anything else could happen. This however caused bugs with losing a target, and trying to recalibrate while in the calibration state. It had the advantage of being able to dedicate a section to explaining how to calibrate, and also showing a status bar showing the state of calibration.
The imagined workflow for recording involves the Wiimote. With the Wiimote in hand, the user can quickly and at any moment start and stop the recording. At any point while preparing to record, or even in a recording (though it would cause some jumpiness), the user can reset the tracking state and do a simple fix on many problems that might have arisen.
The overall workflow would hopefully go something like this: