Beyond casual: thoughts about gesture gaming: Part 1: Opening and 'On-Body' Items

Microsoft Kinect did exceptionally well in bringing full body gesture gaming platform to the masses. The system consists of a 3D sensing solution from PrimeSense, together with a fancy 4 microphones beam forming array, speech recognition algorithms and top-notch computer vision algorithms running on the Xbox. It’s not only HW and algorithms: Microsoft Studios division also did a great job in bringing new experiences, such as those found in Kinect Adventures, and newer interaction models found in some of the Kinect Fun Labs (Check out Air Band!).

But putting aside several success stories, such as Kinect Sports, GunStringer, Halfbrick’s Fruit Ninja Kinect, Harmonix’s Dance Central; the majority of the available Kinect games leave much to be desired. It looks like gesture gaming creates ultra-casual experiences at best. In some cases – it seems like it’s even a step backwards from the Nintendo Wii catalog.

Of course, current generation of the technology has its own limitations – such as limited field of view (FOV) and less than finger resolution. However – the fidelity of the available input is still light years ahead of anything else we had in the past.

Until recently – the degrees of freedom available in the controller was predefined – out of the scope of the game designer (Harmonix’s Guitar Hero and Rock Band are of course an exception). The controller was something pre-defined, by the console maker, the PC HW vendor, or by the OS (IE: If you use a Windows machine – you can assume a keyboard and a 2 button wheal mouse). On the other hand - endless degrees of freedom is not only a blessing –it takes tremendous amount of design and research to define a successful new control scheme (Think of how all the 1^st person shooters, on the PC they converged to a nearly identical keyboard mapping and mouse scheme)

ATARI2600 Joystick

XBOX360 Controller

Back in the 80s, Atari2600 featured a single button 8 direction Joystick and a peddle. The input was well defined and accessible by both gamers and the rest of us. The NES added more buttons, but addressing the gamer’s endless request for power, latest generations of the controllers found in the Xbox and PlayStation feature 2 analog joysticks, 1 D-Pad, 4 action buttons, and 4 triggers. They are master-pieces of design and ergonomics. If you manage to get the hang of it you get exceptional level of control – most probably exceeding those found in many professional markets and even military devices. The F16 Joystick is years behind what the average teen uses to fly his virtual battlefield!

PS Move

XBOX Kinect

The downside of this evolution is that it left behind the majority of humanity. Those not interested that much in spending the time necessary to master all the buttons and sticks have differentiated from those who do. We call the two groups now ‘gamers’ and ‘casual gamers’. After my first play experience with the Sony PS Move controller, I tried to compare it to Kinect and had two main conclusions: “It’s better in that you have a button under your pointing finger. But, its worst due all the rest of the buttons there … WTF!?”

Yes - I am old now. I don’t have patience to learn what to press – it’s too much. I just keep pressing buttons until something happens. But I certainly do not agree to be cast out of the gamers club!

One aspect of the GUI revolution was making the display ‘soft’. This means the application can alter the display according to its state – to reflect the most useful controls at any given time. The contrast of an analog mechanical gage might be better than the LCD representation, but the benefits of being able to change it dynamically make it worth the sacrifice. The gesture interfaces, NUI direction, are ‘softening’ the input device. You can create a ‘virtual’ controller that will best suit the current application. You have a wheel when you need to drive and a sword when you need to cut (Only fruits – of course! Make salad not war!)

Going back to the main topic – I believe it is well possible to create responsive and challenging gesture games. Games that will be really fun – and not only in ‘party mode’

Many people imagine gesture control as an interface in which users need to memorize complicated gestures, that once detected trigger some kind of operation. So – gestures are simply complicated full body encoding for the controller buttons? Does not seem fun to my taste. I am much more in favor of gesture 1:1 mapping, where the motions are analogically mapped to control aspects.

PrimeSense, as the provider of the 3D technology – created several concept gaming interactions over the years. Thankfully – many of those demos are available in the openNI Arena website.

Initial prototypes demonstrated boxing experience, as well as basic body mapping – moving an Avatar in a boxed game area.

CKJ

Boxing

Items on Body

Once skeleton tracking algorithms were stable enough – we also demonstrated full body tracking. Mapping the motion to the Ogre3D’s Sinbad character was an instant success – and not only due to the extremely good job made by the original modeler or the talented computer vision researchers. One could draw the swords Sinbad holds on his back – merely by bringing both hands behind your neck. Drawing the swords was an extremely pleasing experience. You could easily imagine how you could battle enemies attacking Sinbad (Even though – this demo does not have any enemies). This was the first demonstration of a concept in gesture game I call ‘Items on Body’

Sinbad

SinbadNI: http://arena.openni.org/OpenNIArena/Applications/ViewApp.aspx?app_id=466

There are several benefits to this approach:

· You don’t need to memorize gestures – as the interactions are defined by the visible items on your avatar. It’s natural to touch what you see.
· Since the interactions are mapped to your body – you don’t encounter depth perception problems often experienced when trying to touch items in the virtual world. You can even enjoy ‘muscle memory’ once you know how to operate your items. The swords, in examples are always on your back. In the heat of the battle you can keep your eyes on the enemies, just as you would with real weapons you mastered

This all sounds good and easy – but the reality is always more challenging. To actually make it work, you will also need to deal with issues of retargeting and false activations

In case the avatar model body proportions are significantly different than those of the users, naive mapping the user’s skeleton joints to the avatar will have undesired results. Imagine the user touching his head when the avatar’s hands are much shorter. If you will implement the detection by a collision box on the avatar – it will never be triggered. You can implement a collision box on the user skeleton instead – but then you will sacrifice the learning curve, as learning to interact with new items will not be according to the visible interaction of the avatar’s hand with the item. This can be solved by smarter retargeting algorithm that will take the proportions differences into account or by scaling the model’s joints to match the user’s dimensions.

False activations are commonly related to the limitations of the tracking algorithm, and its behavior on occlusion, complicated poses, or motion blur. You can surely expect that the user’s hand will reach the item’s hot-zone unintentionally.

False triggering can be compensated by adding additional activation requirements:

· Temporal requirement: A short pause before triggering the item (During the pause there should be some animation)

· Require two hands operation (Like Sinbad's swords)

· Add requirement of touching and then moving in a certain direction

o Examples:

§ To remove the crown you need to touch it and then raise your hand

§ To remove sword, dagger or arrow you need to touch and move in a sensible direction

o If after touching the hand moved in another direction the operation is canceled

o One simple implementation is defining two collision boxes for each item – and requiring the user to pass through both in the right order to actually activate the item. As always – it is highly recommended to add as many visual feedbacks to the correct operations (Popping out the sword a bit, glowing and making sound)

Some of those ideas are demonstrated in the new Unity3D integration example: AngryBotsNI

AngyBotsNI

AngryBotsNI: http://arena.openni.org/OpenNIArena/Applications/ViewApp.aspx?app_id=586

Coming up next: Part 2 - POV and scene

4 comments:

UnknownFebruary 6, 2012 at 11:47 PM
One problem I've always had with serious gesture gaming is that there's no intuitive way to control which direction your character walks in. Do you think controllers might still be necessary for that? How could you get around that problem?
Patrick JarnfeltFebruary 13, 2012 at 4:16 PM
Hi Micha

Nice blogpost. We at Copenhagen Game Collective also have a take on the limitations and challenges developing for the Microsoft Kinect. Check out our blogpost and video on the subject here: http://www.copenhagengamecollective.org/2011/10/19/prototyping-for-kinect/
UnknownNovember 18, 2013 at 6:41 PM
Hi. I can't find AngryBotsNI anywhere. The OpenNI Arena link doesn't work. Could you please provide an alternate download link?

New to Beyond Casual? – start from part 1!

Part 1: Opening and 'On-Body' Items

Items on Body

4 comments: