A reflection on experience with designing NUI

For the NUI class project, we decided to use the MYO armband as our mid-air input technology, for symbolic input .We imagined the scenario as symbolic input for wearable technologies where a physical keyboard cannot be used, screen space is an issue and voice commands might be socially awkward.

The MYO seemed to be a good fit for input in such a scenario because it’s minimally intrusive and can be used anywhere since it is worn on the hand and reads muscle activity, it is not confined by a physical workspace in terms of its sensing ability.

It came with its own set of issues, as listed below.
There are very few simple poses that the MYO can recognize
MYO is not any exception to the problem of live mic like other mid-air input technologies. We needed to design a reserved gestures to work around the live mic.
When using the MYO, there is also no physical frame of reference.
The MYO also seems to be very sensitive to the initial calibration.
For a subset of the poses , (which happen to stress the muscles quiet less in comparison with other poses) , the MYO produces a lot of false negatives.

We had to overcome/compensate for all of these issues to make it a more usable interface . For instance, we combined the accelerometer and gyroscope data with the pose information to come up with a bigger vocabulary of gestures to support more interactions.

In one of our initial designs we considered using just one pose , the registering of that pose, and then movement till the pose was lost was understood as input.But this design, we realized would not allow for continuation. So in our subsequent designs we decided to include rotation for increasing the power of the gesture and reduce ambiguity.

Also, we had to design our system to avoid the poses that the MYO most often failed to recognize, for more frequent tasks, but these were the poses that were fairly simple and less fatiguing (the pinky pose).

We tried to use redundant gestures to enhance primitive recognition. We designed the system to interpret one gesture for the same purpose at all times. For example, an open fist would mean one level up depending on the state of the system.

To deal with the problem of the lack of reference, we had to keep the gestures relative to one another and not defined as absolute. And MYO conveniently offers itself to this purpose since it is not limited by physical space.

At a high level design perspective, we understood the importance of interaction flow . We had to make sure to use gestures that went well with each other so it was more usable and not cumbersome and essentially slowing the user down.

Specific to being a symbolic input, we wanted the system to be easily learn-able and transition the user from novice to expert quickly, we designed for the sequence of gestures to always be the same so it can be memorized and easily performed .

The system aims for visual independence with time,So initially, for scaffolding purposes,the mapping between the gesture and the visual feedback in terms of what symbol is being selected for input is carefully designed.

From designing this NUI, it has become very clear that there is quiet a delicate balance between leveraging what the input technology has to offer and compensating for its limitations and constraints by designing around it, to keep the interface effective and usable at the same time.

MYO – Initial thoughts and concerns

In an attempt to get our hands dirty with trying the MYO for the very first time, we realized that we share a love-hate relationship with it ( I shall quote you on this , Wallace). We were sincerely impressed with the simplistic design and hassle free set-up. A simple low energy Bluetooth dongle and a connecting application along with a very simple but powerful getting started guide was all it took to discover all of its functional capabilities. We were very much driven to validate our design ideas by exploring the gestures further.

As impressed as we were, We wrote our own sample application to understand the actual troubles. This was when we found out that calibrating the MYO is a very crucial step in getting it to work for any further application and This was not in anyway easy or obvious . The getting started guide was so well presented as to (deceptively?) make it all look so easy.

Once we had the sample application, We tried to see if our gesture mappings still made sense and if the design would still work . The problem of false positives, false negatives with the pose being misread or simply going undetected is of concern although we believe that calibrating the MYO very carefully should address this issue significantly if not completely solve it.

The idea of using 3 of 5 poses recognized by MYO in the core activity of symbol selection, and the other two poses for committing the symbol selection and committing the word selection each seems very much feasible, but we still need to address the question of auto completion at a word level and more importantly the visual deign of the layout. This is because the MYO is on the higher end of the sensitivity, and it essentially means that users will need to depend(heavily) on the visual cues , at least initially , in order to get accustomed to the sensitivity and to avoid erroneous symbolic input.

So, we plan on making the visual representation more effective while also trying to balance the sensitivity of the MYO and also maybe allow a different correction mechanism (rotate arm in intended direction ) without having to use backspace (spread fingers gesture).

Analysis of Prior work for in-air symbolic input

Text input and as an extension symbolic input is an essential form of interaction with almost any type of computer systems ranging from PCs to PDAs to maybe even wearable technology. Our attempt at designing a gesture based natural user interface is therefore justifiably not the first of its kind, given the significance of symbolic input. This blog post is an attempt at analyzing prior work in this area.

Gesture based symbolic input yields itself as a very natural alternative system in scenarios where traditional keyboard-based systems become cannot be employed and/or are difficult to use. Examples of such type of scenarios include small screen devices, typically wearables, large collaborative display systems etc. We can see in literature, some gesture based symbolic input solutions suggested for these scenarios such as the usage of  pinch gloves where a pinch between the thumb and the finger along with the orientation of the hand (inward or outward) was used to map to a particular character, tracking a fingers movement in space (position) to map to a particular character using a leap motion device etc.

The pinch glove technique [1] draws from the users familiarity with the QWERTY keyboard layout and also a very few arbitrary gestures were used in this system that are not too hard for the users to memorize. The physical feedback of the thumb pressing against a different finger also represents similarity to pressing a key on a physical keyboard. This technique proved to be sufficiently less error prone,to be used for text input in an immersive VE.

Thumbcoding[2] is a similar approach which makes complete use of the distinct states that the human hand can gesture based off of the 3 phalanges on each finger barring the thumb, and the closure states ( if the fingers are open or close together) but the performance of such a system might largely depend on the effectiveness of the tracking technologies.

The motionInput [3] simply attempts to extend the 2-d touch interface as seen on mobile screens etc to a 3d gestural input, where the users finger is tracked in space and the position is mapped to a character. They have found this system to be easy to learn but also tedious and slow and very much requiring visual feedback.

The work by Scott frees and others[4], demonstrates the use of drawing the letters to input characters using a stylus and a hard surface , although not gestural, this system can be viewed as relevant because it leverages the users knowledge of the shape of the characters and hence while designing a gesture based system, keeping the users already acquired knowledge can help in developing gestures that are extremely intuitive to the users.

The dasher technique [5] uses continuous input again takes users intent to type a letter ( maybe by tracking users eye or mouse input or hand tracking as shown in the video- https://www.youtube.com/watch?v=YSSADq6rCu0) and combining it with a predictive model .

The ring (https://d2pq0u4uni88oo.cloudfront.net/projects/841093/video-350596-h264_high.mp4) exactly maps handwriting as a gesture input.

It is evident that in order to make the design usable and error free and fast, we need to design gestures that are quick and easy to perform, be remembered and recognized , make use of a good predictive model that frees the user from having to type all of the characters out, and also we need to take into consideration the dependence on visual feedback during input.


[1] Bowman, Doug A., Chadwick A. Wingrave, J. M. Campbell, V. Q. Ly, and C. J. Rhoton. “Novel uses of Pinch Gloves™ for virtual environment interaction techniques.” Virtual Reality 6, no. 3 (2002): 122-129.

[2]Pratt, Vaughan R. “Thumbcode: A Device-Independent Digital Sign Language.” In Proceedings of the 13th Annual IEEE Symposium on Logic in Computer Science, Brunswick, NJ. 1998.

[3]Qiu, Shuo, Kyle Rego, Lei Zhang, Feifei Zhong, and Michael Zhong. “MotionInput: Gestural Text Entry in the Air.”

[4]Frees, Scott, Rami Khouri, and G. Drew Kessler. “Connecting the Dots: Simple Text Input in Immersive Environments.” In VR, pp. 265-268. 2006.

[5]Ward, David J., Alan F. Blackwell, and David JC MacKay. “Dasher—a data entry interface using continuous gestures and language models.” InProceedings of the 13th annual ACM symposium on User interface software and technology, pp. 129-137. ACM, 2000.

Design studio- feedback

The first design studio helped us present our initial design ideas with respect to using the MYO. The idea is to use hand gestures to achieve symbolic text input that can be used in combination with any wearable-technology.

We discussed our ideas to be based off of the minuum keyboard for touch surfaces, that takes advantage of the user’s previous knowledge of the qwerty keyboard layout, and it is designed so that the user need not be accurate about the letter that he intends to type, Owing to good prediction algorithms. The first question that came up was how can we allow the sloppiness if we had a similar keyboard for typing using in-air gestures. We realized that we are going to need good predictive algorithms to suggest words based on current selection  and have auto-complete like features, that will make the typing faster.

We had questions about how the MYO detects, and how many gestures it can uniquely identify and raw data that can help us have gestures, that define the nature of action depending on the speed of the arm movement etc. These question helped us realize that we needed to know more about how the MYO processes the data before we settle for designing the vocabulary for gestures.

We also had suggestions to look up techniques such as the Dasher to identify many possible design ideas. We had  a very simple gesture-set for our idea like a side slide of the hand for letter-group selection, a closed fist for choosing the letter , an arm slide for word completion etc. While we did this activity of identifying potential gestures that will be fast at the same time not fatiguing, we realized that we might need a different layout of the letters, gestures for capital letters, numerical characters, punctuation etc.

So our next steps would be to consider all of these suggestions feedback and identify more issues and work-around for the same.

Design Ideas for Project

The main idea for the project is to come up with a natural and intuitive way for symbolic input in scenarios where a physical keyboard is necessarily impossible/inconvenient to use, for instance inside a CAVE like environment or  a collaborative setup while multiple users  sharing a single visual display might need to interact with the system all simultaneously. The interaction technique that is to be designed needs to be intuitive in the sense that a novice user manages to understand or relate to how it functions and it also needs to be learnable in the sense that with a little familiarity and practice, the technique becomes requires close to demanding no cognitive effort.

This can be achieved somewhat easily if we base the design of the technique off of an already existing and known input technique, as compared to building a whole new technique. So the question at hand becomes, ” how do we design an interaction technique that takes advantage of what the users are generally familiar with and at the same time realize this technique without the use of devices that are in use in order to implement the technique? “.

While trying to answer this question, we quickly realized that two skills that we can use  to achieve this would be either  to take advantage of the layout of the physical keyboards (owing to user familiarity), or use the natural shape of the letters and numbers and symbols to define gestures that will correspond to them.

As an addition to using this knowledge, we also want the system to be relative in speed and accuracy, so incorporating gaze direction input along with gestures came up as a thought.  One very rudimentary design idea is to allow the user to use his gaze and one quick  and unique gesture to indicate the start of input and the intended character respectively.

Keeping in mind that the input technique should be very well defined for a particular context  (for instance, the idea mentioned above requires a visual feedback due to the gaze input dependency, which cannot be taken for granted in all application contexts ) and limited by that context space’s constraints , we intend to come up with a technique that  the users picks up with ease and also is able to master it without much effort, both without compromising on speed and accuracy of the input task.

Project Proposal Motivation

The motivation to design a novel text input technique for the class project came from trying to observe common trends in newer NUI applications such as Google glass. As we started talking about what kinds of problems are often encountered in NUI scenarios, we realized that text input is still very much depending on traditional methods, for instance, when a physical cannot be supported due to application constraints , the closest form of text input is most likely  a virtual keyboard that is operated using touch sensing or gestural selection by pointing. These techniques suffer from inherent difficulties such as lack of force feedback, or visual feedback due to spatial constraints in displaying the virtual keypad. If not these, generally, the technique is difficult to understand and master and demands some considerable cognitive effort.

We wanted to come up with a novel design that strikes a balance between performance , in terms of accuracy and speed of typing , and usability in terms of  ease-of-use and cognitive demands associated with the technique.

Our brainstorming process initially started out with us discussing the following questions:

what are the very general problems in gestural interfaces or touch interfaces in a very broad sense?

what are the specific tasks that need to be supported in gesture or touch based systems?

Problems associated with each of the techniques?

What work is already done in terms of addressing these issues ?

At the end of all of these topics, we decided that text input is something that could be applicable in a wide variety of scenarios and it would be very useful if we could design some natural user interface that addresses at least some of the open issues.

Some of the resources that we looked at in the process :

Click to access YJVLC_463inPress.pdf





The influence of fictional user interfaces on Future user interfaces

The video found at http://blog.leapmotion.com/fictional-uis-influence-todays-motion-controls/ talks about the influence of Fictional user interfaces that people see in the field of arts or fiction, and how it influences the future user interfaces. The video draws its arguments based  on modern fiction where motion controls,voice based systems, holographic displays etc are understood to depict advanced technologies. The author of the blog claims that the reason for such attraction  to motion control (among other technologies) ,is due to the fact that such an interaction technique provides the user with a sense of “Power and mastery”.  The author also makes a statement about how these fictional media convey more about the user than the future of these technologies themselves. So , when such UIs are used by characters that are supposedly super human, it creates an expectation , because people believe that the natural users interfaces should provide you with this power and make you feel more like a super-human by augmenting your natural capabilities,for example, a wave of your hand can do something extremely complicated in the real world.

The author then draws out to how the expectation fits into “Immersion and flow” , immersion is a sense of being in a higher level of reality and the sense of flow is achieved when the user is pushed to extend his skills to meet with the rising challenge, too little a challenge , and boredom creeps in, too tough the challenges, the user becomes anxious, both of which breaks the flow.

The video does a good job at analyzing various examples that depict different design principles, and states how these differences suggests that a certain principle isn’t the one thing that is always better and how it all depends on context. He talks about how some interfaces cleverly subvert the expectations in order to overcome the difficulty in realizing them, by drawing the user’s attention to some other interesting, fun, playful interaction and have them enjoy the experience.

So, according to the author, the expectations are very much influenced by these fictional UIs, and it does in a sense imply that these interfaces need not be realistic, as long as they are still easy to learn, use and master and make the user feel more powerful. It is interesting how the video suggests how much users expect indeed (Given that these technologies are still only maturing) ,  and also, how it is very much possible for designers to subvert the expectation by providing a new and engaging experience.



What are Natural user interfaces all about?

A user-interface is anything that sets the platform for communication between a human and a computer system. Given how a computer system behaves in a very specific way,that it is meant to, it only makes most sense to make it understand and respond to the human while the human is given the option of using the easiest,most natural way to interact with it.

 In a  Command line interface (CLI), the user types in some commands that the program understands. The focus is on what the system understands and so the user has to make himself/herself accustomed to communicating with the system in a language that it does understand. To flatten out the steep learning curve of the user , GUI or the graphical user interface was introduced, with more visual indicators that the the users could relate more to  , icons that stood as visual representations of things that the users more easily understood, as compared to text commands in CLI. A Natural User interface is a step further ahead in the sense that the focus is all on the user,it is  an attempt at making the interface more suited to humans, that the computers try to understand and not vice versa.

So, in essence,a natural user interface attempts to make human-computer interaction a more human centered approach. Current natural user interfaces hence use technologies (or a combination of technolgies) such as multi-touch,gesture recognition,speech recognition, motion sensing, body tracking etc.

    multi-touch-games   BrainInterface2


These technologies are used in designing the interface such that the interaction style becomes more direct and very natural and intuitive to the user. An ideal NUI would be something that has enough flexibility in the sense that it can accommodate and work well with a way of interaction by the user that is absolutely effortless (in terms of cognitive effort) and extremely direct for the user.