The Evolution of User Interfaces


by Stephen Atwood

I am not sure whether it happened after reading Daniel Wigdor's Enabling Technology article for this month, "The Breadth–Depth Dichotomy: Opportunities and Crises in Expanding Sensing Capabilities" or merely as the result of seeing so many new ideas for human–machine interfaces (HMIs), but at some recent point I realized that hardware will no longer define the scope of interaction between humans and computers. In his article, Daniel says, "A simple touch is not simple. What we think of as "touch" actually includes a variety of object-sensing technologies and an even wider variety of information that can be detected about the sensed objects." I am willing to say that this interaction is going to go beyond that to speech, facial expressions, tone of voice, and even, some day, to mood-interpretation.

IBM's new Watson computer project recently demonstrated for the record that computers running artificial-intelligence algorithms are capable of interpreting complex human speech and answering very abstract questions1 – effectively performing vast queries of data based on clues as well as context. No, this is not the same as human thought, but it is a giant leap forward in the realm of interaction between humans and machines. The developers of Watson picked the TV quiz show Jeopardy! to show off their achievement because "The game of Jeopardy! makes great demands on its players – from the range of topical knowledge covered to the nuances in language employed in the clues," according to IBM's Web site and promotional materials.2 In effect, what I think they really achieved was to prove the viability of a real speech input system for computers.

Touch screens and related interfaces have always represented a great advancement over keyboard or text-based computer interaction. Originally, we had to type the exact commands to the machine each and every time, even carrying the commands around in shoe boxes of punch cards. Later, with the aid of terminal displays, we were able to type commands in real time. The next big break came with the introduction of the graphical user interface (GUI) that is mainly credited to Xerox PARC and was adopted by both Apple and Microsoft. Now the computer could effectively give us a palette of options and could remember its own underlying commands, hidden behind icons and controls. Before this point, a touch screen would have had limited value. However, with the GUI, we could now design a wide array of choices and actions that could be intuitively selected by users simply by pointing at them. It is hard to overstate this monumental step and how much it bridged the gap between machines and the people who needed to use them.

As touch screens evolved, much of the focus was on the technology of the screens themselves and improving rather small details of the interaction such as whether the user could wear gloves or use a stylus. It was still primarily aimed at pointing and selecting predetermined icons and such. To me, the next small leap appears to have come with multi-touch interfaces, in which users could express their intentions with gestures instead of just choosing menus and icons. And now, a machine can interpret shades of gray in these gesture commands and respond in a similar, measured way. From here it is not much of a reach to see where you can go by recognizing hand and body movements with cameras (in the case of Microsoft's Kinect) or by measuring the momentum of a handheld wand and attempting to determine real intent (in the case of the Wii).

Suddenly the big next leap does not seem that far-fetched. First, drop the qualifier "Graphic" from GUI because the interface no longer needs to rely solely on either touch-screen gestures or on graphically predetermined options. Next, add the ability to interpret speech (as Watson can do now), with face-recognition technology to establish mood (as has been demonstrated in several academic settings) and the UI of tomorrow could really be a conversation with the machine that incorporates all the nuances of gesture, mood, spoken idea, and maybe even tone of voice that we use with each other as human beings.

All of the basic hardware building blocks to achieve this exist today, in many and various forms. Digital cameras can be used for recording faces and bodies. Sensors mounted on a person (or held in hands) can determine all the required states of motion as well as body temperature. Microphones can capture audio speech and speakers can allow the machine to talk back. It is no longer the hard-ware that is holding us back. It is now a matter of how much functionality we can envision and how much artificial intelligence the computer science community can bring to bear on the task. We already have handheld devices that can make calls on command, surf the Web, and even write messages with speech commands. Imagine being able to ask your iPhone to survey the local restaurants, recommend a place with good seafood, and speculate based on the fish-ing seasons and the migration patterns whether the salmon will be available fresh or frozen thatday. Whimsical for sure, but no more unrea-sonable than Captain Kirk asking his computer to speculate on the likelihood of some complex astrophysics effects contributing to the dilemma du-jour he is facing in deep space.

So, I brought you through this train of thought culminating in a Star Trek reference because my goal was to illustrate that the relatively basic embodiment of touch, in my view, is one of the cornerstones on the journey to a free-expression UI, and still extremely relevant to the future of computing devices. The vast array of touch or body motion interface technologies available today are building blocks in the critical hardware foundation needed to support the next generation of UI capabilities I am so easily suggesting. That is why, more than ever before, keeping our eyes and hands around innovation in the touch space is a critical part of understanding the future of the display industry.

To keep us up to date and focused on the latest trends, we continue to rely on this month's Guest Editor and one of our most ardent supporters, Mr. Geoff Walker, whose official title at NextWindow is Marketing Evangelist and Industry Guru. Geoff has done an outstanding job assembling this month's array of articles and you can read his great introductions in his Guest Editor's note. Geoff is also a frequent seminar speaker at SID and I hope you have the chance to experience one of his seminars if you are coming to Display Week in LA this year.

Every year, the March Touch Technology issue of ID is one of our most popular issues. We receive many requests for extra copies, our advertisers provide us very generous support, and the articles are always in-depth and fun to read. People just naturally understand touch paradigms and all seem to have stronglyformed opinions on how the technology should perform. That leads to lively discussions I always look forward to. Next year, I suspect we will be calling this the User Interfaces issue and expanding our reach even further, based on where the industry appears to be going and on my own logic discussed above.

I would like to once again acknowledge the very generous and enabling support being given to us by Avnet. As a strong backer of the display industry through its many activities, which include application-engineering support, customer education, and supply-chain management, as well as its support for SID and Information Display magazine, Avnet helps us all move the world of displays forward in new and innovative ways. We really appreciate the company coming on board and co-sponsoring ID this month.

One final note: As we were going to press we learned of the dreadful circumstances following the earthquake and tsunami in Japan. I can only imagine the scope of the tragedy that will slowly be revealed to us in the coming days. Our thoughts and prayers go out to everyone involved with the sincere hope that recovery comes fast. What we do as technologists is only a small part of who we are as human beings and in these times, the real measure of our spirit is how we reach out to help each other and convert our compassion to actions that truly heal. The whole world will be working and praying for those involved.


1Watson is an artificial intelligence computer system capable of answering questions posed in natural language, developed in IBM's DeepQA project by a research team led by principal investigator David Ferrucci. In 2011, as a test of its abilities, Watson competed on the quiz show Jeopardy! in the show's only human vs. machine match-up. In a two-game combined-point match, broadcast in three Jeopardy! episodes February 14–16, Watson bested Brad Rutter, the biggest all-time money winner on Jeopardy! and Ken Jennings, the record holder for the longest championship streak.