Dmitry Gorodnichy has probably spent more time on the phone than in the lab in the last week.
The researcher with the computational video group of the National Research Council’s Institute for Information Technology – henceforth referred to as “”CVG,”” “”NRC”” and “”IIT,”” respectively, so
I can actually file this editorial on time – has had a deluge of media, industry and user inquiries since New Scientist published a story the nose-operated mouse – or nouse – he’s developed.
Before you conjure up images of someone face down on his mouse pad, chicken-pecking away – admit it, some of you did – it doesn’t work that way. A garden-variety USB Web cam mounted on top of the computer monitor supplies a signal to the software, which tracks the movement of the user’s nose and translates it into cursor movement.
The nose is the easiest feature to track. However, the word “”feature”” means something different to Gorodnichy and the CVG than to you and me. To us, a feature is, “”that thing thar in the middle of your face.””
What the nouse is actually tracking is – and please correct me if I’m wrong, Dr. Gorodnichy – the extremum of the convex shape of the nose tip. This is the point on the surface of your face that is closest to the camera. Gorodnichy points out that this point changes as you turn your head. It’s a sort of virtual nose tip.
Not only does the nouse track this, it does so on a sub-pixel level. Consider the resolution of the aforementioned garden-variety Web cam – let’s say, for the sake of argument, 640 by 480 pixels. Then consider the number of pixels on your run-of-the-mill 17-inch monitor, which we can accurately count as “”lots more than that.”” It’s no good to describe the extremum as moving from Pixel 5 to Pixel 6. It has to be described as going from Pixel 5.55 to Pixel 5.56.
The nouse is one element of a suite of perceptual visual technologies the CVG is developing. Rehab hospitals have been contacting Gorodnichy about another. “”It’s more blink-detection that they’re looking for,”” says Gorodnichy – a nouse-following mouse wouldn’t be much use to a patient with a restricted range of movement. The system can distinguish between intentional and unintentional blinks, and use the former to execute commands. “”We can build a blink-based lexicon,”” says Gorodnichy – two blinks switched between windows, three blinks calls up a dialogue box.
Think of it as a mouse click.
Or rather, don’t. This is where some media reports have morphed the technology to correspond to the conventional mouse more closely than it actually does. A Reuters report suggested that left winks could correspond to left clicks and right winks to right clicks, which would make a user look uncannily like Herbert Lom playing the tic-ridden Inspector Dreyfuss, Clouseau’s boss in the Pink Panther movies. (Apropos nothing, in one of those movies, Dreyfuss accidentally shoots his own nose off, which is close enough to irony for the puposes of this article.) This led one analyst to smirk in the report at “”the high silliness factor of the nouse … People balk at doing things that make them look silly, and there is ample room for looking silly here.””
That simply isn’t the case – the software detects series of blinks only, and is customized to execute a task on the basis of that detection – which might or might not correspond to a mouse click. Don’t take the nouse to literally.
“”Speech recognition is not meant to replace a keyboard. Visual technology is not meant to replace a mouse,”” says Gorodnichy. In fact, it’s easy to imagine how the technology, in conjunction with the usual keyboard-and-mouse package, might speed text-intensive applications or add an element of finesse to graphics packages, once you’re used to it. Your nose knows no boundaries to the applications. Sorry, couldn’t resist.
Dave Webb reminds you that the plural of “”extremum”” is “”extrema.”” You can email Mr. Webb at dwebb@itbusiness.ca.