Right now, the question of what you need in a mobile computing platform is most often phrased in terms of “Do you need a netbook or a full laptop? Or perhaps one of the new high-end smartphones will manage?” I think the question isn’t one of capabilities as much as it is a question about how we access those capabilities.
For some people, the iPhone’s lack of a physical keyboard is a deal-breaker. For me, the smaller-than-standard keyboard on the average netbook is a powerful disincentive: If I had to use one, it would slow down my interaction with the netbook — and if I learned to be fluent and productive with the small keyboard, it might mess up my muscle memory for dealing with full-size keyboards on my “real” computers. It’s not a trade-off I’m willing to make.
The Palm Prē’s physical keyboard is tiny. I can only key it with my thumbs, and there’s no risk of interference with my pre-existing keyboarding skills. Inputting data with it is achingly slow, but offset by the device’s wonderful portability (it fits into a pocket even easier than an iPhone does). But I can’t really edit text with it, because there’s no D-pad to do precise cursor positioning with. Even the Orange+finger-movement trick is balky and awkward, in my experience; if I want to correct a single-letter typo, getting the cursor after the incorrect character so I can backspace and correct it is such an ordeal, it’s often quicker and easier for me to use Shift+Backspace to delete the entire word and then retype the whole thing.
In effect, even though the phone has the ability to edit text, the interface makes it so difficult that I can’t use the capability. It might as well not be there. What would a better interface mechanism look like?
In Charles Stross’ Accelerando, the protagonist starts off with a set of glasses that provide him with a constant Net connection and heads-up displays of whatever he desires: maps, email, people’s vCards, and so on. But Stross (perhaps wisely) doesn’t give much detail about the glasses’ input mechanism. “He glances up and grabs a pigeon, crops the shot and squirts it at his weblog to show he’s arrived.” How? That part’s left to the reader’s imagination. (A very crafty trick on Stross’ part, and one that writers can pull off and user-interface engineers cannot.)
If I want to do with my phone what Stross’ character did, I have to yank it out of my pocket, press the power switch, then make a swiping gesture that tells the phone its attention has been requested by a real human (rather than simply being jostled in a pocket or handbag). But Stross’ protagonist’s glasses were already powered up and in use, so suppose I were already using my phone and decided I wanted to take a picture of something?
Tap a physical button to escape from whatever app I was already using, then press an on-screen button for the main “launcher” feature. Find the “camera” icon, tap it, wait for the camera to load. Then I can aim and press another on-screen button to capture the image.
Cropping is pretty much out of the question, although someone could write an app for it. And I actually can update my blog from my phone; it has a WebKit-based browser and enough screen real estate to make writing and posting an entry possible, albeit painful.
Stross’ interface has the luxury of not having to be real, of course. But something that already works as a real-life prototype is the Sixth Sense system, built by Pranav Mistry of MIT’s Media Lab. It senses the user’s hands, and you can take a picture simply by framing whatever-it-is you want to capture with your fingers and thumbs. (It does a whole lot of other things, too, and I highly recommend the entire video.)
2 Comments
Even when the iPhone only allowed you to enter and delete text, and not select and copy/paste it, I always wanted simple left and right CURSOR KEYS. It’d be so much easier if I make a typo three characters back to just tap left arrow twice and delete, then the correct key, instead of putting my finger down and waiting a moment for the loupe to show up & dragging the cursor to the right spot. Finishing off I could either tap the right arrow twice or tap just once at the end of the line.
It might be of interest that the on-screen keyboard in the Newton OS included left and right cursor keys–but not up and down. The external hardware keyboard included a full set of up/down/left/right keys. When I was using my Newton MP 2000 regularly I made frequent use of the cursor keys while editing text. Select/cut/copy/paste in Newton OS was also straightforward and usable–even more so after enabling the hidden system pref to allow you to have multiple “clipboards” (cut/copied text/images/etc. “stick” to the edges of the screen as a “clipboard,” so it’s easy to see which clipping you want to drag off to “paste.” Using the same double-tap “copy” gesture on clipboard objects would leave a copy behind). Even though I own two of the external hardware keyboards I never used them since the on-device text entry was actually really good. I did miss those up and down arrow keys, though. I eventually wrote a little floating “D pad” of cursor keys to get them, but rarely used it.
Also of note is how the external keyboard dock for the iPad includes a full set of cursor keys, while the on-screen keyboard still lacks any (http://www.apple.com/ipad/design/). Plus the iPad will work with any Bluetooth keyboard.
I think the message here from all of these devices is that while you’re on the go the device is mainly intended to display information, rather than enter it. When you want to do significant entry of information you’re expected to use some “old school” device, like a computer that you synch with, or an add-on hardware keyboard.
I think there’s room for an interface that in between. I personally did a tremendous amount of text entry on my Newton, but that was with a (ewww) stylus.
My hope and expectation is that improvements in mobile processing power will allow more sensing and predictive capabilities, so the context of where you are and what you’re doing can be factored in and you get supplied with a good guess at a minimal set of options for you to “assemble” the information you want to record. i.e. via image recognition your device can tell you just took a photo of a cat. You automatically get presented options to send that photo to the people you have sent photos of cats to before, upload it to your blog or personal gallery, or add it to a photo gallery of images taken in that location by other people. These options are presented to you as a head’s up display projected by your glasses/frames/contacts. Blinking your eyes makes the HUD disappear. To select one of the options you focus your eyes on it and squint. You can elect to add text to what you send before it gets sent, or after, based on a toggle control in the HUD that you focus and squint on [or not] before choosing the main action. Entering the text requires a separate interface, which you’ll probably pull out of your pocket at some point.
Similarly, when typing an email you could just start typing the message, and the device could analyze the content and present to you its best guesses at who the likely recipient(s) would be, allowing you to choose one or more before sending.
@Lunatic:
Sounds like we both agree about the usefulness of cursor keys. It’s funny you mention using “a (ewww) stylus” with the Newton; I’ve been coming to the conclusion that ditching the stylus was one of Palm’s many mistakes with the Prē. The stylus they used from the original PalmPilot all the way through the Centro meant that the user could tap pretty precisely on any screen element that was at least, say, 10 pixels in size.
The webOS developer guidelines, by contrast, advise that touchable targets should be at least 48 pixels — nearly 5 times as large in both dimensions, let’s call it about 20 times the surface area.
That has horrible consequences for the amount of information the OS can get on the screen at one time. Even though the screen resolution is, IIRC, slightly better than my Trēo’s was, I’m lucky if I can see half as many items at a time in my calendar or to-do list.
Your point about “while you’re on the go the device is mainly intended to display information, rather than enter it” is also interesting. It leads to a paradigm that entering information is something you only do when sitting down somewhere. Which I suppose is reasonable, but… I seem to do some of my best thinking when I’m walking or pacing. (Maybe it’s the increased circulation.) I’d really like to be able to jot down information, ideas, thoughts and notes while I’m walking around. (Of course, even old-school pen-and-paper can’t handle that very well.)