Wednesday, January 13, 2010

Managed vs Unmanaged Libraries

In the olden days, our music libraries were primarily unmanaged. It was essentially the only option. If you wanted to listen to a song, you had "Eagle Eye Cherry - Save Tonight.mp3" or "01 - save tonight - eagle eye cherry - [desireless].wav" or something along those lines. You interacted with the file within the file explorer, such as Windows Explorer, and kept track largely by file name. Some people used meta data in one form or another, such as MP3 ID3 tags, but just as many of us ignored or actively removed those tags to force it to organize by carefully renamed filenames. Even more people just took the files as is, creating messes like "~ALLSTAR1.mp3" or alike.

Programs like Winamp, Windows Media Player, and RealPlayer provided the features necessary to organize libraries heavily on metadata, but they generally didn't create much of an abstraction layer between the file system and the music library. Your audio files were linked, and its tags read, but the files were otherwise left alone where they were.

The Music Library: iTunes and Winamp

iTunes took a different default approach, and it made it a polarizing music library application. Unless settings were set otherwise, it organized your files for you by creating folders for each artists under a single parent directory, and sorting each song within appropriate nested folders named by album name. The song files themselves were renamed by track number, a comma separator, and the song name.

So on Windows, you might get:
C:\Documents and Settings\Gordon\My Documents\My Music\iTunes Library\Eagle Eye Cherry\Desireless\01 - Save Tonight.mp3

People either hated it or loved it.

This was managed music.

When the Windows port of iTunes came out in 2003, I began using it in parallel with Winamp, my incumbent music player of choice at the time. It was light, simple, and unmanaged. I had also stripped my entire library of all ID2/ID3 tags over the years, so it wasn't optimized for iTunes.

I was a hater of managed music.

I wanted to organize these files myself. I had my own folder structure and naming conventions. Yet over time, I grew to like iTunes managing my music for me. With every new file, I could drag it into the library, and it would take care of the rest (making a copy, renaming the file based on the metadata, and creating appropriate folders). Then I could delete that original copy or what ever I wanted.

The thing was that, as much fun as it may be to have full control and say over how to manage your own music, it just wasn't scalable to me past a thousand songs, much less several thousand. If the computer is so much better at automation than a human, I figured we should be offloading this tedious work to these machines.

The Photo Library: iPhoto and Picasa

If iTunes is to managed as Winamp is to unmanaged, then iPhoto and Picasa are the respective analogies for the photo library.

Like iTunes, iPhoto by default created copies of any photo you dragged into the application, but also like iTunes, you had the option of disabling copying to the photo library.

With the latter, iPhoto would simply link to wherever that file sat, and wouldn't make a renamed copy. With the former, it would place a copy in a bundle "iPhoto Library" in the user Pictures directory, like this:

/Users/gordon/Pictures/iPhoto Library/Originals/2007/October 22, 2007/IMG_0001.jpg

This is nearly identical to iTunes behavior, with the one distinction that iTunes still involves folders that can be navigated through via UI the traditional way, instead of bundles that require "Show Package Contents" or a terminal.

/Users/gordon/Music/iTunes/iTunes Music/Coldplay/Viva La Vida - Prospekt's March Edition/1-07 Viva La Vida.mp3

Picasa, on the other hand, offers iPhoto's link-to-file option as the only offering, but differs beyond that for monitoring those watched locations for changes to the files (renamed, deleted, etc.).

As a Picasa user, I haven't embraced iPhoto in the same way that I had with iTunes. I tried to determine why that might be, considering that I have a much more overwhelming number of photos than audio files to manage. It could be that the metadata in music files played a much bigger role in determining how to organize them. Many of us were already manually organizing by artist or album, perfect for metadata. Sorting and filtering by genre, beats per minute, length, and other attributes were perks, but the point was that a lot of us were already manually managing the same way auto-managing worked.

Whereas with photos, my guess would be that there is more fragmentation in the way these files are organized from person to person. Some might group by events (Graduation 2003, Italy 2002, Birthday 2008). Some might group by year, or month, or months nested within years. Or some might group by year, with the photos within each grouped by events. Some might not group at all.

iPhoto does allow you to create smart playlists a la iTunes based on criteria, so that you can do anything from creating playlist/labels filtering out all photos taken with a specific type of camera, or more usual things like grouping by year.

But these are folders. These are like, again, smart playlists in iTunes or IMAP labels in Gmail. And this may not work for everyone.

Ultimately though, I would venture to conclude that perhaps not as many people see a need for managed photo libraries, as useful as it is, because photos are much more visual by nature, and can be spotted in a grid of thumbnails. In time, with the ongoing release of new tools like facial recognition and geotagging maps, perhaps managed photo libraries will be compelling enough for more of us to alter our ways.

Tuesday, January 5, 2010

Keyboard-based File Renaming

There are subtle differences in how a file navigator handles keyboard-based file manipulation that are easy to overlook. They appear minor or trivial at a glance, but can be detrimental to usability with the sum of all nuisances, or with batch file manipulation tasks by hand.

There is a lot to focus on, so I will keep this one about file renaming.

Renaming in the Early Days

Back in the days up until and including Windows XP, if a user renamed a file (F2 by keyboard shortcut, or by "Rename" via right-click contextual menu), it would highlight the entire file name and its file extension. (This is, of course, unless known file extensions were hidden.)

This was a problem for a couple reasons. If you didn't know what a file extension was, you would likely accidentally rename a file extension. If you did, it was an extra three or four keyboard strokes for each file to place the cursor to the end of the actual file name. It doesn't sound like much, but you pride yourself on using keyboard shortcuts for the purpose of being fast.

Mom and Dad skiing.jpg

Fortunately, some years back, I noticed on Debian and Mac OS X that triggering a file rename highlights the file name only, with the option to move the cursor to the right into the file extension area.

Mom and Dad skiing.jpg

It was a huge improvement, but considering that most of the world was on Windows, it was important that this make it over to the Windows side.

Renaming Today: Cursor Placement

Starting with Windows Vista, this behavior was available in Windows Explorer. The only issue is that if you rename (F2 or by mouse) to highlight the file name only, and realize you want to append or delete characters from the end of the file name, you hit the right-arrow instinctively. That should place the cursor at the end of the name and before the dot of the file extension, right? This is the case in Linux and OS X, but not so in Vista or 7.

Before:
Mom and Dad skiing.jpg

After (Linux, OS X): Right-arrow places cursor before the dot:
Mom and Dad skiing|.jpg

After (Windows Vista, 7): Right-arrow places cursor after the dot:
Mom and Dad skiing.|jpg

What are the implications in this difference in behavior? Let's compare this to the way you would normally rename a file with the file extension hidden.

Before:
Mom and Dad skiing

After (All OS's): Right-arrow places cursor at the end of the file name:
Mom and Dad skiing|
Mom and Dad skiin|
Mom and Dad skii|

Without file extensions, the keys are F2, arrow-right, and then delete or append right away, in the case of Windows. So if we apply this with the incorrect behavior of placing the cursor right of the dot, we get this.

Before:
Mom and Dad skiing.jpg

After (All OS's): Right-arrow places cursor at the end of the file name:
Mom and Dad skiing.|jpg
Mom and Dad skiing|jpg
Mom and Dad skiin|jpg
Mom and Dad skii|jpg

Screenshots after the break (note the text cursor placement):

Renaming Today: Cursor Jumps

What if I want to jump the text cursor to the beginning of the name? In OS X, hitting the "up" and "down" keys moves the cursor to the start and end of the file name, respectively. In Windows, it simply doesn't register at all, so there's some further improvement that could be made there.

DOWN ARROW ↓: Mom and Dad skiing|
UP ARROW ↑: |Mom and Dad skiing

Renaming Today: Shift-Selection

If there's anything everyone seems to have implemented, it's holding the "shift" key as you add or subtract characters from a selection of text.

Mom and Dad ski|ing.jpg

Batch Renaming

Unless you know what regular expressions are, batch renaming options leave a bit to be desired. On the one hand, you have nothing really built into OS X in Finder, and leaves this work to AppleScript in the form of a handful of pre-written automator scripts. On the other hand, Windows does provide a basic sequential batch renaming solution by appending numbers in parenthesis to an otherwise identical base name. This was available since Windows XP.

Ski trip.jpg
Ski trip (1).jpg
Ski trip (2).jpg
Ski trip (3).jpg

Otherwise, not much interesting is going on in the built-in UI file navigators.

Improvement and Progress

But OS parity aside, honestly, there have been some interesting solutions in third party file renamers such as batch renamer utilities. It pains me that there have been so few improvements implemented into the UI side for this tedious task. It could be something a little more inventive like predictive text entry or something as simple as the live spell check that exists everywhere else. It's all in the details, folks.

Monday, December 14, 2009

Of Physical and Virtual Mobile Keyboards

Nearly three years ago, I watched as Steve Jobs presented a slide of images of existing smartphone physical keyboards, from the Blackberry to the Treo, and then reasoned afterwards that a touch keyboard held the advantage of being adaptable to any situation. This reduces clutter, saves space, and allows for a larger screen without the need to add physical bulk (even if it's just millimeters) required to produce a slide-out keyboard.

Yet, a few years onward, there are people who still insist that a hardware keyboard is an advantage to a soft keyboard like the touch keyboard.

Tiny Plastic QWERTY buttons

I'm no stranger to the tiny plastic buttons comprising the QWERTY keyboards on phones. I expressed real interest in a Blackberry or Treo in 2005, and I composed emails on my father's Blackberry Curve throughout 2008, one of them being 476 words / 1941 characters long without spaces. These keys hurt my thumbs during any extended typing, and this is speaking as someone with relatively slender fingers compared to the average person.

The iPhone keyboard, on the other hand, has never given my fingers any pain or strain, even with extended typing sessions. I don't have to press down hard on each little plastic button far smaller than size of a thumb or fingertip. Now, how hard you have to press down on a key varies from physical keyboard to physical keyboard on a phone, but in order to maintain slim form factors, they typically won't be anywhere near as easy to press as a keyboard on the desktop (where physical keyboards do make absolute sense).

Large Touch Key Sizes and Dynamically Resizing Landing Areas for Keys

Besides, the area for a touch key for an iPhone, at least, is larger to begin with. And beyond that, apparently the iPhone's touch keyboard guesses what the next likely characters are, and enlarges the invisible touchable area beneath the visible key to give the user a more forgiving margin of error for that key. For example, if I'm typing "G", the letter "I" will have a slightly larger tappable area because it is statistically likely to follow, whereas the letter "Q" will not.

Auto-Correction

In fact, the guesswork itself of the next probable character is nothing new, and has existed in the form of predictive text (such as T9). I could press 7(pqrs) 3(def) 2(abc) 3(def) to get "r-e-a-d" instead of pressing 777(prRs) 33(dEf) 2(Abc) 3(Def) with the delays between each numeric pad key. This was a form of auto-correction, which also wasn't new when the iPhone came onto the scene, but it's easy for people to forget the important role auto-correction plays when it comes to touch keyboards. This isn't your typical shopping mall kiosk where the touch keyboard is just a literal software port of the physical keyboard.



It saves you time by automating capitalization, word completion, and punctuation. This all factors in when talking about the speed aspect of the keyboard.

Insertion Points, Selection, and Cut/Copy/Paste

But it doesn't end there. If a user wants to insert a text cursor in a random location of a long line or paragraph, and select and cut/copy and paste a portion of the content, the touch-based selection holds a huge margin in ease and speed versus a D-pad or trackball. On a Blackberry Curve with a trackball, this requires moving the cursor incrementally character by character, and line by line. The physical setup on current phones doesn't allow you to quickly jump across huge blocks of text with a tap, and or maintain a simple selection above and below the fold of the current viewport.

The touch-based setup also allows the developers of the operating system to write and add aides that assist with text selection. For example, in the case of the iPhone, tapping and holding down on an area of text places the cursor there and displays a magnifying glass over the selection. This is immensely helpful with accuracy at high speeds of movement across lines of text, and more so with small text.

Foreign Languages and Special Characters

With physical keyboards, for the most part, what you see is what you get. If you want special characters, you had better hope they're hidden inside a symbols modifier key, but otherwise, you're out of luck. The iPhone's touch keyboard appears to also have special characters behind a numbers/symbol button on first glance, but you can additionally hold down particular letters to place special characters without ever leaving the QWERTY view. For example, you can access "รณ" by holding down the letter "o", "¿" by holding down the "?" key, or "€" by holding down the "$" key. The list goes on, and this is only for the U.S. English keyboard. Each of the international keyboards has its own version of this too, which brings me to the other special character advantage - foreign languages.



I've just illustrated how quickly accented letters could be added, which comes useful for romantic languages. But what about non-Latin-based languages? What about, say, Traditional Chinese?

This is where essentially all physical phone keyboards fall short entirely. Again using the iPhone as an example, you can use international keyboards, exclusively, in dual-mode with another, or in even higher multiples. In my case, I have the U.S. English keyboard in dual mode with the Traditional Chinese keyboard, which uses gestures to allow the user to literally write out the entire character on a virtual pad. The same guides apply - there's its own form of auto-correction with its guesswork of what character you're about to write. It can usually guess the character before you've fingered all the strokes, and it uses the best guess by default unless you tell it otherwise. The predictive text is also there, as it guesses the next likely character to follow the one you've just input.

You could even write your strikes in a relative mess, or even in simplified Chinese, and it would still figure it out. And the appropriate punctuation and everything else you'd want is waiting there right where you need them.

Adaptive and Future-Proof

I can easily concede that the lack of tactile feedback on a touch keyboard like the iPhone's is no minor drawback, and it has appeared that manufacturers have attempted various ways to compensate, ranging from the iPhone's audio click cue to the Blackberry Storm's click screen (though a bit confusing with the flat/depressed split mode) to the Android phones' vibrational tap feedback. From what I have seen in the patents, such as the pop-out surface bumps, the progress of the research looks promising, and I believe that it's simply a matter of time for tactile feedback to be perfected on a touch key.

Yet, even when considering all these features, the characteristic of virtual keyboards that is most appealing is how future-proof it is. Not only does the touch keyboard adapt to any situation, but when there comes a need for something new in the future, perhaps a new currency symbol or an entirely new method of input, it's ultimately just a software update away with the phone you already have.



* Footnote: There are phones that sport physical keyboards with touch screens, though typically a touch keyboard has either been nonexistent or hacked in, or tossed in as a poorly executed afterthought.

Tuesday, December 8, 2009

In-Browser Search Engine Switching

When I first used Firefox under the Firebird name, one of my favorite features was the ability to quickly add and switch search plugins for other sites. In the case of Firefox, you could type one query, and any other search engine or site search was just a click away. Or for keyboard shortcut aficionados, ctrl+k/cmd+k > ctrl/cmd + up/down > enter.

Safari didn't offer this feature, but years back, I discovered a third party Safari plugin called Inquisitor, at the time the work of an independent developer. Among the features it offered, it also allowed users to also add and switch between search engines with a single query.

But what I loved most was how easy he made it to add search plugins. You see, for Firefox, I wrote several search plugins starting at the end of 2004 and beginning of 2005, using the Sherlock format. Some of these (Yahoo! Movies, Yahoo! Widgets) have been replaced by OpenSearch versions uploaded by other people, but some of the early ones remain in case you want to see what I'm talking about (Cal Berkeley plugin from February 15, 2005).

With Inquisitor, on the other hand, we could simply use a variable representing the query within the URL parameter used in any given site search. For example, if I searched IMDB for "Memento", the URL ended up looking like this: "http://www.imdb.com/find?q=memento;s=all". At that point, I would be able to replace the "memento" search query with a variable in the Inquisitor settings to get this: "http://www.imdb.com/find?q=%@;s=all", where %@ just happened to be the variable used by Inquisitor.

Suddenly, I could add just about anything site within seconds, from Finance quote searches to torrent sites to corporate intranet searches.

It didn't cross my mind that someone could easily top this, but Google did just that with Chrome. When typing a domain like imdb.com into the hybrid URL-search bar, the right side of the bar hints that you can hit the "tab" key to type a search query for a search within that site (in this case, imdb.com).


Most major dedicated search engines try to facilitate site-specific searches these days, but for times when you want to perform a site search, the browser has evolved to help get you there.

Friday, August 7, 2009

Password Masking

Caught in the ongoing tug between ease-of-use and security is password masking, a point of contention in the past with some of my colleagues working more closely with security related issues. Whether security and usability necessarily have to be inverses of each other is something to leave for another post, but what's clear is that it's certainly the case with our current form of masking typed text strings in password fields. There are three major types of password masking I've seen.
  1. Full masking: Every alphanumeric character or symbol is represented by an asterisk or dot, effectively masking them.
  2. Partial masking: All characters are masked, except for the last typed key. (Example: iPhone OS)
  3. Invisible: No asterisks, dots, or replacement characters of any kind will display. (Example: Unix environments)

Full Masking

Full masking is the most common technique, and while it tells you where are you in the password you've typed so far, it also gives observers that information too. With this technique, the only way an onlooker can grab those passwords is by a combination of physically watching the keys typed and educated guesses based on what the asterisks hint about the password (its length, whether slowed down typing to enter numbers or symbols, and so on). Remember, I'm talking about what can be attained visually and audibly, as that is the point of password masking. Areas like keylogging, plaintext passwords, and such are another area of concern entirely. Now, when it comes to full masking, it generally works fine until something is causing typos, which masking will hide. This includes common mistakes like leaving the caps locks key on, or missing a shift modifier key, or general typos with commonly misspelled words or lengthy randomized text strings. It's worth noting that the caps locks issue is sometimes addressed by detecting that it's on, and subsequently warning the user when it is.

To deal with this issue, it seems that sometimes entirely masked passwords come with an option to toggle the asterisks on and off, such as with the WEP/WPA key fields in OS X. In other words, it's an override option to temporarily remove masking at the discretion of the user.

Partial Masking

But what happens when full masking carries over to a device where typos are far more frequent, such as a mobile device? The user could slow down immensely, or type at regular speed and hope that the login won't lock down or throw a CAPTCHA form after a couple invalid attempts. The iPhone OS addresses this by masking all characters in dots, except for the last typed character (for a couple seconds). It's an improvement, but at the expense of anyone peering over your shoulder seeing each last character. Anyone keeping an eye the entire time can thus see your entire password in the clear, and at a readable pace considering that even the fastest typists on mobile keyboards are a huge margin from the fastest on the desktop keyboards. I have mixed feelings about this, but then again, even fully masked, typed keys on touch keyboards display their character in a tab above the area obscured by the tapping finger. So any watchful person can still catch on that way, regardless of whether the password field itself is fully or partially masked. This is evident on such touch keyboards as the ones on iPhone OS and Android.

Invisible Masking

In a UNIX environment, you'll notice that password prompts give no feedback for what you're typing or what you've already typed, ironic given that this environment is where strong complex passwords are common. I've seen this confuse many, many users, and it's a commonly asked question that won't go away. Eventually, most people get accustomed to this, and it becomes just about as easy to use as full masking - for most cases, that is. But when it comes to lengthy randomized passwords, entering passwords becomes a snail-paced task, during which keystrokes become easier to observe and follow. (This is unless you happen to be a god at rapidly typing 40-character alphanumeric, mixed-cased passwords interjected with symbols with and without modifier keys.)

The Locks at Every Gate

As we raise the number and complexity of locks on a gate, we also construct higher and higher hurdles for intruders to overcome, but also for the people who have to encounter these security measures everyday, every hour, or even every few minutes. These measures typically work fine, but serve users poorly in intense cases to the point of inducing people to find less secure workarounds ranging from writing passwords to copy-pasting a password in the clear. (Try typing a 30 character alphanumeric, mixed-case Wi-Fi WPA2 key on a mobile physical or touch keyboard, and see if you aren't tempted to copy-paste too.) The less visual and acoustic cues there are, the more it slows everything down. You'll get used to it, but we just need to vary where to draw the line on a use-case basis because ultimately the most inconvenienced person is not the sinister characters - it is you.

Sunday, August 2, 2009

Blast from the Past: SDI and MDI

This is an entry I wrote on March 20, 2006 touching upon SDI and MDI:

Warning: This is a usability and interface topic. You may quietly exit through the back doors. No hard feelings. Otherwise...

I know that Adobe Acrobat 7 has been out for quite a while now, but I figure that I need to get the word out wherever I can. The following is a problem that's been bothering me since Acrobat 6.

Notice this. Earlier versions of Adobe Acrobat used a multiple document interface (MDI), where all documents resided within a single parent window. The problem was that they forgot to add "tabs" for easy navigation between the documents in this multiple document interface.

I wrote a complaint in the official forums a while back, and in version 7, it seems that they finally tried to solve the problem by switching to a single document interface (SDI), where each document has its own window on the Windows Taskbar. But the Adobe Acrobat team forgot something again. If you exit any given document with the Microsoft Windows [X] button (the red one in Windows XP), every single document closes. The expected behavior, based on other applications written for Windows, is that only that one document should close (not all of them).

Or perhaps the Acrobat team has a good explanation for this behavior? (I certainly can't think of one.)

My original entry: http://gordeonbleu.livejournal.com/20578.html

Friday, July 24, 2009

Inline Autocompletion

Inline autocompletion is a common part of search bars, but for the longest time, autocompletion was anchored to the beginning of the URL in a web browser address bar. In the middle of last year, Firefox included an "awesome bar" in version 3, which allowed us to type: "lunar" to bring up a past history or bookmark of "http://en.wikipedia.org/wiki/Penumbral_lunar_eclipse", whereas other browsers required typing, "en.wiki..." (not even flexible enough to allow "wikipedia" to yield results). Over a year onward, and this still hasn't spread to other browsers.

Inline autocompletion


Firefox: en.wikipedia...
Firefox: wikipedia...
Firefox: lunar

Anchored autocompletion


Firefox: en.wikipedia...
Firefox: wikipedia...
Firefox: lunar...

Seeing as most web browsers haven't integrated their search and URL bars entirely as Chrome has, this is one handicap of most browsers that maintain discreet address bars, as they miss out on one of the top usability benefits of unanchored autocompletion - lessening the requirement on the user to remember URLs.