2012-02-21

Mr. Invalid Character

Introduction


I finally received new laptop at work. It's a brand new MacBook Pro. It should not be very surprising that one of the first things I wanted to install (and try out) was XCode. The only problem is, XCode is available through App Store. Therefore I had to create an Apple ID.

To be honest, I wasn't expecting so many internationalization defects in such a simple application. I thought it is not that hard to create simple registration form... Well, I was wrong.

Birth Date


The first thing that caught my eyes was birth date selection. I don't know why Apple needs to know my birth date in the first place, but that's not the worst part.

The worst part is, they assume everybody uses Gregorian calendar. I am not so sure if I knew my Gregorian birth date, if I were Russian (for example).

Another issue is, rather then simply enter the day, they force me to choose it from a drop-down. With 31 items, it is not too easy to select the right one and to be honest, I really dislike it. Apple is known for their Usability, but to me this is clearly a mistake.

Billing Address


For no apparent reason, Apple requires billing address, even if I select "None" as a payment method:


I wouldn't care so much but their registration form is full of i18n defects.

The first thing you might notice is title. It is required. You want see it on the screenshot, but there are only few title you can select. This is wrong, as titles surely depend on the culture. Personally, I prefer no title at all and I feel a bit offended if somebody call me this way.

Another thing, it seems that only ASCII characters are accepted in Personal Name fields and Address fields. Dear Apple, my name is Paweł. Like it or not, this is my name. And I don't want it transcribed into Latin alphabet.
The street name, where I work also contains Polish diacritics. It is very interesting idea to expect street names would be written in Latin all over the world.

Last, but not least: Apple expects everyone to have a first name and the last name. The problem is, this also depends on the culture.

That's all folks.

Thank you for registering, Mr. Invalid Character!

2012-02-17

Language switcher drop-down anti-pattern

Note: This post is related to User Experience (UX) and Internationalization. I am focusing specifically on web applications. Please keep in mind, that there are specific use cases were Language Switcher make sense (this especially regards to static web pages.)

What is a Language Switcher?


By "Language Switcher", I understand the facility that let you change current language of the application. By "Language Switcher drop-down", I mean drop-down menu (aka combo box, aka pull-down menu) that let's you choose and change active User Interface language, placed somewhere on the every page (as oppose to being placed on the preferences page only).

Why I decided to write this post


It seems that more and more people think of using Language Switcher drop-down nowadays. To make it worse, the usage patterns seems not to be limited to static web pages (the original use case for this mechanism), but it is more common to see Language Switcher in desktop or mobile applications.
I am not going to dive into details why it is not a good idea to even think of Language Switcher in the latter case, but please answer this question: "I have already chosen the User Interface language for my device, why do you bother me to make a choice again?".

Please let me know, if you have good answer to aforementioned question. However, please be warned that I am going to accept objective answers only. "Because I sometimes want to see an application in different language" is very subjective and does not add anything to the discussion.

Language Switcher drop-down pros and cons


Let me list specific pros and cons of using Language Switcher drop-down. If you can think of more examples, do not hesitate to post a comment.

Pros of using Language Switcher


1. It is very easy to change the language.

That's really indisputable, even rather inexperienced user should be able to switch language, provided that (s)he will be able to locate the Switcher and understand its meaning (see cons below).

2. It is fairly easy for technical support engineers to troubleshoot foreign language users problems.

Let's imagine following problem. The Chinese speaking user has technical problem and needs help of non-Chinese speaking support technician. Support engineers logs in to the Chinese user machine remotely and navigates to your application. As (s)he does not speak Chinese, there is a need to switch language in order to troubleshoot the problem.
In such case Language Switcher drop-down (see prior point) is really great solution. Of course other solutions to the problem might exist, but none of them is as easy to use.

3. Sometimes users log in from a public computer (configured for another language).

This use case is usually listed as number one when we talk about static web pages. However, in context of web application, it is not necessary as strong pro argument.

Of course that is true, that if you do not preserve language selection in the right way (that is other than a cookie or a Local Storage), Language Switcher is probably a must.

But then again, you really should persist this setting.If you do, the only area where you might face is language selection problem is a new user registration.

However, it is very bad idea for users to register from public device.

For starters, the device might be full of spyware, trojans and keyloggers and you surely don't want your users to take that risk. Therefore, it might be actually better for users not to understand what's going on - in this case they will probably give up before disclosing sensitive information about themselves.

Cons of using Language Switcher


1. Language Switcher is sometimes used as an excuse for not implementing proper Language Negotiation.

Some people think that detecting and selecting appropriate language from HTTP Accept-Language header (that is what any modern web browser sends along with HTTP request) is too hard.
They might say: we implemented Language Switcher, so what's the problem if application is in English?
The problem is, most of the world population does not speak English. According to Wikipedia, only around 1.5 billion people out of 7 billion total speak this language.

The very fact, that you decided on localizing your software (otherwise implementing Language Switcher would be pointless) means that you somehow care about these people.
Therefore you really need to detect the language from web browser (you at least need to set some reasonable default) and using Language Switcher (in any form, not just drop-down) is no excuse.

I add this as an argument against, because it is raised surprisingly often. OK, it is not as important one as others, so please read on.

2. Web browser's default language is valid for most of the users.

Most of the time, web browser will be set user default language taken from the Operating System. That's the reasonable default.
Firefox uses different approach - its language will be set to Firefox distribution language, that is Chinese Simplified version of Firefox will sent "zh-CN" as user's most appropriate language.

Therefore if you are Chinese user living in US and you are using English OS, you have two choices; either switch your default web browser's language or install Chinese version of Firefox (this points to Windows version, but you can find any language and system version here.)

On the other hand, most web browsers lets you change your preferences (Safari is unfortunate exception.) It is really not that hard. Of course that would be problem for computer illiterate, but I don't really think Language Switcher would resolve that issue (see below).

3. Implementation of the Language Switcher raise many technical problems.

On the technical side of things, implementing Language Switcher is not the easiest (and not the best) thing to do. That's because of few facts:

  • Most of Web Frameworks are not designed to handle language switching
    • That means problems with refreshing the User Interface with new language values - this regards to translations obtained from resource files, but... See below;
    • If you have some controls in your application UI that are bound to a database or a web service, chances are hight that you will have to close the connection and re-open it with new Locale (not just the language). This will cause considerable amount of coding effort;
    • In most cases you would need to create logic for reading the translations manually as oppose to use default framework mechanism (in case of Asp.Net that means that you'll be forced to use Explicit Localization mechanism as oppose to Implicit Localization one.) That's another few (hundreds) lines of code that could possibly contain defects,
  • Language is rarely the part of Domain Model
    • If you use Model View Controller or Model View Presenter design pattern, language will either pollute your domain, or you will need to violate Single Responsibility principle if you let Controller or Presenter take care of language switching;
    • If you use Model View View Model pattern everything should be fine, but you might run into problems like not being able to switch language (that really depends on the framework used.)
  • It's not too easy to persist language change
    • The easiest way to persist language change is to store it in a cookie or to save it to Local Storage;
      • That method works well for single device users only
      • Cookies might expire or be deleted
    • Storing it to database pose problems
      • As language is not part of a domain model, you would need to tightly couple your business classes to user preferences (i.e. put language preference to the Model)
      • Alternatively you would need to maintain separate Model (possibly with a separate database connection) to store only the language preference and tightly couple it to each controller (does not really make sense, does it?)

I hope you will agree that it means a lot of work and considerable amount of code which might fail. At the same time, the quality of the code will be lower than desired (due to unavoidable tight coupling).
It might be a problem to maintain the code should your application evolve.
And as we know, applications do evolve. Unless your application is useless, people will request new features, or the business problem it solves will evolve requiring you to make changes.

4. Implementing Language Switcher drop-down means a lot of testing.

There is no way to get away from this; put Language Switcher on every page in your application and you will need to test it. For each and every page. Of course, you can implement automated tests (using Selenium for example), but this will be quite an effort.

5. It costs a lot of money.

Because of technical difficulties with implementation, as well as serious amount of testing required, it will cost you a small fortune to implement Language Switcher drop-down feature.
The sheer amount of code required to implement it, will cause defects. That's because people are not robots and do make mistakes.
As I said, maintaining the code will be problematic (due to tight coupling). Problematic also means more costly.

6. Language Switcher might result in cluttered User Interface.

From the User Interface design perspective, the best UI is a simple one. You should really avoid placing too many controls.
Simple designs are usually much easier to use and often results in more successful products. Simplicity sells.

I hope your company's application really looks closer to the the Apple or Google products:



With Language Switcher (especially not in drop-down form) the Interface of your application might be crowded.

And because of the basic fact, that translated texts are usually longer than English ones, it might be very hard to maintain the desired look and feel.

Do not overcrowd your Main User Interface with Language Switchers and other unnecessary stuff.

7. There is no placement pattern.

Continuing on UX front, there is no really meaningful design pattern which dictates where you should put your Language Switcher drop-down.

I would expect it somewhere in the upper right corner of the page (for languages written from left to right) or in the upper left corner (for languages written from right to left).

For languages written vertically, I really have no basic idea where to put this control.

You decide:


Which is better: a), b) or c)?

This problem is not just hypothetical; if there is no placement pattern, chances are high that your users will have trouble locating and using the Language Switcher.

By the way: In YouTube Google placed Language Switcher drop-down, together with country selector in the footer of the page. Personally I don't think that many people will even notice it.

8. There is no reasonable way to highlight purpose of the Language Switcher.

Although Language Switcher drop-down is getting more common, it is still uncommon enough that you need to make its purpose clear to users.
This is especially the case, when user sees User Interface in the foreign language.
Many of them won't be able to understand the meaning of the language name and the purpose of the Language Switcher.

That's exactly the reason why many designers tend to highlight the language switcher using flags correlated with a language (the example shows simple switcher, but people use flags for drop-downs too):


I am sorry to let you know, but this is a design mistake on its own.

For starters, flag identifies a territory, not the language.
So what you would do, use the largest country flag, or the country from which the language originates from?
If you do that, several people will be offended. That's because, there quite a few countries with an official language of its former occupants (or other complicated history that makes them unhappy with the flag selection.)

On the other hand, there are surprisingly large number of countries with more than one language. What you would do if your application were to support Hindi, Bengali and Telugu? Would you put and Indian flag next each one of them?

Also, please keep in mind that there are several countries that uses the same flag.

How do you handle language variants, i.e. Norwegian Bokmål and Norwegian Nynorsk?
Reusing the flag does not sound as very good idea to me.

Last, but not least, flags will not be helpful for color-blind people.

These were the reasons why Microsoft used language identifiers to highlight the purpose of the Keyboard Switcher in their MS Windows OS. But then again, language identifiers like "de", "en-US", "pl" and "zh-Hans" are meaningless for most of the people.

Without proper highlight (and there is no way to do it), Language Switcher will be ignored by the people who needs it most - the ones who don't really know how to change web browser's preferences.

9. It is hard to organize the menu with large number of languages.

With large number of supported languages, the classic drop-down design is too crowded to be useful. You need to somehow split it into groups (for example by geographic regions).
However, that poses another issues, the most prominent one is how to split it. If you decide on splitting it geographically, you might need to put one language to several groups.

Choosing the grouping scheme will give you a headache. Below is how Google did on YouTube:


Apart from few bugs (like writing Polish language starting with capital letter or incorrect alignment of languages written Right to Left) it is designed pretty well. Just imagine you have all that languages in a single list...

10. The Language Switcher doesn't really belong to Main UI.

The language is a preference. Just like in case of any other preference modifiers, the place of Language Switcher is in user profile (along with other preference mutators).

It is hard to dispute with this argument; you won't put email settings on each and every page, you won't put time zone or date preference to Main UI and so on.

User preferences have their place in the user profile and desired language is not really an exception.

Summary of pros and cons


As you probably see, cons outweigh pros. That's exactly why Language Switcher drop-down is an anti-pattern.

However, please keep in mind that like with any other design pattern, there are several cases, where the pattern could be anti-pattern and vice versa. In most cases you really should not bother with Language Switcher implementation, but there might be specific business case where it is needed. Just like case of any other UX pattern, you need to conduct user studies to better understand how users interact with your application.

I focused on web applications, but in case of desktop and mobile applications the reasoning would be very similar. In most cases, you should not even think of implementing Language Switchers - Operating Systems have Locale settings for a reason.

At the same time, please keep in mind, that Language Switcher might make a lot of sense in the context of static web pages. You won't usually find user profiles in such case and the contents of the web page might be different depending on the language.
Again, knowing how your users use your web page might be important factor in decision making process.

What to do instead?


Here is what I would do:

1. Move language switcher to user profile (preferences page), where it really belongs.

I don't think that we need special exempt for Language Switcher, this is a preference and it should be placed along with other Regional Preferences such as Date & Time format, Number Format, Time Zone and Sorting Order.

How to design Regional Preferences page is another topic, which I hope to complete soon.

2. Redirect user to preferences page on first log on.

In case of Enterprise Applications, user account will be usually created by System Administrator (either manually or with aid from some script). That's why I suggest on redirecting user to preferences on first log on.
That way, user will be able (or even forced) to set valid preferences.

Of course, you need to provide valid default, that is guess language settings from HTTP Accept-Language header and guess time zone from web browser (the UTC offset might be the part of login form as a hidden field set up by simple client-side script.)

3. Implement URL parameter language override.

To help poor Technical Support Engineers switch language when they really have to, one could implement HTTP Filter that will read URL and set override language settings when certain parameter is present in URL, for example this URL:

http://your.application.example.com/something?lang=en-us

should switch language to English (United States).

4. Have global language preferences for all your applications.

I would suggest on having global user profile, so that users do not have to modify their setting over and over again. However, that pose another design challenge: what if some of your applications are localized into smaller set of languages than others?

The only way to resolve this, would be to have list-box of preferred languages, just like web browsers do.

Your turn...


...to make a point. Post a comment. Objective arguments only, pretty please!

2011-10-02

Representing Date and Time in computer programs. Part 2 (Java)

In this series of posts I will focus on how to do common date and time operations in Java world. Basically, I could give a link to Java Internationalization Trail, but it doesn't actually cover many Java-related technologies.

Part 1 – A brief look at the history: java.util.Date

This was Sun's first (failed) attempt to represent Date and Time. The class is as old as the whole JDK - it was here from JDK 1.0. Most of methods in this class are deprecated therefore I am not going to spend too much time on this topic.

Creating Date instances

To create Date instance and initialize it with a current time one would have used:

// Current time
Date now = new Date();

Although this constructor is still valid (as of JDK 7), I would strongly recommend using Calendar instead.
Another example of constructor that is not deprecated wold be:

// Unix time of the epoch - number of milliseconds
// since January 1st, 1970 in relation to GMT
Date epoch = new Date(1311211111011L);

Please note that this constructor takes parameter in relation to GMT and not UTC. There is a subtle but important difference between them. If you stand a chance, please always use UTC.

All the other constructors are deprecated:

// Sun Apr 24 2011 15:21:33 local time zone  
Date then = new Date(111, 3, 24, 15, 21, 33);

Please note that months are zero-indexed, so 0 is January, 1 is February and so on. This is very confusing, especially for beginners. I agree it does not make any sense but who am I to judge Sun developers?
Also, years need to be provided in relation to 1900, so for 2011 you need to pass 111. Now, that really sucks.

Formatting and parsing

Date class has built-in formatting method called toLocaleString(). That is the one you should never, ever use. It is deprecated for a reason, and the reason is it is simply invalid.
Date class also contains deprecated parse(String) method and you should not use it for the same reason. Instead, you should use DateFormat class and its derivatives.

DateFormat, SimpleDateFormat and FastDateFormat

DateFormat is an abstract class and its concrete implementation is SimpleDateFormat. To obtain default formatting/parsing style for default locale (the one you can get via Locale.getDefault() which is not valid for web applications), you can use:

  • DateFormat.getDateInstance() – to format just date part
  • DateFormat.getTimeInstance() – to format just time part
  • DateFormat.getDateTimeInstance() – to format both date and time

For web applications, you somehow need to know what is end user's preferred Locale. I will talk about this in depth in separate post someday as it is really hard to do correctly but to cut long story short you need something along the lines of web browser's Accept Language. When you have it, you can obtain formatter object by calling DateFormat.getDateInstance(int, Locale) method (or other similar method). Now, on the first parameter (int). This is related to date style. DateFormat class has several built-in style constants:

  • DateFormat.FULL – style containing all possible entries (for this language, often resolves to long format)
  • DateFormat.LONG – long style, full month names, full hours, etc.
  • DateFormat.MEDIUM – medium style, could contain abbreviated month names, 2 digit year, etc.
  • DateFormat.SHORT – shortest style form, usually the least information that allows to identify date and time
  • DateFormat.DEFAULT – default format for given Locale and the one that should be always used

It might sound as a pretty strong claim that you should always use default format but I wrote that for a reason. First of all, default is... well, default. This usually resolves to some kind of standardized date format for given Locale. Therefore it is something that International User should be able to understand naturally, without using additional brain cycles. Another thing is the fact that other (non-default) format definitions are often invalid (for example long format for Polish and Russian Locale) and default format should be always correct (unfortunately, should does not mean is). OK, let me give you coding example:

Date now = new Date();
// it is quite important to pass a country as format might differ
Locale polishLocale = new Locale("pl", "PL");
DateFormat dateFormatter = DateFormat.getDateInstance(
            DateFormat.DEFAULT, polishLocale);
// prints something like 2011-04-23
System.out.println(dateFormatter.format(now));
DateFormat timeFormatter = DateFormat.getTimeInstance(
            DateFormat.DEFAULT, polishLocale);
// prints something like 17:05:32
System.out.println(timeFormatter.format(now));
DateFormat dateTimeFormatter = DateFormat.getDateTimeInstance(
            DateFormat.DEFAULT, DateFormat.DEFAULT, polishLocale);
// prints something like 2011-04-25 18:58:20
System.out.println(dateTimeFormatter.format(now));
try {      
    Date parsedDate = dateFormatter.parse("2011-04-25");
    // Mon Apr 25 00:00:00 CEST 2011
    System.out.println(parsedDate);
    Date parsedTime = timeFormatter.parse("09:11:55");
    // Thu Jan 01 09:11:55 CET 1970
    System.out.println(parsedTime);
    Date parsedDateTime = dateTimeFormatter.parse("11-04-25 18:33:44");
    // Sat Apr 25 18:33:44 CET 11
    System.out.println(parsedDateTime);            
}
catch (ParseException pe) {
    pe.printStackTrace();
}

Notice how easy it is to get wrong results while parsing. Although for time parsing it actually make sense that it is time of the epoch related (this way you can perform time-related calculations using Date's getTime() and setTime() methods), it is just plainly wrong to parse "11-04-25" as year 11. No exception will be thrown, mind you. Quite painful gotcha. Of course it makes sense but still could result in programming error.
BTW. More experienced programmers know that DateFormat contains setLenient() method which allows you to control parse behavior – it should throw an exception if date is non-parse-able. The problem is, the default value is true (throw the exception) and no exception was thrown. Still it makes perfect sense but...

OK, now you know how to (more or less) correctly handle built-in Date and Time formats. But what to do when you need arbitrary format, for example ISO8601?
You can use either built-in SimpleDateFormat class or use Apache Commons Lang's FastDateFormat. Personally I would recommend the latter:

Date now = new Date();
String iso8601Pattern = "yyyy-MM-dd'T'HH:mm:ss'Z'";
SimpleDateFormat iso8601Formatter =
    new SimpleDateFormat(iso8601Pattern);
TimeZone utcTimeZone = TimeZone.getTimeZone("UTC");
iso8601Formatter.setTimeZone(utcTimeZone);

Date now = new Date();
// prints something like 2011-04-28T18:27:08Z
System.out.println(iso8601Formatter.format(now));

FastDateFormat fastDateFormat =
    FastDateFormat.getInstance(iso8601Pattern, utcTimeZone);
// again prints out valid ISO8601 Date/Time String
System.out.println(fastDateFormat.format(now));


Converting time zones

In the previous paragraph, you probably noticed that I have been using java.util.TimeZone as the date formatter parameter. Frankly, if you just want to display correct time to end user you do not need anything else. Following example shows current time (in Tokyo time zone) formatted for Japanese user:

DateFormat dateFormat =
    DateFormat.getDateTimeInstance(
        DateFormat.DEFAULT,
        DateFormat.DEFAULT,
        Locale.JAPAN);
dateFormat.setTimeZone(TimeZone.getTimeZone("Asia/Tokyo"));
System.out.println(dateFormat.format(new Date()));

OK, but what can you do if you need to perform Date calculations, for example you need to know what is a current Unix time of the epoch in some arbitrary time zone (as oppose to GMT)? You can use getTime() to obtain GMT-based time of the epoch and then use TimeZone class to obtain the target time zone offset:

TimeZone pacificTimeZone = TimeZone.getTimeZone("America/Los_Angeles");
long currentTime = new Date().getTime();
long convertedTime = currentTime +
    pacificTimeZone.getOffset(currentTime);

Performing date calculations like this literally sucks. How would you approach adding arbitrary number of days? By using getTime() and adding milliseconds? I will write a wrapper to do that - I can almost hear you saying it. Well, turns out somebody already did.

Apache Common Lang's DateUtils

Generally, Apache Commons Lang is always worth referencing and I usually have this on my projects' Class Path. If you need to perform many Date and Time related calculations, DateUtils class is the one that could save your day.

That concludes first part of Java Date and Time formatting, stay tuned for more (I trick myself into thinking I will actually finish these articles).



2010-10-10

Representing Date and Time in computer programs. Part 1 (theory)

Disclaimer

This post contains no code, just plain theory. If you feel comfortable with various calendars, date & time formats, you could safely skip it.

Background

People started to measure time... Stop. If you're like me, you don't have time to read such crap. I will try to present just the important information.
According to Wikipedia:
Artifacts from the Palaeolithic suggest that the moon was used to calculate time as early as 12,000, and possibly even 30,000 BC. Lunar calendars were among the first to appear, either 12 or 13 lunar months (either 346 or 364 days).
Another article says:
The Julian calendar, a reform of the Roman calendar, was introduced by Julius Caesar in 46 BC, and came into force in 45 BC (709 ab urbe condita).
Finally:
The Gregorian calendar, also known as (...) the Christian calendar, is the internationally accepted civil calendar. It was introduced by Pope Gregory XIII, after whom the calendar was named, by a decree signed on 24 February 1582, a papal bull known by its opening words Inter gravissimas.
Why I quoted this? Because I am trying to make a point. Throughout a history people used different calendars:
  • based on Moon phases or Solar year (or both, actually)
  • forced by political rulers
  • dependent on religious believes
In other words, time measuring & its representation was strictly tied to Culture.
And you know what? Turns out, it still is.

Date formats
10/10/10
Q: What date this represents?
A: That one was obvious, wasn't it? I meant Sunday, October 10, 2010 A.D.

However, the following date is not so obvious:
10/11/12
Is it October 11, 2012? Well, yes if you live in United States. However, if you live in Great Britain, it actually represents November 10, 2012. And if you live in Japan, it surely must be November 12, 2010. Since I live in Poland, these are simply three unrelated integers.
Tip: People interpret date formats according to their cultural background.
Q: Will representing year with for digits help?
A: Not so much.

Take a look at this date:
11/12/2010
Is it any better? Well, at least you could easily guess which year it refers to. However, month and day interpretation will still vary. What about other short date formats? Here are some examples:
08.10.2010 г. (Bulgaria)
2010/10/8 (Taiwan)
8.10.2010 (Czech Republic)
08-10-2010 (Denmark)
08.10.2010 (Germany)
2010. 10. 08. (Hungary)
2010-10-08 (Korea)
8-10-2010 (The Netherlands)
10/8/2010 (Kenya)
8/10/2010 (Australia)
08-10-10 (Bangladesh)
2010-10-8 Uyghur (People's Republic of China)
As you can see, we people tend to be very creative when it comes to formatting dates. We tend to create our own format instead of adopting just one common for all the mankind.
Tip: Software should display dates formatted the way current user expects it.
I will explain in details how to do that in the future, so stay tuned.

Q: So maybe I could use long date format to avoid misrepresentation?
A: Of course you can, but let me show you some examples:
9/شوال/1431 Arabic (Saudi Arabia)
08 Октомври 2010 г. Bulgarian (Bulgaria)
divendres, 8 / octubre / 2010 Catalan (Catalan)
2010年10月8日 Chinese (Taiwan)
8. října 2010 Czech (Czech Republic)
8. oktober 2010 Danish (Denmark)
Freitag, 8. Oktober 2010 German (Germany)
These dates have clearly one interpretation. (Un)fortunately, they need to be expressed in user's language.
Tip: No matter what date format (short, medium, long) you decided to use in your application, it should respect user's locale settings.
If you happen to understand Czech or Bulgarian language, you will notice something. This is apparently specific for Slavic languages and I don't know if it holds true for other language groups. Month name is expressed using genitive case as oppose to nominative case.
Tip: Never assume that target language holds the same grammar properties as English.
Unfortunately, this exactly what has been violated by JDK designers, so if you happen to use standard Java to format dates, avoid using long date format.

Time formats

As for time formats, we people have plenty of space for improvements.
09:08 م (Saudi Arabia)
下午 09:08 (Taiwan)
9:08 μμ (Greece)
9:08 PM (United States)
21:08 (France)
오후 9:08 (Korea)
9:08.MD (Albania)
09:08 ب.ظ (Iran)
ਸ਼ਾਮ 09:08 (India)
09:08 ܒ.ܛ (Syria)
PM 9:08 (Singapore)
Not so many differences after all. It seems that the mankind developed only two kind of time format, actually:
  • 12 hour time format (PM symbol and its placement differs)
  • 24 hour time format
In both cases trailing zeros will, or will not be displayed depending on culture. Although it seems that 24 hour time format will be recognizable by everyone, many people have strong preferences.
Tip: Format time in respect to current user's locale settings.
That way it will be easier to understand for the client.

Time zones and related issues

Up until now, we assumed that we know in which time zone our date & time lives. Therefore, following date-time string will regard to exactly one point in time, exactly the same in the whole wide world:
2010-05-20 19:54
Well, not necessary. The real problem here is, we have no reference to actual time zone. Therefore people living in India will interpret it in different way than people living in New Zealand and definitely different than people living in Central Europe. These differences range from 4 hours and 30 minutes to 11 hours and 45 minutes.
Tip: Present date & time in user's local time zone. If that is not possible, add reference to actual time zone.
This leads us to another issue:
1978-12-31 09:31 (Central European Standard Time)
How comprehensible is that? It depends. If you happen to live in California, it may be totally incomprehensible unless you know that usual difference between your current time zone and CEST is +9 hours. Usually, we don't memorize such values. And if we do, it will be easier to remember the name of the city than actual time zone name.
Tip: Avoid using standard time zone names. Use its UTC offset and short list of cities instead.
That said, it is much more readable this way:
1978-12-31 09:31 (UTC+01:00) Sarajevo, Skopje, Warsaw, Zagreb
Q: What the heck is UTC and what we need this for?
A: This is universal, coordinated time which specify exactly one, unique point in time. We need this in order to avoid time misrepresentation. If you're in doubt, take a look at this time:
2010-03-28 02:11 (UTC+01:00) Sarajevo, Skopje, Warsaw, Zagreb
Looks good, right? The only problem, it does not exist. This is related to Daylight Saving Time. Some dates are simply invalid in local time zones. Some are here twice. Without UTC there will be no way to avoid disambiguity of this date:
2010-10-31 02:16 (UTC+01:00) Sarajevo, Skopje, Warsaw, Zagreb
This date refers to one of two points in time:
2010-10-31 00:15 (UTC)
or
2010-10-31 01:15 (UTC)
That is just because, we're going to change time from 03:00 AM to 02:00 AM here in Europe, on this particular date.
Tip: To avoid disambiguity, always instantiate and store Date & Time objects in UTC. Convert it to user's local time before displaying it.
Now it's time to talk about interchangeability.

ISO 8601, serializing and exchanging date & time values

You probably noticed that I used specific date format in my examples. This has something to do with ISO 8601. ISO 8601 is a document that describes interchangeable date & time formats. It thoroughly describes how dates, times, periods and durations should be formatted in order to make them easily exchangeable.
Tip: If you need to serialize date & time values to string in order to store it or exchange it via network, always use one of ISO 8601's formats.
That said, I need to give you an example of valid ISO 8601 timestamp:
 2010-10-09T11:22Z
This points to Saturday, October 9, 2010 11:22 UTC.
Tip: Allegedly, YYYY-MM-DDThh:mmZ is most widely recognizable pattern for interchanging date & time values. If you cannot use strongly typed DateTime objects to store or exchange information, this format should be used instead.
I will explain it further in future posts, however I cannot do that without specific code examples.

Calendars

Till now, we assumed that there is only one, Gregorian calendar. Unfortunately, such an assumption is not correct. Have you noticed something strange about this:
 9/شوال/1431 Arabic (Saudi Arabia)
example?

Year seems to be somehow strange, isn't it? That's just because default calendar in Saudi Arabia is Islamic calendar.
Tip: Do not assume that Gregorian is default calendar for entire planet. Always present date & time values in accordance to user's local calendar.
There are also few other countries that defaults to non-Gregorian calendars. One of them is Thailand which defaults to Thai solar calendar (which is actually Gregorian calendar equivalent to some extent, but years are counted in a different way), another one is Israel which defaults to Hebrew calendar (actually it may not be the official calendar for Israel, but that's what you get by default when you install Hebrew version of Windows 2003, thus it is what user expects to see). There are few problems with Hebrew calendar. Apart from totally different number of years (according to Wikipedia Hebrew year 5771 has just began), Hebrew year could have 12 or 13 months (and totally different number of days as well).
Tip: Never assume that year have 12 months. It doesn't hold true for all calendars.
I could elaborate further about week numbering, year starting and so on, but this seems quite obvious.

Pretty time

Q: What's pretty time?
A: Nothing, actually. I named this section after Java library which allegedly allows for "pretty" timestamp formatting.

Sometimes, instead of this:
2010-09-09 11:44
you want this:
1 month ago
or
3 minutes ago
or
in 10 minutes
or
next year
or
in January
Et cetera.
This is nowhere near the easy task. I have already said about target language properties. There is more than one problem here, actually. Apart from Declension (i.e. "in January" would be "w styczniu" in Polish, which apparently uses Locative case) there is a problem with plural forms here. In English it is quite easy:
1 minute ago
2 minutes ago
5 minutes ago
However, if you translate it into Polish (or many other languages for that matter):
1 minutę temu
2 minuty temu
5 minut temu
Have you noticed something strange? There is more than one plural form of word "minute" when translated into Polish.
Tip: Never assume the number of plural forms the target language could have.
It is pretty strong, isn't it? Well, it just because we (the programmers) are not linguistic specialists and therefore we should not make any assumptions. Yes, fortunately it is possible.

Summary

In this post I wrote about cultural differences in date and time representations. I tried to be thorough and concise at the same time.
Due to these cultural implications, parsing and formatting date & time values is nowhere near the simple problem. It is no wonder, almost nobody got it right (unfortunately, Blogspot is one of the examples: it gave me plenty of choices how my articles and comments should be timestamped, but among them there where no correct option; It should be formatted as per browser locale).

What's next?

In future, I will try to explain how to correctly parse and format Date & Time values using Java, C#/.Net and C++. There are tons of i18n issues built into these languages (or supporting libraries) so stay tuned if you're interested.