I've generally been using the jstl fmt tag to format dates. I haven't had any issues doing this until recently when I stumbled upon something quite bizarre? The date format string that I was specifying was being ignored under certain situations. Thus, instead of a nice date string like "Tues 1st June 2010", I'd end up with "Tue Jun 01 16:44:40 EST 2010".
I first stumbled upon this when viewing a site in googles "cache" (just do any google search and you'll find an option to view the last "cached" version of any search result). When looking at the cached version of the site, all the dates were formatted incorrectly. This was strange, because when I visited the site normally it would display the dates in the correct formatting. Below is a quick screen grab of what the page looks like when it is formatted properly, and when it isnt formatted properly:
The explanation is quite simple, although rather strange. Seems that the class responsible for formatting dates (org.apache.taglibs.standard.tag.rt.fmt.FormatDateTag) will attempt to get the locale from a variety of sources as summarised by this email chain in the apache email archives. If it doesn't find a locale then it will default to simply outputting the toString() value of the java.util.Date instance! I'd definitely agree with "Flavio Tordini" in that email chain when he suggested that the default behaviour should be to fall back to the default JVM locale instead of simply doing a toString() on the Date instance.
I don't know for sure if this is the reason. But I'm guessing that the googlebot does not specify a locale when it crawls webpages. Your browser, on the other hand, does specify a locale via the "Accept-Language" HTTP header in each request. The result is that the FormatDateTag class finds no locale as part of the request and bombs out to the toString() method.
A simple test using wget confirmed this. Here is the result of two runs that I simply ran on my local development server:
# run 1: > wget http://localhost:8080/home # run 2: > wget --header "Accept-Language: en_US" http://localhost:8080/home
The result was as expected. The HTML returned as part of "run 1" contained date strings that were not formatted. The HTML from "run 2" contained date strings formatted correctly. I'm guessing most developers won't see this problem as they will usually be viewing their web pages in a browser or via a tool which specifies the appropriate header parameters ...
The offending line of code in the FormatDateTag class starts on line 116:
// Create formatter
Locale locale = SetLocaleSupport.getFormattingLocale(
pageContext,
this,
true,
DateFormat.getAvailableLocales());
if (locale != null) {
...
} else {
// no formatting locale available, use Date.toString()
formatted = value.toString();
}
The locale is not pulled from the JVMs default locale. The result is that the code in the "else" block is invoked because the value of the "locale" variable ends up being null...
Obviously we cannot rely on the request to specify a locale that the formatDate tag recognises, as the googlebot obviously doesn't do this (and thus dates on the cached google pages look stuffed). The fix I put in was to use the "setLocale" tag at the top of JSPs to specify a default locale. I'm not doing any internationalization, so I didn't see any problems hardcoding it to "English Australia" as such:
<fmt:setLocale value="en_AU" />
The result was that dates were now correctly formatted regardless of whether the HTTP request specified an "Accept-Language" header...
Comments ...