FEB

12th

HTML5 History API: Sure it's good, but...

A few days ago, Joe Hewitt of Firefox and Firebug fame tweeted about HTML5 History APIs: “history.pushState, such a great new web api, but so terribly broken in iOS Safari. Is iOS 5 here yet?”. Since I spent quite some time working with HTML5 history while developing Rhizosphere, his tweet prompted me to share my opinions on the HTML5 History API. This article discusses what I think are the main limitations of the new API.

Rhizosphere is a javascript library for interactive visualizations and data exploration. Among its features, you can embed it in any web page by just reserving a <div /> for it. Rhizosphere uses HTML5 History to let the user replay his interactions with the data the library manages. In the article, I’ll refer to some bits of code which are part of the Rhizosphere library (if you’re interested, you can find the code the library uses for history management here).

History so far

Let’s spend a minute to describe what HTML5 History API is, for those unfamiliar with it. In very short terms, it’s a Javascript API you can use to associate complex state information to browser history and react to history events (such as the user clicking the back and forward buttons).

Before HTML5 (that is, up to HTML 4.01) your Javascript code only had a few ways to interact with browser history. Basically you’d only be able to use window.history.back(), window.history.forward() and window.history.go() to instruct the browser to navigate through the history stack. There was no formal way of associating specific state information to a given history frame.

The usual solution for state manipulation was to resort to URL hash parameters, such as http://www.yoursite.com/page.html#state. They are bookmarkable, accessible via Javascript and their change triggers the addition of a new history frame by the browser, so http://www.yoursite.com/page.html#state and http://www.yoursite.com/page.html#otherState are technically 2 different history states.

This has several limitations. It is painful to encode state information in hash parameters. You have size limitations. And, more importantly, you have no easy way of being notified when the user navigates through history (which you need to update your page state accordingly), forcing developers to do all sort of dirty tricks, from using timeouts to poll the document.location at periodic times, up to the absurdly amazing usage of onscroll events on nested iframes.

The rise of ajax-y web applications which are less tied to the traditional concept of web page (or not tied to it at all) made the situation annoying enough to require a new solution to the problem of tracking web applications’ state and have it play nicely with browser history (that is, you want the back and forward buttons to continue working).

History gets a revamp

Fast forward to recent times, and HTML5 gave us some new APIs that should solve the problem. window.history.pushState() (and its companion replaceState()) and window.onpopstate().

I won’t go into the details. For this discussion it’s enough to know that pushState() lets us push arbitrary JavaScript objects onto the history stack (along with changing the document location, and you’re no longer limited to just changing the hash) and you use onpopstate() to be notified whenever an history state is popped (that is, the user clicked the back/forward controls).

You can find a nice overview here at developer.mozilla.org, or you can also read the official specs (whatwg, w3c).

What’s wrong?

While the new API makes life a lot easier, I think there are still some rough points and uncovered territory to make webapp state management nicely integrated with browser history. Most notably:

  • history is a global object at the window level,
  • You are still stuck with hash tag serialization,
  • No native indication whether you moved back or forward, only that you changed history state,
  • Conflicts about what state to use during page loads,
  • and, as usual, browser inconsistencies.

Let’s cover them one at a time.

history is a global object at the window level

The main problem here is that if you have multiple separate participants in the same web page, they’ll have to pay attention not to stomp on each other feet while interacting with browser history.

Say you have a web application which renders multiple widgets in a single web page (like a portal), each one potentially interacting with history for his own needs. For example, you could have an image gallery widget or an interactive visualization, like Rhizosphere, embedded into the webpage.

Any separate entity will push his own states onto the history with pushState() and register for history notifications with onpopstate(). But then every entity will receive change notifications even if these were originally pushed by separate entities.

A workaround is for each participant to tag history states in some unique way so that, when the state is popped back, he will be able to recognize whether it belongs to him or it was originated by someone else living in the same page.

Rhizosphere does so by adding a custom property to each pushed state and checking it back during pop.

The problem with this is that every participant must be aware of this. If any participant doesn’t respect the rule, it may fail parsing history events during onpopstate() because it receives events generated by someone else he’s not aware of (and therefore producing javascript errors). This might be especially problematic if your page allows third-party content in or if you develop an embeddable library.

Alternatively, you can use iframes to sandbox each participant into a separate context, but I think this is suboptimal. It would be better for history to be structured in a pub-sub way, so that each participant could decide to receive only its own events back or the full stream.

You are still stuck with hash tag serialization

Javascript objects that you associate to an history frame via pushState() are obviously not bookmarkable. If you want bookmarking support, you are still stuck with manipulating document location hash parameters.

Which means that even if pushState(), onpopstate let you seamlessly store and retrieve arbitrarily complex javascript state objects, you stil have to maintain the logic to rebuild the state from url hash parameters, somewhat defeating the benefit the new API provides…

Even the specs agree that you only get a minor optimization:

State objects are intended to be used for two main purposes: first, storing a preparsed description of the state in the URL so that in the simple case an author doesn’t have to do the parsing (though one would still need the parsing for handling URLs passed around by users, so it’s only a minor optimization) …

Some form of automated serialization of the state received by pushState() would have been nice. Sure, you can always JSON-ify your state data and set the hash parameters by hand, but you still have the problem of multiple participants in the same page possibly overwriting each other location hash, which could be solved by having the browser handle hash-parameter serialization rather than leaving it to the application (the browser could enforce some sort of namespacing / separation between different parties trying to modify the hash). Note that this is a compartmentalization problem that already existed before HTML5.

No native indication whether you moved back or forward

This is minor, but still annoying. onpopstate doesn’t give you a clue about whether the user arrived in the current state by moving back or forward.

This may be a helpful bit of information that the application wants. For example, you might want to visually highlight the state transition with a matching animation that would cause the document to scroll left/right (or top/bottom) depending on whether you moved forward or back in the history.

In Rhizosphere case, recomputing the entire state received during onpopstate is expensive, hence you prefer operating on the delta between the current state and the popped one. To compute the delta correctly, you need to know whether the popped state occurred before or after the current one (that is, you need to know whether the user landed on the current state from a ‘back’ or a ‘forward’). Rhizosphere falls back to explicitly timestamping each state object before pushing it onto the history stack to compute such bit of information.

Conflicts about what state to use during page loads

Because history onpopstate and document load events (onload) are distinct, you now have 2 entry points in your application code that might dictate what state the application should transition into, and they do not always play nice together.

Consider this navigation flow:

  1. user goes to http://www.google.com
  2. follows a link to http://www.yourdomain.com/
  3. performs an action that triggers a pushState() on http://www.yourdomain.com
  4. follows a link to http://www.somethingelse.com

Each of the above steps maps to a distinct history state. The following combination of events might trigger in your application code, depending on the circumstance:

  • User arrives on your page (navigation 1->2 in the sequence above). No history event triggers (the user just arrived). onload triggers.
  • User performs the action (2->3) and then hits the back button (3->2). Since the interaction is confined within your domain, with no page loads, only the onpopstate event triggers.
  • User follows an outbound link (3->4) and then hits back (4->3). Since the web page is different betweeen the 2 states, but you are returning back to a previous history state, the page is reloaded and both onload and onpopstate event trigger.

Given you can have any possible combination of onload and onpopstate, at which point do you decide what the ‘official’ state of the webpage is? If you set the page up when the onload triggers (using a default or initial state for the web page), you may have to do it again when onpopstate triggers. You cannot rely on onpopstate alone (or wait for it before deciding), since it may not trigger at all.

If setting the page state is an expensive or visually noticeable operation (for example because the state dictates the positioning/visibility of some DOM elements) you may introduce flickering if you do it both during onload and onpopstate in the last usecase described.

As far as I know, the HTML5 specs do not address this issue of interaction between history change notifications and page initialization. Would passing history state information as an additional (optional) field to the load event be good enough to solve it?

Browser inconsistencies

And finally, just like any other standard in the world, it does not cover implementation details with sufficient detail to prevent inconsistencies between different implementors. Firefox, Safari, Chrome, they all introduce minor differences in implementation that make even harder to write javascript workarounds to the previous issues (especially the last one). At the time of writing (Safari 5, Chrome 8, FF 3.6 or 4beta) these include:

  • timing differences in when the onpopstate event fires in respect to the document load / complete events.
  • double firings of the same PopStateEvent under certain conditions (and combinations of browser/os)
  • spurious events: some browsers would fire a PopStateEvent, with a null payload, when landing directly on a page (no back/forward buttons used), while others would not fire the event at all.

This callback in Rhizosphere code gives you an idea of the intricacies needed in your javascript handling code to distinguish between all the possible scenarios.

Parting words

New APIs are never perfect. The new crop of infrastructure and APIs under the HTML5 umbrella is very powerful and leads to a new generation of web apps, but it is still also fairly young and leaves lots of space for another generation of browser quirks. History management greatly improved with the advent of HTML5, but seems like you’ll still have to code around spec limitations and ambiguities. Hopefully this article helped highlighting some of the corner cases you might face.

Riccardo Govoni, last modified on Oct 21, 2011 - 22:08


1 Comments to this page

khoji at ignitebyte dot net about 2 years ago, Khoji said:

The idea with iFrames probably won't work in iOS. The history there is more than horribly broken in connection with iFrames, I've been struggling with this for the last couple of months for an online help system. The problem is that loading pages into iFrames "poisons" the iOS Safari history. The steps get recorded on the stack but they don't play back correctly -- the address bar progress indicator hangs at around 50% and the "loading.." disc rotates endlessly. You can stop this by clicking x in the address bar, but most users won't get this. This history stack poisoning also carries over to other pages: If you visit a page that loads iFrames from another page and then return to that page with the history buttons you will get the address bar bug there as well. So far the only solution I've found is to prevent creating of a history entirely -- as far as possible -- by loading iframes with location.replace().

Tags

This page is tagged as: html5 programming javascript essays