The new version of Opera (9.5x, codename: Kestrel) has been released, and one of the stated aims is to improve performance across the board. This has resulted in a new ECMAScript engine and substantial revisions to overall page rendering compared to the previous version (9.x, codename: Merlin). So, using the good old scientific method, we can quantify how Opera's performance has changed. First off, it is important to understand that a browser is composed of many subsystems, each of which will affect overall performance. So a browser could be screaming fast except for the display of centred transparent PNG images, and it might just so happen that is what is used on your favourite site! Though it is impossible to cover all aspects of "performance", we can ask for a cross-sample of rendering subsystems, how fast they can perform. I've chosen to focus on ECMAScript and DOM manipulation, as these have become increasingly important as applications are made from what were once web pages… I've also done three "real world" page loading tests.
I will update these tests as new builds of kestrel get released that have significant changes, I believe that are several more changes to come. My focus is on Kestrel so I may not always include results from every other browser — for example sometimes IE7 was so slow I simply didn't wait for it to complete, and I added Firefox 2 only after Firefox 3 as I'm more interested in the next-generation of rendering engines.
One thing that always bothers me on almost all benchmarks I've seen published online is the total lack of error information given! If I measure Object-A five times I might get:
1.8, 6.5, 3.4, 11.5, 6.1, which has an average of
5.84. But the value is not reliably
5.84, so if I measure Object B and get
6.48, although it is higher than the average for Object A, the variability means I cannot say it is really 'different'. For all the graphs you see here, I've calculated the 99.9% significance limits, so I can give you a better idea of the variability of the sample presented. Here is an example, note the brown boxes at the end is the positive confidence interval, the negative confidence interval is behind the main bar. What the confidence interval tells you is that, after repeated testing, 99.9% of values are expected to fall within that box for the sample used. Here for example, someone could claim that Object A at 5.4 was smaller than Object B at 6.1, however looking at the confidence intervals makes it harder to be certain it really is different:
Without giving such limits, any supposed difference, be it for Car speeds, number of Crêpes consumed per hour, or how fast a browser renders something, should be taken with large pinches of Sea Salt. The confidence intervals give you an indicator of how much to trust the value differences for the samples presented here, no more and no less. No p-values are given.
The first test is both mathematically intensive and stresses the DOM, as each pixel is rendered as a separate DIV. There are two setting, "basic" and "full", which use 3 pixel and 1 pixel DIVs respectively. First the results from the basic pass:
When run on full, some 59,000 DIVs are dynamically created, substantially testing the DOM. Internet Explorer 7 fails to render all pixels at full resolution; and though Opera Merlin renders quickly, it becomes highly unstable. Firefox 2 took longer than 700 seconds so I don't plot it. Therefore we only are left with Safari 3, Opera Kestrel and Firefox 3:
Finally, when trying to render the full test, I was curious to see what the memory consumption was after the DIVs had been created. It seems Kestrel is the most memory efficient with such a large DOM:
Try it here…
This is taken from the Webkit Wiki, and is a pure ECMAScript computation:
Try it here…
This is another test which pushes both the ECMAScript engine and the display routines; it calculates a 3D box which is rotated in real-time. There are two setting, small and large; I show the results for large here, giving the averaged time per loop (thus the elapsed time differences will be larger). Also plotted are the results from OS X Opera Merlin and Opera Kestrel. Mac users have noted that Mac Opera feels slower than the equivalent Windows build. This test shows that this is indeed true for Mac Merlin; but note that with Kestrel the platform difference is much smaller.
Try it here…
Warning: I have been told that there are several bugs in Celtic Kane's tests, so take them with a pinch of salt — they are measuring something but not necessarily always accurately. I had already fixed the layer movement test, but the other tests are the same as the original.
Try it here…
Taken from Ian Hickson's performance tests, this tests a set of core DOM manipulations:
Try it here…
Internet explorer 7 fails the first two, and performs terribly for the second two. For DIV 1, it takes ages of 100% CPU before the test starts, then renders normally - I included the freeze time into the results as that is fairest.
It is also instructive to look at the CPU used during these tests, I'll choose DIV 2 as it uses standard DOM methods for indexing the DIVs (time measured using CPU Time in Process Explorer from Sysinternals):
Try the tests: Table, Canvas, DIV 1 and DIV 2.
Safari is missing in the charts above because it returns its onLoad sooner than other browsers. Therefore one cannot compare its performance directly as page load tests depend on onLoad. Safari does have a page load timer in its debug menu on OS X, so I can tell you that gives 495ms for Digg, 298ms for New York times and 201ms for BBC News. But it does not give the standard deviation to know what the error rates are. It certainly looks as fast as Opera Kestrel though.
For Mac users, there is also encouraging signs that Opera Kestrel on the Mac is close to parity (and even faster in some cases) to the Windows version.
The current Firefox 3 test build is currently performing pretty slowly on the page load tests. The mozilla engineers have done major work on display, switching over to Cairo and changing the reflow heuristics substantially. I expected most of that work to have already stabilised (as those chages occured some time ago), but lets hope Firefox 3 will improve performance once they get to beta. It appears that Firefox 3 also fires onLoad later to Firefox 2, fixing the bug Firefox 2 had when it fired before CSS inline resources had come in.
All tests were performed on a 2Ghz Macbook with 2GB RAM. XP SP2 was fully patched and run natively. OS X was 10.4.10 and fully updated. Memory and CPU consumption was measured using Process Explorer (Win), or Activity Monitor (OS X). Tests were run 7-10 times (running benchmarks once is meaningless, ideally tests should be run several hundred times, but that it practically impossible) interleaved when possible from the cache after the first run was discarded. Confidence limits assumed normal sample distributions (I did bootstrap some samples for comparison, which is non-parametric). Safari 3 was the latest beta 3.0.2. Firefox 3 was the latest alpha 7; Firefox 2 was 22.214.171.124. IE 7 was fully patched. Opera Merlin was the current public release version, 9.23.