Before I started this blog, I had one inside Google with similar content. This is a recycle of one post from there.
Dean dropped Chrome startup time by 40 milliseconds on Vista, which is pretty serious given how much time has been put into making startup fast. E.g., from the blog:
We carefully monitor startup performance using an automated test that runs for almost every change to the code. This test was created very early in the project, when Google Chrome did almost nothing, and we have always followed a very simple rule: this test can never get any slower.
So what's the change? r5069 — some futzing with DLL loading.
Chrome is built as a
.dll pair, where the exe just calls
the one exported function from the DLL, called
has a few benefits, most of which I don't understand, but one easy to
grasp trick is that it makes auto-update easier: since Windows doesn't
let you overwrite files that are in use, the auto-update writes a new
dll in a new directory and the same exe picks up the new dll when it
starts the next time.
During startup we also use the Windows API to query the version info out of the DLL. This is fed into the crash catcher (which wants to annotate crashes with which version crashed) so it happens pretty early on startup. The API for this query follows a standard Windows pattern, where you call the function once and it returns how large the buffer needs to be, then you call it again with the appropriate buffer. Dean found (via this neat timing tracer he wrote) that this function was loading and unloading the DLL twice and that was relatively expensive.
So the fix, since we're going to load the DLL anyway to call into
ChromeMain, is preload the DLL just before calling those
functions. This was a tiny win — on the order of a few milliseconds
— on the XP machine he was testing on (as well as the XP performance
tester) but on the Vista performance buildbot startup time plummeted.
He theorized that Vista does a lot more work in loading this DLL.
(You could reasonably argue that you shouldn't care about milliseconds, but a lot of these measurements are done when all the disk caches are warm. What takes 40ms now could take seconds if your disk is loaded. There are tools in the tree that attempt to flush the disk cache for the relevant data before running so we can try to estimate performance in the cold disk case.)