Welcome, guest Sign In

Video: Matt Sweeney — YUI 3 Performance

Video published 2009-10-29.

Transcript:

Matt Sweeney: Thanks for stopping in. I guess we'll get started here. I'm Matt, and it's great to see everybody out here. I see lots of familiar faces out there and some people I don't know, but maybe we can chat over the next couple of days. Welcome to the first day of the conference. I'm pretty excited to tell you guys about some of the performance improvements and enhancements that we've made based on what we've learned over four plus years of working on YUI. I'm going to just dive right in here. We're going to be looking at this from the codes-eye view. Eric did a nice job this morning, if you caught his talk, of covering a bunch of these concepts from the high level with lots of pretty pictures and whatnot. I'm more into the code side of things, so we'll dive in there a little bit.

The first concept I want to talk about is loading the library: getting the library into your browser, up and running, and ready to go for your users to start interacting. This probably looks fairly common. We've got a couple of script includes embedded in your HTML. We've learned over the years from performance testing and whatnot that this really isn't the most efficient way to serve your files, but it'll get the job done. Some of the downsides to embedding your scripts this way, just to give some background here — and this is all caveated depending on the browser, so some of these are more or less true — but it'll pause other assets from loading, block the rendering of the page.

Resolve dependencies is the nice thing there, because you don't have to worry about this script loading before this script. That's kind of the reason that this embedding model works, basically, because traditionally the script would halt loading of any subsequent scripts until it was loaded and executed. Then it would be ready to go for the next script, so in case there were any dependencies to work out or anything, that would take care of it for you. You might hear these referred to as blocking scripts, but I'm trying to move away from that term a little bit just because browsers are getting smarter and smarter about this, so there's less blocking happening in the newer versions of these browsers, at least of the loading. They're still blocking execution to manage the dependencies and things.

The next approach we'll look at is creating your scripts dynamically. What we're doing here is creating some script nodes, giving them a source, and manually loading those. So what's the difference here? Well, we're talking about dynamic scripts now. Those are taken right out of the normal browser rendering thread, if you will, and it allows your resources to load concurrently, and doesn't block your rendering. However, your dependencies don't get magically resolved for you. In this context you'll have to do something else; chain your unload listeners, ready states, etc, to manage that manually on your end if you do have inner script dependencies.

The next piece we've got here is what we're calling our "combo handler". What this does is allow you to dynamically decide which files you want to choose, or to statically include a URL like this where you can say 'I know exactly what I want'. It'll take both of these files, combine them into a single request, cache it after the first time it gets called, so that any subsequent requests take advantage of that. And it works for both JavaScript and CSS, although obviously those are handled separately; you get a script for your JS, and links for your CSS.

You can see here that we're pointing at yui.yahooapis.com, and we make all of our stuff available from our CDN. Some people may want to roll or serve their own your stuff, so you can actually use the PHP Loader that we have – Chad is doing a talk on that tomorrow, so I encourage you to check that out if you're interested in the PHP Loader. It includes a thin combo handler as well. Dav's also hooked up a local client-side loader that you can grab from his blog. Those are two ways to go ahead and serve your own rather than subjecting to the Yahoo! APIs, although our uptime is pretty fantastic so I wouldn't worry about that. But I understand that some organizations may be a little sensitive to who's hosting what, where.

OK, now we'll get right into YUI 3. We've kind of taken these concepts and attempted to boil those into the architecture, so what we're seeing here is a single script include, and then we'll go right ahead and start using stuff. In this case, what you need to use – anim – didn't come with this initial script. But what's happening under the hood is we're dynamically fetching all the requirements that we know anim needs and then your callback gets executed once everything is ready to go, and your scripts will run.

What does this do for you? This is what we call the Seed File, and this is the lightest way to load the library. It'll give you something that you can throw in your page, 6K. It'll block, but it'll be a short amount of blocking happening, or embedded script action if you will. Then you get two additional dynamic requests. One is going to bootstrap the loader, which will come in and have all the meta data it needs. Then it will examine the dependencies are what you're using, and use its knowledge of the dependencies to go ahead to roll up a single request to the combo handler and take care of all that for you.

Another approach you can use is to frontload the loader as well. This will help you minimize HTTP requests, and it still does the subsequent additional dynamic request for any dependencies. What you're looking at here is about 11K over the wire, and again, it's the same built in dependency management you get for free. These are just different techniques that you can use. Depending on your needs, you can tailor this to your particular use case based on profiling and other tools that you run, which we'll look at a little bit; I'll talk about a couple of tools that we like to use to measure the performance.

A third way to do it is to go ahead and create one big combo handle script, dump that into your page, and then you know everything on your dependencies is going to be there. The use statement is smart enough to know that if you say 'star' then we're just going to assume everything you need is already there, so we're just going to go ahead and provide you everything that is already available on the page. What this does, though, is it gives you a heavier initial request. Again, this is an embedded script, so it could have some effects on your loading of other resources, or the rendering of your HTML. But that might be good in some cases, where you really do just want to take this hit up front because you've got some pseudo-application functionality that's very script-dependent, let's say. You don't have to manually construct these crazy combo URLs — you can use our configurator, a nice little UI there where you can just say OK, I'm going to need animation, and it'll fill in the whole URL series based on the dependencies. Here's the URL to that you can play around with.

Another concept Eric briefly touched on was lazy loading. We'll get a bit deeper into that, and what that actually means here. You can see here, if you've looked at YUI 3 at all, you're probably familiar with this use syntax with the Y instance callback. What that's essentially doing is once all the dependencies for node are ready, it passes in an instance of YUI that's your unique instance which won't be affected by any other instances on the page. You can do whatever you want there; it's sandboxed, naturally, with this anonymous function wrapper.

But let's say you don't want to frontload stuff that your user may not ever see or touch or use. Depending on how you want to set it up… In this case, I said whenever somebody mouses over a calendar I want to load the overlay, so then I can throw in my dynamic calendar. But I've got lots of little calendar thumbnails on my page and they might never touch one of those, so I may not want to pull all of that up front. You can see here that we've got a nested view statement. I'm still working off of our same YUI instance, and then if our overlay has any dependencies that aren't loaded yet, loader will get those and then execute this function when everything's ready to go. That gives you on-demand loading and allows you to really fine tune based on what the user needs, rather than what you think the user might need in terms of what you're delivering to the page. This will work for components that have a CSS file, and it's smart enough to know that, so if overlay has its own style sheet then it'll go ahead and pull that in as well.

Another tool that we have is created by a non-YUI team member, but he's a fellow Yahoo!, works in Travel, Matt Mlinac. He introduced the ImageLoader component back in 2002, sometime pretty early on. What this allows you to do is defer the loading of your images and then, based on certain triggers, load them then. In this case we're saying OK, when somebody hovers over the tabset, load the background image for the control sprites, and load the source image for our image on tab too. It's kind of a nice way to not take the hit for all of your resources up front. Especially in the case where you've got tabs that somebody may never touch, this allows you to not worry about pulling those resources.

Another way of using ImageLoader is using class names. Additionally, it's got this built in functionality for checking below the fold, so if you give it a fold distance it will check the offset below the fold. It does require you to hide your images before they start loading, otherwise there's going to be a race. What we like to do is only hide them when JS is there, and that way if JavaScript doesn't load for whatever reason — a less capable browser, bad request, what have you — the images will still show. Then in this case, we're saying anything with the class name, check fold, and if it's below the fold don't load it yet. So it'll run through and find anything that matches that pattern, and apply that rule to it.

Again, this is similar to lazy loading but for images. It's got a nice little API around it that allows you to do the on demand stuff with the custom triggers, including custom events. If you want to react to something that's not a DOM event you could do it if you're not firing events from another component that's available, along with all of our other components from developer.yahoo.com. That's our YDN site.

Another way to load your own stuff in a lazy loading kind of model is to use the Get Utility. This is what we use under the hood for loader; this drives loader functionality. This is what comes with the Seed File, bootstraps the loader, loader comes in with all the meta data, does its thing. Eric demonstrated how it can be used to add your own module and it have its own dependencies and things, but that might be overkill in some cases. You may just want to fetch a script, or fetch a style sheet, where you can do that and there are handlers for success and failure that you can hook into. StyleSheet is just going to dump a link in and load it, so your styles will get applied when it's ready. I won't read the slides here but you get the picture.

We're going to move right into running YUI and some runtime enhancements and improvements that we've made. The first thing that we're looking at here is your standard event listener. In this case however, instead of adding it to one element, we're adding it to any with the class name headline 'lives inside of a news module'. That's going to run through all these headlines, add a click listener to each one, and then make sure that the current target actually maps to one of the items that fits the pattern. If you're familiar with event bubbling you know that the target might be a text node, or a node within a headline.

This is another concept that Eric briefly touched on: Event Delegation. This is a way to manage a bunch of potential interactions with a single listener. This is really nice when you have N number of things that you might be running through, and touching each one too. What you can do here is listen on an ancestor, because all of those events will bubble up to the ancestor and you can route them from there. Well, the target will always be the thing that was interacted with. What we're really interested in is the headline, so what we're going to do is check if they clicked on the headline. If no, then they probably clicked inside the headline, but they may have clicked something between the headline and its news ancestor, so we're going to go ahead and try to resolve that. Then if we have a node at the end of the day there, we know it's a headline and we can go ahead and do whatever we want to do with it. This is an example of how you would do manual event delegation in YUI 3.

We've taken this one step further and built it in to the library, so now all of your nodes come with a delegate method which takes a similar signature to a normal listener. That last argument there is the scoped query to match, so that will find all the headlines within the news module and go ahead to clean up the current target for you, so you don't have to worry about any of this 'what's the target calling up' etc. It's always guaranteed to be one of the headlines that they clicked on. So this can really be a huge performance benefit, especially during the clean up phase of events. But it also comes in handy if you've got dynamic content coming in, you're adding item, new headlines, and you don't want to have to manually add a new listener. All those clicks will just bubble up, and the ancestor that's delegating them will go ahead and route those as needed.

Another thing that we provide as a handle to detach your events… You can see here the event subscription returns a listener, the listener has a detach method. This is one way you can detach events; there's a couple of ways. You generally don't need to do this unless you're really conscious of memory management in your application, or you've got a long running application that you're potentially adding and removing things to and you want to be able to detach. When you destroy a node you can optionally tell it to detach listeners, but this will always get cleaned up for you when the page unloads, just to keep memory down between browser sessions, especially in IE.

Another interesting piece of the puzzle here is the later method. What this will allow you to do is take what would normally be a synchronous function, meaning it would run in the normal flow of your program, and make it asynchronous. Now your script isn't going to have to wait until this thing is done before it moves on to the next thing. This is really nice when you could potentially have something that's going to take awhile, you're not sure how long it's going to take, or your subsequent code doesn't care if it's done or not because it's going to have its own callback to do whatever it needs to do. In this case we're saying, OK, in zero milliseconds run this function with no context. You can actually correct the context if you want to, say, set this to something different, and that comes in handy if you've got methods that are scoped to instances. What happens here is when it gets done, finding all of your loading will remove the class, and you don't have to worry about that stopping the execution of your script.

This helps keep your UI responsive, especially in the case where a long-running process can bog down the user experience. We really want to make sure that things feel snappy to the end user, because otherwise they're going to assume this is a bad product.

What we're looking at here is using node instances. The node is like our DOM façade – this is this thing that wraps up functionality and controls what happens to these underlying DOM nodes. There's a one-to-one mapping: one node instance to one DOM node provides DOM-like API, plus some sugar and helper methods to do common tasks. We've optimized these pretty substantially, so you can do things like add snippets of HTML on the fly, remove elements without having to call parent node, remove child element, and all that fun stuff. You can see in the first example that there's some chaining there, so you can write a little bit terser code when the occasion arises to chain functions that don't have getters.

Another thing this does is it will stop at the first match. So, it won't run through and look for every one of these things, it'll just find the first one and be done. This is a great way to search for things by ID or if you're just looking for the first one of something. I don't know if you've caught this here, but there's also a one method on the node instance, so what's happening on the second line under 'news' is it's a scoped query, so we're saying OK, find the one loading message under 'news', the first, and remove it.

If you guys want to throw any questions we can pause, but we'll also do a Q&A at the end. I'll try to slow it down a little bit so some of this can sink in.

Now what we're looking at is the all method. This is a factory for creating node list instances. A node list is essentially a batch wrapper for node operations, so under the hood it will use a single temporary node instance to crank out all of the functionality that you need on each one of these nodes. That's highly optimized as well. One of the nice things that that allows you to do is avoid creating all these extraneous node instances if you're just doing some one-off task: you don't need to make this a node, this a node, this a node, you just create one node list instance, run the function against that, and call it a day. Unlike node, obviously, it's going to look for all of the matches for the query you pass in.

OK, that's it for loading and running. In terms of performance, before we start thinking about performance optimization or any of these kinds of things we should probably measure performance to see what it's looking like, and see what bubbles up to potentially need to be investigated for further optimization, etc.

In terms of loading, a great tool Yahoo! provides is YSlow. Most of you are probably familiar with this. It will go ahead and analyze your page based on a number of rules it knows in terms of performance best practices, and give you a report card essentially, based on that score. You can see here one of the reports coming out of there, and it's going to have tips and recommendations for how to resolve some of these cases so that you can work towards your A-grade. In some cases you're breaking rules; hopefully you're understanding the rules before you start breaking them. In some cases you can't put your JavaScript at the bottom, in some cases you might not be able to combine all your requests, or on separate domains you're running add code, etc, things like that. But like I said, before you start breaking the rules it's really good to understand the rules. This will also give you some great tips for how to optimize your page load.

It's also got some pretty graphs for analyzing the page weight. There's one with an empty cache, there's one with a prime cache, and it'll show you the difference there. You can look at it and see, OK, in terms of the overhead for this page, what are we looking at here? Where should I start think about breaking things up, or potentially lazy loading some things, or deferring some things? You can see here that we've got 130K of JavaScript, and it will be interesting to know: of that JavaScript, how much do we really need during the loading phase? Is it possible to start deferring some of this stuff? If you haven't already, you should have the YSlow Firebug plugin and run that against your sites. It's a great first step toward making your initial page view come up quick.

That was a good segue way into Google Page Speed, which they provided, I think, maybe early this year or late last year. There's some overlap with YSlow in terms of known rule sets, and a score that it will provide you. One of the really cool things that it has, though, is this profile deferrable JavaScript, which answers the question previously: we've got all this JavaScript loading, and what of that can we take out of the mix? So if you enable this option, then you'll get as part of your overall report some kind of break out about the amount of script. In this case, over 80 per cent of the script isn't used before the onload gets called, so that could be a good place to start looking for things to defer.

There's also an activity panel as part of Page Speed. This is kind of like a net tab, except it's a little juiced up here. One of the cool things that it'll do is break out JavaScript parse phase, JavaScript execution phase, and you can visualize which parts of your script might be taking longer than others. Are these things affecting the overall load? You can kind of see some waterfall happening here, so it looks like there's some blocking going on that we may want to investigate, and potentially defer some of that stuff. So it's a great first step in terms of looking at what we're going to do to get this initial view up quicker.

You can see up here that we've also got these buttons for show on called functions, and show delayed functions. Those will give you a report of all the explicit functions that it knows were not called up to onload firing, and it will also show you functions that were never called at all. They may or may not be needed. You may have some component that was never interacted with, so those scripts were never executed. Those might be a good candidate for a lazy loading technique.

Like YSlow, it's a Firebug plugin. I highly recommend you grab and use that, and start thinking more about how to manage your resources, especially as we're just throwing more and more stuff in the browser and really wanting to keep the initial view up quick and keep things responsive.

Nicholas Zakas contributed a profiler component to YUI 2, and port of it to YUI 3. This is a great tool for doing cross browser profiling. If you're not familiar with profiling, this is a way to register various portions of your application to give you a report based on how many times it was called, how long it took to execute, what was the maximum, minimum, average, and those kinds of things. Ideally, a profiler will give you some results that you can export and analyze in interesting ways. The YUI profiler does all of this for you.

As we move down the list here you can see some of the ways you can profile things. You can register a single function for profiling; in this case, the node list factory. We want to see how long it is taking to crank out these node lists. We may want to profile a whole instance and say: everything that happens to node instance, I want to know about. When you register a constructor, what will happen is it will run through and instrument all of its prototype methods for profiling. Then the third way is if, say, you've got a static object that's got a bunch of methods hanging on, you can just register your object – this is how event is designed — and this way it'll do the same thing, it'll run through and instrument all the methods on this object, and all of those will be included in your report.

Additionally, often there are times when you don't want to time functions, but blocks of code, or larger sections of code. There's also the stopwatch style functionality that you can use to, say, start a timer, name it, and do whatever you need to do, then stop the timer. The nice thing about this is that it all gets rolled into your report, so you'll see that along with the rest of your profile data.

Here's a sample report. As I mentioned, you get fields for how many times it was called, what's the average speed, min and max. Points is kind of interesting, especially if you're trying to export this and do some charting or some other visualizations. You could do a scatter, or some other diagram to visualize each of the calls. The results are JSON, so they're easily portable. If you want to pull those into a table, or if you want to send those to your server and collect those, and be able to compare those with other browser results, all of those things are easily enabled here.

I do have a couple more slides here. There's an interesting area of discussion on some other tools that have popped up. This one in particular seems to get a lot of traction out there, and people seem to think that this is the be all and end all of performance measuring, and what determines the performance of a library. I would caveat that by saying in most cases, rarely do we see selector queries bubbling up in profiling. So from my perspective, this isn't really hugely important. It's more important for YUI 3, now that we've got the selector engine embedded into the core and everything running on top of that. In YUI 2 it's an a la carte module that you can use to pass around your collections of elements to other methods that operate on the collections.

What we're looking at here is a CSS 2 only selectors, and I've whittled some of those down so I could cram it into the slide here. These are the results for IE7 which is really the only one that matters anymore. Well really, IE less than 8 in compatible mode, or IE8 in quirks mode. All the other A-grade browsers now have their own native query engines, so you'll basically see zeros, ones, and twos down the line, which will make that even less of an issue.

Again, I think that this is an interesting tool, especially for comparing your versions against each other, making sure you're moving in the right direction, and also to test out various selectors and seeing how they perform, just in case you're wondering if this is more optimized than this, etc. It's really easy to pull the Slickspeed test suite down and just update the selectors list and add or remove any test from there. It's also really easy to add another library if you've got your own thing, or if there's another one floating out there that you want to see some results on, you can go ahead and do that.

You can see that we've been made improvements on this. YUI 3 is significantly faster than YUI 2. But again, all of that's kind of a grain of salt. For those that aren't familiar with Slickspeed, what it does is measure JavaScript selector queries. You might get the impression it's measuring CSS parsing or something like that, but this is really a manual implementation of JavaScript query, or forking to a native query. Again, the primary use case is IE, and here's the main repository. I went ahead and threw up the full CSS 2 test that has the ones that I stripped out there, so if you wanted to go ahead and play around and run those on your own, feel free. If you're seeing results that are wildly different, or if you want to discuss that, I'd like to talk about that a little bit more.

The other one is piggybacked off Slickspeed and uses the same kind of foundational PHP. Unlike selector, though, it doesn't have a simple set of tests that you can run against every browser. You have to manually instrument all of these tests based on whatever the syntax might be for these various libraries. So this one's a little more tedious to add tests to. This one also, to me, is more meaningful than Slickspeed.

Now we're really getting into the area where you're touching the DOM. This is probably the foundation of most of the work you're doing in the browser, so we want this to be really fast. Again, I'm including these other libraries as a basis for comparison, and also to just address any performance concerns that YUI users might have out there. Ultimately, we're competing against pure DOM in my mind. If you can't get close to a pure DOM approach, then personally I would be less inclined to use a library and more inclined to just use the DOM. That's our goal, ultimately: to close the gap between the pure DOM. In this case, asterisk just says that basically, pure DOM functions only with a couple of utility methods for basic selector query and add listener wrapper.

Here it is on GitHub, the original TaskSpeed repository. Again, I went ahead and threw this up on YUI Library, so if folks want to play around with that and compare results, feel free to grab me and talk about any of that stuff.

That's all I had, but if anyone had any more questions about any of that, we've got a few minutes and we could talk a little more. Or we could just break early. Anything? OK. I'm going to be around for the next couple of days and I'd love to meet you guys and talk about things like performance, things about the core architecture, animations, CSS, node, any of that kind of stuff. If you're interested and want to start a conversation, or have any other questions and you want to grab me offline, feel free. I appreciate it. Good day.

[applause]

Copyright © 2010 Yahoo! Inc. All rights reserved. Copyright | Privacy Policy

Help us continue to improve the Yahoo! Developer Network: Send Your Suggestions