Our little browser now renders pages that change. That means it’s now laying out the page multiple times. That translates to a lot of wasted work: a page doesn't usually change much, and layout is expensive. So in this chapter we'll modify our browser to reuse as much as it can between layouts.
InlineLayoutcreates both line and text items; it makes reflow for that layout type very odd.
Before we start speeding up our browser, let's confirm that layout is taking up a lot of time. And before that, let's list out everythin our browser does. First, on initial load:
And then every time it does layout:
I'd like to measure how long each of these phases takes. Python does have various profilers, but the easiest thing to do is to check the clock every now and then. To keep it all well-contained, I'm going to make a
Timer class to store the time and report how long each phase took.
That wacky string in the
Timer is pretty easy. First we define a
timer field in our browser:
Then we call
start every time we start one of the phases above. For example, in
browse, I start the
Then, at the end of
render, I stop the timer:
Your results may not match mine,These results were recorded on a 2019 13-inch MacBook Pro with a 2.4GHz i5, 8 GB of LPDDR3 memory, and an Intel Iris Plus graphics 655 with 1.5 GB of video memory, running macOS 10.14.6. but here's what I saw on my console on a full page load for this web page.
[ 0.225331] Download [ 0.028922] Parse HTML [ 0.001835] Parse CSS [ 0.003599] Run JS [ 0.007073] Style [ 0.517131] Layout [ 0.022553] Display List [ 0.275645] Rendering [ 0.008225] Chrome
The overall process takes about one second (60 frames), with layout consuming half and then rendering and network consuming the rest. Moreover, you only download on initial load, so it's really layout and rendering that need to be faster. By the way, keep in mind that while networking in a real web browser is similar enough to our toy version,Granted with caching, keep-alive, parallel connections, and incremental parsing to hide the delay… layout is much more complex in real browsers!Of course they're also not written in Python…Now is a good time to mention that benchmarking a real browser is a lot harder than benchmarking ours. A real browser will run most of these phases simultaneously, and may also split the work over multiple CPU and GPU processes. Counting how much time everything takes is a chore. Real browsers are also memory hogs, so optimizing memory usage is just as important for them, which is a real pain to measure in Python. Luckily our browser doesn't have tabs so it's unlikely to strain for memory!
By the way, this might be the point in the book where you realize you accidentally implemented something super-inefficiently. If something other than the network, layout, or rendering is taking a long time, look into that.The exact speeds of each of these phases can vary quite a bit between implementations, and might depend (for example) on the exercises you ended up implementing, so don't sweat the details too much. This chapter can only help speed up layout and rendering!
To really demand faster layout and rendering, let's implement a browser feature that really taxes them: hover styles. The
:hover CSS selector applies to whichever element the mouse is currently over. It’s often used together with other selectors: an
a:hover selector to change the color of links that you’re hovering over, for example. In our browser, with our limited selectors language, we can at least try out the style:
This should draw a box around any element we hover over. Let’s implement it. First, we need to parse the
:hover selector. Start with a new selector class:
This expects an
ElementNode.pseudoclasses field, which I'll initialize to an empty set:
PseudoclassSelectors need to be created in the parser. That’s basically the same as class selectors, replacing the period with a colon:
Finally we need to handle hover events. In Tk, you do that by binding the
handle_hover method is pretty simple; it calls
find_element, walks up the tree until it finds an
ElementNode, and then sets its
hover pseudoclass. It also has to unset the
hover pseudoclass on the previously-hovered element, so I store a reference to that element in the
hovered_elt field, initialized to
None in the constructor.
class Browser: def handle_hover(self, e): x, y = e.x, e.y - 60 + self.scrolly elt = find_element(x, y, self.nodes) while elt and not isinstance(elt, ElementNode): elt = elt.parent if self.hovered_elt: self.hovered_elt.pseudoclasses.remove("hover") if elt: elt.pseudoclasses.add("hover") self.hovered_elt = elt self.relayout()
Note that hover calls
relayout, because by changing the pseudoclasses it potentially changes which rules apply to which elements and thus which borders are applied where.
Try this out! Black rectangles should appear around every element you hover over—except it'll take about a second to do! We really need to speed up layout.
Actually, this is not how
:hover works, because in normal CSS if you hover over an element you probably also hover over its parent, and both get the
:hover style. With our selector language, I’m at a loss for a true hover style that incrementalizes well, so I’m implementing a fake
:hover selector instead. Put simply, this chapter is a guide to incremental reflow, not hover selectors.
How can we make layout faster? In the intro to this chapter, I mentioned that the layout doesn't change much, and that's going to be the key here. But what exactly do I mean? When the page is reflowed,That is, laid out but excluding initial layout. like on hover, the sizes and positions of most elements on the page change. Borders-on-hover changes the height of the hovered element, for example, and that moves other elements down the page.
But even if some parts of the page move, they don’t change size, and their relative positions won't change. We will leverage this fact to skip as much work as possible on reflow by splitting layout into two phases. First, we'll compute relative positions for each element; then, we'll compute the absolute positions by adding the parent offset to the relative position. The
layout function will become two:
layout2.Look, I've got a bit of a cold, I'm not feeling very creative.
I'll start with block layout to understand this split better. In block layout, the x position is computed from the parent's left content edge and then changed in
layout to account for the margin. Likewise, the y position is initialized from the argument to
layout and then changed once to account for margins. In other words, neither changes much, so it's safe to move these absolute position fields to
layout1 doesn't take any arguments any more, since it doesn't need to know its position to figure out the relative positions of its contents. Meanwhile,
layout2 still needs a
y parameter from its parent.
layout method created children and recursively called
child.layout to lay out its children. We need to split that recursive call into two:
layout1 should recursively call
layout2 should recursively call
One final subtlety: I plan to keep layout objects around between reflows, and that means we won't call the constructor on reflow. So layout object can’t read from the page in the constructor: the constructor won't be re-run when the page changes. In block layout, the margin, padding, and border values are computed in the constructor; let's move that computation into
layout1. Plus, the
children array is initialized in the constructor, but it's only modified in
layout1. Let's move that
children array to
Let’s review the changes before executing them in the other layout modes:
This method is responsible for computing the twelve margin, padding, and border fields, and for creating the child layouts, and for assigning the
h fields on the
BlockLayout. Plus, it must call
layout1 on its children. It may not read the
y fields on anything.
This method may read the
h fields and is responsible for assigning the
y fields on the
BlockLayout. Plus, it must call
layout2 on its children.
BlockLayout sorted, let's move on to text, input, and line layout. I'll save inline layout to the very end.
layout function does basically nothing, because everything happens in the constructor. We'll rename
layout2 and move the computation of the
h fields from the constructor into
InputLayout, the width and height are computed in the constructor (so let's move them to
y but also creates a child
InlineLayout for textareas. Let's put the child layout creation in
layout1 and leave the
y computation in
layout2 will need to call
layout2 on the child layout, if there is one.
LineLayout there’s a small quirk: it does not create its own children (
InlineLayout is in charge of that) and it does not compute its own width (its children do that in their
attach methods). So unlike the other layout modes,
LineLayout will initialize the
w fields in its constructor, and its
layout1 won't recursively call
layout1 on its children or compute the
layout2 method looks more normal, however:
Finally, inline layout. Currently,
InlineLayout.layout computes the
w fields, then creates children with the
recurse method, and then calls
layout on each child. While layout phase should recurse go in? Peeking inside
recurse and its helper methods
input, I see that it only reads the
w field, so all of
recurse can happen in
InlineLayout is responsible not only for its children (line layouts) but their children (text and input layouts) as well, we need update the
input helpers to call
layout1 on the new layout objects they create.
layout2 will compute
y and also recursively call
Note that I'm accepting a
y argument in
layout2, even though I don't use it. That's because
BlockLayout passes one. Probably it would be good to accept both
y in every
layout2 method, though that's not how I've written my code. Also don't forget to update
InputLayout to pass a
y argument in
layout2 when it handles its child layout.
I want to emphasize that
recurse is by far the slowest part of our browser's layout algorithm,Because it calls
font.measure, which has to do a slow and expensive text rendering to get the right size. Text is crazy. and since it happens in
layout1 we will be able to mostly skip it. This will be our the big performance win.
Finally, we need to update our browser to actually call all of these functions. In
Browser.relayout, where we used to call
layout, let's call
layout1 followed by
I cannot overemphasize how it important it is to stop and debug now. Fix the minor bugs (for me: forgetting to add
self to variables when moving them out of constructors; forgetting to move the
children array; and passing the wrong number of arguments to
layout2 before we make things more complicated by only calling
layout1 sometimes. It’s also a good idea to read through all of the constructors for the layout classes to make sure they're not doing anything interesting.
If you time the browser, now that we've split layout into two phases, you'll find that not much has changed; for me, layout got about five percent slower. To find out more, I made
layout2 separate phases in the timer. Here's what it looks like:
[ 0.498251] Layout1 [ 0.006418] Layout2
layout1 is taking all of the time and
layout2 is near-instantaneous.“Near-instantaneous‽ 6.418 milliseconds is almost 40% of our one-frame time budget!” This is a toy web browser written in Python. Cut me some slack. That validates our next step: avoiding
layout1 whenever we can.
The idea is simple, but the implementation will be tricky, because sometimes you do need to lay an element out again; for example, the element you hover over gains a border, and that changes its width and therefore potentially its line breaking behavior. So you may need to run
layout1 on that element. But you won't need to do so on its siblings. The responsibilities of
layout2, outlined above, will be our guide to what must run when. Because
layout1 only reads the node style, it only needs to be called when the node or its style changes.
Let's plan. Look over your browser and list all of the places where
relayout is called. For me, these are:
parse, on initial page load.
js_innerHTML, when new elements are added to the page (and old ones removed).
edit_input, when the contents of an input element is changed.
handle_hover, when an element is hovered over.
Each of these needs a different approach:
parse, we are doing the initial page load so we don't even have an existing layout. We need to create one, and that involves calling
js_innerHTML, the new elements and their parent are the only elements that have had their nodes or style changed.
edit_input, only the input element itself has had a change to node or style.
handle_hover, only the newly-hovered and newly-not-hovered elements have had a change to node or style.
relayout into pieces to reflect the above. First, let's move the construction of
parse, since they only occur on initial load. Then let's create a new
reflow function that calls
layout1 on an element of your choice. And finally,
relayout will just contain the call to
layout2 and the computation of the display list. Here's
class Browser: def parse(self, body): # ... self.page = Page() self.layout = BlockLayout(self.page, self.nodes) self.reflow(self.nodes) self.relayout() def relayout(self): self.start("Layout2") self.layout.layout2(0) self.max_h = self.layout.h self.timer.start("Display List") self.display_list = self.layout.display_list() self.render()
reflow method is a little more complex. Here's what it looks like after a simple reorganization:
Note that while
reflow takes an element as an argument, it ignores it, and restyles and re-lays-out the whole page. That's clearly silly, so let's fix that. First, the
style call only needs to be passed
Second, instead of calling
self.layout, we only want to call it on the layout object corresponding to
elt. The easiest way to find that is with a big loop:There is a subtlety in the code below. It's important to check the current node before recursing, because some nodes have two layout objects, in particular block layout elements that contain text and thus have both a
BlockLayout and an
InlineLayout. We want the parent, and doing the check before recursing guarantees us that.
This is definitely inefficient, because we could store the element-layout correspondence on the node itself, but let's run with it for the sake of simplicity. We can now change the
self.layout.layout1() line to:
This mostly works, but
find_layout won't be happy on initial page load because at that point some of the layout objects don't have children yet. Let's add a line for that:I recognize that in many languages, unlike in Python, you can't just add fields in some method without declaring them earlier on. I assume that in all of those cases you've been initializing the fields with dummy values, and then you'd check for that dummy value instead of using
hasattr. Just make sure your dummy value for
children isn't the empty list, since that is also a valid value. Better to use a null pointer or something like that, whatever your language provides.
hasattr is extremely ugly. Perhaps we need dirty bits?
The logic of returning any layout object without a
children field is that if some layout object does not have such a field, it definitely hasn't had
layout1 called on it and therefore we should, which we do by returning it from
Finally, let's go back to place where we call
relayout and add a call to
reflow. The only complicated case is
handle_hover, where you need to call
reflow both on the old
hovered_elt and on the new one. (You only need to call
layout once, though.)
With this tweak, you should see
layout1 taking up almost no time, except on initial page load, and hovering should be much more responsive. For me the timings now look like this:
I should add rendering times.
So now rendering takes up roughly 89% of the runtime when hovering, and everything else takes up 32 milliseconds total. That's not one frame, but it's not bad for a Python application!
Let's put a bow on this lab by speeding up
render. It's actually super easy: we just need to avoid drawing stuff outside the browser window; in the graphics world this is called clipping. Now, sometimes stuff is half-inside and half-outside the browser window. We still want to draw it! For that, we’ll need to know where that stuff starts and ends. I'm going to update the
DrawText constructor to compute that:
This misdirection is stupid. It should implement it the right way, notice the slowdown, and improve it.
Ok, wait, that's not the code you expected. Why 50? Why not use
font.metrics are quite slow: they actually execute text layout, and that takes a long time! So I'll be using only y position for clipping, and I'll be using an overapproximation to
font.metrics. The 50 is not a magic value; it just needs to be bigger than any actual line height. If it's too big, we render a few too many
DrawText objects, but it won't change the resulting page.
DrawRect objects have top-left and bottom-right coordinates and we can check those in
That takes rendering down from a quarter-second to a hundredth of a second for me, and makes the hover animation fairly smooth. A hover reflow now takes roughly 25 milliseconds, with the display list computation 44% of that, rendering 39%, and
layout2 13%. We could continue optimizing (for example, tracking invalidation rectangles in rendering), but I'm going to call this a success. We've made interacting with our browser more than 30 times faster, and in the process made the
:hover selector perfectly usable.
The more complex, two-phase layout algorithm in this chapter sped up my toy browser by roughly 30×, to the point that it can now run simple animations like hovering.
DrawText, it already knows the width and height of the text to be laid out. Use that to compute
DrawText. Add horizontal clipping to the rendering function.
font.measurefunction is quite slow! Change Add a cache for the size of each word in a given font. Measure the speed-up that results.
BlockLayoutobjects, how much in
InlineLayout, and so on. What percentage of the
layout1time is spent handling inline layouts? If you did the first and second exercises, measure the effect.
styleattribute, which can change which styles apply to the affected element; make sure to handle that with reflow. Furthermore, you can use
setAttributeto update the
hrefattribute of a
<link>element, which means you must download a new CSS file and recompute the set of CSS rules. Make sure to handle that edge case as well. If you change the
srcattribute of a
setTimeout(ms, f)should run the function
msmilliseconds. In Python, you can use the
Timerclass from the
setTimeoutto implement a simple animation; for example, you might implement a "typewriter effect" where a paragraph is typed out letter-by-letter. Check that your browser handles the animation relatively smoothly.