Scheduling Tasks and Threads | Web Browser Engineering

Modern browsers must handle user input, request remote files, run various callbacks, and ultimately render to the screen, all while staying fast and responsive. That requires a unified task abstraction to keep track of the browser’s pending work. Moreover, browser work must be split across multiple CPU threads, with different threads running tasks in parallel to maximize responsiveness.

Tasks and Task Queues

So far, most of the work our browser’s been doing has come from user actions like scrolling, pressing buttons, and clicking on links. But as the web applications our browser runs get more and more sophisticated, they begin querying remote servers, showing animations, and prefetching information for later. And while users are slow and deliberative, leaving long gaps between actions for the browser to catch up, applications can be very demanding. This requires a change in perspective: the browser now has a never-ending queue of tasks to do.

Modern browsers adapt to this reality by multitasking, prioritizing, and deduplicating work. Every bit of work the browser might do—loading pages, running scripts, and responding to user actions—is turned into a task, which can be executed later, where a task is just a function (plus its arguments) that can be executed:

Note the special *args syntax in the constructor arguments and in the call to task_code. This syntax indicates that a Task can be constructed with any number of arguments, which are then available as the list args. Then, calling a function with *args unpacks the list back into multiple arguments.

The point of a task is that it can be created at one point in time, and then run at some later time by a task runner of some kind, according to a scheduling algorithm.The event loops we discussed in Chapter 2 and Chapter 11 are task runners, where the tasks to run are provided by the operating system. In our browser, the task runner will store tasks in a first-in, first-out queue:

When the time comes to run a task, our task runner can just remove the first task from the queue and run it:First-in, first-out is a simplistic way to choose which task to run next, and real browsers have sophisticated schedulers which consider many different factors.

To run those tasks, we need to call the run method on our TaskRunner, which we can do in the main event loop:

The TaskRunner allows us to choose when exactly different tasks are handled. Here, I’ve chosen to check for user events between every Task the browser runs, which makes our browser more responsive when there are lots of tasks. I’ve also chosen to only run tasks on the active tab, which means background tabs can’t slow our browser down.

With this simple task runner, we can now queue up tasks and execute them later. For example, right now, when loading a web page, our browser will download and run all scripts before doing its rendering steps. That makes pages slower to load. We can fix this by creating tasks for running scripts:

Now our browser will not run scripts until after load has completed and the event loop comes around again. And if there are lots of scripts to run, we’ll also be able to process user events while the page loads.

Timers and setTimeout

Tasks are also a natural way to support several JavaScript APIs that ask for a function to be run at some point in the future. For example, setTimeout lets you run a JavaScript function some number of milliseconds from now. This code prints “Callback” to the console one second from now:

As with addEventListener in Chapter 9, we’ll implement setTimeout by saving the callback in a JavaScript variable and creating a handle by which the Python-side code can call it:

The exported setTimeout function will create a timer, wait for the requested time period, and then ask the JavaScript runtime to run the callback. That last part will happen via __runSetTimeout:Note that we never remove callback from the SET_TIMEOUT_REQUESTS dictionary. This could lead to a memory leak, if the callback is holding on to the last reference to some large data structure. Chapter 9 had a similar issue with handles. Avoiding memory leaks in data structures shared between the browser and the browser application takes a lot of care and this book doesn’t attempt to do it right.

Now let’s implement the Python side of this API. We can use the Timer class in Python’s threading module. You use the class like this:An alternative approach would be to record when each Task is supposed to occur, and compare against the current time in the event loop. This is called polling, and is what, for example, the SDL event loop does to look for events and tasks. However, that can mean wasting CPU cycles in a loop until the task is ready, so I expect the Timer to be more efficient.

This runs callback one second from now. Simple! But threading.Timer executes its callback on a new Python thread, and that introduces a lot of challenges. The callback can’t just call evaljs directly: we’d end up with JavaScript running on two Python threads at the same time, which is not good.JavaScript is not a multithreaded programming language. It’s possible on the web to create workers of various kinds, but they all run independently and communicate only via special message-passing APIs. So as a workaround, the callback will add a new Task to the task queue to call __runSetTimeout. That has the downside of potentially delaying the callback, but it means that JavaScript will only ever execute on the main thread.

But this still isn’t quite right. We now have two threads accessing the task_runner: the primary thread, to run tasks, and the timer thread, to add them. This is a race condition that can cause all sorts of bad things to happen, so we need to make sure only one thread accesses the task_runner at a time.

To do so we use a Condition object, which can only be held by one thread at a time. Each thread will try to acquire condition before reading or writing to the task_runner, avoiding simultaneous access.The blocking parameter to acquire indicates whether the thread should wait for the condition to be available before continuing; in this chapter you’ll always set it to True. (When the thread is waiting, it’s said to be blocked.)

The Condition class is actually a Lock, plus functionality to be able to wait until a state condition occurs. If you have no more work to do right now, acquire condition and then call wait. This will cause the thread to stop at that line of code. When more work comes in to do, such as in schedule_task, a call to notify_all will wake up the thread that called wait.

It’s important to call wait at the end of the run loop if there is nothing left to do. Otherwise that thread will tend to use up a lot of the CPU, plus constantly be acquiring and releasing condition. This busywork not only slows down the computer, but also causes the callbacks from the Timer to happen at erratic times, because the two threads are competing for the lock.Try removing this code and observe. The timers will become quite erratic.

When using locks, it’s super important to remember to release the lock eventually and to hold it for the shortest time possible. The code above, for example, releases the lock before running the task. That’s because after the task has been removed from the queue, it can’t be accessed by another thread, so the lock does not need to be held while the task is running.

The setTimeout code is now thread-safe, but still has yet another bug: if we navigate from one page to another, setTimeout callbacks still pending on the previous page might still try to execute. That is easily prevented by adding a discarded field on JSContext and setting it when loading a new page:

Long-lived threads

Threads can also be used to add browser multitasking. For example, in Chapter 10 we implemented the XMLHttpRequest class, which lets scripts make requests to the server. But in our implementation, the whole browser would seize up while waiting for the request to finish. That’s obviously bad.For this reason, the synchronous version of the API that we implemented in Chapter 10 is not very useful and a huge performance footgun. Some browsers are now moving to deprecate synchronous XMLHttpRequest. Python’s Thread class lets us do better:

This code creates a new thread and then immediately returns. The callback then runs in parallel, on the new thread, while the initial thread continues to execute later code.

We’ll implement asynchronous XMLHttpRequest calls using threads. Specifically, we’ll have the browser start a thread, do the request and parse the response on that thread, and then schedule a Task to send the response back to the script.

Like with setTimeout, we’ll store the callback on the JavaScript side and refer to it with a handle:

When a script calls the open method on an XMLHttpRequest object, we’ll now allow the is_async flag to be true:In browsers, the is_async parameter is optional and defaults to true, but our browser doesn’t implement that.

On the browser side, the XMLHttpRequest_send handler will have three parts. The first part will resolve the URL and do security checks:

Then, we’ll define a function that makes the request and enqueues a task for running callbacks:

Note that the task runs dispatch_xhr_onload, which we’ll define in just a moment.

Finally, depending on the is_async flag the browser will either call this function right away, or in a new thread:

Note that in the asynchronous case, the XMLHttpRequest_send method starts a thread and then immediately returns. That thread will run in parallel with the browser’s main work until the request is done.In theory two parallel requests could race while accessing the cookie jar; I’m not fixing this out of expediency but a proper implementation would have locks for the cookie jar.

To communicate the result back to JavaScript, we’ll call a __runXHROnload function from dispatch_xhr_onload:

The __runXHROnload method just pulls the relevant object from XHR_REQUESTS and calls its onload function, which is the standard callback for asynchronous XMLHttpRequests:

As you can see, tasks allow not only the browser but also applications running in the browser to delay tasks until later.

The Cadence of Rendering

There’s more to tasks than just implementing some JavaScript APIs. Once something is a Task, the task runner controls when it runs: perhaps now, perhaps later, or maybe at most once a second, or even at different rates for active and inactive pages, or according to its priority. A browser could even have multiple task runners, optimized for different use cases.

Now, it might be hard to see how the browser can prioritize which JavaScript callback to run, or why it might want to execute JavaScript tasks at a fixed cadence. But besides JavaScript the browser also has to render the page, and as you may recall from Chapter 2, we’d like the browser to render the page exactly as fast as the display hardware can refresh. On most computers, this is 60 times per second, or 16 ms per frame. However, even with today’s computers, it’s quite difficult to maintain such a high frame rate, and certainly too high a bar for our toy browser.

So let’s establish 30 frames per second—33 ms for each frame—as our refresh rate target:Of course, 30 times per second is actually 33.33333… ms. But it’s a toy browser, and having a more exact value also makes tests easier to write.

Now, drawing a frame is split between the Tab and Browser. The Tab needs to call render to compute a display list. Then the Browser needs to raster and draw that display list (and also the chrome display list). Let’s put those Browser tasks in their own method:

Now, we don’t need each tab redrawing itself every frame, because the user only sees one tab at a time. We just need the active tab redrawing itself. Therefore, it’s the Browser that should control when we update the display, not individual Tabs. So let’s write a schedule_animation_frame methodIt’s called an “animation frame” because sequential rendering of different pixels is an animation, and each time you render it’s one “frame”—like a drawing in a picture frame. that schedules a task to render the active tab:

We can kick off the process when we start the browser. In the top-level loop, after running a task on the active tab the browser will need to raster and draw, in case that task was a rendering task:

The additional call to schedule_animation_frame will happen every time through the loop. However, because of the check for self.animation_timer being None, it will only have an effect once callback was called, which only happens after 33 ms. Thus we’re scheduling a new rendering task every 33 ms, just as we wanted to.

Optimizing with Dirty Bits

If you run this on your computer, there’s a good chance your CPU usage will spike and your batteries will start draining. That’s because we’re calling render every frame, which means our browser is now constantly styling elements, building layout trees, and painting display lists. Most of that work is wasted, because on most frames, the web page will not have changed at all, so the old styles, layout trees, and display lists would have worked just as well as the new ones.

Let’s fix this using a dirty bit, a piece of state that tells us if some complex data structure is up to date. Since we want to know if we need to run render, let’s call our dirty bit needs_render:

One advantage of this flag is that we can now set needs_render when the HTML has changed instead of calling render directly. The render will still happen, but later. This makes scripts faster, especially if they modify the page multiple times. Make this change in innerHTML_set, load, click, and keypress when changing the DOM. For example, in load, do this:

There are more calls to render; you should find and fix all of them … except, let’s take a closer look at click.

We now don’t immediately render when something changes. That means that the layout tree (and style) could be out of date when a method is called. Normally, this isn’t a problem, but in one important case it is: click handling. That’s because we need to read the layout tree to figure out what object was clicked on, which means the layout tree needs to be up to date. To fix this, add a call to render at the top of click:

Another problem with our implementation is that the browser is now doing raster_and_draw every time the active tab runs a task. But sometimes that task is just running JavaScript that doesn’t touch the web page, and the raster_and_draw call is a waste.

We can avoid this using another dirty bit, which I’ll call needs_raster_and_draw:The needs_raster_and_draw dirty bit doesn’t just make the browser a bit more efficient. Later in this chapter, we’ll add multiple browser threads, and at that point this dirty bit is necessary to avoid erratic behavior when animating. Try removing it later and see for yourself!

We will need to call set_needs_raster_and_draw every time either the Browser changes something about the browser chrome, or any time the Tab changes its rendering. The browser chrome is changed by event handlers:

Now the rendering pipeline is only run if necessary, and the browser should have acceptable performance again.

Animating Frames

One big reason for a steady rendering cadence is so that animations run smoothly. Web pages can set up such animations using the requestAnimationFrame API. This API allows scripts to run code right before the browser runs its rendering pipeline, making the animation maximally smooth. It works like this:

By calling requestAnimationFrame, this code is doing two things: scheduling a rendering task, and asking that the browser call callback at the beginning of that rendering task, before any browser rendering code. This lets web page authors change the page and be confident that it will be rendered right away.

The implementation of this JavaScript API is straightforward. Like before, we store the callbacks on the JavaScript side:

In JSContext, when that method is called, we need to schedule a new rendering task:

Then, when render is actually called, we need to call back into JavaScript, like this:

Note that __runRAFHandlers needs to reset RAF_LISTENERS to the empty array before it runs any of the callbacks. That’s because one of the callbacks could itself call requestAnimationFrame. If this happens during such a callback, the specification says that a second animation frame should be scheduled. That means we need to make sure to store the callbacks for the current frame separately from the callbacks for the next frame.

This situation may seem like a corner case, but it’s actually very important, as this is how pages can run an animation: by iteratively scheduling one frame after another. For example, here’s a simple counter “animation”:

This script will cause 100 animation frame tasks to run on the rendering event loop. During that time, our browser will display an animated count from 0 to 99. Serve this example web page from our HTTP server:

One flaw with our implementation so far is that an inattentive coder might call requestAnimationFrame multiple times and thereby schedule more animation frames than expected. If other JavaScript tasks appear later, they might end up delayed by many, many frames.

Luckily, rendering is special in that it never makes sense to have two rendering tasks in a row, since the page wouldn’t have changed in between. To avoid having two rendering tasks we’ll add a dirty bit called needs_animation_frame to the Browser that indicates whether a rendering task actually needs to be scheduled:

A tab will set the needs_animation_frame flag when an animation frame is requested:

Note that set_needs_animation_frame will only actually set the dirty bit if called from the active tab. This guarantees that inactive tabs can’t interfere with active tabs. Besides preventing scripts from scheduling too many animation frames, this system also makes sure that if our browser consistently runs slower than 30 frames per second, we won’t end up with an ever-growing queue of rendering tasks.

Profiling Rendering

We now have a system for scheduling a rendering task every 33 ms. But what if rendering takes longer than 33 ms to finish? Before we answer this question, let’s instrument the browser and measure how much time is really being spent rendering. It’s important to always measure before optimizing, because the result is often surprising.

To instrument our browser, let’s have it output the JSON tracing format used by chrome://tracing in Chrome, Firefox Profiler or Perfetto UI.Though note that these three tools seem to have somewhat different interpretations of the JSON format and display the same trace in slightly different ways.

A trace file is just a JSON object with a traceEvents fieldThere are other optional fields too, which provide various kinds of metadata. We won’t need them here. which contains a list of trace events:

Each trace event has a number of fields. The ph and name fields define the event type. For example, setting ph to M and name to process_name allows us to change the displayed process name:

The new name (“Browser”) is passed in args, and the other fields are required. Since our browser only has one process, I just pass 1 for the process ID, and the category has to be __metadata for metadata trace events. The ts field stores a timestamp; since this is the first event, it’ll set the start time for the whole trace, so it’s important to put in the actual current time.

We’ll create this MeasureTime object when we start the browser, so we can use it to measure how long various browser components take:

Now let’s add trace events when our browser does something interesting. We specifically want B and E events, which mark the beginning and end of some interesting computation. Because we have that initial trace event, every later trace event needs to be preceded by a comma:

Here, the name argument to time should describe what kind of computation is starting, and it needs to match the name passed to the corresponding stop event:

Do the same for raster_and_draw, and for all of the code that calls evaljs to run JavaScript.

Finally, when we finish tracing (that is, when we close the browser window), we want to leave the file a valid JSON file:

By the way, note that I’m careful to flush after every write. This makes sure that if the browser crashes, all of the log events—which might help me debug—are already safely on disk.Some of the tracing tools listed above actually accept invalid JSON files, in case the trace comes from a browser crash.

In Chrome tracing, you can choose the cursor icon from the toolbar and drag a selection around a set of trace events. That will show counts and average times for those events in the details window at the bottom of the screen. On my computer, my browser spent about 23 ms in render and about 62 ms in raster_and_draw on average, as you can see in the zoomed-in view in Figure 2. That clearly blows through our 33 ms budget. So, what can we do?

Two Threads

Well, one option, of course, is optimizing raster-and-draw, or even render, and we’ll do that in Chapter 13 But another option—complex, but worthwhile and done by every major browser—is to do the render step in parallel with the raster-and-draw step by adopting a multithreaded architecture. Not only would this speed up the rendering pipeline (dropping from 85 ms to 62 ms) but we could also execute JavaScript on one thread while the expensive raster_and_draw task runs on the other.

Let’s call our two threads the browser threadIn modern browsers the analogous thread is often called the compositor thread, though modern browsers have lots of threads and the correspondence isn’t exact. and the main thread.Here I’m going with the name real browsers often use. A better name might be the “DOM” thread (since JavaScript can sometimes run on other threads). The browser thread corresponds to the Browser class and will handle raster-and-draw. It’ll also handle interactions with the browser chrome. The main thread, on the other hand, corresponds to a Tab and will handle running scripts, loading resources, and rendering, along with associated tasks like running event handlers and callbacks. If you’ve got more than one tab open, you’ll have multiple main threads (one per tab) but only one browser thread.

To start, the one thread that exists already—the one that runs when you start the browser—will be the browser thread. We’ll make a main thread every time we create a tab. These two threads will need to communicate to handle events and draw to the screen.

When the browser thread needs to communicate with the main thread, to inform it of events, it’ll place tasks on the main thread’s TaskRunner.You might be wondering why the main thread doesn’t also communicate back to the browser thread with a TaskRunner. That could certainly be done. Here I chose to only do it in one direction, because the main thread is generally the “slowest” thread in browsers, due to the unpredictable nature of JavaScript and the unknown size of the DOM. The main thread will need to communicate with the browser thread to request animation frames and to send it a display list to raster-and-draw, and the main thread will do that via two methods on browser: set_needs_animation_frame to request an animation frame and commit to send it a display list.

Let’s implement this design. To start, we’ll add a Thread to each TaskRunner, which will be the tab’s main thread. This thread will need to run in a loop, pulling tasks from the task queue and running them. We’ll put that loop inside the TaskRunner’s run method.

Note that I name the thread; this is a good habit that helps with debugging. Let’s also name the browser thread:

Remove the call to run from the top-level while True loop, since that loop is now going to be running in the browser thread. And run will have its own loop:

Because this loop runs forever, the main thread will live on indefinitely. So if the browser quits, we’ll want it to ask the main thread to quit as well:

The set_needs_quit method sets a flag on TaskRunner that’s checked every time it loops:

The Browser should no longer call any methods on the Tab. Instead, to handle events, it should schedule tasks on the main thread. For example, here is loading:

We need to clear any pending tasks before loading a new page, because those previous tasks are now invalid:

We also need to split new_tab into a version that acquires a lock and one that doesn’t (new_tab_internal):

This way new_tab_internal can be called directly by methods, like Chrome’s click method, that already hold the lock.Using locks while avoiding race conditions and deadlocks can be quite difficult!

Event handlers are mostly similar, except that we need to be careful to distinguish events that affect the browser chrome from those that affect the tab. For example, consider handle_click. If the user clicked on the browser chrome, we can handle it right there in the browser thread. But if the user clicked on the web page, we must schedule a task on the main thread:

So now we have the browser thread telling the main thread what to do. Communication in the other direction is a little subtler.

Committing a Display List

We already have a set_needs_animation_frame method, but we also need a commit method that a Tab can call when it’s finished creating a display list. And if you look carefully at our raster-and-draw code, you’ll see that to draw a display list we also need to know the URL (to update the browser chrome), the document height (to allocate a surface of the right size), and the scroll position (to draw the right part of the surface).

When running an animation frame, the Tab should construct one of these objects and pass it to commit. To keep render from getting too confusing, let’s put this in a new run_animation_frame method, and move __runRAFHandlers there too.Why not reuse render instead of a new method? Because the render method is just about updating style, layout and paint when needed; it’s called for every frame, but it’s also called from click, and in real browsers from many other places too. Meanwhile, run_animation_frame is only called for frames, and therefore it, not render, runs RAF handlers and calls commit.

Think of the CommitData object as being sent from the main thread to the browser thread. That means the main thread shouldn’t access it any more, and for this reason I’m resetting the display_list field. The Browser should now schedule run_animation_frame:

On the Browser side, the new commit method needs to read out all of the data it was sent and call set_needs_raster_and_draw as needed. Because this call will come from another thread, we’ll need to acquire a lock. Another important step is to not clear the animation_timer object until after the next commit occurs. Otherwise multiple rendering tasks could be queued at the same time. Finally, store all the CommitData: save the scroll in active_tab_scroll, the url in active_tab_url, and additionally store the height and, if available, the display_list:

Make sure to update the Chrome class to use this new url field, since we don’t want the chrome, running on the browser thread, to read from the tab, running on the main thread.

Note that commit is called on the main thread, but acquires the browser thread lock. As a result, commit is a critical time when both threads are “stopped” simultaneously.For this reason commit needs to be as fast as possible, to maximize parallelism and responsiveness. In modern browsers, optimizing commit is quite challenging, because their method of caching and sending data between threads is much more sophisticated. Also note that it’s possible for the browser thread to get a commit from an inactive tab,That’s because even inactive tabs might be processing one last animation frame. so the tab parameter is compared with the active tab before copying over any committed data.

Now that we have a browser lock, we also need to acquire the lock any time the browser thread accesses any of its variables. For example, in set_needs_animation_frame, do this:

In schedule_animation_frame you’ll need to do it both inside and outside the callback:

Add locks to raster_and_draw, handle_down, handle_click, handle_key, and handle_enter as well.

We also don’t want the main thread doing rendering faster than the browser thread can raster and draw. So we should only schedule animation frames once raster and draw are done.The technique of controlling the speed of the front of a pipeline by means of the speed of its end is called back pressure. Luckily, that’s exactly what we’re doing:

And that’s it: we should now be doing render on one thread and raster and draw on another!

Threaded Profiling

Now that we have two threads, we’ll want to be able to visualize this in the traces we produce. Luckily, the Chrome tracing format supports that. First of all, we’ll want to make the MeasureTime methods thread-safe, so they can be called from either thread:

Next, in every trace event, we’ll want to provide a real thread ID in the tid field, which we can get by calling get_ident from the threading library:

Do the same thing in stop. We can also show human-readable thread names by adding metadata events when finishing the trace:Note that our browser doesn’t let you close tabs, so any thread stays around until the trace is finished. If closing tabs were possible, we’d need to do thread names somewhat differently.

You can see how the render and raster tasks now happen on different threads, and how our multithreaded architecture allows them to happen concurrently.However, in this case the two threads are not running tasks concurrently. That’s because all of the JavaScript tasks are requestAnimationFrame callbacks, which are scheduled by the browser thread, and those are only kicked off once the browser thread finishes its raster and draw work. Execise 12-8 addresses that problem.

Threaded Scrolling

Splitting the main thread from the browser thread means that the main thread can run a lot of JavaScript without slowing down the browser much. But it’s still possible for really slow JavaScript to slow the browser down. For example, imagine our counter adds the following artificial slowdown:

Now, every tick of the counter has an artificial pause during which the main thread is stuck running JavaScript. This means it can’t respond to any events; for example, if you hold down the down key, the scrolling will be janky and annoying. I encourage you to try this and witness how annoying it is, because modern browsers usually don’t have this kind of jank.Adjust the loop bound to make it pause for about a second or so on your computer.

To fix this, we need the browser thread to handle scrolling, not the main thread. This is harder than it might seem, because the scroll offset can be affected by both the browser (when the user scrolls) and the main thread (when loading a new page or changing the height of the document via JavaScript). Now that the browser thread and the main thread run in parallel, they can disagree about the scroll offset.

The best we can do is to keep two scroll offsets, one on the browser thread and one on the main thread. Importantly, the browser thread’s scroll offset refers to the browser’s copy of the display list, while the main thread’s scroll offset refers to the main thread’s display list, which can be slightly different. We’ll have the browser thread send scroll offsets to the main thread when it renders, but then the main thread will have to be able to override that scroll offset if the new frame requires it.

Let’s implement that. To start, we’ll need to store an active_tab_scroll variable on the Browser, and update it when the user scrolls:

This code calls set_needs_raster_and_draw to redraw the screen with a new scroll offset, and also sets needs_animation_frame to cause the main thread to receive the scroll offset asynchronously in the future. Even though the browser thread has already handled scrolling, it’s still important to synchronize the new value back to the main thread soon because APIs like click handling depend on it.

The scroll offset also needs to change when the user switches tabs, but in this case we don’t know the right scroll offset yet. We need the main thread to run in order to commit a new display list for the other tab, and at that point we will have a new scroll offset as well. Move tab switching (in load and handle_click) to a new method set_active_tab that simply schedules a new animation frame:Note that both callers already hold the lock, so this method doesn’t need to acquire it.

So far, this is only updating the scroll offset on the browser thread. But the main thread eventually needs to know about the scroll offset, so it can pass it back to commit. So, when the Browser creates a rendering task for run_animation_frame, it should pass in the scroll offset. The run_animation_frame function can then store the scroll offset before doing anything else. Add a scroll parameter to run_animation_frame:

But the main thread also needs to be able to modify the scroll offset. We’ll add a scroll_changed_in_tab flag that tracks whether it’s done so, and only store the browser thread’s scroll offset if scroll_changed_in_tab is not already true.Two-threaded scroll has a lot of edge cases, including some I didn’t anticipate when writing this chapter. For example, it’s pretty clear that a load should force scroll to 0 (unless the browser implements scroll restoration for back-navigations!), but what about a scroll clamp followed by a browser scroll that brings it back to within the clamped region? By splitting the browser into two threads, we’ve brought in all of the challenges of concurrency and distributed state.

We’ll set scroll_changed_in_tab when loading a new page or when the browser thread’s scroll offset is past the bottom of the page:

If the main thread hasn’t overridden the browser’s scroll offset, we’ll set the scroll offset to None in the commit data:

As you’ve seen, moving tasks to the browser thread can be challenging, but can also lead to a much more responsive browser. These same trade-offs are present in real browsers, at a much greater level of complexity.

Threaded Style and Layout

Now that we have separate browser and main threads, and now that some operations are performed on the browser thread, our browser’s thread architecture has started to resemble that of a real browser.Note that many browsers now run some parts of the browser thread and main thread in different processes, which has advantages for security and error handling. But why not move even more browser components into even more threads? Wouldn’t that make the browser even faster?

In a word, yes. Modern browsers have dozens of threads, which together serve to make the browser even faster and more responsive. For example, raster-and-draw often runs on its own thread so that the browser thread can handle events even while a new frame is being prepared. Likewise, modern browsers typically have a collection of network or input/output (I/O) threads, which move all interaction with the network or the file system off the main thread.

On the other hand, some parts of the browser can’t be easily threaded. For example, consider the earlier part of the rendering pipeline: style, layout and paint. In our browser, these run on the main thread. But could they move to their own thread?

In principle, yes. The only thing browsers have to do is implement all the web API specifications correctly, and draw to the screen after scripts and requestAnimationFrame callbacks have completed. The specification spells this out in detail in what it calls the “update-the-rendering” steps. These steps don’t mention style or layout at all—because style and layout, just like paint and draw, are implementation details of a browser. The specification’s update-the-rendering steps are the JavaScript-observable things that have to happen before drawing to the screen.

Nevertheless, in practice, no current modern browser runs style or layout on any thread but the main one.Some browsers do use multiple threads within style and layout; the Servo research browser was the pioneer here, attempting a fully parallel style, layout, and paint phase. Some of Servo’s code is now part of Firefox. Still, even if style or another phase uses threads internally, those steps still don’t happen concurrently with, say, JavaScript execution. The reason is simple: there are many JavaScript APIs that can query style or layout state. For example, getComputedStyle requires first computing style, and getBoundingClientRect requires first doing layout.There is no JavaScript API that allows reading back state from anything later in the rendering pipeline than layout, which is what made it possible to move the back half of the pipeline to another thread. If a web page calls one of these APIs, and style or layout is not up to date, then it has to be computed then and there. These computations are called forced style or forced layout: style or layout are “forced” to happen right away, as opposed to possibly 33 ms in the future, if they’re not already computed. Because of these forced style and layout situations, browsers have to be able to compute style and layout on the main thread.Or the main thread could force the browser thread to do that work, but that’s even worse, because forcing work on the compositor thread will make scrolling janky unless you do even more work to avoid that somehow.

One possible way to resolve these tensions is to optimistically move style and layout off the main thread, similar to optimistically doing threaded scrolling if a web page doesn’t preventDefault a scroll. Is that a good idea? Maybe, but forced style and layout aren’t just caused by JavaScript execution. One example is our implementation of click, which causes a forced render before hit testing:

It’s possible (but very hard) to move hit testing off the main thread or to do hit testing against an older version of the layout tree, or to come up with some other technological fix. Thus it’s not impossible to move style and layout off the main thread “optimistically”, but it is challenging. That said, browser developers are always looking for ways to make things faster, and I expect that at some point in the future style and layout will be moved to their own thread. Maybe you’ll be the one to do it?

Summary

This chapter demonstrated the two-thread rendering system at the core of modern browsers. The main points to remember are:

Additionally, you’ve seen how hard it is to move tasks between the two threads, such as the challenges involved in scrolling on the browser thread, or how forced style and layout makes it hard to fully isolate the rendering pipeline from JavaScript.

Outline

The complete set of functions, classes, and methods in our browser should now look something like this:

COOKIE_JAR

class URL:
    def __init__(url)

    def request(referrer, payload)

    def resolve(url)

    def origin()

    def __str__()

class Text:
    def __init__(text, parent)

    def __repr__()

class Element:
    def __init__(tag, attributes, parent)

    def __repr__()

def print_tree(node, indent)

def tree_to_list(tree, list)

class HTMLParser:
    SELF_CLOSING_TAGS

    HEAD_TAGS

    def __init__(body)

    def parse()

    def get_attributes(text)

    def add_text(text)

    def add_tag(tag)

    def implicit_tags(tag)

    def finish()

class CSSParser:
    def __init__(s)

    def whitespace()

    def literal(literal)

    def word()

    def ignore_until(chars)

    def pair()

    def selector()

    def body()

    def parse()

class TagSelector:
    def __init__(tag)

    def matches(node)

class DescendantSelector:
    def __init__(ancestor, descendant)

    def matches(node)

FONTS

def get_font(size, weight, style)

def linespace(font)

NAMED_COLORS

def parse_color(color)

def parse_blend_mode(blend_mode_str)

REFRESH_RATE_SEC

class MeasureTime:
    def __init__()

    def time(name)

    def stop(name)

    def finish()

class Task:
    def __init__(task_code)

    def run()

class TaskRunner:
    def __init__(tab)

    def schedule_task(task)

    def set_needs_quit()

    def clear_pending_tasks()

    def start_thread()

    def run()

    def handle_quit()

DEFAULT_STYLE_SHEET

INHERITED_PROPERTIES

def style(node, rules)

def cascade_priority(rule)

WIDTH, HEIGHT

HSTEP, VSTEP

INPUT_WIDTH_PX

BLOCK_ELEMENTS

class DocumentLayout:
    def __init__(node)

    def layout()

    def should_paint()

    def paint()

    def paint_effects(cmds)

class BlockLayout:
    def __init__(node, parent, previous)

    def layout_mode()

    def layout()

    def recurse(node)

    def new_line()

    def word(node, word)

    def input(node)

    def self_rect()

    def should_paint()

    def paint()

    def paint_effects(cmds)

class LineLayout:
    def __init__(node, parent, previous)

    def layout()

    def should_paint()

    def paint()

    def paint_effects(cmds)

class TextLayout:
    def __init__(node, word, parent, previous)

    def layout()

    def should_paint()

    def paint()

    def paint_effects(cmds)

class InputLayout:
    def __init__(node, parent, previous)

    def layout()

    def should_paint()

    def paint()

    def paint_effects(cmds)

    def self_rect()

class DrawText:
    def __init__(x1, y1, text, font, color)

    def execute(canvas)

class DrawRect:
    def __init__(rect, color)

    def execute(canvas)

class DrawRRect:
    def __init__(rect, radius, color)

    def execute(canvas)

class DrawLine:
    def __init__(x1, y1, x2, y2, color, thickness)

    def execute(canvas)

class DrawOutline:
    def __init__(rect, color, thickness)

    def execute(canvas)

class Blend:
    def __init__(opacity, blend_mode, children)

    def execute(canvas)

def paint_tree(layout_object, display_list)

def paint_visual_effects(node, cmds, rect)

EVENT_DISPATCH_JS

SETTIMEOUT_JS

XHR_ONLOAD_JS

RUNTIME_JS

class JSContext:
    def __init__(tab)

    def run(script, code)

    def dispatch_event(type, elt)

    def dispatch_settimeout(handle)

    def dispatch_xhr_onload(out, handle)

    def get_handle(elt)

    def querySelectorAll(selector_text)

    def getAttribute(handle, attr)

    def innerHTML_set(handle, s)

    def XMLHttpRequest_send(...)

    def setTimeout(handle, time)

    def requestAnimationFrame()

SCROLL_STEP

class Tab:
    def __init__(browser, tab_height)

    def load(url, payload)

    def run_animation_frame(scroll)

    def render()

    def allowed_request(url)

    def raster(canvas)

    def clamp_scroll(scroll)

    def set_needs_render()

    def scrolldown()

    def click(x, y)

    def go_back()

    def submit_form(elt)

    def keypress(char)

class Chrome:
    def __init__(browser)

    def tab_rect(i)

    def paint()

    def click(x, y)

    def keypress(char)

    def enter()

    def blur()

class CommitData:
    def __init__(...)

class Browser:
    def __init__()

    def schedule_animation_frame()

    def commit(tab, data)

    def render()

    def raster_and_draw()

    def raster_tab()

    def raster_chrome()

    def draw()

    def set_needs_animation_frame(tab)

    def set_needs_raster_and_draw()

    def new_tab(url)

    def new_tab_internal(url)

    def set_active_tab(tab)

    def schedule_load(url, body)

    def clamp_scroll(scroll)

    def handle_down()

    def handle_click(e)

    def handle_key(char)

    def handle_enter()

    def handle_quit()

def mainloop(browser)

Exercises

12-1 setInterval. setInterval is similar to setTimeout but runs repeatedly at a given cadence until clearInterval is called. Implement these APIs. Make sure to test setInterval with various cadences in a page that also uses requestAnimationFrame with some expensive rendering pipeline work to do. Record the actual timing of setInterval tasks; how consistent is the cadence?

12-2 Task timing. Modify Task to add trace events every time a task executes. You’ll want to provide a good name for these trace events. One option is to use the __name__ field of task_code, which will get the name of the Python function run by the task.

12-3 Clock-based frame timing. Right now our browser schedules each animation frame exactly 33 ms after the previous one completes. This actually leads to a slower animation frame rate cadence than 33 ms. Fix this in our browser by using the absolute time to schedule animation frames, instead of a fixed delay between frames. Also implement main-thread animation frame scheduling that happens before raster and draw, not after, allowing both threads to do animation work simultaneously.

12-4 Scheduling. As more types of complex tasks end up on the event queue, there comes a greater need to carefully schedule them to ensure the rendering cadence is as close to 33 ms as possible, and also to avoid task starvation. Implement a task scheduler with a priority system that balances these two needs: prioritize rendering tasks and input handling, and deprioritize (but don’t completely starve) tasks that ultimately come from JavaScript APIs like setTimeout. Test it out on a web page that taxes the system with a lot of setTimeout-based tasks.

12-5 Threaded loading. When loading a page, our browser currently waits for each style sheet or script resource to load in turn. This is unnecessarily slow, especially on a bad network. Instead, make your browser send off all the network requests in parallel. You must still process resources like styles in source order, however. It may be convenient to use the join method on a Thread, which will block the thread calling join until the thread being joined completes.

12-6 Networking thread. Real browsers usually have a separate thread for networking (and other I/O). Tasks are added to this thread in a similar fashion to the main thread. Implement a third networking thread and put all networking tasks on it.

12-7 Optimized scheduling. On a complicated web page, the browser may not be able to keep up with the desired cadence. Instead of constantly pegging the CPU in a futile attempt to keep up, implement a frame time estimator that estimates the true cadence of the browser based on previous frames, and adjust schedule_animation_frame to match. This way complicated pages get consistently slower, instead of having random slowdowns.

12-8 Raster-and-draw thread. Right now, if an input event arrives while the browser thread is rastering or drawing, that input event won’t be handled immediately. This is especially a problem because raster and draw are slow. Fix this by adding a separate raster-and-draw thread controlled by the browser thread. While the raster-and-draw thread is doing its work, the browser thread should be available to handle input events. Be careful: SDL is not thread-safe, so all of the steps that directly use SDL still need to happen on the browser thread.