While our toy browser can render complex styles, visual effects, and animations, all of those apply basically just to text. Yet web pages contain a variety of non-text embedded content, from images to other web pages. Support for embedded content has powerful implications for browser architecture, performance, security, and open information access, and has played a key role throughout the web’s history.
Images are certainly the most popular kind of embedded content on the
web,So it’s a little
ironic that images only make their appearance in chapter 15 of this
book! It’s because Tkinter doesn’t support many image formats or proper
sizing and clipping, so I had to wait for the introduction of
Skia. dating back to early
1993.This history is
also the
reason behind a lot of inconsistencies, like src
versus
href
or img
versus
image
. They’re included on web pages via the
<img>
tag, which looks like this:
<img src="https://browser.engineering/im/hes.jpg">
And which renders something like this:
Luckily, implementing images isn’t too hard, so let’s just get started. There are four steps to displaying images in our browser:
Let’s start with downloading images from a URL. Naturally, that
happens over HTTP, which we already have a request
function
for. However, while all of the content we’ve downloaded so far—HTML,
CSS, and JavaScript—has been textual, images typically use binary data
formats. We’ll need to extend request
to support binary
data.
The change is pretty minimal: instead of passing the "r"
flag to makefile
, pass a "b"
flag indicating
binary mode:
def request(url, top_level_url, payload=None):
# ...
= s.makefile("b")
response # ...
Now every time we read from response
, we will get
bytes
of binary data, not a str
with textual
data, so we’ll need to change some HTTP parser code to explicitly
decode
the data:
def request(url, top_level_url, payload=None):
# ...
= response.readline().decode("utf8")
statusline # ...
while True:
= response.readline().decode("utf8")
line # ...
# ...
Note that I didn’t add a decode
call when we
read the body; that’s because the body might actually be binary data,
and we want to return that binary data directly to the browser. Now,
every existing call to request
, which wants textual data,
needs to decode
the response. For example, in
load
, you’ll want to do something like this:
class Tab:
def load(self, url, body=None):
# ...
= request(url, self.url, body)
headers, body = body.decode("utf8")
body # ...
Make sure to make this change everywhere in your browser that you
call request
, including inside
XMLHttpRequest_send
and in several other places in
load
.
When we download images, however, we won’t call
decode
, and just use the binary data directly.
class Tab:
def load(self, url, body=None):
# ...
= [node
images for node in tree_to_list(self.nodes, [])
if isinstance(node, Element)
and node.tag == "img"]
for img in images:
= img.attributes.get("src", "")
src = resolve_url(src, self.url)
image_url assert self.allowed_request(image_url), \
"Blocked load of " + image_url + " due to CSP"
= request(image_url, self.url) header, body
Once we’ve downloaded the image, we need to turn it into a Skia
Image
object. That requires the following code:
class Tab:
def load(self, url, body=None):
for img in images:
# ...
= body
img.encoded_data = skia.Data.MakeWithoutCopy(body)
data = skia.Image.MakeFromEncoded(data) img.image
There are two tricky steps here: the requested data is turned into a
Skia Data
object using the MakeWithoutCopy
method, and then into an image with MakeFromEncoded
.
Because we used MakeWithoutCopy
, the Data
object just stores a reference to the existing body
and
doesn’t own that data. That’s essential, because encoded image data can
be large—maybe megabytes—and copying that data wastes memory and time.
But that also means that the data
will become invalid if
body
is ever garbage-collected; that’s why I save the
body
in an encoded_data
field.This is a bit of a hack.
Perhaps a better solution would be to write the response directly into a
Skia Data
object using the writable_data
API.
It would require some refactoring of the rest of the browser which is
why I’m choosing to avoid it.
These download and decode steps can both fail; if that happens we’ll load a “broken image” placeholder (I used this one):
= skia.Image.open("Broken_Image.png")
BROKEN_IMAGE
class Tab:
def load(self, url, body=None):
for img in images:
try:
# ...
except Exception as e:
print("Exception loading image: url="
+ image_url + " exception=" + str(e))
= BROKEN_IMAGE img.image
Now that we’ve downloaded and saved the image, we need to use it.
Recall that the Image
object is created using a
MakeFromEncoded
method. That name reminds us that the image
we’ve downloaded isn’t raw image bytes. In fact, all of the image
formats you know—JPG, PNG, and the many more obscure ones—encode the
image data using various sophisticated algorithms. The image therefore
needs to be decoded before it can be used.
Luckily, Skia will automatically do the decoding for us, so drawing the image is pretty simple:
class DrawImage(DisplayItem):
def __init__(self, image, rect):
super().__init__(rect)
self.image = image
def execute(self, canvas):
self.image, self.rect) canvas.drawImageRect(
Skia applies a variety of clever optimizations to decoding, such as directly decoding the image to its eventual size and caching the decoded image as long as possible.There’s also is an HTML API to control decoding, so that the web page author can indicate when to pay that cost. That’s because raw image data can be quite large:Decoding costs both a lot of memory and also a lot of time, since just writing out all of those bytes can take a big chunk of our render budget. Optimizing image handling is essential to a performant browser. a pixel is usually stored as four bytes, so a 12 megapixel camera (as you can find on phones these days) produces 48 megabytes of raw data.
But because image decoding can be so expensive, Skia actually has several algorithms for decoding to different sizes, some of which are faster but result in a worse-looking image.Image formats like JPEG are also lossy, meaning that they don’t faithfully represent all of the information in the original picture, so there’s a time/quality trade-off going on before the file is saved. Typically these formats try to drop “noisy details” that a human is unlikely to notice, just like different resizing algorithms might. For example, just for resizing an image, there’s fast, simple, “nearest neighbor” resizing and the slower but higher-quality “bilinear” or even “Lanczos” resizing algorithms.
To give web page authors control over this performance bottleneck,
there’s an image-rendering
CSS property that indicates which algorithm to use. Let’s add that as an
argument to DrawImage
:
class DrawImage(DisplayItem):
def __init__(self, image, rect, quality):
# ...
if quality == "high-quality":
self.quality = skia.FilterQuality.kHigh_FilterQuality
elif quality == "crisp-edges":
self.quality = skia.FilterQuality.kLow_FilterQuality
else:
self.quality = skia.FilterQuality.kMedium_FilterQuality
def execute(self, canvas):
= skia.Paint(FilterQuality=self.quality)
paint self.image, self.rect, paint) canvas.drawImageRect(
With the images downloaded and decoded, all we need to see the downloaded images is to add images into our browser’s layout tree.
The HTTP Content-Type
header lets the web server tell
the browser whether a document contains text or binary data. The header
contains a value called a MIME
type, such as text/html
, text/css
, and
text/javascript
for HTML, CSS, and JavaScript;
image/png
and image/jpeg
for PNG and JPEG
images; and many
others for various font, video, audio, and data formats.“MIME” stands for Multipurpose
Internet Mail Extensions, and was originally intended for enumerating
all of the acceptable data formats for email attachments. These days the
loop has basically closed: most email clients are now “webmail” clients,
accessed through your browser, and most emails are now HTML, encoded
with the text/html
MIME type. Many mail clients do still
have an option to encode the email in text/plain
,
however. Interestingly, we didn’t need to the image format
in the code above. That’s because many image formats start with “magic
bytes”; for example, PNG files always start with byte 137 followed
by the letters “PNG”. These magic bytes are often more reliable than
web-server-provided MIME types, so such “format sniffing” is common
inside browsers and their supporting libraries.
Based on your experience with prior chapters, you can probably guess
how to add images to our browser’s layout and paint process. We’ll need
to create an ImageLayout
method; add a new
image
case to BlockLayout
’s
recurse
method; and generate a DrawImage
command from ImageLayout
’s paint
method.
As we do this, you might recall doing something very similar for
<input>
elements. In fact, text areas and buttons are
very similar to images: both are leaf nodes of the DOM, placed into
lines, affected by text baselines, and painting custom content.Images aren’t quite like
text because a text node is potentially an entire run of text,
split across multiple lines, while an image is an atomic
inline. The other types of embedded content in this chapter are also
atomic inlines. Since they are so similar, let’s try to
reuse the same code for both.
Let’s split the existing InputLayout
into a superclass
called EmbedLayout
, containing most of the existing code,
and a new subclass with the input-specific code,
InputLayout
:In a real browser, input elements are usually called
widgets because they have a lot of special
rendering rules that sometimes involve CSS.
class EmbedLayout:
def __init__(self, node, parent, previous, frame):
# ...
def get_ascent(self, font_multiplier=1.0):
return -self.height
def get_descent(self, font_multiplier=1.0):
return 0
def layout(self):
self.zoom = self.parent.zoom
self.font = font(self.node.style, self.zoom)
if self.previous:
= self.previous.font.measureText(" ")
space self.x = \
self.previous.x + space + self.previous.width
else:
self.x = self.parent.x
class InputLayout(EmbedLayout):
def __init__(self, node, parent, previous):
super().__init__(node, parent, previous)
def layout(self):
super().layout()
The idea is that EmbedLayout
should provide common
layout code for all kinds of embedded content, while its subclasses like
InputLayout
should provide the custom code for that type of
content. Different types of embedded content might have different widths
and heights, so that should happen in InputLayout
, and each
subclass has its own unique definition of paint
:
class InputLayout(EmbedLayout):
def layout(self):
# ...
self.width = device_px(INPUT_WIDTH_PX, self.zoom)
self.height = linespace(self.font)
def paint(self, display_list):
# ...
ImageLayout
can now inherit most of its behavior from
EmbedLayout
, but take its width and height from the image
itself:
class ImageLayout(EmbedLayout):
def __init__(self, node, parent, previous):
super().__init__(node, parent, previous)
def layout(self):
super().layout()
self.width = device_px(self.node.image.width(), self.zoom)
self.img_height = device_px(self.node.image.height(), self.zoom)
self.height = max(self.img_height, linespace(self.font))
Notice that the height of the image depends on the font size of the
element. Though odd, this is how image layout actually works: a line
with a single, very small, image on it will still be tall enough to
contain text.In fact, a
page with only a single image and no text or CSS at all still has its
layout affected by a font—the default font. This is a common source of
confusion for web developers. In a real browser, it can be avoided by
forcing an image into a block or other layout mode via the
display
CSS property. The underlying reason
for this is because, as a type of inline layout, images are designed to
flow along with related text, which means the bottom of the image should
line up with the text
baseline (in fact, img_height
is saved in the code
above to ensure they line up).
Painting an image is also straightforward:
class ImageLayout(EmbedLayout):
def paint(self, display_list):
= []
cmds = skia.Rect.MakeLTRB(
rect self.x, self.y + self.height - self.img_height,
self.x + self.width, self.y + self.height)
= self.node.style.get("image-rendering", "auto")
quality self.node.image, rect, quality))
cmds.append(DrawImage( display_list.extend(cmds)
Now we need to create ImageLayout
s in
BlockLayout
. Input elements are created in an
input
method, so we could could duplicate it calling it
image
…but input
is itself a duplicate of
text
, so this would be a lot of almost-identical methods.
The only part of these methods that differs is the part that computes
the width of the new inline child; most of the rest of the logic is
shared.
Let’s instead refactor the shared code into new methods which
text
, input
, and input
can call.
First, all of these methods need a font to determine how big of a
spaceYes, this is how
real browsers do it too. to leave after the inline; let’s
make a function for that:
def font(style, zoom):
= style["font-weight"]
weight = style["font-style"]
variant = float(style["font-size"][:-2])
size = device_px(size, zoom)
font_size return get_font(font_size, weight, variant)
There’s also shared code that handles line layout; let’s put that
into a new add_inline_child
method. We’ll need parameters
for the layout class to instantiate and a word
parameter
that is only passed for some layout classes.
class BlockLayout:
def add_inline_child(self, node, w, child_class, word=None):
if self.cursor_x + w > self.x + self.width:
self.new_line()
= self.children[-1]
line if word:
= child_class(node, line, self.previous_word, word)
child else:
= child_class(node, line, self.previous_word, frame)
child
line.children.append(child)self.previous_word = child
self.cursor_x += w + font(node.style, self.zoom).measureText(" ")
We can redefine text
and input
in a
satisfying way now:
class BlockLayout:
def text(self, node):
= font(node.style, self.zoom)
node_font for word in node.text.split():
= node_font.measureText(word)
w self.add_inline_child(node, w, TextLayout, word)
def input(self, node):
= device_px(INPUT_WIDTH_PX, self.zoom)
w self.add_inline_child(node, w, InputLayout)
Adding image
is now also straightforward:
class BlockLayout:
def recurse(self, node):
# ...
elif node.tag == "img":
self.image(node)
def image(self, node):
= device_px(node.image.width(), self.zoom)
w self.add_inline_child(node, w, ImageLayout)
Now that we have ImageLayout
nodes in our layout tree,
we’ll be painting DrawImage
commands to our display list
and showing the image on the screen!
But what about our second output modality, screen readers? That’s
what the alt
attribute is for. It works like this:
<img src="https://browser.engineering/im/hes.jpg"
alt="A computer operator using a hypertext editing system in 1969">
Implementing this in AccessibilityNode
is very easy:
class AccessibilityNode:
def __init__(self, node):
else:
# ...
elif node.tag == "img":
self.role = "image"
def build(self):
# ...
elif self.role == "image":
if "alt" in self.node.attributes:
self.text = "Image: " + self.node.attributes["alt"]
else:
self.text = "Image"
As we continue to implement new features for the web platform, we’ll always need to think about how to make features work in multiple modalities.
Videos are similar to images, but demand more bandwidth, time, and
memory; they also have complications like Digital
Rights Management (DRM). The <video>
tag
addresses some of that, with built-in support for advanced video codecs,In video, it’s called a
“codec”, but in images it’s called a “format”–go figure.
DRM, and hardware acceleration. It also provides media controls like a
play/pause button and volume controls.
So far, an image’s size on the screen is its size in pixels, possibly
zoomed.Note that zoom
already may cause an image to render at a size different than its
regular size, even before introducing the features in this
section. But in fact it’s generally valuable for authors
to control the size of embedded content. There are a number of ways to
do this,For example, the
width
and height
CSS properties (not to be
confused with the width
and height
attributes!), which were an exercise in Chapter 13. but
one way is the special width
and height
attributes.Images have
these mostly for historical reasons, because these attributes were
invented before CSS existed.
If both those attributes are present, things are pretty
easy: we just read from them when laying out the element, both in
image
:
class BlockLayout:
def image(self, node):
if "width" in node.attributes:
= device_px(int(node.attributes["width"]), self.zoom)
w else:
= device_px(node.image.width(), self.zoom)
w # ...
And in ImageLayout
:
class ImageLayout(EmbedLayout):
def layout(self):
# ...
= self.node.attributes.get("width")
width_attr = self.node.attributes.get("height")
height_attr = self.node.image.width()
image_width = self.node.image.height()
image_height
if width_attr and height_attr:
self.width = device_px(int(width_attr), self.zoom)
self.img_height = device_px(int(height_attr), self.zoom)
else:
self.width = device_px(image_width, self.zoom)
self.img_height = device_px(image_height, self.zoom)
# ...
This works great, but it has a major flaw: if the ratio of
width
to height
isn’t the same as the
underlying image size, the image ends up stretched in weird ways.
Sometimes that’s on purpose but usually it’s a mistake. So browsers let
authors specify just one of width
and
height
, and compute the other using the image’s aspect
ratio.Despite it
being easy to implement, this feature of real web browsers only appeared
in 2021. Before that, developers resorted to things like the padding-top
hack. Sometimes design oversights take a long time to
fix.
Implementing this aspect ratio tweak is easy:
class ImageLayout(EmbedLayout):
# ...
def layout(self):
# ...
= image_width / image_height
aspect_ratio
if width_attr and height_attr:
# ...
elif width_attr:
self.width = device_px(int(width_attr), self.zoom)
self.img_height = self.width / aspect_ratio
elif height_attr:
self.img_height = device_px(int(height_attr), self.zoom)
self.width = self.img_height * aspect_ratio
else:
# ...
# ...
Your browser should now be able to render this example page correctly.
Our browser computes an aspect ratio from the loaded image
dimensions, but that’s not available before an image loads, which is a
problem in real browsers where images are loaded asynchronously and
where the image size can respond
to layout parameters. Not knowing the aspect ratio can cause the layout to shift when the image loads,
which can be frustrating for users. The aspect-ratio
property is one way web pages can address this issue.
So far, our browser has two kinds of embedded content: images and
input elements. While both are important and widely-used,As are variations like the <canvas>
element. Instead of loading an image from the network, JavaScript can
draw on a <canvas>
element via an API. Unlike images,
<canvas>
element’s don’t have intrinsic sizes, but
besides that they are pretty similar. they don’t offer
quite the customizabilityThere’s actually ongoing
work aimed at allowing web pages to customize what input elements
look like, and it builds on earlier work supporting custom
elements and forms.
This problem is quite challenging, interacting with platform
independence, accessibility, scripting, and styling. and
flexibility that complex embedded content use cases like maps, PDFs,
ads, and social media controls require. So in modern browsers, these are
handled by embedding one web page within another using the
<iframe>
element.
Semantically, an <iframe>
is almost exactly a
Tab
inside a Tab
—it has its own HTML document,
CSS, and scripts. And layout-wise, an <iframe>
is a
lot like the <img>
tag, with width
and
height
attributes. So implementing basic iframes just
requires handling three significant differences:
Iframes have no browser chrome. So any page navigation
has to happen from within the page (either through an
<a>
element or script), or as a side effect of
navigation on the web page that contains the
<iframe>
element.
Iframes can share a rendering event loop.For example, if an iframe has the same origin as the web page that embeds it, then scripts in the iframe can synchronously access the parent DOM. That means that it’d be basically impossible to put that iframe in a different thread or CPU process, and in practice it ends up in the same rendering event loop as a result. In real browsers, cross-origin iframes are often “site isolated”, meaning that the iframe has its own CPU process for security reasons. In our toy browser we’ll just make all iframes (even nested ones—yes, iframes can include iframes!) use the same rendering event loop.
Cross-origin iframes are script-isolated from the containing page. That means that a script in the iframe can’t access the containing page’s variables or DOM, nor can scripts in the containing page access the iframe’s variables or DOM. Same-origin iframes, however, can.
We’ll get to these differences, but for now, let’s start working on
the idea of a Tab
within a Tab
. What we’re
going to do is split the Tab
class into two pieces:
Tab
will own the event loop and script environments,
Frame
s that do the rest.
It’s good to plan out complicated refactors like this in some detail.
A Tab
will:
Browser
and the
Frame
s to handle events.And the new Frame
class will:
Create these two classes and split the methods between them accordingly.
Naturally, every Frame
will need a reference to its
Tab
; it’s also convenient to have access to the parent
frame and the corresponding <iframe>
element:
class Frame:
def __init__(self, tab, parent_frame, frame_element):
self.tab = tab
self.parent_frame = parent_frame
self.frame_element = frame_element
# ...
Now let’s look at how Frame
s are created. The first
place is in Tab
’s load method, which needs to create the
root frame:
class Tab:
def __init__(self, browser):
# ...
self.root_frame = None
def load(self, url, body=None):
self.history.append(url)
# ...
self.root_frame = Frame(self, None, None)
self.root_frame.load(url, body)
Note that the guts of load
now lives in the
Frame
, because the Frame
owns the DOM tree.
The Frame
can also construct child
Frame
s, for <iframe>
elements:
class Frame:
def load(self, url, body=None):
# ...
= [node
iframes for node in tree_to_list(self.nodes, [])
if isinstance(node, Element)
and node.tag == "iframe"
and "src" in node.attributes]
for iframe in iframes:
= resolve_url(iframe.attributes["src"],
document_url self.tab.root_frame.url)
if not self.allowed_request(document_url):
print("Blocked iframe", document_url, "due to CSP")
= None
iframe.frame continue
= Frame(self.tab, self, iframe)
iframe.frame
iframe.frame.load(document_url)# ...
So we’ve now got a tree of frames inside a single tab. But because we will sometimes need direct access to an arbitrary frame, let’s also give each frame an identifier, which I’m calling a window ID:
class Frame:
def __init__(self, tab, parent_frame, frame_element):
# ...
self.window_id = len(self.tab.window_id_to_frame)
self.tab.window_id_to_frame[self.window_id] = self
class Tab:
def __init__(self, browser):
# ...
self.window_id_to_frame = {}
Now that we have frames being created, let’s work on rendering those frames to the screen.
For quite a while, browsers also supported embedded content in the
form of plugins like Java applets or Flash. But there
were performance,
security, and accessibility problems because plugins typically
implemented their own rendering, sandboxing, and UI primitives. Over
time, new APIs have closed the gap between web-native content and
“non-web” plugins,For
example, in the last decade the <canvas>
element has
gained support for hardware-accelerated 3D content, while WebAssembly can run
at near-native speed. and plugins have therefore become
less common. Personally, I think that’s a good thing: the web is about
making information accessible to everyone, and that requires open
standards, including for embedded content.
Rendering is split between the Tab
and its
Frame
s: the Frame
does style and layout, while
the Tab
does accessibility and paint.Why split the rendering
pipeline this way? Because the output of accessibility and paint is
combined across all frames—a single display list, and a single
accessibility tree—while the DOMs and layout trees don’t
intermingle. We’ll need to implement that split, and also
add code to trigger each Frame
’s rendering from the
Tab
.
Let’s start with splitting the rendering pipeline. The main method
here is still the Tab
’s render
method, which
first calls render
on each frame to do style and
layout:
class Tab:
def render(self):
self.measure_render.start_timing()
for id, frame in self.window_id_to_frame.items():
frame.render()
if self.needs_accessibility:
# ...
if self.pending_hover:
# ...
Note that the needs_accessibility
,
pending_hover
, and other flags are all still on the
Tab
, because they relate to the Tab
’s part of
rendering. Meanwhile, style and layout happen in the Frame
now:
class Frame:
def __init__(self, tab, parent_frame, frame_element):
# ...
self.needs_style = False
self.needs_layout = False
def set_needs_render(self):
self.needs_style = True
self.tab.set_needs_accessibility()
self.tab.set_needs_paint()
def set_needs_layout(self):
self.needs_layout = True
self.tab.set_needs_accessibility()
self.tab.set_needs_paint()
def render(self):
if self.needs_style:
# ...
if self.needs_layout:
# ...
Again, these dirty bits move to the Frame
because they
relate to the frame’s part of rendering.
Yet unlike images, iframes have no intrinsic
size–the layout size of an <iframe>
element
does not depend on its content.There was an attempt to provide iframes with intrinsic
sizing in the past, but it was removed from the
HTML specification when no browser implemented it. This may change in the
future, as there are good use cases for a “seamless” iframe whose
layout coordinates with its parent frame. That means
there’s a crucial extra bit of communication that needs to happen
between the parent and child frames: how wide and tall should a frame be
laid out? This is defined by the attributes and CSS of the
iframe
element:
class BlockLayout:
# ...
def recurse(self, node):
# ...
elif node.tag == "iframe" and \
"src" in node.attributes:
self.iframe(node)
# ...
def iframe(self, node):
if "width" in self.node.attributes:
= device_px(int(self.node.attributes["width"]),
w self.zoom)
else:
= IFRAME_WIDTH_PX + device_px(2, self.zoom)
w self.add_inline_child(node, w, IframeLayout, self.frame)
The IframeLayout
layout code is also similar, inheriting
from EmbedLayout
, but without the aspect ratio code:
class IframeLayout(EmbedLayout):
def __init__(self, node, parent, previous, parent_frame):
super().__init__(node, parent, previous, parent_frame)
def layout(self):
# ...
if width_attr:
self.width = device_px(int(width_attr) + 2, self.zoom)
else:
self.width = device_px(IFRAME_WIDTH_PX + 2, self.zoom)
if height_attr:
self.height = device_px(int(height_attr) + 2, self.zoom)
else:
self.height = device_px(IFRAME_HEIGHT_PX + 2, self.zoom)
Note that if the width
isn’t specified, it uses a default
value, chosen a long time ago based on the average screen sizes of
the day:
= 300
IFRAME_WIDTH_PX = 150 IFRAME_HEIGHT_PX
The extra 2 pixels (corrected for zoom, of course) provide room for a border later on.
Now, note that this code is run in the parent frame. We need to get this width and height over to the child frame, so it can know its width and height during layout. So let’s add a field for that in the child frame:
class Frame:
def __init__(self, tab, parent_frame, frame_element):
# ...
self.frame_width = 0
self.frame_height = 0
And we can set those when the parent frame is laid out:
class IframeLayout(EmbedLayout):
def layout(self):
# ...
if self.node.frame:
self.node.frame.frame_height = \
self.height - device_px(2, self.zoom)
self.node.frame.frame_width = \
self.width - device_px(2, self.zoom)
The conditional is only there to handle the (unusual) case of an iframe blocked due by CSP.
The root frame, of course, fills the whole window:
class Tab:
def load(self, url, body=None):
# ...
self.root_frame.frame_width = WIDTH
self.root_frame.frame_height = HEIGHT - CHROME_PX
Note that there’s a tricky dependency order here. We need the parent frame to do layout before the child frame, so the child frame has an up-to-date width and height when it does layout. That order is guaranteed for us by Python (3.7 or later), where dictionaries are sorted by insertion order, but if you’re following along in another language, you might need to sort frames before rendering them.
Alright, we’ve now got frames styled and laid out, and just need to
paint them. Unlike layout and style, all the frames in a tab produce a
single, unified display list, so we’re going to need to work
recursively. We’ll have the Tab
paint the root
Frame
:
class Tab:
def render(self):
if self.needs_paint:
self.display_list = []
self.root_frame.paint(self.display_list)
self.needs_paint = False
We’ll then have the Frame
call the layout tree’s
paint
method:
class Frame:
def paint(self, display_list):
self.document.paint(display_list)
Most of the layout tree’s paint
methods don’t need to
change, but to paint an IframeLayout
, we’ll need to paint
the child frame:
class IframeLayout(EmbedLayout):
def paint(self, display_list):
= []
frame_cmds
= skia.Rect.MakeLTRB(
rect self.x, self.y,
self.x + self.width, self.y + self.height)
= self.node.style.get("background-color",
bgcolor "transparent")
if bgcolor != "transparent":
= device_px(float(
radius self.node.style.get("border-radius", "0px")[:-2]),
self.zoom)
frame_cmds.append(DrawRRect(rect, radius, bgcolor))
if self.node.frame:
self.node.frame.paint(frame_cmds)
Note the last line, where we recursively paint the child frame.
Before putting those commands in the display list, though, we need to add a border, clip content outside of it, and transform the coordinate system:
class IframeLayout(EmbedLayout):
def paint(self, display_list):
# ...
= device_px(1, self.zoom)
diff = (self.x + diff, self.y + diff)
offset = [Transform(offset, rect, self.node, frame_cmds)]
cmds = skia.Rect.MakeLTRB(
inner_rect self.x + diff, self.y + diff,
self.x + self.width - diff, self.y + self.height - diff)
= paint_visual_effects(self.node, cmds, inner_rect)
cmds self.node, cmds, rect, self.zoom)
paint_outline( display_list.extend(cmds)
The Transform
shifts over the child frame contents so
that its top-left corner starts in the right place,This book doesn’t go into the
details of the CSS
box model, but the width
and height
attributes of an iframe refer to the content box, and adding
the border width yields the border box. Note also that the clip
we’re appling is an overflow clip, which is not quite the same as an
iframe clip, and the differences have to do with the box model as well.
As a result, what we’ve implemented is somewhat incorrect with respect
to all of those factors. while paint_outline
adds the border and paint_visual_effects
clips content
outside the viewable area of the iframe. Conveniently, we’ve already
implemented all of these features and can simply trigger them from our
browser CSS file:Another
good reason to delay iframes and images until chapter 15
perhaps?
iframe {outline: 1px solid black;
overflow: clip;
}
Finally, let’s also add iframes to the accessibility tree. Like the
display list, the accessibility tree is global across all frames. We can
have iframes create iframe
nodes:
class AccessibilityNode:
def __init__(self, node):
else:
elif node.tag == "iframe":
self.role = "iframe"
To build
such a node, we just recurse into the
frame:
class AccessibilityNode:
def build_internal(self, child_node):
if isinstance(child_node, Element) \
and child_node.tag == "iframe" and child_node.frame:
= AccessibilityNode(child_node.frame.nodes)
child # ...
So we’ve now got iframes showing up on the screen. The next step is interacting with them.
Before iframes, there were the <frameset>
and <frame>
elements. A
<frameset>
replaces the <body>
tag
and splits browser window screen among multiple
<frame>
s; this was an early alternative layout
algorithm to the one presented in this book. Frames had confusing
navigation and accessibility, and lacked the flexibility of
<iframe>
s, so aren’t used much these days.
Now that we’ve got iframes rendering to the screen, let’s close the loop with user input. We want to add support for clicking on things inside an iframe, and also for tabbing around or scrolling inside one.
At a high level, event handlers just delegate to the root frame:
class Tab:
def click(self, x, y):
self.render()
self.root_frame.click(x, y)
When an iframe is clicked, it passes the click through to the child frame, and immediately return afterwards, because iframes capture click events:
class Frame:
def click(self, x, y):
# ...
while elt:
# ...
elif elt.tag == "iframe":
= x - elt.layout_object.x
new_x = y - elt.layout_object.y
new_y
elt.frame.click(new_x, new_y)return
Now, clicking on <a>
elements will work, which
means that you can now cause a frame to navigate to a new page. And
because a Frame
has all the loading and navigation logic
that Tab
used to have, it just works without any more
changes!
You should now be able to load this example. Repeatedly clicking on the link will add another recursive iframe.
Let’s get the other interactions working as well, starting with
focusing an element. You can focus on only one element per tab,
so we will still store the focus
on the Tab
,
but we’ll need to store the frame the focused element is on too:
class Tab:
def __init__(self, browser):
self.focus = None
self.focused_frame = None
When a frame tries to focus on an element, it sets itself as the focused frame, but before it does that, it needs to un-focus the previously-focused frame:
class Frame:
def focus_element(self, node):
# ...
if self.tab.focused_frame and self.tab.focused_frame != self:
self.tab.focused_frame.set_needs_render()
self.tab.focused_frame = self
# ...
We need to re-render the previously-focused frame frame so that it stops drawing the focus outline.
Another interaction is pressing Tab
to cycle through
focusable elements in the current frame. Let’s move the
advance_tab
logic into Frame
and just dispatch
to it from the Tab
:
class Tab:
def advance_tab(self):
= self.focused_frame or self.root_frame
frame frame.advance_tab()
Do the same exact thing for keypress
and
enter
, which are used for interacting with text inputs and
buttons.
Another big interaction we need to support is scrolling. We’ll store
the scroll offset in each Frame
:
class Frame:
def __init__(self, tab, parent_frame, frame_element):
self.scroll = 0
Now, as you might recall from Chapter
13, scrolling happens both inside Browser
and inside
Tab
, to reduce latency. That was already quite complicated,
so to keep things simple, we won’t support both for non-root iframes.
We’ll need a new commit parameter so the browser thread knows whether
the root frame is focused:
class CommitData:
def __init__(self, url, scroll, root_frame_focused, height,
display_list, composited_updates, accessibility_tree, focus):# ...
self.root_frame_focused = root_frame_focused
class Tab:
def run_animation_frame(self, scroll):
= not self.focused_frame or \
root_frame_focused self.focused_frame == self.root_frame
# ...
= CommitData(
commit_data # ...
root_frame_focused,# ...
)# ...
The Browser
thread will save this information in
commit
and use it when the user requests a scroll:
class Browser:
def commit(self, tab, data):
# ...
self.root_frame_focused = data.root_frame_focused
def handle_down(self):
self.lock.acquire(blocking=True)
if self.root_frame_focused:
# ...
= self.tabs[self.active_tab]
active_tab = Task(active_tab.scrolldown)
task
active_tab.task_runner.schedule_task(task)self.lock.release()
When a tab is asked to scroll, it then scrolls the focused frame:
class Tab:
def scrolldown(self):
= self.focused_frame or self.root_frame
frame
frame.scrolldown()self.set_needs_paint()
There’s one more subtlety to scrolling. After we scroll, we want to
clamp the scroll position, to prevent the user scrolling past
the last thing on the page. Right now clamp_scroll
uses the
window height to determine the maximum scroll amount; let’s move that
function inside Frame
so it can use the current frame’s
height:
class Frame:
def scrolldown(self):
self.scroll = self.clamp_scroll(self.scroll + SCROLL_STEP)
def clamp_scroll(self, scroll):
= math.ceil(self.document.height)
height = height - self.frame_height
maxscroll return max(0, min(scroll, maxscroll))
Make sure to use the new clamp_scroll
in place of the
old one, everywhere in Frame
:
class Frame:
def scroll_to(self, elt):
# ...
self.scroll = self.clamp_scroll(new_scroll)
Scroll clamping can also come into play if a layout causes a page’s
maximum height to shrink. You’ll need to move the scroll clamping logic
out of Tab
’s run_animation_frame
method and
into the Frame
’s render
to handle this:
class Frame:
def render(self):
= self.clamp_scroll(self.scroll)
clamped_scroll if clamped_scroll != self.scroll:
self.scroll_changed_in_frame = True
self.scroll = clamped_scroll
There’s also a set of accessibility hover interactions that we need to support. This is hard, because the accessibility interactions happen in the browser thread, which has limited information:
The accessibility tree doesn’t know where the iframe is, so it doesn’t know how to transform the hover coordinates when it goes into a frame.
It also doesn’t know how big the iframe is, so it doesn’t ignore
things that are clipped outside an iframe’s bounds.Observe that frame-based
click
already works correctly, because we don’t recurse
into iframes unless the click intersects the iframe
element’s bounds. And before iframes, we didn’t need to do that, because
the SDL window system already did it for us.
It also doesn’t know how far a frame has scrolled, so it doesn’t adjust for scrolled frames.
We’ll make a subclass of AccessibilityNode
to store this
information:
class FrameAccessibilityNode(AccessibilityNode):
pass
We’ll create one of those below each iframe
node:
class AccessibilityNode:
def build_internal(self, child_node):
if isinstance(child_node, Element) \
and child_node.tag == "iframe" and child_node.frame:
= FrameAccessibilityNode(child_node) child
Hit testing now has to become recursive, so that
FrameAccessibilityNode
can adjust for the iframe
location:
class AccessibilityNode:
def hit_test(self, x, y):
= None
node if self.intersects(x, y):
= self
node for child in self.children:
= child.hit_test(x, y)
res if res: node = res
return node
Hit testing FrameAccessibilityNodes
will use the frame’s
bounds to ignore clicks outside the frame bounds, and adjust clicks
against the frame’s coordinates:
class FrameAccessibilityNode(AccessibilityNode):
def hit_test(self, x, y):
if not self.intersects(x, y): return
= x - self.bounds.x()
new_x = y - self.bounds.y() + self.scroll
new_y = self
node for child in self.children:
= child.hit_test(new_x, new_y)
res if res: node = res
return node
Hit testing should now work, but the bounds of the hovered node when
drawn to the screen are still wrong. For that, we’ll need a method that
returns the absolute screen rect of an AccessibilityNode
.
And that method in turn needs parent pointers to walk up the
accessibility tree, so let’s add that first:
class AccessibilityNode:
def __init__(self, node, parent = None):
# ...
self.parent = parent
def build_internal(self, child_node):
# ...
= FrameAccessibilityNode(child_node, self)
child else:
= AccessibilityNode(child_node, self) child
And now the method to map to absolute coordinates:
class AccessibilityNode:
def absolute_bounds(self):
= skia.Rect.MakeXYWH(
rect self.bounds.x(), self.bounds.y(),
self.bounds.width(), self.bounds.height())
= self
obj while obj:
obj.map_to_parent(rect)= obj.parent
obj return rect
This method depends on calls map_to_parent
to adjust the
bounds. For most accessibility nodes we don’t need to do anything,
because they are in the same coordinate space as their parent:
class AccessibilityNode:
def map_to_parent(self, rect):
pass
A FrameAccessibilityNode
, on the other hand, adjusts for
the iframe’s position:
class FrameAccessibilityNode(AccessibilityNode):
def map_to_parent(self, rect):
self.bounds.x(), self.bounds.y() - self.scroll) rect.offset(
You should now be able to hover on nodes and have them read out by our accessibility subsystem.
Alright, we’ve now got all of our browser’s forms of user interaction properly recursing through the frame tree. It’s time to add more capabilities to iframes.
Our browser can only scroll the root frame on the browser thread, but real browsers have put in a lot of work to make scrolling happen on the browser thread as much as possible, including for iframes. The hard part is handling the many obscure combinations of containing blocks, stacking orders, scroll bars, transforms, and iframes: with scrolling on the browser thread, all of these complex interactions have be communicated from the main thread to the browser thread, and correctly interpreted by both sides.
We’ve now got users interacting with iframes—but what about scripts
interacting with them? Of course, each frame can already run
scripts—but right now, each Frame
has its own
JSContext
, so these scripts can’t really interact with each
other. Instead same-origin iframes should run in the same
JavaScript context and should be able to access each other’s globals,
call each other’s functions, and modify each other’s DOMs. Let’s
implement that.
For two frames’ JavaScript environments to interact, we’ll need to
put them in the same JSContext
. So, instead of each
Frame
having a JSContext
of its own, we’ll
want to store JSContext
s on the Tab
, in a
dictionary that maps origins to JS contexts:
class Tab:
def __init__(self, browser):
# ...
self.origin_to_js = {}
def get_js(self, origin):
if origin not in self.origin_to_js:
self.origin_to_js[origin] = JSContext(self, origin)
return self.origin_to_js[origin]
Each Frame
will then ask the Tab
for its
JavaScript context:
class Frame:
def load(self, url, body=None):
# ...
self.js = self.tab.get_js(url_origin(url))
# ...
So we’ve got multiple pages’ scripts using one JavaScript context.
But now we’ve got to keep their variables in their own namespaces
somehow. The key is going to be the window
global, of type
Window
. In the browser, this refers to the global
object, and instead of writing a global variable like
a
, you can always write window.a
instead.There are various
proposals to expose multiple global namespaces as a JavaScript API.
It would definitely be convenient to have that capability in this
chapter, to avoid this restriction! To keep our
implementation simple, in our browser, scripts will always need to
reference variable and functions via window
.This also means that all
global variables in a script need to do the same, even if they are not
browser APIs. We’ll need to do the same in our
runtime:
window.console = { log: function(x) { call_python("log", x); } }
// ...
window.Node = function(handle) { this.handle = handle; }
// ...
Do the same for every function or variable in the
runtime.js
file. If you miss one, you’ll get errors like
this:
_dukpy.JSRuntimeError: ReferenceError: identifier 'Node' undefined
duk_js_var.c:1258
eval src/pyduktape.c:1 preventsyield
Then you’ll need to go find where you forgot to put
window.
in front of Node
. You’ll also need to
modify EVENT_DISPATCH_CODE
to prefix classes with
window
:
= \
EVENT_DISPATCH_CODE "new window.Node(dukpy.handle)" + \
".dispatchEvent(new window.Event(dukpy.type))"
Demos from previous chapters will need to be similarly fixed up
before they work. For example, setTimeout
might need to
change to window.setTimeout
.
To get multiple frames’ scripts to play nice inside one JavaScript
context, we’ll create multiple Window
objects:
window_1
, window_2
, and so on. Before running
a frame’s scripts, we’ll set window
to that frame’s
Window
object, so that the script uses the correct
Window
.Some
JavaScript engines support a simple API for changing the global object,
but the DukPy library that we’re using isn’t one of them. There
is a standard JavaScript operator called with
which sort of does this, but the rules are complicated and not quite
what we need here. It’s also not recommended these
days.
So to begin with, let’s define the Window
class when we
create a JSContext
:
class JSContext:
def __init__(self, tab, url_origin):
self.url_origin = url_origin
# ...
self.interp.evaljs("function Window(id) { this._id = id };")
Now, when a frame is created and wants to use a
JSContext
, it needs to ask for a window
object
to be created first:
class JSContext:
def add_window(self, frame):
= "var window_{} = new Window({});".format(
code
frame.window_id, frame.window_id)self.interp.evaljs(code)
Before running any JavaScript, we’ll want to change which window the
window
global refers to:
class JSContext:
def wrap(self, script, window_id):
return "window = window_{}; {}".format(window_id, script)
We can use this to, for example, set up the initial runtime
environment for each Frame
:
class JSContext:
def add_window(self, frame):
# ...
with open("runtime15.js") as f:
self.interp.evaljs(self.wrap(f.read(), frame.window_id))
We’ll need to call wrap
any time we use
evaljs
, which also means we’ll need to add a window ID
argument to a lot of methods. For example, in run
we’ll add
a window_id
parameter:
class JSContext:
def run(self, script, code, window_id):
try:
= self.wrap(code, window_id)
code print("Script returned: ", self.interp.evaljs(code))
except dukpy.JSRuntimeError as e:
print("Script", script, "crashed", e)
And we’ll pass that argument from the load
method:
class Frame:
def load(self, url, body=None):
for script in scripts:
# ...
= Task(self.js.run, script_url, body,
task self.window_id)
# ...
The same holds for various dispatching APIs. For example, to dispatch
an event, we’ll need the window_id
:
class JSContext:
def dispatch_event(self, type, elt, window_id):
# ...
= self.wrap(EVENT_DISPATCH_CODE, window_id)
code = self.interp.evaljs(code,
do_default type=type, handle=handle)
Likewise, we’ll need to pass a window ID argument in
click
, submit_form
, and keypress
;
I’ve omitted those code fragments. Note that you should have modified
your runtime.js
file to store the LISTENERS
on
the window
object, meaning each Frame
will
have its own set of event listeners to dispatch to:
window.LISTENERS = {}
// ...
window.Node.prototype.dispatchEvent = function(evt) {
var type = evt.type;
var handle = this.handle
var list = (window.LISTENERS[handle] &&
window.LISTENERS[handle][type]) || [];
for (var i = 0; i < list.length; i++) {
.call(this, evt);
list[i]
}return evt.do_default;
}
Do the same for requestAnimationFrame
, passing around a
window ID and wrapping the code so that it correctly references
window
.
For calls from JavaScript into the browser, we’ll need JavaScript to pass in the window ID it’s calling from:
window.document = { querySelectorAll: function(s) {
var handles = call_python("querySelectorAll", s, window._id);
return handles.map(function(h) { return new window.Node(h) });
}}
Then on the browser side we can use that window ID to get the
Frame
object:
class JSContext:
def querySelectorAll(self, selector_text, window_id):
= self.tab.window_id_to_frame[window_id]
frame = CSSParser(selector_text).selector()
selector = [node for node
nodes in tree_to_list(frame.nodes, [])
if selector.matches(node)]
return [self.get_handle(node) for node in nodes]
We’ll need something similar in innerHTML
and
style
because we need to set_needs_render
on
the relevant Frame
.
Finally, for setTimeout
and XMLHttpRequest
,
which involve a call from JavaScript into the browser and later a call
from the browser into JavaScript, we’ll likewise need to pass in a
window ID from JavaScript, and use that window ID when calling back into
JavaScript.
I’ve omitted many of the code changes in this section because they
are quite repetitive. You can find all of the needed locations by
searching your codebase for evaljs
; once you’ve got scripts
working again, let’s make it possible for scripts in different frames to
interact.
Same-origin iframes can access each other’s state, but cross-origin
ones can’t. But the obscure domain
property lets an iframe change its origin, moving itself in or out of
same-origin status in some cases. I personally think it’s a misfeature:
it’s hard to implement securely, and interferes with various sandboxing
techniques; I hope it is eventually removed from the web. Instead, there
are various
headers where an iframe can opt into less sharing in order to get
better security and performance.
We’ve now managed to run multiple Frame
s’ worth of
JavaScript in a single JSContext
, and isolated them
somewhat so that they don’t mess with each others’ state. But the whole
point of this exercise is to allow some interaction between
same-origin frames. Let’s do that now.
The simplest way two frames can interact is that they can get access
to each other’s state via the parent
attribute on the
Window
object. If the two frames have the same origin, that
lets one frame calls methods, access variables, and modify browser state
for the other frame. Because we’ve had these same-origin frames share a
JSContext
, this isn’t too hard to implement. Basically,
we’ll need a way to go from a window ID to its parent frame’s window
ID:
class JSContext:
# ...
def parent(self, window_id):
= \
parent_frame self.tab.window_id_to_frame[window_id].parent_frame
if not parent_frame:
return None
return parent_frame.window_id
On the JavaScript side, we now need to look up the
Window
object given its window ID. There are lots of ways
you could do this, but the easiest is to have a global map:
class JSContext:
def __init__(self, tab, url_origin):
# ...
self.interp.evaljs("WINDOWS = {}")
We’ll add each window to the global map as it’s created:
class JSContext:
def add_window(self, frame):
# ...
self.interp.evaljs("WINDOWS[{}] = window_{};".format(
frame.window_id, frame.window_id))
Now window.parent
can look up the correct
Window
object in this global map:
Object.defineProperty(Window.prototype, 'parent', {
configurable: true,
get: function() {
var parent_id = call_python('parent', window._id);
if (parent_id != undefined) {
var parent = WINDOWS[parent_id];
if (parent === undefined) parent = new Window(parent_id);
return parent;
}
} });
Note that it’s possible for the lookup in WINDOWS
to
fail, if the parent frame is not in the same origin as the current one
and therefore isn’t running in the same JSContext
. In that
case, this code return a fresh Window
object with that id.
But iframes are not allowed to access each others’ documents across
origins (or call various other APIs that are unsafe), so add a method
that checks for this situation and raises an exception:
class JSContext:
def throw_if_cross_origin(self, frame):
if url_origin(frame.url) != self.url_origin:
raise Exception(
"Cross-origin access disallowed from script")
Then use this method in all JSContext
methods that
access documents:Note
that in a real browser this is woefully inadequate security. A real
browser would need to very carefully lock down the entire
runtime.js
code and audit every single JavaScript API with
a fine-toothed comb.
class JSContext:
def querySelectorAll(self, selector_text, window_id):
= self.tab.window_id_to_frame[window_id]
frame self.throw_if_cross_origin(frame)
def innerHTML_set(self, handle, s, window_id):
= self.tab.window_id_to_frame[window_id]
frame self.throw_if_cross_origin(frame)
def style_set(self, handle, s, window_id):
= self.tab.window_id_to_frame[window_id]
frame self.throw_if_cross_origin(frame)
So via parent
, same-origin iframes can communicate. But
what about cross-origin iframes? It would be insecure to let them access
each other’s variables or call each other’s methods, so instead browsers
allow a form of message
passing, a technique for structured communication between two
different event loops that doesn’t require any shared state or
locks.
Message-passing in JavaScript works like this: you call the postMessage
API on the Window
object you’d like to talk to, with
the message itself as the first parameter and *
as the
second:The second
parameter has to do with origin restrictions; see the
exercises.
window.parent.postMessage("...", '*')
This will send the first argumentIn a real browser, you can also pass data that is not a
string, such as numbers and objects. It works via a
serialization algorithm called structured
cloning, which converts most JavaScript objects (though not, for
example, DOM nodes) to a sequence of bytes that the receiver frame can
convert back into a JavaScript object. DukPy doesn’t support structured
cloning natively for objects, so our browser won’t support this
either. to the parent frame, which can receive the message
by handling the message
event on its Window
object:
window.addEventListener("message", function(e) {
console.log(e.data);
});
Note that in this second code snippet, window
is the
receiving Window
, a different Window
from the
window
in the first snippet.
Let’s implement postMessage
, starting on the
receiver side. Since this event happens on the
Window
, not on a Node
, we’ll need a new
WINDOW_LISTENERS
array:
window.WINDOW_LISTENERS = {}
Each listener will be called with a MessageEvent
object:
window.MessageEvent = function(data) {
this.type = "message";
this.data = data;
}
The event listener and dispatching code is the same as for
Node
, except it’s on Window
and uses
WINDOW_LISTENERS
. You can just duplicate those methods:
Window.prototype.addEventListener = function(type, listener) {
// ...
}
Window.prototype.dispatchEvent = function(evt) {
// ...
}
That’s everything on the receiver side; now let’s do the sender side.
First, let’s implement the postMessage
API itself. Note
that this
is the receiver or target window:
Window.prototype.postMessage = function(message, origin) {
call_python("postMessage", this._id, message, origin)
}
In the browser, postMessage
schedules a task on the
Tab
:
class JSContext:
def postMessage(self, target_window_id, message, origin):
= Task(self.tab.post_message,
task
message, target_window_id)self.tab.task_runner.schedule_task(task)
Scheduling the task is necessary because postMessage
is
an asynchronous API; sending a synchronous message might involve
synchronizing multiple JSContext
s or even multiple
processes, which would add a lot of overhead and probably result in
deadlocks.
The task finds the target frame and call a dispatch method:
class Tab:
def post_message(self, message, target_window_id):
= self.window_id_to_frame[target_window_id]
frame
frame.js.dispatch_post_message( message, target_window_id)
Which then calls into the JavaScript dispatchEvent
method we just wrote:
= \
POST_MESSAGE_DISPATCH_CODE "window.dispatchEvent(new window.MessageEvent(dukpy.data))"
class JSContext:
def dispatch_post_message(self, message, window_id):
self.interp.evaljs(
self.wrap(POST_MESSAGE_DISPATCH_CODE, window_id),
=message) data
You should now be able to use postMessage
to send
messages between frames,In this demo,
for example, you should see “Message received from iframe: This is the
contents of postMessage.” printed to the console. (This particular
example uses a same-origin postMessage. You can test cross-origin
locally by starting two local HTTP servers on different ports, then
changing the URL of the example15-img.html
iframe document
to point to the second port.) including cross-origin
frames running in different JSContext
s, in a secure
way.
Ads are commonly served with iframes and are big users of the web’s sandboxing, embedding, and animation primitives. This means they are a challenging source of performance and user experience problems. For example, ad analytics are important to the ad economy, but involve running a lot of code and measuring lots of data. Some web APIs, such as Intersection Observer, basically exist to make analytics computations more efficient. And, of course, the most popular browser extensions are probably ad blockers.
Iframes add a whole new layer of security challenges atop what we discussed in Chapter 10. The power to embed one web page into another creates a commensurate security risk when the two pages don’t trust each other—both in the case of embedding an untrusted page into your own page, and the reverse, where an attacker embeds your page into their own, malicious one. In both cases, we want to protect your page from any security or privacy risks caused by the other frame.
The starting point is that cross-origin iframes can’t access each other directly through JavaScript. That’s good—but what if a bug in the JavaScript engine, like a buffer overrun, lets an iframe circumvent those protections? Unfortunately, bugs like this are common enough that browsers have to defend against them. For example, browsers these days run frames from different origins in different operating system processes, and use operating system features to limit how much access those processes have.
Other parts of the browser mix content from multiple frames, like our
browser’s Tab
-wide display list. That means that a bug in
the rasterizer could allow one frame to take over the rasterizer and
then read data that ultimately came from another frame. This might seem
like a rather complex attack, but it’s worth defending against, so
modern browsers use sandboxing
techniques to prevent it. For example, Chromium can place the rasterizer
in its own process and use a Linux feature called seccomp
to limit what system calls that process can make. Even if a bug
compromised the rasterizer, that rasterizer wouldn’t be able to
exfiltrate data over the network, preventing private date from
leaking.
These isolation and sandboxing features may seem “straightforward”, in the same sense that the browser thread we added in Chapter 13 is “straightforward”. In practice, the many browser APIs mean the implementation is full of subtleties and ends up being extremely complex. Chromium, for example, took many years to ship the first implementation of site isolation.
Site isolation has become much more important recent years, due to the CPU cache timing attacks called spectre and meltdown. In short, these attacks allow an attacker to read arbitrary locations in memory—including another frame’s data, if the two frames are in the same process—by measuring the time certain operations take. Placing sensitive content in different CPU processes (which come with their own memory address spaces) is a good protection against these attacks.
That said, these kinds of timing attacks can be subtle, and
there are doubtless more that haven’t been discovered yet. To try to
dull this threat, browsers currently prevent access to
high-precision timers that can provide the accurate timing data
typically required for timing attacks. For example, browsers reduce the
accuracy of APIs like Date.now
or
setTimeout
.
Worse yet, there are browser APIs that don’t seem like timers but can
be used as such.For
example, the SharedArrayBuffer
API lets two JavaScript threads run concurrently and share memory, which
can be used to construct
a clock. These API are useful, so browsers don’t quite
want to remove it, but there is also no way to make it “less accurate”,
since it’s not primarily a clock anyway. Browsers now require certain
optional HTTP headers to be present in the parent and child
frames’ HTTP responses in order to allow use of
SharedArrayBuffer
, though this is not a perfect
solution.
The SharedArrayBuffer
issue caused problems when I added
JavaScript support to the embedded browser widgets on this website.
I was using SharedArrayBuffer
to allow synchronous calls
from a JSContext
to the browser, and that required APIs
that browsers restrict for security reasons. Setting the security
headers wouldn’t work, because Chapter
14 embeds a Youtube video, and YouTube doesn’t send those headers.
In the end, I worked around the issue by not embedding the browser
widget and asking the reader to open
a new browser window.
This chapter introduced how the browser handles embedded content use cases like images and iframes. Reiterating the main points:
Non-HTML embedded content—images, video, canvas, iframes, input elements, and plugins—can be embedded in a web page.
Embedded content comes with its own performance concerns—like image decoding time—and necessitates custom optimizations.
Iframes are a particularly important kind of embedded content, having over time replaced browser plugins as the standard way to easily embed complex content into a web page.
Iframes introduce all the complexities of the web—rendering, event handling, navigation, security—into the browser’s handling of embedded content. However, this complexity is justified, because they enable important cross-origin use cases like ads, video, and social media buttons.
And as we hope you saw in this chapter, none of these features are too difficult to implement, though—as you’ll see in the exercises below—implementing them well requires a lot of attention to detail.
The complete set of functions, classes, and methods in our browser should now look something like this:
def print_tree(node, indent)
class Text:
def __init__(text, parent)
def __repr__()
class Element:
def __init__(tag, attributes, parent)
def __repr__()
def resolve_url(url, current)
def tree_to_list(tree, list)
INHERITED_PROPERTIES
def layout_mode(node)
COOKIE_JAR
def url_origin(url)
def draw_text(canvas, x, y, text, font, color)
def get_font(size, weight, style)
def linespace(font)
def parse_blend_mode(blend_mode_str)
CHROME_PX
SCROLL_STEP
class MeasureTime:
def __init__(name)
def start_timing()
def stop_timing()
def text()
def diff_styles(old_style, new_style)
class CompositedLayer:
def __init__(skia_context, display_item)
def can_merge(display_item)
def add(display_item)
def composited_bounds()
def absolute_bounds()
def raster()
def __repr__()
def absolute_bounds(display_item)
def absolute_bounds_for_obj(obj)
class DrawCompositedLayer:
def __init__(composited_layer)
def execute(canvas)
def __repr__()
class Task:
def __init__(task_code)
def run()
class TaskRunner:
def __init__(tab)
def schedule_task(task)
def set_needs_quit()
def clear_pending_tasks()
def start_thread()
def run()
def handle_quit()
class SingleThreadedTaskRunner:
def __init__(tab)
def schedule_task(callback)
def run_tasks()
def clear_pending_tasks()
def start_thread()
def set_needs_quit()
def run()
def clamp_scroll(scroll, tab_height)
def add_parent_pointers(nodes, parent)
class DisplayItem:
def __init__(rect, children, node)
def is_paint_command()
def map(rect)
def add_composited_bounds(rect)
class DrawText:
def __init__(x1, y1, text, font, color)
def is_paint_command()
def execute(canvas)
def __repr__()
class DrawLine:
def __init__(x1, y1, x2, y2)
def is_paint_command()
def execute(canvas)
def __repr__()
def paint_visual_effects(node, cmds, rect)
WIDTH
HEIGHT
INPUT_WIDTH_PX
REFRESH_RATE_SEC
HSTEP
VSTEP
SETTIMEOUT_CODE
XHR_ONLOAD_CODE
class Transform:
def __init__(translation, rect, node, children)
def execute(canvas)
def map(rect)
def clone(children)
def __repr__()
ANIMATED_PROPERTIES
class SaveLayer:
def __init__(sk_paint, node, children, should_save)
def execute(canvas)
def clone(children)
def __repr__()
def parse_color(color)
def draw_rect(canvas, l, t, r, b, fill_color, border_color, width)
class DrawRRect:
def __init__(rect, radius, color)
def is_paint_command()
def execute(canvas)
def print(indent)
def __repr__()
def is_focused(node)
def paint_outline(node, cmds, rect, zoom)
def has_outline(node)
def device_px(css_px, zoom)
def cascade_priority(rule)
def style(node, rules, tab)
def is_focusable(node)
def get_tabindex(node)
def announce_text(node, role)
def speak_text(text)
class CSSParser:
def __init__(s, internal)
def whitespace()
def literal(literal)
def word()
def until_char(chars)
def pair(until)
def ignore_until(chars)
def body()
def simple_selector()
def selector()
def media_query()
def parse()
class DrawOutline:
def __init__(rect, color, thickness)
def is_paint_command()
def execute(canvas)
def __repr__()
def main_func(args)
class Browser:
def __init__()
def render()
def commit(tab, data)
def set_needs_animation_frame(tab)
def set_needs_raster()
def set_needs_composite()
def set_needs_accessibility()
def set_needs_draw()
def composite()
def clone_latest(visual_effect, current_effect)
def paint_draw_list()
def update_accessibility()
def composite_raster_and_draw()
def schedule_animation_frame()
def handle_down()
def handle_tab()
def focus_addressbar()
def clear_data()
def set_active_tab(index)
def go_back()
def cycle_tabs()
def toggle_accessibility()
def speak_node(node, text)
def speak_document()
def toggle_mute()
def is_muted()
def toggle_dark_mode()
def handle_click(e)
def handle_hover(event)
def handle_key(char)
def schedule_load(url, body)
def handle_enter()
def increment_zoom(increment)
def reset_zoom()
def load(url)
def load_internal(url)
def raster_tab()
def raster_chrome()
def draw()
def handle_quit()
def request(url, top_level_url, payload)
class DrawImage:
def __init__(image, rect, quality)
def execute(canvas)
def __repr__()
class DocumentLayout:
def __init__(node, frame)
def layout(width, zoom)
def paint(display_list, dark_mode, scroll)
def __repr__()
def font(style, zoom)
class BlockLayout:
def __init__(node, parent, previous, frame)
def layout()
def recurse(node)
def new_line()
def add_inline_child(node, w, child_class, frame, word)
def text(node)
def input(node)
def image(node)
def iframe(node)
def paint(display_list)
def __repr__()
class EmbedLayout:
def __init__(node, parent, previous, frame)
def get_ascent(font_multiplier)
def get_descent(font_multiplier)
def layout()
class InputLayout:
def __init__(node, parent, previous, frame)
def layout()
def paint(display_list)
def __repr__()
class LineLayout:
def __init__(node, parent, previous)
def layout()
def paint(display_list)
def role()
def __repr__()
class TextLayout:
def __init__(node, parent, previous, word)
def get_ascent(font_multiplier)
def get_descent(font_multiplier)
def layout()
def paint(display_list)
def rect()
def __repr__()
def filter_quality(node)
class ImageLayout:
def __init__(node, parent, previous, frame)
def layout()
def paint(display_list)
def __repr__()
IFRAME_WIDTH_PX
IFRAME_HEIGHT_PX
class IframeLayout:
def __init__(node, parent, previous, parent_frame)
def layout()
def paint(display_list)
def __repr__()
class AttributeParser:
def __init__(s)
def whitespace()
def literal(literal)
def word(allow_quotes)
def parse()
class HTMLParser:
def __init__(body)
def parse()
def get_attributes(text)
def add_text(text)
SELF_CLOSING_TAGS
def add_tag(tag)
HEAD_TAGS
def implicit_tags(tag)
def finish()
INTERNAL_ACCESSIBILITY_HOVER
EVENT_DISPATCH_CODE
POST_MESSAGE_DISPATCH_CODE
class JSContext:
def __init__(tab, url_origin)
def throw_if_cross_origin(frame)
def add_window(frame)
def wrap(script, window_id)
def run(script, code, window_id)
def dispatch_event(type, elt, window_id)
def get_handle(elt)
def querySelectorAll(selector_text, window_id)
def getAttribute(handle, attr)
def parent(window_id)
def dispatch_post_message(message, window_id)
def postMessage(target_window_id, message, origin)
def innerHTML_set(handle, s, window_id)
def style_set(handle, s, window_id)
def dispatch_settimeout(handle, window_id)
def setTimeout(handle, time, window_id)
def dispatch_xhr_onload(out, handle, window_id)
def XMLHttpRequest_send(method, url, body, isasync, handle, window_id)
def now()
def dispatch_RAF(window_id)
def requestAnimationFrame()
class AccessibilityNode:
def __init__(node, parent)
def build()
def build_internal(child_node)
def intersects(x, y)
def hit_test(x, y)
def map_to_parent(rect)
def absolute_bounds()
def __repr__()
class FrameAccessibilityNode:
def __init__(node, parent)
def build()
def hit_test(x, y)
def map_to_parent(rect)
def __repr__()
BROKEN_IMAGE
class Frame:
def __init__(tab, parent_frame, frame_element)
def set_needs_render()
def set_needs_layout()
def allowed_request(url)
def load(url, body)
def render()
def paint(display_list)
def advance_tab()
def focus_element(node)
def activate_element(elt)
def submit_form(elt)
def keypress(char)
def scrolldown()
def scroll_to(elt)
def click(x, y)
def clamp_scroll(scroll)
class CommitData:
def __init__(url, scroll, root_frame_focused, height, display_list, composited_updates, accessibility_tree, focus)
class Tab:
def __init__(browser)
def load(url, body)
def get_js(origin)
def set_needs_render_all_frames()
def set_needs_accessibility()
def set_needs_paint()
def request_animation_frame_callback()
def run_animation_frame(scroll)
def render()
def click(x, y)
def keypress(char)
def scrolldown()
def enter()
def get_tabindex(node)
def advance_tab()
def zoom_by(increment)
def reset_zoom()
def go_back()
def toggle_accessibility()
def toggle_dark_mode()
def post_message(message, target_window_id)
def draw_line(canvas, x1, y1, x2, y2, color)
def add_main_args()
if __name__ == "__main__"
Canvas element: Implement the <canvas>
element, the 2D aspect of the getContext
API, and some of the drawing commands on CanvasRenderingContext2D
.
Canvas layout is just like an iframe, including its default width and
height. You should allocate a Skia canvas of an appropriate size when
getContext("2d")
is called, and implement some of the APIs
that draw to the canvas.Note that once JavaScript draws to a canvas, the drawing
persists forever until reset
or similar is called. This allows a web developer to build up a display
list with a sequence of commands, but also places the burden on them to
decide when to do so, and also when to clear it when needed. This
approach is called an immediate mode of rendering—as opposed to
the retained
mode used by HTML, which does not have this complexity for
developers. (Instead, the complexity is borne by the
browser.) It should be straightforward to translate most
API methods to their Skia equivalent.
Background images: Elements can have a background-image
.
Implement the basics of this CSS property: a url(...)
value
for the background-image
property. Avoid loading the image
if the background-image
property does not actually end up
used on any element. For a bigger challenge, also allow the web page set
the size of the background image with the background-size
CSS property.
Object-fit: implement the object-fit
CSS property. It determines how the image within an
<img>
element is sized relative to its container
element.
Iframe aspect ratio. Implement the aspect-ratio
CSS property and use it to provide an implicit sizing to iframes and
images when only one of width
or height
is
specified (or when the image is not yet loaded, if you did the lazy
loading exercise).
Lazy loading: Even encoded images can be quite large.In the early days of the web,
computer networks were slow enough that browsers had a user setting to
disable downloading of images until the user expressly asked for
them. Add support for the loading
attribute on img
elements. Your browser should only
download images if they are close to the visible area of the page. This
kind of optimization is generally called lazy
loading. Implement a second optimization in your browser that only
renders images that are within a certain number of pixels of the being
visible on the screen.
Image placeholders: Building on top of lazy loading,
implement placeholder styling of images that haven’t loaded yet. This is
done by setting a 0x0 sizing, unless width
or
height
is specified. Also add support for hiding the
“broken image” if the alt
attribute is missing or
empty.That’s because if
alt
text is provided, the browser can assume the image is
important to the meaning of the website, and so it should tell the user
that they are missing out on some of the content if it fails to load.
But otherwise, the broken image icon is probably just ugly
clutter.
Media queries. Implement the width media query. Make sure it works inside iframes. Also make sure it works even when the width of an iframe is changed by its parent frame.
Target origin for postMessage
: Implement the
targetOrigin
parameter to postMessage
.
This parameter is a string which indicates the frame origins that are
allowed to receive the message.
Multi-frame focus: in our toy browser, pressing
Tab
cycles through the elements in the focused frame. But
means it’s impossible to access focusable elements in other frames via
the keyboard alone. Fix it to move between frames after iterating
through all focusable elements in one frame.
Iframe history: Ensure that iframes affect browser history. For example, if you click on a link inside an iframe, and then hit back button, it should go back inside the iframe. Make sure that this works even when the user clicks links in multiple frames in various orders.It’s debatable whether this is a good feature of iframes, as it causes a lot of confusion for web developers who embed iframes they don’t plan on navigating.
Iframes under transforms: painting an iframe that has a CSS
transform
on it or an ancestor should already work, but
event targeting for clicks doesn’t work, because click
doesn’t account for that transform. Fix this. Also make sure that
accessibility handles iframes under transform correctly in all
cases.
Iframes added or removed by script: the
innerHTML
API can cause iframes to be added or removed, but
our browser doesn’t load or unload them when this happens. Fix this: new
iframes should be loaded and old ones unloaded.
Did you find this chapter useful?