A web browser doesn’t just download a web page; it also has to show
that page to the user. In the 21st century, that means a
graphical application. How does that work? In this chapter we’ll equip
the toy browser with a graphical user interface.There are some obscure
text-based browsers: I used
w3m as my main browser for most
of 2011. I don’t anymore.
Desktop and laptop computers run operating systems that provide desktop environments: windows, buttons, and a mouse. So programs don’t directly draw to the screen; the desktop environment controls the screen. Instead:
Though the desktop environment is responsible for displaying the window, the program is responsible for drawing its contents. Applications have to redraw these contents quickly for interactions to feel fluid,On older systems, applications drew directly to the screen, and if they didn’t update, whatever was there last would stay in place, which is why in error conditions you’d often have one window leave “trails” on another. Modern systems use a technique called compositing, in part to avoid trails (performance and application isolation are additional reasons). Even while using compositing, applications must redraw their window contents to change what is displayed. Chapter 13 will discuss compositing in more detail. and must respond quickly to clicks and key presses so the user doesn’t get frustrated.
“Feel fluid” can be made more precise. Graphical applications such as browsers typically aim to redraw at a speed equal to the refresh rate, or frame rate, of the screen, and/or a fixed 60HzMost screens today have a refresh rate of 60Hz, and that is generally considered fast enough to look smooth. However, new hardware is increasingly appearing with higher refresh rates, such as 120Hz. Sometimes rendering engines, games in particular, refresh at lower rates on purpose if they know the rendering speed can’t keep up.. This means that the browser has to finish all its work in less than 1/60th of a second, or 16ms, in order to keep up. For this reason, 16ms is called the animation frame budget of the application.
You should also keep in mind that not all web page interactions are animations - there are also discrete actions such as mouse clicks. Research has shown that it usually suffices to respond to a discrete action in 100ms - below that threshold, most humans are not sensitive to discrete action speed. This is very different than interactions such as scroll, where speed less than 60Hz or so is quite noticeable. The difference between the two has to do with the way the human mind processes movement (animation) versus discrete action, and the time it takes for the brain to decide upon such an action, execute it, and understand its result.
Doing all of this by hand is a bit of a drag, so programs usually use a graphical toolkit to simplify these steps. These toolkits allow you to describe your program’s window in terms of widgets like buttons, tabs, or text boxes, and take care of drawing and redrawing the window contents to match that description.
Python comes with a graphical toolkit called Tk using the Python
tkinter.The library is called Tk, and it was originally written for
a different language called Tcl. Python contains an interface to it,
hence the name. Using it is quite simple:
import tkinter = tkinter.Tk() window tkinter.mainloop()
tkinter.Tk() creates a window and
tkinter.mainloop() starts the process of redrawing the
screen. Inside Tk,
tkinter.Tk() asks the desktop
environment to create the window and returns its identifier, while
tkinter.mainloop() enters a loop that looks similar to this
The example event loop
above may look like an infinite loop that locks up the computer, but
it’s not, because of preemptive multitasking among threads and processes
and/or a variant of the event loop that sleeps unless it has inputs that
wake it up from another thread or process.:
while True: for evt in pendingEvents(): handleEvent(evt) drawScreen()
drawScreen draws the various widgets,
pendingEvent asks the desktop environment for recent mouse
clicks or key presses, and
handleEvent calls into library
user code in response to that event. This event loop pattern is
common in many applications, from web browsers to video games. A simple
window does not need much event handling (it ignores all events) or much
drawing (it is a uniform white or gray). But in more complex graphical
applications the event loop pattern makes sure that all events are
eventually handled and the screen is eventually updated, both essential
to a good user experience.
Tk’s event loop is the
Tk_UpdateObjCmd function, found
XSync to redraw the screen and
Tcl_DoOneEvent to handle an event. There’s also a lot of
code to handle errors.
Our toy browser will draw the web page text to a canvas, a
rectangular Tk widget that you can draw circles, lines, and text
in.You may be familiar
with the HTML
<canvas> element, which is a similar
idea: a 2D rectangle in which you can draw shapes. Tk also
has widgets like buttons and dialog boxes, but our browser won’t use
them: we will need finer-grained control over appearance, which a canvas
provides:This is why
desktop applications are more uniform than web pages: desktop
applications generally use the widgets provided by a common graphical
toolkit, which limits their creative possibilities.
= 800, 600 WIDTH, HEIGHT = tkinter.Tk() window = tkinter.Canvas(window, width=WIDTH, height=HEIGHT) canvas canvas.pack()
The first line creates the window, as above; the second creates the
Canvas inside that window. We pass the window as an
argument, so that Tk knows where to display the canvas, and some
arguments that define the canvas’s size; I chose 800×600 because that
was a common old-timey monitor size.This size, called Super Video Graphics Array, was
standardized in 1987, and probably did seem super back
then. The third line is a Tk peculiarity, which positions
the canvas inside the window.
There’s going to be a window, a canvas, and later some other things, so to keep it all organized let’s make an object:
class Browser: def __init__(self): self.window = tkinter.Tk() self.canvas = tkinter.Canvas( self.window, =WIDTH, width=HEIGHT height )self.canvas.pack()
Once you’ve made a canvas, you can call methods that draw shapes on
the canvas. Let’s do that inside
load, which we’ll move
into the new
class Browser: def load(self, url): # ... self.canvas.create_rectangle(10, 20, 400, 300) self.canvas.create_oval(100, 100, 150, 150) self.canvas.create_text(200, 150, text="Hi!")
To run this code, create a
load, and then start the Tk
if __name__ == "__main__": import sys 1]) Browser().load(sys.argv[ tkinter.mainloop()
You ought to see: a rectangle, starting near the top-left corner of the canvas and ending at its center; then a circle inside that rectangle; and then the text “Hi!” next to the circle.
Coordinates in Tk refer to X positions from left to right and to Y positions from top to bottom. In other words, the bottom of the screen has larger Y values, the opposite of what you might be used to from math. Play with the coordinates above to figure out what each argument refers to.The answers are in the online documentation.
The Tk canvas widget is quite a bit more powerful than what we’re using it for here. As you can see from the tutorial, you can move the individual things you’ve drawn to the canvas, listen to click events on each one, and so on. In this book, I’m not using those features, because I want to teach you how to implement them.
Let’s draw a simple web page on this canvas. So far, the toy browser steps through the web page source code character by character and prints the text (but not the tags) to the console window. Now we want to draw the characters on the canvas instead.
To start, let’s change the
show function from the
previous chapter into a function that I’ll call
developments… which just returns the
text-not-tags content of an HTML document, without printing it:
def lex(body): = "" text # ... for c in body: # ... elif not in_angle: += c text return text
load will draw that text, character by
def load(self, url): # ... for c in text: self.canvas.create_text(100, 100, text=c)
Let’s test this code on a real webpage. For reasons that might seem
inscrutableIt’s to delay
a discussion of basic typography to the next chapter…,
let’s test it on the first
chapter of 西游记 or “Journey to the West”, a
classic Chinese novel about a monkey. Run this URLRight click on the link and
“Copy URL”. through
load.If you’re not in Asia, you’ll
probably see this phase take a while: China is far away!
You should see a window with a big blob of black pixels inset a bit from
the top left corner of the window.
Why a blob instead of letters? Well, of course, because we are drawing every letter in the same place, so they all overlap! Let’s fix that:
= 13, 18 HSTEP, VSTEP = HSTEP, VSTEP cursor_x, cursor_y for c in text: self.canvas.create_text(cursor_x, cursor_y, text=c) += HSTEP cursor_x
to where the next character will go, as if you were typing the text with
in a word processor. I picked the magic numbers—13 and 18—by trying a
few different values and picking one that looked most readable. In the
next chapter, we’ll replace magic numbers with
The text now forms a line from left to right. But with an 800 pixel wide canvas and 13 pixels per character, one line only fits about 60 characters. You need more than that to read a novel, so we also need to wrap the text once we reach the edge of the screen:
for c in text: # ... if cursor_x >= WIDTH - HSTEP: += VSTEP cursor_y = HSTEP cursor_x
The code increases
cursor_y and resets
the olden days of typewriters, increasing y meant
feeding in a new line, and resetting x meant
returning the carriage that printed letters to the
left edge of the page. So ASCII standardizes two separate
characters—“carriage return” and “line feed”—for these operations, so
that ASCII could be directly executed by teletypewriters. That’s why
headers in HTTP are separated by
\r\n, even though modern
computers have no mechanical carriage. once
cursor_x goes past 787 pixels.Not 800, because we started at
pixel 13 and I want to leave an even gap on both sides.
Wrapping the text this way makes it possible to read more than a single
Now we can read a lot of text, but still not all of it: if there’s enough text, all of the lines of text don’t fit on the screen. We want users to scroll the page to look at different parts of it.
Chinese characters are usually, but not always, independent: 开关 means “button” but is composed of 开 “on” and 关 “off”. A line break between them would be confusing, because you’d read “on off” instead of “button”. The ICU library, used by both Firefox and Chrome, uses dynamic programming to guess phrase boundaries based on a word frequency table.
Scrolling introduces a layer of indirection between page coordinates (this text is 132 pixels from the top of the page) and screen coordinates (since you’ve scrolled 60 pixels down, this text is 72 pixels from the top of the screen). Generally speaking, a browser lays out the page—determines where everything on the page goes—in terms of page coordinates and then renders the page—draws everything—in terms of screen coordinates.Sort of. What actually happens is that the page is first drawn into a bitmap or GPU texture, then that bitmap/texture is shifted according to the scroll, and the result is rendered to the screen. Chapter 12 will have more on this topic.
Our browser will have the same split. Right now
both computes the position of each character and draws it: layout and
rendering. Let’s have a
layout function to compute and
store the position of each character, and a separate
function to then draw each character based on the stored position. This
layout can operate with page coordinates and only
draw needs to think about screen coordinates.
Let’s start with
layout. Instead of calling
canvas.create_text on each character let’s add it to a
list, together with its position. Since
layout doesn’t need
to access anything in
Browser, it can be a standalone
def layout(text): =  display_list = HSTEP, VSTEP cursor_x, cursor_y for c in text: display_list.append((cursor_x, cursor_y, c))# ... return display_list
The resulting list is called a display list: it is a list of
things to display.The
term is standard. Since
layout is all about
page coordinates, we don’t need to change anything else about it to
Once the display list is computed,
draw needs to loop
through the display list and draw each character:
class Browser: def draw(self): for x, y, c in self.display_list: self.canvas.create_text(x, y, text=c)
draw does need access to the canvas, we keep it a
Browser. Now the
load just needs to
layout followed by
class Browser: def load(self, url): = request(url) headers, body = lex(body) text self.display_list = layout(text) self.draw()
Now we can add scrolling. Let’s have a variable for how far you’ve scrolled:
class Browser: def __init__(self): # ... self.scroll = 0
The page coordinate
y then has screen coordinate
y - self.scroll:
def draw(self): for x, y, c in self.display_list: self.canvas.create_text(x, y - self.scroll, text=c)
If you change the value of
scroll the page will now
scroll up and down. But how does the user change
Storing the display list makes scrolling faster: the browser isn’t
layout every time you scroll. Modern browsers take
this further, retaining much of the display list even when the web
Most browsers scroll the page when you press the up and down keys, rotate the scroll wheel, drag the scroll bar, or apply a touch gesture to the screen. To keep things simple, let’s just implement the down key.
Tk allows you to bind a function to a key, which instructs Tk to call that function when the key is pressed. For example, to bind to the down arrow key, write:
def __init__(self): # ... self.window.bind("<Down>", self.scrolldown)
self.scrolldown is an event handler, a
function that Tk will call whenever the down arrow key is pressed.
passed an event object as an argument by Tk, but since
scrolling down doesn’t require any information about the key press,
besides the fact that it happened,
scrolldown ignores that
event object. All it needs to do is increment
y and re-draw the canvas:
= 100 SCROLL_STEP def scrolldown(self, e): self.scroll += SCROLL_STEP self.draw()
If you try this out, you’ll find that scrolling draws all the text a
second time. That’s because we didn’t erase the old text before drawing
the new text. Call
canvas.delete to clear the old text:
def draw(self): self.canvas.delete("all") # ...
Scrolling should now work!
But this scrolling is pretty slow.How fast exactly seems to depend a lot on your operating
system and default font. Why? It turns out that loading
information about the shape of a character, inside
create_text, takes a while. To speed up scrolling we need
to make sure to do it only when necessary (while at the same time
ensuring the pixels on the screen are always correct).
Real browsers incorporate a lot of quite tricky optimizations to this
process, but for this toy browser let’s limit ourselves to a simple
improvement: on a long page most characters are outside the viewing
window, and we can skip drawing them in
for x, y, c in self.display_list: if y > self.scroll + HEIGHT: continue if y + VSTEP < self.scroll: continue # ...
if statement skips characters below the
viewing window; the second skips characters above it. In that second
y + VSTEP computes the bottom
edge of the character, so that character that are halfway inside the
viewing window are still drawn.
Scrolling should now be pleasantly fast, and hopefully well within
the 16ms animation frame budget. And because we split
draw, we don’t need to change
layout at all to implement this optimization.
Though you’re probably writing your browser on a desktop computer, many people access the web through mobile devices such as phones or tablets. On mobile devices, there’s still a screen, a rendering loop, and most other things discussed in this book.For example, most real browsers have both desktop and mobile editions, and the rendering engine code is almost exactly the same for both. But there are several differences worth noting:
<head>you’ll see a “viewport”
<meta>tag. This tag gives instructions to the browser for how to handle zooming on a mobile device. Without this tag, the browser makes assumptions, for historical reasons, that the site is “desktop-only” and needs some special tricks to make it readable on a mobile device, such as allowing the user to use a pinch-zoom or double-tap touchscreen gesture to focus in on one part of the page. Once zoomed in, the part of the page visible on the screen is the “visual viewport” and the whole documents’ bounds are the “layout viewport”.
This chapter went from a rudimentary command-line browser to a graphical user interface with text that can be scrolled. The browser now:
Next, we’ll make this browser work on English text, with all its complexities of variable width characters, line layout, and formatting.
The complete set of functions, classes, and methods in our browser should look something like this:
if __name__ == "__main__"
Line breaks: Change
layout to end the current
line and start a new one when it sees a newline character. Increment
y by more than
VSTEP to give the illusion of
paragraph breaks. There are poems embedded in “Journey to the West”;
you’ll now be able to make them out.
Mouse wheel: Add support for scrolling up when you hit the
up arrow. Make sure you can’t scroll past the top of the page.It’s harder to stop scrolling
past the bottom of the page; we will implement this in Chapter 5 Then bind the
<MouseWheel> event, which triggers when you scroll
with the mouse wheel.It
will also trigger with touchpad gestures, if you don’t have a
mouse. The associated event object has an
event.delta value which tells you how far and in what
direction to scroll. Unfortunately, Mac and Windows give the
event.delta objects opposite sign and different scales, and
on Linux, scrolling instead uses the
<Button-5> events.The Tk
manual has more information about this. It’s not easy to write
Emoji: Add support for emoji 😀 to our browser. Emoji are
characters, and you can call
create_text to draw them, but
the results aren’t very good. Instead, head to the OpenMoji project, download the emoji
face” as a PNG file, convert to GIF, resize it to 16×16 pixels, and
save it to the same folder as the browser. Use Tk’s
PhotoImage class to load the image and then the
create_image method to draw it to the canvas. In fact,
download the whole OpenMoji library (look for the “Get OpenMojis” button
at the top right)—then your browser can look up whatever emoji is used
in the page.
Resizing: Make the browser resizable. To do so, pass the
expand arguments to
canvas.pack, call and bind to the
<Configure> event, which happens when the window is
resized. The window’s new width and height can be found in the
height fields on the event object.
Remember that when the window is resized, the line breaks must change,
so you will need to call
Zoom: Make the
+ key double the text size. You
will need to use the
font argument in
create_text to change the size of text, like this:
= tkinter.font.Font(size=32) font 200, 150, text="Hi!", font=font)canvas.create_text(
Be careful in how you split the task between
draw. Make sure that text doesn’t overlap when you zoom in
and that scrolling works when zoomed in.
Did you find this chapter useful?