Drawing to the Screen

Twitter · Blog · Patreon · Discussions

A web browser doesn’t just download a web page; it also has to show that page to the user. In the 21st century, that means a graphical application. How does that work? In this chapter we’ll equip the toy browser with a graphical user interface.There are some obscure text-based browsers: I used w3m as my main browser for most of 2011. I don’t anymore.

Creating windows

Desktop and laptop computers run operating systems that provide desktop environments: windows, buttons, and a mouse. So programs don’t directly draw to the screen; the desktop environment controls the screen. Instead:

Though the desktop environment is responsible for displaying the window, the program is responsible for drawing its contents. Applications have to redraw these contents quickly for interactions to feel fluid,On older systems, applications drew directly to the screen, and if they didn’t update, whatever was there last would stay in place, which is why in error conditions you’d often have one window leave “trails” on another. Modern systems use a technique called compositing, in part to avoid trails (performance and application isolation are additional reasons). Even while using compositing, applications must redraw their window contents to change what is displayed. Chapter 12 will discuss compositing in more detail. and must respond quickly to clicks and key presses so the user doesn’t get frustrated.

“Feel fluid” can be made more precise. Graphical applications such as browsers typically aim to redraw at a speed equal to the refresh rate, or frame rate, of the screen, and/or a fixed 60HzMost screens today have a refresh rate of 60Hz, and that is generally considered fast enough to look smooth. However, new hardware is increasingly appearing with higher refresh rates, such as 120Hz. Sometimes rendering engines, games in particular, refresh at lower rates on purpose if they know the rendering speed can’t keep up.. This means that the browser has to finish all its work in less than 1/60th of a second, or 16ms, in order to keep up. For this reason, 16ms is called the animation frame budget of the application.

You should also keep in mind that not all web page interactions are animations - there are also discrete actions such as mouse clicks. Research has shown that it usually suffices to respond to a discrete action in 100ms - below that threshold, most humans are not sensitive to discrete action speed. This is very different than interactions such as scroll, where speed less than 60Hz or so is quite noticeable. The difference between the two has to do with the way the human mind processes movement (animation) versus discrete action, and the time it takes for the brain to decide upon such an action, execute it, and understand its result.

Doing all of this by hand is a bit of a drag, so programs usually use a graphical toolkit to simplify these steps. These toolkits allow you to describe your program’s window in terms of widgets like buttons, tabs, or text boxes, and take care of drawing and redrawing the window contents to match that description.

Python comes with a graphical toolkit called Tk using the Python package tkinter.The library is called Tk, and it was originally written for a different language called Tcl. Python contains an interface to it, hence the name. Using it is quite simple:

import tkinter
window = tkinter.Tk()
tkinter.mainloop()

Here tkinter.Tk() creates a window and tkinter.mainloop() starts the process of redrawing the screen. Inside Tk, tkinter.Tk() asks the desktop environment to create the window and returns its identifier, while tkinter.mainloop() enters a loop that looks similar to this The example event loop above may look like an infinite loop that locks up the computer, but it’s not, because of preemptive multitasking among threads and processes and/or a variant of the event loop that sleeps unless it has inputs that wake it up from another thread or process.:

while True:
    for evt in pendingEvents():
        handleEvent(evt)
    drawScreen()

Here, drawScreen draws the various widgets, pendingEvent asks the desktop environment for recent mouse clicks or key presses, and handleEvent calls into library user code in response to that event. This event loop pattern is common in many applications, from web browsers to video games. A simple window does not need much event handling (it ignores all events) or much drawing (it is a uniform white or gray). But in more complex graphical applications the event loop pattern makes sure that all events are eventually handled and the screen is eventually updated, both essential to a good user experience.

Tk’s event loop is the Tk_UpdateObjCmd function, found in tkCmds.c, which calls XSync to redraw the screen and Tcl_DoOneEvent to handle an event. There’s also a lot of code to handle errors.

Drawing to the window

Our toy browser will draw the web page text to a canvas, a rectangular Tk widget that you can draw circles, lines, and text in.You may be familiar with the HTML <canvas> element, which is a similar idea: a 2D rectangle in which you can draw shapes. Tk also has widgets like buttons and dialog boxes, but our browser won’t use them: we will need finer-grained control over appearance, which a canvas provides:This is why desktop applications are more uniform than web pages: desktop applications generally use the widgets provided by a common graphical toolkit, which limits their creative possibilities.

WIDTH, HEIGHT = 800, 600
window = tkinter.Tk()
canvas = tkinter.Canvas(window, width=WIDTH, height=HEIGHT)
canvas.pack()

The first line creates the window, as above; the second creates the Canvas inside that window. We pass the window as an argument, so that Tk knows where to display the canvas, and some arguments that define the canvas’s size; I chose 800×600 because that was a common old-timey monitor size.This size, called Super Video Graphics Array, was standardized in 1987, and probably did seem super back then. The third line is a Tk peculiarity, which positions the canvas inside the window.

There’s going to be a window, a canvas, and later some other things, so to keep it all organized let’s make an object:

class Browser:
    def __init__(self):
        self.window = tkinter.Tk()
        self.canvas = tkinter.Canvas(
            self.window, 
            width=WIDTH,
            height=HEIGHT
        )
        self.canvas.pack()

Once you’ve made a canvas, you can call methods that draw shapes on the canvas. Let’s do that inside load, which we’ll move into the new Browser class:

class Browser:
    def load(self, url):
        # ...
        self.canvas.create_rectangle(10, 20, 400, 300)
        self.canvas.create_oval(100, 100, 150, 150)
        self.canvas.create_text(200, 150, text="Hi!")

To run this code, create a Browser, call load, and then start the Tk mainloop:

if __name__ == "__main__":
    import sys
    Browser().load(sys.argv[1])
    tkinter.mainloop()

You ought to see: a rectangle, starting near the top-left corner of the canvas and ending at its center; then a circle inside that rectangle; and then the text “Hi!” next to the circle.

Coordinates in Tk refer to X positions from left to right and to Y positions from top to bottom. In other words, the bottom of the screen has larger Y values, the opposite of what you might be used to from math. Play with the coordinates above to figure out what each argument refers to.The answers are in the online documentation.

The Tk canvas widget is quite a bit more powerful than what we’re using it for here. As you can see from the tutorial, you can move the individual things you’ve drawn to the canvas, listen to click events on each one, and so on. In this book, I’m not using those features, because I want to teach you how to implement them.

Laying out text

Let’s draw a simple web page on this canvas. So far, the toy browser steps through the web page source code character by character and prints the text (but not the tags) to the console window. Now we want to draw the characters on the canvas instead.

To start, let’s change the show function from the previous chapter into a function that I’ll call lexForeshadowing future developments… which just returns the text-not-tags content of an HTML document, without printing it:

def lex(body):
  text = ""
  # ...
  for c in body:
      # ...
      elif not in_angle:
          text += c
    return text

Then, load will draw that text, character by character:

def load(self, url):
    # ...
    for c in text:
        self.canvas.create_text(100, 100, text=c)

Let’s test this code on a real webpage. For reasons that might seem inscrutableIt’s to delay a discussion of basic typography to the next chapter…, let’s test it on the first chapter of 西游记 or “Journey to the West”, a classic Chinese novel about a monkey. Run this URLRight click on the link and “Copy URL”. through request, lex, and layout.If you’re not in Asia, you’ll probably see this phase take a while: China is far away! You should see a window with a big blob of black pixels inset a bit from the top left corner of the window.

Why a blob instead of letters? Well, of course, because we are drawing every letter in the same place, so they all overlap! Let’s fix that:

HSTEP, VSTEP = 13, 18
cursor_x, cursor_y = HSTEP, VSTEP
for c in text:
    self.canvas.create_text(cursor_x, cursor_y, text=c)
    cursor_x += HSTEP

The variables cursor_x and cursor_y point to where the next character will go, as if you were typing the text with in a word processor. I picked the magic numbers—13 and 18—by trying a few different values and picking one that looked most readable. In the next chapter, we’ll replace magic numbers with font metrics.

The text now forms a line from left to right. But with an 800 pixel wide canvas and 13 pixels per character, one line only fits about 60 characters. You need more than that to read a novel, so we also need to wrap the text once we reach the edge of the screen:

for c in text:
    # ...
    if cursor_x >= WIDTH - HSTEP:
        cursor_y += VSTEP
        cursor_x = HSTEP

The code increases cursor_y and resets cursor_xIn the olden days of typewriters, increasing y meant feeding in a new line, and resetting x meant returning the carriage that printed letters to the left edge of the page. So ASCII standardizes two separate characters—“carriage return” and “line feed”—for these operations, so that ASCII could be directly executed by teletypewriters. That’s why headers in HTTP are separated by \r\n, even though modern computers have no mechanical carriage. once cursor_x goes past 787 pixels.Not 800, because we started at pixel 13 and I want to leave an even gap on both sides. Wrapping the text this way makes it possible to read more than a single line:

Now we can read a lot of text, but still not all of it: if there’s enough text, all of the lines of text don’t fit on the screen. We want users to scroll the page to look at different parts of it.

Chinese characters are usually, but not always, independent: 开关 means “button” but is composed of “on” and “off”. A line break between them would be confusing, because you’d read “on off” instead of “button”. The ICU library, used by both Firefox and Chrome, uses dynamic programming to guess phrase boundaries based on a word frequency table.

Scrolling text

Scrolling introduces a layer of indirection between page coordinates (this text is 132 pixels from the top of the page) and screen coordinates (since you’ve scrolled 60 pixels down, this text is 72 pixels from the top of the screen). Generally speaking, a browser lays out the page—determines where everything on the page goes—in terms of page coordinates and then renders the page—draws everything—in terms of screen coordinates.Sort of. What actually happens is that the page is first drawn into a bitmap or GPU texture, then that bitmap/texture is shifted according to the scroll, and the result is rendered to the screen. Chapter 12 will have more on this topic.

Our browser will have the same split. Right now load both computes the position of each character and draws it: layout and rendering. Let’s have a layout function to compute and store the position of each character, and a separate render function to then draw each character based on the stored position. This way, layout can operate with page coordinates and only render needs to think about screen coordinates.

Let’s start with layout. Instead of calling canvas.create_text on each character let’s add it to a list, together with its position. Since layout doesn’t need to access anything in Browser, it can be a standalone function:

def layout(text):
    display_list = []
    cursor_x, cursor_y = HSTEP, VSTEP
    for c in text:
        display_list.append((cursor_x, cursor_y, c))
        # ...
    return display_list

The resulting list is called a display list: it is a list of things to display.The term is standard. Since layout is all about page coordinates, we don’t need to change anything else about it to support scrolling.

Once the display list is computed, draw needs to loop through the display list and draw each character:

class Browser:
    def draw(self):
        for x, y, c in self.display_list:
          self.canvas.create_text(x, y, text=c)

Since draw does need access to the canvas, we keep it a method on Browser. Now the load just needs to call layout followed by draw:

class Browser:
    def load(self, url):
        headers, body = request(url)
        text = lex(body)
        self.display_list = layout(text)
        self.draw()

Now we can add scrolling. Let’s have a variable for how far you’ve scrolled:

class Browser:
    def __init__(self):
        # ...
        self.scroll = 0

The page coordinate y then has screen coordinate y - self.scroll:

def draw(self):
    for x, y, c in self.display_list:
        self.canvas.create_text(x, y - self.scroll, text=c)

If you change the value of scroll the page will now scroll up and down. But how does the user change scroll?

Storing the display list makes scrolling faster: the browser isn’t doing layout every time you scroll. Modern browsers take this further, retaining much of the display list even when the web page changes due to JavaScript or user interaction.

Reacting to the user

Most browsers scroll the page when you press the up and down keys, rotate the scroll wheel, drag the scroll bar, or apply a touch gesture to the screen. To keep things simple, let’s just implement the down key.

Tk allows you to bind a function to a key, which instructs Tk to call that function when the key is pressed. For example, to bind to the down arrow key, write:

def __init__(self):
    # ...
    self.window.bind("<Down>", self.scrolldown)

Here, self.scrolldown is an event handler, a function that Tk will call whenever the down arrow key is pressed.scrolldown is passed an event object as an argument by Tk, but since scrolling down doesn’t require any information about the key press, besides the fact that it happened, scrolldown ignores that event object. All it needs to do is increment y and re-draw the canvas:

SCROLL_STEP = 100

def scrolldown(self, e):
    self.scroll += SCROLL_STEP
    self.draw()

If you try this out, you’ll find that scrolling draws all the text a second time. That’s because we didn’t erase the old text before drawing the new text. Call canvas.delete to clear the old text:

def draw(self):
    self.canvas.delete("all")
    # ...

Scrolling should now work!

Faster Rendering

But this scrolling is pretty slow.How fast exactly seems to depend a lot on your operating system and default font. Why? It turns out that loading information about the shape of a character, inside create_text, takes a while. To speed up scrolling we need to make sure to do it only when necessary (while at the same time ensuring the pixels on the screen are always correct).

Real browsers incorporate a lot of quite tricky optimizations to this process, but for this toy browser let’s limit ourselves to a simple improvement: on a long page most characters are outside the viewing window, and we can skip drawing them in render:

for x, y, c in self.display_list:
    if y > self.scroll + HEIGHT: continue
    if y + VSTEP < self.scroll: continue
    # ...

The first if statement skips characters below the viewing window; the second skips characters above it. In that second if statement, y + VSTEP computes the bottom edge of the character, so that character that are halfway inside the viewing window are still drawn.

Scrolling should now be pleasantly fast, and hopefully well within the 16ms animation frame budget. And because we split layout and draw, we don’t need to change layout at all to implement this optimization.

Mobile devices

Though you’re probably writing your browser on a desktop computer, many people access the web through mobile devices such as phones or tablets. On mobile devices, there’s still a screen, a rendering loop, and most other things discussed in this book.For example, most real browsers have both desktop and mobile editions, and the rendering engine code is almost exactly the same for both. But there are several differences worth noting:

Summary

This chapter went from a rudimentary command-line browser to a graphical user interface with text that can be scrolled. The browser now:

Next, we’ll make this browser work on English text, with all its complexities of variable width characters, line layout, and formatting.

Get an email every time we publish a new chapter:

Adding your email to the list...
Success! You'll receive a welcome email shortly.
Error! Something seems to have gone wrong.
Close

Outline

The complete set of functions, classes, and methods in our browser should look something like this:

def request(url) def lex(body) WIDTH, HEIGHT HSTEP, VSTEP SCROLL_STEP def layout(text) class Browser: def __init__() def load(url) def draw() def scrolldown(e) if __name__ == "__main__"

Exercises

Line breaks: Change layout to end the current line and start a new one when it sees a newline character. Increment y by more than VSTEP to give the illusion of paragraph breaks. There are poems embedded in “Journey to the West”; you’ll now be able to make them out.

Mouse wheel: Add support for scrolling up when you hit the up arrow. Make sure you can’t scroll past the top of the page.It’s harder to stop scrolling past the bottom of the page; we will implement this in Chapter 5 Then bind the <MouseWheel> event, which triggers when you scroll with the mouse wheel.It will also trigger with touchpad gestures, if you don’t have a mouse. The associated event object has an event.delta value which tells you how far and in what direction to scroll. Unfortunately, Mac and Windows give the event.delta objects opposite sign and different scales, and on Linux, scrolling instead uses the <Mouse-4> and <Mouse-5> events.The Tk manual has more information about this. It’s not easy to write cross-platform applications!

Emoji: Add support for emoji 😀 to our browser. Emoji are characters, and you can call create_text to draw them, but the results aren’t very good. Instead, head to the OpenMoji project, download the emoji for “grinning face” as a PNG file, convert to GIF, resize it to 16×16 pixels, and save it to the same folder as the browser. Use Tk’s PhotoImage class to load the image and then the create_image method to draw it to the canvas. In fact, download the whole OpenMoji library (look for the “Get OpenMojis” button at the top right)—then your browser can look up whatever emoji is used in the page.

Resizing: Make the browser resizable. To do so, pass the fill and expand arguments to canvas.pack, call and bind to the <Configure> event, which happens when the window is resized. The window’s new width and height can be found in the width and height fields on the event object. Remember that when the window is resized, the line breaks must change, so you will need to call layout again.

Zoom: Make the + key double the text size. You will need to use the font argument in create_text to change the size of text, like this:

font = tkinter.font.Font(size=32)
canvas.create_text(200, 150, text="Hi!", font=font)

Be careful in how you split the task between layout and draw. Make sure that text doesn’t overlap when you zoom in and that scrolling works when zoomed in.

Did you find this chapter useful?