Handling Buttons and Links

Our toy browser draws web pages, but it is is still missing the key insight of hypertext: pages linked together into a web of information. We can watch the waves, but cannot yet surf the web. We need to implement hyperlinks, and we might as well add an address bar and a back button while we’re at it.

Click handling

To implement links, the browser chrome, and so on, we need to start with clicks. We already handle key presses; clicks work similarly in Tk: an event handler bound to a certain event. For scrolling, we defined scroll_down and bound it to <Down>; for click handling we will define handle_click and bind it to <Button-1>, button number 1 being the left button on the mouse.Button 2 is the middle button; button 3 is the right hand button.

window.bind("<Button-1>", handle_click)

def handle_click(e):
    pass

Inside handle_click, we want to figure out what link the user has clicked on. We’ll need to look at the e argument, which contains an "event object". This object has x and y fields, which refer to where the click happened, relative to the corner of the browser window. Since the canvas is in the top-left corner of the window, those are also the x and y coordinates relative to the canvas. To get coordinates relative to the web page, we need to account for scrolling:

x, y = e.x, e.y + scrolly

The next step is to figure out what links or other elements are at that location. Naively, this seems like it should be easy. We already have a tree of layout objects, each of which records size and position. We could use those to find the element clicked on, something like this:

def handle_click(e):
    x, y = e.x, e.y + scrolly
    elt = find_element(x, y, mode)
    print(elt.tag)

Here the find_element function is a straightforward variant of code we’ve already written a few times:

def find_element(x, y, layout):
    for child in reversed(layout.children):
        result = find_element(x, y, child)
        if result: return result
    if layout.x <= x < layout.x + layout.w and \
       layout.y <= y < layout.y + layout.h:
        return layout.node

In this code snippet, I am checking the children of a given node before checking the node itself. That's because if you click on a link, you want to click on the link, not the paragraph that it’s in. I search the children in reverse order in case children overlap; the last one would be “on top”.Real browsers use what are called stacking contexts to resolve the overlapping-elements question while allowing the order to be controlled with the z-index property.

Let's test it—but actually, first, let's handle a silly omission: we don't have any special style for links! Let's quickly add support for text color, which is controlled by the color property:

Once links have colors, you can actually find them on the page. So try clicking on them!

… but it won't work. There is no layout object corresponding to a link. The link text is laid out by InlineLayout, but each InlineLayout handles a whole paragraph of text, not a single link. We'll need to do some surgery on InlineLayout to fix this.

Adding line and text layout

Here's how I want inline layout to work, at a high level:

To begin with, we'll need to create two new data structures:

class LineLayout:
    def __init__(self, parent):
        self.parent = parent
        self.children = []
        self.w = 0

class TextLayout:
    def __init__(self, node, text):
        self.children = []
        self.node = node
        self.text = text

LineLayout is pretty run-of-the-mill here,There's no dummy node field on LineLayout because there's no HTML node that corresponds to a line of text, and there couldn't be. but TextLayout is unusual. First, it's got a dummy children field. I added that just to keep it uniform; it'll always be an empty list.You may want to use inheritance to group all the Layout classes into a hierarchy, but I'm trying to stick to some kind of easily-translatable subset of Python. But then, it's also got both a node and a text argument. That's because a single TextNode contains multiple words, and I want each TextLayout to be a single word.Because of line breaking. Finally, I'm not attaching the TextLayout to its parents. I'll do that in a separate attach method, which I'll define below.

Next, since each TextLayout corresponds to a particular TextNode, we can immediately compute its width and height. I'm going make the height of a TextNode be just the line-space for its font, with the 0.2 linespace added in LineLayout.Make sure you measure text, not node.text, which contains multiple words! That's an easy and confusing bug.

bold = node.style["font-weight"] == "bold"
italic = node.style["font-style"] == "italic"
self.color = node.style["color"]
self.font = tkinter.font.Font(
    family="Times", size=16,
    weight="bold" if bold else "normal",
    slant="italic" if italic else "roman"
)
self.w = self.font.measure(text)
self.h = self.font.metrics('linespace')

We can compute all this immediately in the constructor; that's one of the benefits of implementing styles and inheritance in the previous chapter.

With a basic TextLayout and LineLayout in place, we can start changing InlineLayout. First, let's create that list of lines, and initialize it with a blank line:

class InlineLayout:
    def __init__(self, parent):
        # ...
        self.children = [LineLayout(self)]

We'll need to create LineLayout and TextLayout objects as we lay out text in InlineLayout.layout. That function does little but recurse and call text on each TextNode.In the last lab, we stopped relying on open and close for changing the bold and italic variables, so you might as well delete those functions entirely. In that text function, there is a lot of control flow and then either an increment to x or an increment to y and a reset of x. We'll just replace the first case by creating a TextLayout and adding it to the last child in self.children, while the second case will create a new LineLayout.

If you ignore the terminal_space stuff, my version of text now looks like this:

def text(self, node):
    words = node.text.split()
    for i, word in enumerate(words):
        tl  = TextLayout(node, word)
        line = self.children[-1]
        if line.w + tl.w > self.w:
            line = LineLayout(self)
            self.children.append(line)
        tl.attach(line)

Note that I've removed the DrawText command and the display list. I'm planning to do that in TextLayout now.

Here, TextLayout.attach just adds text to a line and increments the line's width

def attach(self, parent):
    self.parent = parent
    parent.children.append(self)
    parent.w += self.w

What about terminal_space? Well, remember that we only have TextLayout objects for words of text, not the intermediate inline style nodes. We just need to know which TextLayout objects have spaces after them, and which do not. I'm going to add a space field to TextLayout, which is going to tell me whether a space goes after that word, and set it like this:

if node.text[0].isspace() and len(self.children[-1].children) > 0:
    self.children[-1].children[-1].add_space()

for i, word in enumerate(words):
    # ...
    if i != len(words) - 1 or node.text[-1].isspace():
        tl.add_space()

Here the add_space function sets the space field and also increases the parent line's width:

def add_space(self):
    if self.space == 0:
        gap = self.font.measure(" ")
        self.space = gap
        self.parent.w += gap

Now that we have created the LineLayout and TextLayout objects, we need to compute their x and y positions. Let's start from the simplest and work up to the hardest. TextLayout does barely anything; it is told where to be and it goes there:

class TextLayout:
    def layout(self, x, y):
        self.x = x
        self.y = y

Recall that for a TextLayout we compute the width and height in the constructor. Now, for a line, we need to just lay out the words in the line, one by one:

class LineLayout:
    def layout(self, y):
        self.y = y
        self.x = self.parent.x
        self.h = 0

        x = self.x
        leading = 2
        y += leading / 2
        for child in self.children:
            child.layout(x, y)
            x += child.w + child.space
            self.h = max(self.h, child.h + leading)
        self.w = x - self.x

Note the height computation. It will be totally wrong if you mix fonts of different sizes in one line. You should instead first compute the largest ascenders and descenders, use that to compute a baseline, then place all the boxes, and finally compute the line height. Leading would be computed per-word and would factor into the placement of the baseline. I'm not doing any of that here because we don't have any elements of different font sizes anyway.[^8]

Now that we have words and lines laying themselves out, we need only modify InlineLayout. This involves the most surgery, but the end result is much simpler now that we've got proper line and text layout.

First, we can delete the bold, italic, terminal_space, and dl fields; pass node to the InlineLayout constructor; and rename the layout method to recurse:

class InlineLayout:
    def __init__(self, parent, node):
        # ...
        self.node = node

    def recurse(self, node):
        if isinstance(node, ElementNode):
            for child in node.children:
                self.recurse(child)
        else:
            self.text(node)

This makes room for a new layout function, which calls recurse to create children and then lays them out:

def layout(self):
    self.x = self.parent.content_left()
    self.y = self.parent.content_top()
    self.w = self.parent.content_width()
    self.recurse(self.node)
    y = self.y
    for child in self.children:
        child.layout(y)
        y += child.h
    self.h = y - self.y

All that's left is generating a display list; let's just copy the recursive display_list method from BlockLayout to InlineLayout and LineLayout (skipping the borders and background color stuff), and add a simple display_list to TextLayout, which just issues a single DrawText call:

def display_list(self):
    return [DrawText(
        self.x, self.y,
        self.text, self.font, self.color)]

Phew. That was a lot of surgery to InlineLayout. But as a result, InlineLayout should now look a lot like the other layout classes. And, we now have individual layout object corresponding to each word in the document. The handle_click function should now working correctly: when you click on a link find_element should return the exact TextNode that you clicked on, from which you could get a link:

elt = find_element(x, y, nodes)
while elt and not \
      (isinstance(elt, ElementNode) and \
       elt.tag == "a" and "href" in elt.attributes):
    elt = elt.parent
if elt:
    print(elt.attributes["href"])

Once we’ve found the link, we need to navigate to that page.

Navigating between pages

I'd like clicking a link to cause the browser to navigate to that page. That would mean:

None of that is impossible, since we do all of it already, but right now it's split between two functions: show, which executes the last three bullet points, and the browser entry point that does the first few. I'm going to rejigger this architecture by introducing a new Browser object, which will both manage the canvas and do the page-related stuff. The GUI will be set up in the constructor:

class Browser:
    def __init__(self):
        self.window = tkinter.Tk()
        self.canvas = tkinter.Canvas(self.window,
            width=800, height=600)
        self.canvas.pack()

        self.url = None
        self.scrolly = 0
        self.max_h = 0
        self.window.bind("<Down>", self.scrolldown)
        self.window.bind("<Button-1>", self.handle_click)

Then, we'll have a method to browse to a given web page:

def browse(self, url):
    self.url = url
    host, port, path = parse_url(url)
    headers, body = request(host, port, path)
    text = lex(body)
    self.nodes = parse(text)
    self.rules = []
    with open("browser.css") as f:
        r = CSSParser(f.read()).parse()
        self.rules.extend(r)
    for link in find_links(self.nodes):
        lhost, lport, lpath = parse_url(relative_url(link, self.url))
        header, body = request(lhost, lport, lpath)
        self.rules.extend(CSSParser(body)).parse()
    self.rules.sort(key=lambda x: x[0].score())
    style(self.nodes, self.rules)
    self.page = Page()
    self.layout = BlockLayout(self.page, self.nodes)
    self.layout.layout(0)
    self.max_h = self.layout.height()
    self.display_list = self.layout.display_list()
    self.render()

Here the methods like self.scrolldown, self.handle_click, and self.render are the functions we used to have of that name, but now with an additional self argument.

Running the browser is straight-forward:

browser = Browser()
browser.browse(sys.argv[1])
tkinter.mainloop()

In handle_click, that print statement can now call browse:

def handle_click(self, e):
    # ...
    if elt:
        self.browse(relative_url(elt.attributes["href"], self.url))

Try the code out, say on this page—you could use the links at the top of the page, for example. Our toy browser now sufficies to read not just a chapter, but the whole book.

Browser chrome

Now that we are navigating between pages all the time, it's easy to get a little lost and forget what web page you're looking at. Browsers solve this issue by with an address bar that shows the URL. Let's implement a little address bar ourselves.

The idea is to reserve the top 60 pixels of the canvas and then draw the address bar there. That 60 pixels is called the browser chrome.Yep, that predates and inspired the name of the Chrome browser.

To do that, we first have to move the page content itself further down. We can do that in render:

def render(self):
    self.canvas.delete("all")
    for cmd in self.display_list:
        cmd.draw(scrolly - 60, canvas)

We need to make a similar change in handle_click to subtract that 60 pixels off when we convert back from screen to page coordinates. Next, we need to cover up any actual page contents that got drawn to that top 60 pixels:

def render(self):
    # ...
    self.canvas.create_rectangle(0, 0, 800, 60, fill='white')

Of course a real browser wouldn’t draw that content in the first place, but in Tk that's a little tricky to do,Text that is partially covered by the browser chrome would be hard to handle. and covering it up later is easier.

The browser chrome area is now our playbox. Let’s add an address bar:

self.canvas.create_rectangle(10, 10, 790, 50)
self.canvas.create_text(15, 15, anchor='nw', text=self.url)

I’d like to tweak this a little to make the results look passable, tweaking the font and the size of things.

Browser history

The back button is another classic browser feature our browser really needs. I'll start by drawing the back button itself:

self.canvas.create_rectangle(10, 10, 35, 50)
self.canvas.create_polygon(15, 30, 30, 15, 30, 45, fill='black')

In Tk, create_polygon takes a list of coordinates and connects them into a shape. Here I've got three points that form a simple triangle evocative of a back button. You'll need to shrink the address bar so that it doesn't overlap this new back button.

Now we need to detect when that button is clicked on. This will go in handle_click, which must now have two cases, for clicks in the chrome and clicks in the page:

def handle_click(self, e):
    if e.y < 60: # Browser chrome
        if 10 <= e.x < 35 and 10 <= e.y < 50:
            self.go_back()
    else: # Page content
        # ...

How should self.go_back() work? Well, to begin with, we'll need to store the history of the browser got to the current page. I'll add a history field to Browser, and have browse append to it when navigating to a page. The self.url field now becomes the last element of self.history:

def browse(self, url):
    self.history.append(url)
    # ...

Now self.go_back() knows where to go:

def go_back(self):
    if len(self.history) > 1:
        self.browse(self.history[-2])

This is almost correct, but if you click the back button twice, you'll go forward instead, because browse has appended to the history. Instead, we need to do something more like:

def go_back(self):
    if len(self.history) > 1:
        self.history.pop()
        back = self.history.pop()
        self.browse(back)

I’d like to add a forward button too, which requires the history list to contain a cursor.

Summary

It's been a lot of work just to handle links! We have totally re-done line and text layout. That's allowed us to determine which piece of text a user clicked on, which allows us to determine what link they've clicked on and where that links goes. And as a cherry on top, we've implemented a simple browser chrome, which displays the URL of the current page and allows the user to go back.

Exercises