Sending Information to Servers | Web Browser Engineering

So far, our browser has seen the web as read-only—but when you post on Facebook, fill out a survey, or search Google, you’re sending information to servers as well as receiving information from them. In this chapter, we’ll start to transform our browser into a platform for web applications by building out support for HTML forms, the simplest way for a browser to send information to a server.

How Forms Work

First, in HTML there is a form element, which contains input elements,There are other elements similar to input, such as select and textarea. They work similarly enough; they just represent different kinds of user controls, like dropdowns and multi-line inputs. which in turn can be edited by the user. So a form might be written like this (see results in Figure 1):

This form contains two text entry boxes called name and comment. When the user goes to this page, they can click on those boxes to edit their values. Then, when they click the button at the end of the form, the browser collects all of the name–value pairs and bundles them into an HTTP POST request (as indicated by the method attribute), sent to the URL given by the form element’s action attribute, with the usual rules of relative URLs—so in this case, /submit. The POST request looks like this:

In other words, it’s a lot like the regular GET requests we’ve already seen, except that it has a body—you’ve already seen HTTP responses with bodies, but requests can have them too. Note the Content-Length header; it’s mandatory for POST requests. The server responds to this request with a web page, just like normal, and the browser then does everything it normally does.

Implementing forms requires extending many parts of the browser, from implementing HTTP POST through new layout objects that draw input elements to handling buttons clicks. That makes it a great starting point for transforming our browser into an application platform, our goal for the next few chapters. Let’s get started implementing it all!

Rendering Widgets

First, let’s draw the input areas that the user will type into.Most applications use OS libraries to draw input areas, so that those input areas look like other applications on that OS. But browsers need a lot of control over application styling, so they often draw their own input areas. Input areas are inline content, laid out in lines next to text. So to support inputs we’ll need a new kind of layout object, which I’ll call InputLayout. We can copy TextLayout and use it as a template, though we’ll need to make some quick edits.

The input and button elements need to be visually distinct so the user can find them easily. Our browser’s styling capabilities are limited, so let’s use background color to do that:

Note that <button> elements can in principle contain complex HTML, not just a text node. That’s too complicated for this chapter, so I’m having the browser print a warning and skip the text in that case.See Exercise 8-8. Finally, we draw that text:

By this point in the book, you’ve seen many layout objects, so I’m glossing over these changes. The point is that new layout objects are one common way to extend the browser.

Finally, this new input method is similar to the text method, creating a new layout object and adding it to the current line:It’s so similar in fact that they only differ in how they compute w. I’ll resist the temptation to refactor this code until we get to Chapter 15.

But actually, there are a couple more complications due to the way we decided to resolve the block-mixed-with-inline-siblings problem (see Chapter 5). One is that if there are no children for a node, we assume it’s a block element. But <input> elements don’t have children, yet must have inline layout or else they won’t draw correctly. Likewise, a <button> does have children, but they are treated specially.This situation is specific to these elements in our browser, but only because they are the only elements with special painting behavior within an inline context. These are also two examples of atomic inlines.

We can fix that with this change to layout_mode to add a second condition for returning “inline”:

The second problem is that, again due to having block siblings, sometimes an <input> or <button> element will create a BlockLayout (which will then create an InputLayout inside). In this case we don’t want to paint the background twice, so let’s add some simple logic to skip painting it in BlockLayout in this case, via a new should_paint method:Recall (see Chapter 5) that we only get into this situation due to the presence of anonymous block boxes. Also, it’s worth noting that there are various other ways that our browser does not fully implement all the complexities of inline painting—one example is that it does not correctly paint nested inlines with different background colors. Inline layout and paint are very complicated in real browsers.

Add a trivial should_paint method that just returns True to all of the other layout object types. Now we can skip painting objects based on should_paint:

With these changes the browser should now draw input and button elements as blue and orange rectangles.

Interacting with Widgets

We’ve got input elements rendering, but you can’t edit their contents yet. But of course that’s the whole point! So let’s make input elements work like the address bar does—clicking on one will clear it and let you type into it.

However, if you try this, you’ll notice that clicking does not actually clear the input element. That’s because the code above updates the HTML tree—but we need to update the layout tree and then the display list for the change to appear on the screen.

Right now, the layout tree and display list are computed in load, but we don’t want to reload the whole page; we just want to redo the styling, layout, paint and draw phases. Together these are called rendering. So let’s extract these phases into a new Tab method, render:

For this code to work, you’ll also need to change nodes and rules from local variables in the load method to new fields on a Tab. Note that styling moved from load to render, but downloading the style sheets didn’t—we don’t re-download the style sheetsActually, some changes to the web page could delete existing link nodes or create new ones. Real browsers respond to this correctly, either removing the rules corresponding to deleted link nodes or downloading new style sheets when new link nodes are created. This is tricky to get right, and typing into an input area definitely can’t make such changes, so let’s skip this in our browser. every time you type!

Now when we click an input element and clear its contents, we should call render to redraw the page with the input cleared. We also need to call render if we clicked off an input element, since we might have unfocused an input element in the process:

So that’s clicking in an input area. But typing is harder. Think back to how we implemented the address bar in Chapter 7: we added a focus field that remembered what we clicked on so we could later send it our key presses. We need something like that focus field for input areas, but it’s going to be more complex because the input areas live inside a Tab, not inside the Browser.

Naturally, we will need a focus field on each Tab, to remember which text entry (if any) we’ve recently clicked on:

Now when we click on an input element, we need to set focus (and clear focus if nothing was found to focus on):

But remember that keyboard input isn’t handled by the Tab—it’s handled by the Browser. So how does the Browser even know when keyboard events should be sent to the Tab? The Browser has to remember that in its own focus field!

In other words, when you click on the web page, the Browser updates its focus field to remember that the user is interacting with the page, not the browser chrome. And if so, it should unfocus (“blur”) the browser chrome:

The if branch that corresponds to clicks in the browser chrome unsets focus, meaning focus is no longer on the page contents, and key presses will thus be sent to the Chrome.

When a key press happens, the Browser either sends it to the address bar or calls the active tab’s keypress method (or neither, if nothing is focused):

Here I’ve changed keypress to return true if the browser chrome consumed the key:

That keypress method then uses the tab’s focus field to put the character in the right text entry:

Note that here we call render instead of draw, because we’ve modified the web page and thus need to regenerate the display list instead of just redrawing it to the screen.

Hierarchical focus handling is an important pattern for combining graphical widgets; in a real browser, where web pages can be embedded into one another with iframes,The iframe element allows you to embed one web page into another as a little window. We’ll talk about this more in Chapter 15. the focus tree can be arbitrarily deep.

So now we have user input working with input elements. Before we move on, there is one last tweak that we need to make: drawing the text cursor in the Tab’s render method. This turns out to be harder than expected: the cursor should be drawn by the InputLayout of the focused node, and that means that each node has to know whether or not it’s focused:

Add the same field to Text nodes; they’ll never be focused and never draw cursors, but it’s more convenient if Text and Element have the same fields. We’ll set this when we move focus to an input element:

Note that we have to un-focus the currently focused element, lest it keep drawing its cursor. Anyway, now we can draw a cursor if an input element is focused:

Now you can click on a text entry, type into it, and modify its value. The next step is submitting the now-filled-out form.

Submitting Forms

You submit a form by clicking on a button. So let’s add another condition to the big while loop in click:

Once we’ve found the button, we need to find the form that it’s in by walking up the HTML tree:

The submit_form method is then in charge of finding all of the input elements, encoding them in the right way, and sending the POST request. First, we look through all the descendents of the form to find input elements:

For each of those input elements, we need to extract the name attribute and the value attribute, and form encode both of them. Form encoding is how the name–value pairs are formatted in the HTTP POST request. Basically, it is: name, then equal sign, then value; and name–value pairs are separated by ampersands:

Here, body initially has an extra & tacked on to the front, which is removed on the last line.

Now, any time you see special syntax like this, you’ve got to ask: what if the name or the value has an equal sign or an ampersand in it? So in fact, “percent encoding” replaces all special characters with a percent sign followed by those characters’ hex codes. For example, a space becomes %20 and a period becomes %2e. Python provides a percent-encoding function as quote in the urllib.parse module:You can write your own percent_encode function using Python’s ord and hex functions if you like. I’m using the standard function for expediency. In Chapter 1, using these library functions would have obscured key concepts, but by this point percent encoding is necessary but not conceptually interesting.

Now that submit_form has built a request body, it needs to make a POST request. I’m going to defer that responsibility to the load function, which handles making requests:

In request, this new argument is used to decide between a GET and a POST request:

Note that the Content-Length is the length of the payload in bytes, which might not be equal to its length in letters.Because characters from many languages take up multiple bytes. Finally, after the headers, we send the payload itself:

So that’s how the POST request gets sent. Then the server responds with an HTML page and the browser will render it in the totally normal way.Actually, because browsers treat going “back” to a POST-requested page specially (see Exercise 8-5), it’s common to respond to a POST request with a redirect. That’s basically it for forms!

How web apps work

So … how do web applications (web apps) use forms? When you use an application from your browser—whether you are registering to vote, looking at pictures of your baby cousin, or checking your email—there are typicallyHere I’m talking in general terms. There are some browser applications without a server, and others where the client code is exceptionally simple and almost all the code is on the server. two programs involved: client code that runs in the browser, and server code that runs on the server. When you click on things or take actions in the application, that runs client code, which then sends data to the server via HTTP requests.

For example, imagine a simple message board application. The server stores the state of the message board—who has posted what—and has logic for updating that state. But all the actual interaction with the page—drawing the posts, letting the user enter new ones—happens in the browser. Both components are necessary.

The browser and the server interact over HTTP. The browser first makes a GET request to the server to load the current message board. The user interacts with the browser to type a new post, and submits it to the server (say, via a form). That causes the browser to make a POST request to the server, which instructs the server to update the message board state. The server then needs the browser to update what the user sees; with forms, the server sends a new HTML page in its response to the POST request. This process is shown in Figure 2.

Forms are a simple, minimal introduction to this cycle of request and response and make a good introduction to how browser applications work. They’re also implemented in every browser and have been around for decades. These days many web applications use the form elements, but replace synchronous POST requests with asynchronous ones driven by Javascript,In the early 2000s, the adoption of asynchronous HTTP requests sparked the wave of innovative new web applications called Web 2.0. which makes applications snappier by hiding the time to make the HTTP request. In return for that snappiness, that JavaScript code must now handle errors, validate inputs, and indicate loading time. In any case, both synchronous and asynchronous uses of forms are based on the same principles of client and server code.

Receiving POST Requests

To better understand the request/response cycle, let’s write a simple web server. It’ll implement an online guest book,They were very hip in the 1990s—comment threads from before there was anything to comment on. kind of like an open, anonymous comment thread. Now, this is a book on web browser engineering, so I won’t discuss web server implementation that thoroughly. But I want you to see how the server side of an application works.

A web server is a separate program from the web browser, so let’s start a new file. The server will need to:

Let’s start by opening a socket. Like for the browser, we need to create an internet streaming socket using TCP:

The setsockopt call is optional. Normally, when a program has a socket open and it crashes, your OS prevents that port from being reusedWhen your process crashes, the computer on the end of the connection won’t be informed immediately; if some other process opens the same port, it could receive data meant for the old, now-dead process. for a short period. That’s annoying when developing a server; calling setsockopt with the SO_REUSEADDR option allows the OS to immediately reuse the port.

Now, with this socket, instead of calling connect (to connect to some other server), we’ll call bind, which waits for other computers to connect:

Let’s look at the bind call first. Its first argument says who should be allowed to make connections to the server; the empty string means that anyone can connect. The second argument is the port others must use to talk to our server; I’ve chosen 8000. I can’t use 80, because ports below 1024 require administrator privileges, but you can pick something other than 8000 if, for whatever reason, port 8000 is taken on your machine.

Finally, after the bind call, the listen call tells the OS that we’re ready to accept connections.

To actually accept those connections, we enter a loop that runs once per connection. At the top of the loop we call s.accept to wait for a new connection:

That connection object is, confusingly, also a socket: it is the socket corresponding to that one connection. We know what to do with those: we read the contents and parse the HTTP message. But it’s a little trickier in the server than in the browser, because the server can’t just read from the socket until the connection closes—the browser is waiting for the server and won’t close the connection.

So, we’ve got to read from the socket line by line. First, we read the request line:

Then we read the headers until we get to a blank line, accumulating the headers in a dictionary:

Finally we read the body, but only when the Content-Length header tells us how much of it to read (that’s why that header is mandatory on POST requests):

Now the server needs to generate a web page in response. We’ll get to that later; for now, just abstract that away behind a do_request call:

The architecture is summarized in Figure 3. Our implementation is all pretty bare-bones: our server doesn’t check that the browser is using HTTP 1.0 to talk to it, it doesn’t send back any headers at all except Content-Length, it doesn’t support TLS, and so on. Again: this is a web browser book—it’ll do.

Generating Web Pages

So far, all of this server code is “boilerplate”—any web application will have similar code. What makes our server a guest book, on the other hand, depends on what happens inside do_request. It needs to store the guest book state, generate HTML pages, and respond to POST requests.

Let’s store guest book entries in a Python list. Usually web applications use persistent state, like a database, so that the server can be restarted without losing state, but our guest book need not be that resilient.

This is definitely “minimal” HTML, so it’s a good thing our browser will insert implicit tags and has some default styles! You can test it out by running this minimal web server and, while it’s running, direct your browser to http://localhost:8000/, where localhost is what your computer calls itself and 8000 is the port we chose earlier. You should see one guest book entry.

By the way, while you’re debugging this web server, it’s probably better to use a real web browser, instead of this book’s browser, to interact with it. That way you don’t have to worry about browser bugs while you work on server bugs. But this server does support both real and toy browsers.

When this form is submitted, the browser will send a POST request to http://localhost:8000/add. So the server needs to react to these submissions. That means do_request will field two kinds of requests: regular browsing and form submissions. Let’s separate the two kinds of requests into different functions.

This then frees up the do_request function to figure out which function to call for which request:

When a POST request to /add comes in, the first step is to decode the request body:

Note that I use unquote_plus instead of unquote, because browsers may also use a plus sign to encode a space. The add_entry function then looks up the guest parameter and adds its content as a new guest book entry:

I’ve also added a “404” response. Fitting the austere stylings of our guest book, here’s the 404 page:

Try it! You should be able to restart the server, open it in your browser, and update the guest book a few times. You should also be able to use the guest book from a real web browser.

Summary

With this chapter we’re starting to transform our browser into an application platform. We’ve added:

Plus, our browser now has a little web server friend. That’s going to be handy as we add more interactive features to the browser.

Outline

The complete set of functions, classes, and methods in our browser should now look something like this:

class URL:
    def __init__(url)

    def request(payload)

    def resolve(url)

    def __str__()

class Text:
    def __init__(text, parent)

    def __repr__()

class Element:
    def __init__(tag, attributes, parent)

    def __repr__()

def print_tree(node, indent)

def tree_to_list(tree, list)

class HTMLParser:
    SELF_CLOSING_TAGS

    HEAD_TAGS

    def __init__(body)

    def parse()

    def get_attributes(text)

    def add_text(text)

    def add_tag(tag)

    def implicit_tags(tag)

    def finish()

class CSSParser:
    def __init__(s)

    def whitespace()

    def literal(literal)

    def word()

    def ignore_until(chars)

    def pair()

    def selector()

    def body()

    def parse()

class TagSelector:
    def __init__(tag)

    def matches(node)

class DescendantSelector:
    def __init__(ancestor, descendant)

    def matches(node)

FONTS

def get_font(size, weight, style)

DEFAULT_STYLE_SHEET

INHERITED_PROPERTIES

def style(node, rules)

def cascade_priority(rule)

WIDTH, HEIGHT

HSTEP, VSTEP

class Rect:
    def __init__(left, top, right, bottom)

    def contains_point(x, y)

INPUT_WIDTH_PX

BLOCK_ELEMENTS

class DocumentLayout:
    def __init__(node)

    def layout()

    def should_paint()

    def paint()

class BlockLayout:
    def __init__(node, parent, previous)

    def layout_mode()

    def layout()

    def recurse(node)

    def new_line()

    def word(node, word)

    def input(node)

    def self_rect()

    def should_paint()

    def paint()

class LineLayout:
    def __init__(node, parent, previous)

    def layout()

    def should_paint()

    def paint()

class TextLayout:
    def __init__(node, word, parent, previous)

    def layout()

    def should_paint()

    def paint()

class InputLayout:
    def __init__(node, parent, previous)

    def layout()

    def should_paint()

    def paint()

    def self_rect()

class DrawText:
    def __init__(x1, y1, text, font, color)

    def execute(scroll, canvas)

class DrawRect:
    def __init__(rect, color)

    def execute(scroll, canvas)

class DrawLine:
    def __init__(x1, y1, x2, y2, color, thickness)

    def execute(scroll, canvas)

class DrawOutline:
    def __init__(rect, color, thickness)

    def execute(scroll, canvas)

def paint_tree(layout_object, display_list)

SCROLL_STEP

class Tab:
    def __init__(tab_height)

    def load(url, payload)

    def render()

    def draw(canvas, offset)

    def scrolldown()

    def click(x, y)

    def go_back()

    def submit_form(elt)

    def keypress(char)

class Chrome:
    def __init__(browser)

    def tab_rect(i)

    def paint()

    def click(x, y)

    def keypress(char)

    def enter()

    def blur()

class Browser:
    def __init__()

    def draw()

    def new_tab(url)

    def handle_down(e)

    def handle_click(e)

    def handle_key(e)

    def handle_enter(e)

def handle_connection(conx)

def do_request(method, url, headers, body)

def form_decode(body)

ENTRIES

def show_comments()

def not_found(url, method)

def add_entry(params)

Exercises

8-1 Enter key. In most browsers, if you hit the “Enter” or “Return” key while inside a text entry, that submits the form that the text entry was in. Add this feature to your browser.

8-2 GET forms. Forms can be submitted via GET requests as well as POST requests. In GET requests, the form-encoded data is pasted onto the end of the URL, separated from the path by a question mark, like /search?q=hi; GET form submissions have no body. Implement GET form submissions.

8-3 Blurring. Right now, if you click inside a text entry, and then inside the address bar, two cursors will appear on the screen. To fix this, add a blur method to each Tab which unfocuses anything that is focused, and call it before changing focus.

8-4 Check boxes. In HTML, input elements have a type attribute. When set to checkbox, the input element looks like a checkbox; it’s checked if the checked attribute is set, and unchecked otherwise.Technically, the checked attribute only affects the state of the checkbox when the page loads; checking and unchecking a checkbox does not affect this attribute but instead manipulates internal state. When the form is submitted, a checkbox’s name=value pair is included only if the checkbox is checked. (If the checkbox has no value attribute, the default is the string on.)

8-5 Resubmit requests. One reason to separate GET and POST requests is that GET requests are supposed to be idempotent (read-only, basically) while POST requests are assumed to change the web server state. That means that going “back” to a GET request (making the request again) is safe, while going “back” to a POST request is a bad idea. Change the browser history to record what method was used to access each URL, and the POST body if one was used. When you go back to a POST-ed URL, ask the user if they want to resubmit the form. Don’t go back if they say no; if they say yes, submit a POST request with the same body as before.

8-6 Message board. Right now our web server is a simple guest book. Extend it into a simple message board by adding support for topics. Each topic should have its own URL and its own list of messages. So, for example, /cooking should be a page of posts (about cooking) and comments submitted through the form on that page should only show up when you go to /cooking, not when you go to /cars. Make the home page, at /, list the available topics with a link to each topic’s page. Make it possible for users to add new topics.

8-7 Persistence. Back the server’s list of guest book entries with a file, so that when the server is restarted it doesn’t lose data.

8-8 Rich buttons. Make it possible for a button to contain arbitrary elements as children, and render them correctly. The children should be contained inside the button instead of spilling out—this can make a button really tall. Think about edge cases, like a button that contains another button, an input area, or a link, and test real browsers to see what they do.

8-9 HTML chrome. Browser chrome is quite complicated in real browsers, with tricky details such as font sizes, padding, outlines, shadows, icons and so on. This makes it tempting to try to reuse our layout engine for it. Implement this, using <button> elements for the new tab and back buttons, an <input> element for the address bar, and <a> elements for the tab names. It won’t look exactly the same as the current chrome—outline will have to wait for Chapter 14, for example—but if you adjust the default CSS you should be able to make it look passable.Real browsers have in fact gone down this implementation path multiple times, building layout engines for the browser chrome that are heavily inspired by or reuse pieces of the main web layout engine. Firefox had one, and Chrome has one. However, because it’s so important for the browser chrome to be very fast and responsive to draw, such approaches have had mixed success.