Applying User Styles

Twitter · Blog · Patreon · Discussions

In the last chapter, we gave each pre element a gray background. It looks OK, and it is good to have defaults… but of course sites want a say in how they look. Web sites do that with Cascading Style Sheets, which allow web authors (and, as we’ll see, browser developers) to define how a web page ought to look.

Parsing with functions

One way a web page can change its appearance is with the style attribute. For example, this changes an element’s background color:

<div style="background-color:lightblue"></div>

More generally, a style attribute contains property/value pairs separated by semicolons. The browser looks at those property-value pairs to determine how an element looks, for example to determine its background color.

To add this to our browser, we’ll need to start by parsing these property/value pairs. I’ll use recursive parsing functions, which are a good way to build a complex parser step by step. The idea is that each parsing function advances through the text being parsed and returns the data it parsed. We’ll have different functions for different types of data, and organize them in a CSSParser class that stores the text being parsed and the parser’s current position in it:

class CSSParser:
    def __init__(self, s):
        self.s = s
        self.i = 0

Let’s start small and build up. A parsing function for whitespace increments the index i past every whitespace character:

def whitespace(self):
    while self.i < len(self.s) and self.s[self.i].isspace():
        self.i += 1

Whitespace is insignificant, so there’s no data to return in this case. On the other hand, we’ll want to return property names and values when we parse them:

def word(self):
    start = self.i
    while self.i < len(self.s):
        if self.s[self.i].isalnum() or self.s[self.i] in "#-.%":
            self.i += 1
        else:
            break
    assert self.i > start
    return self.s[start:self.i]

This function increments i through any word characters,I’ve chosen the set of word characters here to cover property names (which use letters and the dash), numbers (which use the minus sign, numbers, periods), units (the percent sign), and colors (which use the hash sign). Real CSS values have a more complex syntax but this is enough for our toy browser. much like whitespace. But to return the parsed data, it stores where it started and extracts the substring it moved through.

Parsing functions can fail. The word function we just wrote has an assertion to check that i advanced though at least one character—otherwise it didn’t point at a word to begin with.You can add error text to the assertion, too; I recommend doing that to help you debug problems. Likewise, to check for a literal colon (or some other punctuation character) you’d do this:

def literal(self, literal):
    assert self.i < len(self.s) and self.s[self.i] == literal
    self.i += 1

The great thing about parsing functions is that they can build on one another. For example, property-value pairs are a property, a colon, and a value,In reality properties and values have different syntaxes, so using word for both isn’t quite right, but for our browser’s limited CSS this simplification will do. with whitespace in between:

def pair(self):
    prop = self.word()
    self.whitespace()
    self.literal(":")
    self.whitespace()
    val = self.word()
    return prop.lower(), val

We can parse sequences by calling parsing functions in a loop. For example, style attributes are a sequence of property-value pairs:

def body(self):
    pairs = {}
    while self.i < len(self.s):
        prop, val = self.pair()
        pairs[prop.lower()] = val
        self.whitespace()
        self.literal(";")
        self.whitespace()
    return pairs

Now, in a browser, we always have to think about handling errors. Sometimes a web page author makes a mistake; sometimes our browser doesn’t support a feature some other browser does. So we should skip property-value pairs that don’t parse, but keep the ones that do.

We can skip things with this little function; it stops at any one of a set of characters, and returns that character (or None if it was stopped by the end of the file):

def ignore_until(self, chars):
    while self.i < len(self.s):
        if self.s[self.i] in chars:
            return self.s[self.i]
        else:
            self.i += 1

When we fail to parse a property-value pair, we either skip to the next semicolon or to the end of the string:

def body(self):
    # ...
    while self.i < len(self.s):
        try:
            # ...
        except AssertionError:
            why = self.ignore_until([";"])
            if why == ";":
                self.literal(";")
                self.whitespace()
            else:
                break
    # ...

Skipping parse errors is a double-edged sword. It hides error messages, making it harder for authors to debug their style sheets; it also makes it harder to debug your parser.I suggest removing the try block when debugging. So in most programming situations this “catch-all” error handling is a code smell.

But “catch-all” error handling has an unusual benefit on the web. The web is an ecosystem of many browsers,And an ecosystem of many browser versions, some of which haven’t been written yet—but need to be supported as best we can. which (for example) support different kinds of property values.Our browser does not support parentheses in property values, for example, which real browsers use for things like the calc and url functions. CSS that parses in one browser might not parse in another. With silent parse errors, browsers just ignore stuff they don’t understand, and web pages mostly work in all of them. The principle (variously called “Postel’s Law”,After a line in the specification of TCP, written by Jon Postel the “Digital Principle”,After a similar idea in circuit design, where transistors must be nonlinear to reduce analog noise. or the “Robustness Principle”) is: produce maximally conformant output but accept even minimally conformant input.

This parsing method is formally called recursive descent parsing for an LL(1) language. Parsers that use this method can be really, really fast, at least if you put a lot of work into it. In a browser, faster parsing means pages load faster.

The style attribute

Now that the style attribute is parsed, we can use that parsed information in the rest of the browser. We’ll store the parsed information in a style field on each node:

def style(node):
    node.style = {}
    # ...
    for child in node.children:
        style(child)

Call style in the browser’s load method, after parsing the HTML but before doing layout.

This style function will also fill in the style field by parsing the element’s style attribute:

def style(node):
    # ...
    if isinstance(node, Element) and "style" in node.attributes:
        pairs = CSSParser(node.attributes["style"]).body()
        for property, value in pairs.items():
            node.style[property] = value
    # ...

With the style information stored on each element, the browser can consult it for styling information:

class InlineLayout:
    def paint(self, display_list):
        bgcolor = self.node.style.get("background-color",
                                      "transparent")
        if bgcolor != "transparent":
            x2, y2 = self.x + self.width, self.y + self.height
            rect = DrawRect(self.x, self.y, x2, y2, bgcolor)
            display_list.append(rect)
        # ...

I’ve removed the default gray background from pre elements for now, but we’ll put it back soon.

Open this chapter up in your browser to test your code: the code block right after this paragraph should now have a light blue background.

<div style="background-color:lightblue"> ... </div>

So this is one way web pages can change their appearance. And in the early days of the web,I’m talking Netscape 3. The late 90s. something like this was the only way. But honestly, it’s a pain—you need to set a style attribute on each element, and if you change the style that’s a lot of attributes to edit. CSS was invented to improve on this state of affairs:

To achieve these goals, CSS extends the style attribute with two related ideas: selectors and cascading. Selectors describe which HTML elements a list of property/value pairs apply to:CSS rules can also be guarded by “media queries”, which say that a rule should apply only in certain browsing environments (like only on mobile or only in landscape mode). Media queries are super-important for building sites that work across many devices, like reading this book on a phone.

selector { property-1: value-1; property-2: value-2; }

Since one of these rules can apply to many elements, it’s possible for several rules to apply to the same element. So browsers have a cascading mechanism to resolve conflicts in favor of the most specific rule. Cascading also means a browser can ignore rules it doesn’t understand and choose the next-most-specific rule that it does understand.

So next, let’s add support for CSS to our browser. We’ll need to parse CSS files into selectors and property/value pairs; figure out which elements on the page match each selector; and then copy those property values to the elements’ style fields.

Actually, before CSS, you’d style pages with custom elements like font and center. This was easy to implement but made it hard to keep pages consistent. There were also properties on <body> like text and vlink that could consistently set text colors, mainly for links.

Selectors

Selectors come in lots of types, but in our browser, we’ll support two: tag selectors (p selects all <p> elements, ul selects all <ul> elements) and descendant selectors (article div selects all div elements with an article ancestor).The descendant selector associates to the left; in other words, a b c means a c that descends from a b that descends from an a, which maybe you’d write (a b) c if CSS had parentheses.

We’ll have a class for each type of selector to store the selector’s contents, like the tag name for a tag selector:

class TagSelector:
    def __init__(self, tag):
        self.tag = tag

Each selector class will also test whether the selector matches an element:

    def matches(self, node):
        return isinstance(node, Element) and self.tag == node.tag

A descendant selector works similarly. It has two parts, which are both themselves selectors:

class DescendantSelector:
    def __init__(self, ancestor, descendant):
        self.ancestor = ancestor
        self.descendant = descendant

Then the match method is recursive:

class DescendantSelector:
    def matches(self, node):
        if not self.descendant.matches(node): return False
        while node.parent:
            if self.ancestor.matches(node.parent): return True
            node = node.parent
        return False

Now, to create these selector objects, we need a parser. In this case, that’s just another parsing function:Once again, using word here for tag names is actually not quite right, but it’s close enough. One tricky side effect of using word is that a class name selector (like .main) or an identifier selector (like #signup) is mis-parsed as a tag name selector. But that won’t cause any harm since there aren’t any elements with those tags.

def selector(self):
    out = TagSelector(self.word().lower())
    self.whitespace()
    while self.i < len(self.s) and self.s[self.i] != "{":
        tag = self.word()
        descendant = TagSelector(tag.lower())
        out = DescendantSelector(out, descendant)
        self.whitespace()
    return out

A CSS file is just a sequence of selectors and blocks:

def parse(self):
    rules = []
    while self.i < len(self.s):
        self.whitespace()
        selector = self.selector()
        self.literal("{")
        self.whitespace()
        body = self.body()
        self.literal("}")
        rules.append((selector, body))
    return rules

Once again, let’s pause to think about error handling. First, when we call body while parsing CSS, we need it to stop when it reaches a closing brace:

def body(self):
    # ...
    while self.i < len(self.s) and self.s[self.i] != "}":
        try:
            # ...
        except AssertionError:
            why = self.ignore_until([";", "}"])
            if why == ";":
                self.literal(";")
                self.whitespace()
            else:
                break
    # ...

Second, there might also be an parse error while parsing a selector. In that case, we want to skip the whole rule:

def parse(self):
    # ...
    while self.i < len(self.s):
        try:
            # ...
        except AssertionError:
            why = self.ignore_until(["}"])
            if why == "}":
                self.literal("}")
                self.whitespace()
            else:
                break
    # ...

Error handling is hard to get right, so make sure to test your parser, just like the HTML parser two chapters back. Here are some errors you might run into:

You can also add a print statement to the start and endIf you print an open parenthesis at the start of the function and a close parenthesis at the end, you can use your editor’s “jump to other parenthesis” feature to skip through output quickly. of each parsing function with the name of the parsing function,If you also add the right number of spaces to each line it’ll be a lot easier to read. Don’t neglect debugging niceties like this! the index i,It can be especially helpful to print, say, the 20 characters around index i from the string. and the parsed data. It’s a lot of output, but it’s a sure-fire way to find really complicated bugs.

A parser receives arbitrary bytes as input, so parser bugs are usually easy for bad actors to exploit. Parser correctness is thus crucial to browser security, as many parser bugs have demonstrated. Nowadays browser developers use fuzzing to try to find and fix such bugs.

Applying style sheets

With the parser debugged, the next step is applying the parsed style sheet to the web page. Since each CSS rule can style many elements on the page, this will require looping over all elements and all rules. When a rule applies, its property/values pairs are copied to the element’s style information:

def style(node, rules):
    # ...
    for selector, body in rules:
        if not selector.matches(node): continue
        for property, value in body.items():
            node.style[property] = value

Make sure to put this loop before the one that parses the style attribute: the style attribute should override style sheet values.

To try this out, we’ll need a style sheet. Every browser ships with a browser style sheet,Technically called a “User Agent” style sheet. User Agent, like the Memex. which defines its default styling for the various HTML elements. For our browser, it might look like this:

pre { background-color: gray; }

Let’s store that in a new file, browser.css, and have our browser read it when it starts:

class Browser:
    def __init__(self):
        # ...
        with open("browser.css") as f:
            self.default_style_sheet = CSSParser(f.read()).parse()

Now, when the browser loads a web page, it can apply that default style sheet to set up its default styling for each element:

def load(self, url):
    # ...
    rules = self.default_style_sheet.copy()
    style(self.nodes, rules)
    # ...

The browser style sheet is the default for the whole web. But each web site can also use CSS to set a consistent style for the whole site. by referencing CSS files using link elements:

<link rel="stylesheet" href="/main.css">

The mandatory rel attribute identifies this link as a style sheetFor browsers, stylesheet is the most important kind of link, but there’s also preload for loading assets that a page will use later and icon for identifying favicons. Search engines also use these links; for example, rel=canonical names the “true name” of a page and search engines use it to track pages that appear at multiple URLs. and the href attribute has the style sheet URL. We need to find all these links, download their style sheets, and apply them.

Since we’ll be doing similar tasks in the next few chapters, let’s generalize a bit and write a recursive function that turns a tree into a list of nodes:

def tree_to_list(tree, list):
    list.append(tree)
    for child in tree.children:
        tree_to_list(child, list)
    return list

I’ve written this helper to work on both HTML and layout tree, for later. We can use tree_to_list with a Python list comprehensionIt’s kind of crazy, honestly, that Python lets you write things like this—crazy, but very convenient! to grab the URL of each linked style sheet:

def load(self, url):
    # ...
    links = [node.attributes["href"]
             for node in tree_to_list(self.nodes, [])
             if isinstance(node, Element)
             and node.tag == "link"
             and "href" in node.attributes
             and node.attributes.get("rel") == "stylesheet"]
    # ...

Now, these style sheet URLs are usually not full URLs; they are something called relative URLs, such as:There are other flavors, including query-relative and scheme-relative URLs, that I’m skipping.

To download the style sheets, we’ll need to convert each relative URL into a full URL:

def resolve_url(url, current):
    if "://" in url:
        return url
    elif url.startswith("/"):
        scheme, hostpath = current.split("://", 1)
        host, oldpath = hostpath.split("/", 1)
        return scheme + "://" + host + url
    else:
        dir, _ = current.rsplit("/", 1)
        while url.startswith("../"):
            url = url[3:]
            if dir.count("/") == 2: continue
            dir, _ = dir.rsplit("/", 1)
        return dir + "/" + url

When resolving path-relative URLs, we count the number of slashes in the “directory” to make sure we never strip off the scheme and host name.

Now the browser can request each linked style sheet and add its rules to the rules list:

def load(self, url):
    # ...
    for link in links:
        try:
            header, body = request(resolve_url(link, url))
        except:
            continue
        rules.extend(CSSParser(body).parse())

The try/except ignores style sheets that fail to download, but it can also hide bugs in your code, so if something’s not right try removing it temporarily.

Each browser has its own browser style sheet (Chromium, Safari, Firefox). Reset style sheets are often used to overcome any differences. This works because web page style sheets take precedence over the browser style sheet, just like in our browser, though real browsers fiddle with priorities to make that happen.Our browser style sheet only has tag selectors in it, so just putting them first works well enough. But if the browser style sheet had any descendant selectors, we’d encounter bugs.

Cascading

A web page can now have any number of style sheets applied to it. And since two rules can apply to the same element, rule order matters: it determines which rules take priority, and when one rule overrides another.

In CSS, the correct order is called cascade order, and it is based on the rule’s selector, with file order as a tie breaker. This system allows more specific rules to override more general ones, so that you can have a browser style sheet, a site-wide style sheet, and maybe a special style sheet for a specific web page, all co-existing.

Since our browser only has tag selectors, our cascade order just counts them:

class TagSelector:
    def __init__(self, tag):
        # ...
        self.priority = 1

class DescendantSelector:
    def __init__(self, ancestor, descendant):
        # ...
        self.priority = ancestor.priority + descendant.priority

Then our cascade order for rules is just those priorities:

def cascade_priority(rule):
    selector, body = rule
    return selector.priority

Now when we call style, we need to sort the rules, like this:

def load(self, url):
    # ...
    style(self.nodes, sorted(rules, key=cascade_priority))
    # ...

Note that before sorting rules, it is in file order. Since Python’s sorted function keeps the relative order of things when possible, file order thus acts as a tie breaker, as it should.

That’s it: we’ve added CSS to our web browser! I mean—for background colors. But there’s more to web design than that. For example, if you’re changing background colors you might want to change foreground colors as well—the CSS color property. But there’s a catch: color affects text, and there’s no way to select a text node. How can that work?

Web pages can also supply “alternative style sheets”, and some browsers provide (obscure) methods to switch from the default to an alternate style sheet. The CSS standard also allows for browser extensions that set custom style sheets for websites.

Inherited styles

The way text styles work in CSS is called inheritance. Inheritance means that if some node doesn’t have a value for a certain property, it uses its parent’s value instead. That includes text nodes. Some properties are inherited and some aren’t; it depends on the property. Background color isn’t inherited, but text color and other font properties are.

Let’s implement inheritance for four font properties: color, font-weight (normal or bold), font-style (normal or italic), and font-size (a length or percentage).

Let’s start by listing our inherited properties and their default values:

INHERITED_PROPERTIES = {
    "font-size": "16px",
    "font-style": "normal",
    "font-weight": "normal",
    "color": "black",
}

We’ll then add the actual inheritance code to the style function. It has to come before the other loops, since explicit rules should override inheritance:

def style(node, rules):
    # ...
    for property, default_value in INHERITED_PROPERTIES.items():
        if node.parent:
            node.style[property] = node.parent.style[property]
        else:
            node.style[property] = default_value
    # ...

Inheriting font size comes with a twist. Web pages can use percentages as font sizes: h1 { font-size: 150% } makes headings 50% bigger than surrounding text. But what if you had, say, a code element inside an h1 tag—would that inherit the 150% value for font-size? Surely it shouldn’t be another 50% bigger than the rest of the heading text?

So, in fact, browsers resolve percentages to absolute pixel units before storing them in the style and before those values are inherited; it’s called a [“computed style”]Full CSS is a bit more confusing: there are specified, computed, used, and actual values, and they affect lots of CSS properties besides font-size. We’re just not implementing those other properties in this book.. Of the properties our toy browser supports, only font-size needs to be computed in this way:

def compute_style(node, property, value):
    if property == "font-size":
        if value.endswith("px"):
            return value
        elif value.endswith("%"):
            # ...
        else:
            return None
    else:
        return value

Percentage sizes also have to handle a tricky edge case: percentage sizes for the root html element. In that case the percentage is relative to the default font size:This code has to parse and unparse font sizes because our style field stores strings; in a real browser the computed style is stored parsed so this doesn’t have to happen.

if node.parent:
    parent_font_size = node.parent.style["font-size"]
else:
    parent_font_size = INHERITED_PROPERTIES["font-size"]
node_pct = float(value[:-1]) / 100
parent_px = float(parent_font_size[:-2])
return str(node_pct * parent_px) + "px"

Now style can call computed_style any time it reads a property value out of a style sheet:

def style(node, rules):
    # ...
    for selector, body in rules:
        if not selector.matches(node): continue
        for property, value in body.items():
            computed_value = compute_style(node, property, value)
            if not computed_value: continue
            node.style[property] = computed_value
    # ...

Note that because the style function recurses at the end of the function, the node’s parent already has a font-size value stored when compute_style is called.

The loop that handles style attributes should likewise call computed_style. Remember that style attributes overwrite CSS rules; that means the loop above, which handles rules, should come before the loop that handles the style attribute.

Styling a page can be slow, so real browsers apply tricks like bloom filters for descendant selectors, indices for simple selectors, and various forms of sharing and parallelism. Some types of sharing are also important to reduce memory usage—computed style sheets can be huge!

Font Properties

So now with all these font properties implemented, let’s change layout to use them! That will let us move our default text styles to the browser style sheet:

a { color: blue; }
i { font-style: italic; }
b { font-weight: bold; }
small { font-size: 90%; }
big { font-size: 110%; }

The browser looks up font information in InlineLayout’s text method; we’ll need to change it to use the node’s style field:

class InlineLayout:
    def text(self, node):
        # ...
        weight = node.style["font-weight"]
        style = node.style["font-style"]
        if style == "normal": style = "roman"
        size = int(float(node.style["font-size"][:-2]) * .75)
        font = get_font(size, weight, style)
        # ...

Note that for font-style we need to translate CSS “normal” to Tk “roman” and for font-size we need to convert CSS pixels to Tk points.

Text color requires a bit more plumbing. First, we have to read the color and store it in the current line:

def text(self, node):
    color = node.style["color"]
    # ...
    for word in node.text.split():
        # ...
        self.line.append((self.cursor_x, word, font, color))
        # ...

The flush method then copies it from line to display_list:

def flush(self):
    # ...
    metrics = [font.metrics() for x, word, font, color in self.line]
    # ...
    for x, word, font, color in self.line:
        # ...
        self.display_list.append((x, y, word, font, color))
    # ...

That display_list is converted to drawing commands in paint:

def paint(self, display_list):
    # ...
    for x, y, word, font, color in self.display_list:
        display_list.append(DrawText(x, y, word, font, color))

DrawText now needs a color argument, and needs to pass it to create_text’s fill parameter:

class DrawText:
    def __init__(self, x1, y1, text, font, color):
        # ...
        self.color = color

    def execute(self, scroll, canvas):
        canvas.create_text(
            # ...
            fill=self.color,
        )

Phew! That was a lot of coordinated changes, so test everything and make sure it works. You should now see links on this page appear in blue—and you might also notice that the rest of the text has become slightly lighter.The book’s main body text is colored #333, or roughly 97% black after gamma correction. Also, now that we’re explicitly setting the text color, we should explicitly set the background color as well:My Linux machine sets the default background color to a light gray, while my macOS laptop has a “Dark Mode” where the default background color becomes a dark gray. Setting the background color explicitly avoids the browser looking strange in these situations.

class Browser:
    def __init__(self):
        # ...
        self.canvas = tkinter.Canvas(
            # ...
            bg="white",
        )
        # ...

These changes obsolete all the code in InlineLayout that handles specific tags, like the style, weight, and size properties and the open_tag and close_tag methods. Let’s refactor a bit to get rid of them:

def recurse(self, node):
    if isinstance(node, Text):
        self.text(node)
    else:
        if node.tag == "br":
            self.flush()
        for child in node.children:
            self.recurse(child)

Styling not only lets web page authors style their own web pages; it also moves browser code to a simple style sheet. And that’s a big improvement: the style sheet is simpler and easier to edit. Sometimes converting code to data like this means maintaining a new format, but browsers get to reuse a format, CSS, they need to support anyway.

Usually a point is one 72nd of an inch while pixel size depends on the screen, but CSS instead defines an inch as 96 pixels, because that was once a common screen resolution. And these CSS pixels need not be physical pixels! Seem weird? OS internals are equally bizarre, let alone traditional typesetting.

Summary

This chapter implemented a rudimentary but complete styling engine, including downloading, parsing, matching, sorting, and applying CSS files. That means we:

Our styling engine is also relatively easy to extend with properties and selectors.

Close

Outline

The complete set of functions, classes, and methods in our browser should now look something like this:

def request(url) WIDTH HEIGHT HSTEP VSTEP SCROLL_STEP FONTS def get_font(size, weight, slant) class Text: def __init__(text, parent) def __repr__() class Element: def __init__(tag, attributes, parent) def __repr__() def print_tree(node, indent) class HTMLParser: def __init__(body) def parse() def get_attributes(text) def add_text(text) SELF_CLOSING_TAGS def add_tag(tag) HEAD_TAGS def implicit_tags(tag) def finish() BLOCK_ELEMENTS def layout_mode(node) class DrawRect: def __init__(x1, y1, x2, y2, color) def execute(scroll, canvas) def __repr__() def resolve_url(url, current) def tree_to_list(tree, list) class CSSParser: def __init__(s) def whitespace() def literal(literal) def word() def pair() def ignore_until(chars) def body() def selector() def parse() class TagSelector: def __init__(tag) def matches(node) def __repr__() class DescendantSelector: def __init__(ancestor, descendant) def matches(node) def __repr__() INHERITED_PROPERTIES def compute_style(node, property, value) def style(node, rules) def cascade_priority(rule) class InlineLayout: def __init__(node, parent, previous) def layout() def recurse(node) def text(node) def flush() def paint(display_list) def __repr__() class BlockLayout: def __init__(node, parent, previous) def layout() def paint(display_list) def __repr__() class DocumentLayout: def __init__(node) def layout() def paint(display_list) def __repr__() class DrawText: def __init__(x1, y1, text, font, color) def execute(scroll, canvas) def __repr__() class Browser: def __init__() def load(url) def draw() def scrolldown(e) if __name__ == "__main__"

Exercises

Fonts: Implement the font-family property, an inheritable property that names which font should be used in an element. Make text inside <code> elements use a nice monospaced font like Courier. Beware the font cache.

Width/Height: Add support to block layout objects for the width and height properties. These can either be a pixel value, which directly sets the width or height of the layout object, or the word auto, in which case the existing layout algorithm is used.

Class Selectors: Any HTML element can have a class attribute, whose value is a space-separated list of that element’s classes. A CSS class selector, like .main, affects all elements with the main class. Implement class selectors; give them priority 10. If you’ve implemented them correctly, the code blocks in this book should be syntax-highlighted.

Display: Right now, the layout_mode function relies on a hard-coded list of block elements. In a real browser, the display property controls this. Implement display with a default value of inline, and move the list of block elements to the browser style sheet.

Shorthand Properties: CSS “shorthand properties” set multiple related CSS properties at the same time; for example, font: italic bold 100% Times sets the font-style, font-weight, font-size, and font-family properties all at once. Add shorthand properties to your parser. (If you haven’t implemented font-family, just ignore that part.)

Fast Descendant Selectors: Right now, matching a selector like div div div div div can take a long time—it’s O(nd) in the worst case, where n is the length of the selector and d is the depth of the layout tree. Modify the descendant-selector matching code to run in O(n) time. It may help to have DescendantSelector store a list of base selectors instead of just two.

Selector Sequences: Sometimes you want to select an element by tag and class. You do this by concatenating the selectors without anything in between.Not even whitespace! For example, span.announce selects elements that match both span and .announce. Implement a new SelectorSequence class to represent these and modify the parser to parse them. Sum priorities.Priorities for SelectorSequences are supposed to compare the number of ID, class, and tag selectors in lexicographic order, but summing the priorities of the selectors in the sequence will work fine as long as no one strings more than 16 selectors together.

Important: a CSS property-value pair can be marked “important” using the !important syntax, like this:

#banner a { color: black !important; }

This gives that property-value pair (but not other pairs in the same block!) a higher priority than any other selector (except for other !important selector). Parse and implement !important, giving any property-value pairs marked this way a priority 10000 higher than normal property-value pairs.

Ancestor Selectors: An ancestor selector is the inverse of a descendant selector—it styles an ancestor according to the presence of a descendant. This feature is one of the benefits provided by the :has syntax. Try to implement ancestor selectors. As I write this, no browser has actually implemented :has; why do you think that is? Hint: analyze the asymptotic speed of your implementation. There is a clever implementation that is O(1) amortized per element—can you find it?No, this clever implementation is still not fast enough for real browsers to implement.

Inline Style Sheets: The link rel=stylesheet syntax allows importing an external style sheet (meaning one loaded via its own HTTP request). There is also a way to provide a style sheet inline, as part of the HTML, via the <style> tag—everything up to the following </style> tag is interpreted as a style sheet.Inline style sheets should apply after all external style sheets in the cascade, and apply in order of their position in the HTML. Inline style sheets are useful for creating self-contained example web pages, but more importantly are a way that web sites can load faster by reducing the number of round-trip network requests to the server. Since style sheets typically don’t contain left angle brackets, you can implement this feature without modifying the HTML parser.

Did you find this chapter useful?