Browsers and the Web

The web browser (or just “browser”), and the webBroadly defined, the web is the interlinked network (“web”) of web pages on the Internet. If you’ve never made a web page, I recommend MDN’s Learn Web Development series, especially the Getting Started guide. This book will be easier to read if you’re familiar with the core technologies. more broadly, is a marvel. Every year it expands its reach to more and more of what we do with computers. It now goes far beyond its original use for document-based information sharing; many people now spend their entire day in a browser, not using a single other application!

Nowadays, desktop applications are often built and delivered as web apps: web pages loaded by a browser and used in similar ways to installed applications. And on mobile devices even native apps often use web views that embed a browser to render parts of the application UI.The fraction of these “hybrid” apps that are web content is also likely increasing over time. Perhaps in the future mobile devices will, like desktop computers, mostly be a container for web apps. Clearly, browsers are a critical and indispensable part of computing.

The basis of browsers is the web. And the web itself is built on a few simple, yet revolutionary, concepts; concepts that that together present a vision of the future of software and information. Among them are open, decentralized and safe computing; a declarative document model for describing UIs; hyperlinks; and the User Agent.The User Agent is a way to view the computer, or software within the computer, as a trusted assistant and advocate.

Looked at from the browser’s point of view, the browser makes the web real. As a result, all these concepts also form the core structure of the browser code itself. It is the User Agent, but also the mediator of web interactions and enforcer of the rules—this is its way of ensuring safety and fairness. Not only that, the browser is the implementer of all of the ways information is explored. The browser keeps web browsing safe; its algorithms implement the declarative UI; it navigates links and represents you to web pages. And of course, for web pages to load fast and react smoothly, the browser must be hyper-efficient as well.

This web+browsers setup is neither simple nor obvious. In fact, it is the result of experimentation and research reaching back to nearly the beginning of computing. Of course, the web also needed rich computer displays, powerful UI-building libraries, fast networks, and sufficient CPU power and information storage capacity. The result was what so often happens with technology: the web has many similar-looking predecessors, but only took its modern form once all those technologies were available.

Such lofty goals! How does the browser deliver on them? It’s a fascinating and fun journey. That’s what this book is about. But first let’s dig deeper into the thoughts raised here: how the web works, where the web came from, and the role browsers play in the web and computing.

Explaining the black box

HTML, CSS, HTTP, hyperlinks, and JavaScript—the core of the web—are approachable enough, and if you’ve made a web page before you’ve seen that programming ability is not required. But not many people—not even professional software developersI usually prefer “engineer”—hence the title of this book—but “developer” or “web developer” is much more common on the web. One important reason is that anyone can build a web page—not just trained software engineers and computer scientists. “Web developer” also is more inclusive of additional, critical roles like designers, authors, editors, and photographers. A web developer is anyone who makes web pages, regardless of how.—know much about how a browser renders web pages!

As a black box, the browser is either magical or frustrating (depending on whether it is working correctly or not!). And HTML & CSS are meant to be black boxes—declarative APIs—where one specifies what outcome to achieve, as opposed to how to achieve it. The browser itself is responsible for figuring out the how. Web developers don’t, and mostly can’t, draw their web page’s pixels on their own.

There are philosophical and practical reasons for this unusual design. Yes, developers lose some control and agency—when those pixels are wrong, developers cannot fix them directly.Loss of control is not necessarily specific to the web—much of computing these days relies on mountains of other peoples’ code. But they gain the ability to deploy content on the web without worrying about the details, to make that content instantly available on almost every computing device in existence, and to keep it accessible in the future, mostly avoiding the inevitable obsolescence of most software.

Behind its philosophy lies a web browser’s implementations of inversion of control, constraint programming, and declarative programming. The web inverts control, with an intermediary—the browser—handling most of the rendering, and the web developer specifying parameters and content to this intermediary.For example, in HTML there are many built-in form control elements that take care of the various ways the user of a web page can provide input. The developer need only specify parameters such as button names, sizing, and look-and-feel, or JavaScript extension points to handle form submission to the server. The rest of the implementation is taken care of by the browser. Further, these parameters usually take the form of constraints over relative sizes and positions instead of specifying their values directly.Constraint programming is clearest during web page layout, where font and window sizes, desired positions and sizes, and the relative arrangement of widgets is rarely specified directly. A fun question to consider: what does the browser “optimize for” when computing a layout? It’s the browser’s job to solve the constraints, or to even to pick which ones to break if needed. The same idea applies for actions: web pages mostly require that actions take place without specifying when they do. This declarative style means that from the point of view of a developer, changes “apply immediately,” but under the hood, the browser can be lazy and delay applying the changes until they become externally visible, either due to subsequent API calls or because the page has to be displayed to the user.For example, when exactly does the browser compute which CSS styles apply to which HTML elements, after a web page changes those styles? The change is visible to all subsequent API calls, so in that sense it applies “immediately.” But it is better for the browser to delay style re-calculation, avoiding redundant work if styles change twice in quick succession. Maximally exploiting the opportunities afforded by declarative programming makes real-world browsers very complex.

The upshot of all this is that a browser is a pretty unusual piece of software, with unique challenges, interesting algorithms, and clever optimizations invented just for this domain. That makes browsers worth studying for the pure pleasure of it—even leaving aside their importance!

The browser and me

IThis is Chris speaking! have known the web for all of my adult life. Ever since I first encountered the web, and its predecessors,For me, BBS systems over a dial-up modem connection. A BBS is not all that different from a browser if you think of it as a window into dynamic content created somewhere else on the Internet. in the early 90s, I was fascinated by browsers and the concept of networked user interfaces. When I surfed the web, even in its earliest form, I felt I was seeing the future of computing. In some ways, the web and I grew together—for example, in 1994, the year the web went commercial, was the same year I started college; while there I spent a fair amount of time surfing it, and by the time I graduated in 1999, the browser had fueled the famous dot-com speculation gold rush. The company for which I now work, Google, is a child of the web and was founded during that time. The web for me is something of a technological companion, and I’ve never been far from it in my studies or work.

In my freshman year at college, I attended a presentation by a RedHat salesman. The presentation was of course aimed at selling RedHat Linux, probably calling it the “operating system of the future” and speculating about the “year of the Linux desktop”. But when asked about challenges RedHat faced, the salesman mentioned not Linux but the web: he said that someone “needs to make a good browser for Linux.”Netscape Navigator was available for Linux at that time, but it wasn’t viewed as especially fast or featureful compared to its implementation on other operating systems. Even back then, in the very first year or so of the web, the browser was already a necessary component of every computer. He even threw out a challenge: “how hard could it be to build a better browser?” Indeed, how hard could it be? What makes it so hard? That question stuck with me for a long time.Meanwhile, the “better Linux browser than Netscape” took a long time to appear….

How hard indeed! After seven years in the trenches working on Chrome, I now know the answer to his question: building a browser is both easy and incredibly hard, both intentional and accidental, both planned and organic, both simple and unimaginably complex. And everywhere you look, you see the evolution and history of the web wrapped up in one codebase.

As you’ll see in this book, it’s surprisingly easy to write a very simple browser, one that can despite its simplicity display interesting-looking web pages, and support many interesting behaviors. This starting point—that it’s easy to write and support basic web pages—encapsulates the (intentionally!) easy-to-implement core of the web architecture.You might relate this to the history of the web and the idea of progressive enhancement.

Even in real browsers, simplicity is easy to find. For example, sometime during my first few months of working on Chrome, I came across the code implementing the <br> tag—look at that, the good-old <br> tag that I’ve used many times to insert newlines into web pages! And the implementation turns out to be barely any code at all, both in Chrome and in this book’s simple browser.

On the other hand, a browser with all the features, performance, security, and reliability of today’s top browsers—that is a whole lot of work. Thousands of person-years of effort went into the browsers you use today. And keeping a browser competitive is a lot of work as well: not only is there an inherent cost to maintaining large codebases, there is also constant pressure to do more—to add more features, improve performance, and keep up with the “web ecosystem”—the thousands of businesses, millions of developers, and billions of users that use the web.

Every browser has thousands of unfixed bugs, from the smallest of mistakes to the myriad of ways to mix up and mismatch features. And every browser has a complicated set of optimizations to squeeze out that last bit of performance. And every browser requires painstaking, but necessary, work to continuously refactor the code to reduce its complexity, often through the carefulBrowsers are so performance-sensitive in many places that merely the introduction of an abstraction—the function call or branching overhead—can cause an unacceptable performance cost. introduction of modularization and abstraction.

Working on such a codebase is often daunting. For one thing, there is the weighty history of each browser. It’s not uncommon to find lines of code last touched 15 years ago by someone who you’ve never met; or even after years of working discover files and code that you didn’t even know existed; or see lines of code that don’t look necessary, yet seem to do something important. If I want to know what that 15-year-old code does, how can I do it? Does that code I just discovered matter at all? Can I delete those lines of code, or are they there for a reason?

These kinds of quandaries are common to all complex codebases. But what makes a browser different is that there is often an urgency to fix them. Browsers are nearly as old as any “legacy” codebase, but are not legacy, not abandoned or half-deprecated, not slated for replacement. On the contrary, they are vital to the world’s economy. Browser engineers are forcedI don’t intend the negative connotation. It’s because browsers are so important that we browser developers can afford an iterative & continuous process of improvement. to fix and improve rather than abandon and replace.

It’s not just urgency though—understanding the cumulative answers to these small questions yields true insights into how computing actually works, and where future innovations may appear. In fact, browsers are where the fun of algorithms comes to life. Where else can one explore the limits of so many parts of computer science? Consider that a browser contains: a rendering engine more complex and powerful than any computer game; a full networking stack; many clever data structures and parallel programming techniques; a virtual machine, interpreted language and JIT; world-class security sandboxes; and uniquely dynamic systems for storing data. On top of this, the browser interacts in a fascinating and symbiotic way with the huge number of web pages deployed today.

The web in history

The public Internet and the Web co-evolved, and in fact many peoples’ first experiences of the Internet from the 1990s onward were more or less experiences with the web. However, it’s important to distinguish between them, since the Internet and the web are in fact not synonymous.

In the early days, the similarity between the physical structure of the web—where the web servers were—and the names of the websites was very strong. The Internet was a world wide network of computers, those computers had domain names, many of them ran web servers, and those servers stored and provided web pages to browsers that asked for them. In this sense, the Internet and the web really were closely related at that time. However, there is of course nothing inherent about this: nothing forces you to host your own web server on your home computer and Internet connection,People actually did this! And when their website became popular, it often ran out of bandwidth or computing power and became inaccessible. and the same goes for a university or corporation. Likewise, there is nothing requiring everyone to have their own website rather than a social networking account. These days, almost everyone uses a virtual machine or service purchased from one kind of cloud computing service or another to run their websites, regardless of how small or large, and there are many products available that can easily publish your web content on your behalf on various social networking platforms.

This same virtualization concept also applies to the implementation of web pages themselves. While it’s still possible to write HTML by hand, few of the most popular web pages’ HTML payloads literally exist on a hard drive somewhere. Instead, their component pieces and dependent databases exist, and the final product is dynamically assembled on the fly by complex build and “rendering”“Server-side rendering” is the process of assembling HTML on the server when loading a web page. Server-side rendering often uses web tech like JavaScript, and even a headless browser. Yet one more place browsers are taking over! systems and sent over the Internet on-demand to your browser. The reason for this is that their contents themselves are dynamic—composed of data from news, blog posts, inbox contents, advertisements, and algorithms adjusting to your particular tastes.

There is also aforementioned web app, which is a computer application written entirely as a web page. These applications are widespread, and they are gradually expanding to include nearly all types of computer tasks as the capabilities of browsers to support those tasks inprove. While these web apps are part of the web (e.g. they are loadable via URL), thinking of them as web pages is sometimes confusing. To deal with this confusion, there is often a conceptual distinction made (even if it is blurry in practice) between an “informational” web page and a “task-based” web app, even though they use the same underlying technology. Related to the notion of a web app is a PWA,PWA stands for Progressive Web App. In this case, progressive refers to progressive enhancement. which is a web app that progressively becomes indistinguishable from a native app.

For these reasons, it’s sometimes confusing to know what we should think of as “the web”. Here is one definitionIt’s worth noting here that this definition is not accidental and is part of the original design of the web. The fact that the web not only survived but thrived during the process of “virtualization” of hosting and content further demonstrates the elegance and effectiveness of its original design. that gets at the essence of its implementation building blocks:

One might try to argue that HTTP, URLs and hyperlinking are the only truly essential parts of the Web, or also argue that a browser is not strictly necessary, since conceptually web pages exist independently of the browser for them, and could in principle self-render through dedicated applications.For example, if you’re using an installed PWA, are you using a browser? In other words, one could try to distinguish between the networking and rendering aspects of the web; likewise, one could abstract the concept of linking and networking from the particular choice of protocols and data formats.

It is indeed true that one or more of the implementation choices could be replaced, and perhaps that will happen over time. For example, JavaScript might eventually be replaced by another language or technology, HTTP by some other protocol, or HTML by its successor. In practice, it is not really the case that networking and rendering are separated, and there are in fact important inter-dependencies—for example, HTML plays a critical role in both rendering and hyperlinks. It’s best to just consider browsers and HTML (and CSS and JavaScript) part of the core definition of the web. In any case, as with all technology, the web continues to evolve. The above definition may change over time, but for the purposes of this book, it’s a pretty good one.

Technological precursors

The web is at its core organized around representing and displaying information, and how to provide a way for humans to effectively learn and explore that information. The collective knowledge and wisdom of the species long ago exceeded the capacity of a single mind, organization, library, country, culture, group or language. However, while we as humans cannot possibly know even a tiny fraction of what is possible to know, we can use technology to learn more efficiently than before, and most importantly, to quickly access information we need to learn or remember. Computers, and the Internet, allow us to process and store as much information as we want. The web, on the other hand, plays the role of organizing and finding that information and knowledge to make it useful.The search engine Google’s mission statement to “organize the world’s information and make it universally accessible and useful” is almost exactly the same as this. This is not a coincidence - the search engine concept is inherently connected to the web.

The earliest exploration of how computers might revolutionize information is a 1945 essayThis brief prehistory of the web is by no means exhaustive. Instead, you should view it as a brief view into a much larger—and quite interesting in its own right—subject. entitled As We May Think. This essay envisioned a machine called a Memex. The Memex was an imagined machine that helps (think: User Agent) an individual human to see and explore all the information in the world. It was described in terms of microfilm screen technology of the time, but its purpose and concept has some clear similarities to the web as we know it today, even if the user interface and technology details differ.

The concept of networked links of information began to appear in about 1964-65, when the term “link” appeared (though connected to text rather than pages).These concepts are also the computer-based evolution of the long tradition of citation in academics and literary criticism. Researchers then began to advocate for building a network of computers to realize the concept. Independently, the first hyperlink system appeared (though apparently not using that wordThe word “hyperlink” may have first appeared in 1987, in connection with the HyperCard system on the Macintosh.)) for navigating within a single document; it was later generalized to linking between multiple documents. This work also formed one of the key parts of the mother of all demos, the most famous technology demonstration in the history of computing.

In 1983 the HyperTIES system was developed around highlighted hyperlinks. This was used to develop the world’s first electronically published academic journal, the 1988 issue of the Communications of the ACM. Tim Berners-Lee cites this 1988 event as the source of the link concept in his World Wide Web concept,Nowadays the World Wide Web is called just “the web”, or “the web ecosystem”—ecosystem being another way to capture the same concept as “World Wide”). The original wording lives on in the “www” in many website domain names. in which he proposed to join the link concept with the availability of the Internet, thus realizing many of the original goals of all the work from previous decades.The web itself is, therefore, an example of the realization of previous ambitions and dreams, just as today we strive to realize the vision laid out by the web. (No, it’s not done yet!)

In 1989-1990, the first browser (named “WorldWideWeb”) and web server (named “httpd”, for “HTTP Daemon” according to UNIX naming conventions) were born, again written in their first version by Berners-Lee. Interestingly, that browser’s capabilities were in some ways inferior to the browser you will implement in this book,No CSS! and in some ways go beyond the capabilities available even in modern browsers.For example, it included the concept of an index page meant for searching within a site (vestiges of which exist today in the “index.html” convention when a URL path ends in /”), and had a WYSIWYG web page editor (the “contenteditable” HTML attribute and “html()” method on DOM elements have similar semantic behavior, but built-in file saving is gone). Today, the index is replaced with a search engine, and web page editors as a concept are somewhat obsolete due to the highly dynamic nature of today’s web page rendering. On December 20, 1990 the first web page was created. The browser we will implement in this book is easily able to render this web page, even today.Also, as you can see clearly, that web page has not been updated in the meantime, and retains its original aesthetics! In 1991, Berners-Lee advertised his browser and the concept on the alt.hypertext Usenet group.

Berners-Lee has also written a Brief History of the Web that highlights a number of other interesting factors leading to the establishment of the web as we know it. One key factor was its decentralized nature, which he describes as arising from the culture of CERN, where he worked. The decentralized nature of the web is a key feature that distinguishes it from many systems that came before or after, and his explanation of it is worth quoting here (highlight is mine):

There was clearly a need for something like Enquire [ed: a predecessor software system] but accessible to everyone. I wanted it to scale so that if two people started to use it independently, and later started to work together, they could start linking together their information without making any other changes. This was the concept of the web.

This quote captures one of the key value propositions of the web. The web was successful for several reasons, but I believe it’s primarily the following three:

The browser ecosystem

Browsers have a unique character in that they are not proprietary—no company controls the APIs of the web and there are multiple independent implementations. In addition, it turned out that over time almost all of the code became open-source and developed by a very wide array of people and entities. As a corollary, the software platform for web pages is also not proprietary, and the information and capabilities contained within them are easy to make accessible to everyone.Unless, of course, the web page owner chooses to restrict availability for one reason or another. The point is that the web platform does not restrict availability, and therefore the owner has the freedom to choose.

I’ll now give a brief overview of the evolution of browser implementations. The first widely distributed browser may have been ViolaWWW; this browser also pioneered multiple interesting features such as applets and images. This browser was in turn the inspiration for NCSA Mosaic, which launched in 1993. One of the two original authors of Mosaic went on to co-found Netscape, the first commercial browser,By commercial I mean built by a for-profit entity. Netscape’s early versions were also not free software—you had to buy them from a store. which launched in 1994. The era of the ”first browser war” ensued, in a competition between Netscape and Internet Explorer. In addition, there were other browsers with smaller market shares; one notable example is Opera. The WebKit project began in 1999 (Safari and Chromium-based browsers, such as Chrome and newer versions of Edge, descend from this codebase). Likewise, the Gecko rendering engine was originally developed by Netscape starting in 1997; the Firefox browser is descended from this codebase. During the first browser war period, nearly all of the core features you will implement in your browser that goes along with this book were added, including CSS, DOM, and JavaScript.

The “second browser war”, which according to Wikipedia was 2004-2017, was between a variety of browsers - Internet Explorer, Firefox, Safari and Chrome in particular. Chrome split off its rendering engine subsystem into its own code base called Blink in 2013.

In parallel with these developments was another, equally important, one—the standardization of web APIs. In October 1994, the World Wide Web Consortium (W3C) was founded in order to provide oversight and standards for web features. After this point, browsers would often introduce new HTML elements or APIs, and competing browsers would copy them. Those elements and APIs were subsequently agreed upon and documented in W3C specifications. (These days, an initial discussion, design and specification precedes any new feature.) Later on, the HTML specification ended up moving to a different standards body called the WHATWG, but CSS and other features are still standardized at the W3C. JavaScript is standardized at TC39 (“Technical Committee 39” at ECMA, yet another standards body). HTTP is standardized by the IETF.

In the first years of the web, it was not so clear that browsers would remain standard and that one browser might not end up “winning” and becoming another proprietary software platform. There are multiple reasons this didn’t happen, among them the egalitarian ethos of the computing community and the presence and strength of the W3C. Equally important was the networked nature of the web, and therefore the desire of web developers to make sure their pages worked correctly in most or all of the browsers (otherwise they would lose customers), leading them to avoid any proprietary extensions.

Despite fears that this might happen, there never really was a point where any browser openly attempted to break away from the standard. Instead, intense competition for market share was channeled into very fast innovation and an ever-expanding set of APIs and capabilities for the web, which we nowadays refer to as the web platform, not just the “World Wide Web”. This recognizes the fact that the web is no longer a document viewing mechanism, but has evolved into a fully realized computing platform and ecosystem.There have even been operating systems built around the web! Examples include webOS, which powered some Palm smartphones, Firefox OS (that today lives on in KaiOS-based phones), and ChromeOS, which is a desktop operating system. All of these OSes are based on using the Web as the UI layer for all applications, with some JavaScript-exposed APIs on top for system integration.

Given these outcomes, in retrospect it is clearly not so relevant to know which browser “won” or “lost” each of the browser “wars”. In both cases the web won and was preserved and enhanced for the benefit of the world. In economics terms, enforcing a standard set of APIs across browsers made the web platform a commodity; instead of competing based on lock-in, browsers compete on performance and quality, and also browser features that are not part of the web platform—for example, tabbed UIs and search engine integration.Browser development is also primarily funded by revenue from search engine advertisements; a secondary typical funding motivation is to improve the competitive position of an operating system or device owned or controlled by the company building the browser. (Compare this with the RedHat anecdote I related earlier!)

An important and interesting outcome of the second browser war was that all mainstream browsers today (of which there are many more than threeExamples of Chromium-based browsers include Chrome, Edge, Opera (which switched to Chromium from the Presto engine in 2013), Samsung Internet, Yandex Browser, and UC Browser. In addition, there are many “embedded” browsers, based on one or another of the three engines, for a wide variety of automobiles, phones, TVs and other electronic devices.) are based on three open-source web rendering / JavaScript engines: Chromium, Gecko and WebKit.The JavaScript engines are actually in different repositories (as are various other sub-components that we won’t get into here), and can and do exist outside of browsers as JavaScript virtual machines. One important such application is the use of v8 to power node.js. However, each of the three rendering engines does have its own JavaScript implementation, so conflating the two is reasonable. Since Chromium and WebKit have a common ancestral codebase, while Gecko is an open-source descendant of Netscape, all three date back to the 1990s—almost to the beginning of the web. That this occurred is not an accident, and in fact tells us something quite interesting about the most cost-effective way to implement a rendering engine based on a commodity set of platform APIs.

How browsers evolve

At the highest level, a browser has two major pieces of code: an implementation of the web platform APIs (sometimes called a web rendering engine), and a browsing UI and accompanying features, such as a search engine integration, bookmarks, navigation, tabs, translation, autofill, password managers, data sync and so on.

Web rendering engines have a lot in common with any other very large software project—they have a very high total cost of development, and a significant & increasing over time cost of maintenance (due to the ever-expanding feature set). However, they also have a unique character in that they are heavily influenced in thier priorities by the community and ecosystem. In other words, since the character of the web itself is highly decentralized, what use cases end up getting met by browsers is to a significant extent not determined by the companies “owning” or “controlling” a particular browser. For example, there are many other people, such as web developers, who contribute many good ideas and proposals that end up implemented in browsers.

Due to the very high cost of building and maintaining an implementation of the web platform, and because they are commodities, web rendering engines are today all open-source. This allows sharing the burden (opportunity?) of maintenance and feature development across a larger community. They evolve like giant R&D projects, where new ideas are constantly being proposed and tested out in discussions and implementations. Like any R&D project, these engines have an iterative and incremental planning and shipping process. And just as you would expect, some features fail and some succeed. The ones that succeed end up in specifications and implemented by all browsers.

Browsers and you

This book explains how to build a simple web rendering engine plus browser shell, as well as many details about advanced features and the architecture of a real browser’s rendering engine. From this you will learn what is easy and what is hard in these engines: which algorithms are simple, and which are tricky; what makes a browser fast, and what makes it slow; and all the core concepts you need to understand to predict the behavior of a real-world browser. After reading all of it, you should be able to dig into the source code for a real browser’s rendering engine and understand it without too much trouble.

The intention of the book is for you to build your own browser as you work through the early chapters. Once your browser is up and running, there are endless opportunities to improve performance or add features. Many of the exercises at the ends of the chapters are suggested feature enhancements that are similar to ones that come up in real browsers. We encourage you to try the exercises—adding these features is one of the most fun parts of browser development! It’s also a lot of fun (and very satisfying) to compare your browser with a real one, or see how many websites you can successfully render.

The browser is an essential part of computing, and this chapter gave evidence of that fact, along with a flavor of the depth and history of the web and browsers. However, I believe that only by really understanding how a browser works will you fully appreciate and understand its beauty, complexity and power. I hope you come away from this book with a deeper sense of its beauty in particular—how it works, its relationship to the culture and history of computing and information, and what it’s like to be someone building a browser. But most of all, I hope you can connect all of that to you, your career in software and computers, and the future in general. After all, it’s up to you to invent and discover what comes next!