Introdutions
To make good sites you need good structure, and on the Web that starts with HTML. How you mark up your pages gives them a solid structure both now and in the future. Whatever the context, whether you’re building a heavily interactive web app, a hybrid mobile app, or a one-page brochure site, putting a sound structure in place is a top priority. A solid structure makes your pages more accessible and easier to author and maintain, and helps browsers and other user agents make sense of your pages. A well-structured DOM can also give a performance boost, making parsing easier for the browser and requiring less memory.
Beyond simple structure is semantic richness. Giving the content on your pages this extra meaning provides an immediate benefit: It’s easier for search engines to crawl and understand your data. And longer-term benefits that haven’t even been invented yet may arise.
HTML5 and related technologies make all of this easy. Using existing and well-implemented methods, you can create pages that are solid, meaningful, high performing, and rich in data.
New Elements in HTML5
One of the major new features in HTML5 is a range of new semantic elements, extending the suite far beyond its roots in marking up scientific documents with headings, lists, and paragraphs. Most of the new elements are aimed at giving a page better structure and developers more options for marking up areas of content than just using a div
with an associated id
or classes.
Here’s one example. In the past, developers might have used this:
<div class="article">…</div>
In HTML5, they have the option of using this:
<article>...<article>
Chapter 2. Structure and Semantics
The W3C’s HTML5 spec lists ten structural elements. Of these, three already existed in HTML4: body
, h1–h6
(if we cheat a little and count them as a single entity), and address
. Of the seven new elements, four are what are known as sectioning content; I’ll get to what this means in a little while, but for now here’s the list:
- article. An independent part of a document or site, such as a forum post, blog entry, or user-submitted comment.
- aside. An area of a page that is tangentially connected to the content around it, but which could be considered separate, like a sidebar in a magazine article.
- nav. The navigation area of a document, an area that contains links to other documents or other areas of the same document.
- section. A thematic grouping of content, such as a chapter of a book, a page in a tabbed dialog box, or the introduction on a website home page.
The other three structural elements define areas within the sectioned content:
- hgroup. Used to group a set of multiple-level heading elements, such as a subheading or a tagline.
- header. Possibly the header of a document, but could also be the header of an area of a document, generally containing heading (h1–h6) elements to mark up titles.
- footer. The footer of a document or of an area of a document, typically containing metadata about the section it’s within, such as author details
What’s the Point?
The stated aim of these new elements is to provide clear document outlines for better parsing by the browser and other machines, notably assistive technology like screen readers. Consider these outlines to be like document maps, showing the hierarchy of the content within, which headings are most important, the parent-child relationships between content areas, and so on.
In HTML4, this task was mostly done using the header elements, h1
through h6
: The h1
would be unique or the most important heading on the page, h2
elements were usually the direct children of h1
, and so on. Seeing something like this was fairly common:
<h1>Great Apes</h1> <h2>Gorilla</h2> ... <h3>Eastern Gorilla</h3> ... <h3>Western Gorilla</h3> ... <h2>Orangutan</h2> ...
Nesting headings in this way creates this document outline:
Chapter 2. Structure and Semantics
- Great Apes
- Gorilla
- Eastern Gorilla
- Western Gorilla
- Orangutan
- Gorilla
The structure I’ve created makes visual sense, and using headings in this way to create a document outline is known as implicit sectioning.
In HTML5, the sectioning content elements introduced earlier in this chapter create the sections in the outline, not the headers within those sections. This is explicit sectioning. So to get the same structure with our Great Apes markup in HTML5, we’d go for something like this:
<h1>Great Apes</h1> <section> <h1>Gorilla</h1> <article> <h1>Eastern Gorilla</h1> </article> <article> <h1>Western Gorilla</h1> ... </article> </section> <section> <h1>Orangutan<h/h1> ... </section>
The resulting outline would be the same as in the HTML4 example because each section
or article
element creates a new section in the outline. These are the sectioning content elements I mentioned earlier, along with aside
and nav
.
Each outline section should have a heading—any heading will do. In my example, I’ve used all h1
headings, but the heading level used doesn’t really matter because the sectioning content is what creates new sections. I could have rolled a die and used that number for each heading level for all the difference it makes.
As well as the heading (or headings, and possibly an hgroup
element to wrap them in), each section can contain a distinct header
and footer
, plus further sections and sectioning roots. These roots are elements such as blockquote
and figure
, which can have their own outlines but don’t contribute to the overall document outline.
If this discussion isn’t making a lot of sense to you, you’re in good company. The confusion over what each of the sectioning content elements does is so common that the good HTML5 Doctor has created a flowchart (Figure 2-1) to help you choose the right element for the task at hand.
A flowchart. To help you choose an element. If you’re a good judge of tone, you might have started to get the impression that I’m not a fan of the new HTML5 structural elements. If so, you’re right.