HTML Text and Semantics

There are a variety of elements in HTML that are used for providing structure to the content.

The semantics and hierarchical structure of the HTML document is important for a variety of reasons including:

  1. The content is organized in a well formatted way
  2. It's accessible without the need of vision
  3. Web page is optimized for search engines
  4. Responsible development etiquette, if our styles fail, and the HTML is well formatted, the content is still accessible

HTML Document

Every HTML document has a basic structure which is required for it to be valid. Here is the basic structure of the HTML document

<!DOCTYPE html>
<html lang="en">
  <head>
    <!-- Document information -->
    <meta charset="UTF-8">
    <title>Document</title>
  </head>
  <body>
    <!-- Document Content -->
  </body>
</html>
  • <!doctype HTML> is the first thing we add to our HTML document which defines what is the type of this document.
  • <html></html> is the next and most important tag on our page, anything we add to the page must be inside opening <html> and closing </html> tags.
  • The <html></html> tag has two child tags <head></head> is the brain of the document, this is where we add the information about the document and links to the stylesheets which are the CSS documents. This information is mainly for use of the browsers and is mostly not displayed to the users. The two tags added to the head are required and should always be used:
    • <meta charset="UTF-8"> is a meta tag that tells the browsers about the encoding of the characters of the webpage, utf-8 encoding covers many languages in the world.
    • <title>Document</title> is the title of the webpage which will be displayed on top of the tab in a browser. Not on the page.
  • <body></body> is the main body of the document, this is what we see in our browsers. All the content of the web page must be added inside the opening <body> and closing </body> tags.
  • <!-- --> is the comment element or tag or simply an HTML comment. Comments mean that this piece of information will not be used by the browsers and is only for adding notes in the code for ourself or future developers. Anything added between the opening comment syntax <!-- and closing comment syntax --> will be ignored by the browser.

Parent child relation

In one of the points above the child is mentioned, we use these relations in HTML and in CSS for selecting elements. Parent - the element that contains other element Child - the element that is inside another element of which it is a child of Sibling - the element that is also inside the same parent element and beside the another element

From our example above:

  • <head></head> is the parent element
  • <meta charset="UTF-8"> and <title></title> are the children of <head></head>
  • <meta charset="UTF-8"> and <title></title> are siblings

Indentation

When writing HTML we use indentation to properly structure our code, this makes it easy for us to read the code.

In the example above each time we add a child element inside the parent we indent it using the tab key

<html>
  <head></head>
  <body></body>
</html>

Page layout

When creating the HTML Documents we will mainly be focusing on the content displayed to the user of our website which is the content we add inside the <body></body> tag.

Page layout semantic tags are used to give specific meaning to different parts of our page, such as the main article, header of the page, the footer of the page, navigation, different sections of the page and so on.

<header></header> when used as the child of <body></body> tag it acts as the master header of the page. When added inside the <article></article> tag it acts as the header for the article.

<footer></footer> when used as the child of <body></body> tag it acts as the footer of the web page. When added inside the <article></article> tag it acts as the footer for the article.

<nav></nav> the navigation of the page, this will contain the links to other pages of the website

<main></main> primary content of the page

<article></article> is the tag used for containing independent piece of content. It could be shared as stand alone piece of content and will still make sense.

<aside></aside> secondary content on the page, this information is not part of the main topic of the page and is not required.

<section></section> group of related content on the page. A section must have its own heading.

<!DOCTYPE html>
<html>
  <head>
    <meta charset="UTF-8">
    <title>My First Page</title>
  </head>
  <body>
    <header>
      This is my Website Header
      <nav>This is where the navigation links will be added</nav>
    </header>
    <main>
      This is the main content of the page
      <article>This is the article on the page</article>
      <aside>Some additional but not relevant information</aside>
      <section>Some more information about our website</section>
    </main>
    <footer>The footer of the web page</footer>
  </body>
</html>

Generic Tags

There are two main meaningless or generic tags we use in HTML. These tags are used to mainly wrap around any content when we do not want to give any semantic meaning to the content but need a parent to style the content or add interactivity using JavaScript.

<div></div> is used to wrap around multiple child elements. Div divides the content into logical groups which will show the content on a separate line.

<span></span> is used for single elements or short piece of content. Span shows the content inline with the rest of the content.

Text Semantics

Text semantics tags allow us to give semantic meaning to the text we have on the page. The content of these tags is styled by the browser to visually display the text differently.

Headings

We can use the <h1></h1> tag for the most important heading of the page. For the next level of heading we have <h2></h2>, <h3></h3>, <h4></h5>, and <h6></h6>. The browser will style each heading differently, stating with the biggest size for h1 then h2 and so on.

Paragraphs

For adding a paragraph of text we can use the <p></p> tag. The browser will add a space before and after the content of the paragraph.

Quote

<q></q> a small quotation embedded within other content. You become what you think

Blockquote

<blockquote></blockquote> a large, stand alone quote from another source.

Citation

<cite></cite> a citation for another source, often used with quotations. A person’s name, a URL, a book, a movie title, etc.

<blockquote>Love yourself! Because you're the only person that will be with you, your entire life. -<cite>Mr. Bean</cite></blockquote>

Love yourself! Because you're the only person that will be with you, your entire life. -Mr. Bean

Emphasis

<em></em> a string of emphasized, slightly more important text. Screen readers will change their voice for this text.

Important text

<strong></strong> a string of highly emphasized, much more important text. Screen readers will change their voice for this text.

Abbreviation

<abbr title="…"></abbr> an acronym or abbreviation, like “HTML”, “CSS”, etc. title attribute contains the expanded version, like <abbr title="Hypertext Markup Language">HTML</abbr> HTML

Highlighting text

<mark></mark> used to highlight a piece of text for reference. The keywords in a search results page, the current navigation item, etc.

Technical term

<i></i> defines technical term, a ship name, a book title, a thought, sarcasm, another language.

Keyword

<b></b> defines a keyword, like product name in a review, a lead sentence in a paragraph.

Subscript text

<sub></sub> defines text as being subscript. H2O

Superscript text

<sup></sup> defines text as being superscript. 10th

Deleted content

<del datetime="…"></del> content that was deleted after the document was published. datetime defines when it was removed. No pun intended

Address

<address></address> contact information, email, tel, postal address, etc.

1345 Woodroffe Avenue

Line Break

<br> creates a line break that’s significant to the content. Useful in poems and addresses where the division of lines is important. Do not use to create space in a design—use CSS margins and padding instead.

Horizontal Line

<hr> represents a thematic break in the content. For example, a scene change or topic change. Do not use to create a decorative horizontal line — use CSS borders instead.


By default all browsers give HTML some style. These include increased font size and boldness on headers, spacing for the indentation of quotes and the items in the numbered list, and the space between each of these things.

This default styling can be different across browsers and screen sizes and is usually in relation to the base line font size.

Also by default we see that HTML scales to any screen size and can be served to many devices and systems.

HTML Text Based Reference

Thomas Bradley