Hypertext Markup Language
These notes generally follow the intro and header tutorials. There's also just a bit from text fundamentals.
  1. A Web Page Is
    1. A text document.
    2. Contains instructions for formatting.
  2. Where to web pages come from?
    1. Our web pages will come from a text editor. We will create the tags by hand.
    2. Other places to get HTML:
      1. Editors and content generation systems which output HTML.
      2. Export from programs such as word processors.
      3. Web sites for user content.
    3. Why do it by hand?
      1. More control.
      2. More understanding.
      3. Can fix the fancy tools when they break, or go around them to get what you need to.
  3. Tags
    1. Undecorated text is displayed in the browser.
    2. Tags structure the document.
      1. Opening and closing tags: <tagname>Content of the tag.</tagname>
      2. Open and close tags must balance, and may be nested. Think parentheses.
      3. The tags and contents are called an element.
    3. Attributes.
      1. Attributes are key-value pairs added to the (opening) tag to modify its behaviour.
      2. The value can be enclosed in single or double quotes.
      3. Quotes may sometimes be omitted, but it's asking for trouble.
      4. <tagname attrname="value">Content displayed accordinging</tagname>
    4. Block Elements. Control the arrangement of large blocks, such as paragraphs.
      1. Paragrpah: <p>
        1. The paragraph tag has many attributes, for instance, to align the text: <p align="center">, and left and right.
        2. Modern web pages should use CSS instead of these attributes.
      2. Headings: <h1> through <h6>
      3. Lists creation <ol>, <ul>, <li>. (More on lists below.)
    5. Inline Elements. These control what text looks like, but does not change the structure of the document.
      1. The first approach is to specify intent. The browser would decide how to actually display these.
        1. <em> Emphasize.
        2. <strong> Really emphasize!.
        3. <cite> The Title of Some Work.
        4. <samp> Sample program output.
      2. But designers don't like giving up that much control. So from the early days there are also specific formatting tags:
        1. <b> Bold.
        2. <i> Italic.
        3. <u> Underline.
        4. <big> Larger text.
        5. <small> Smaller text.
        6. <strike> Strikeout.
      3. Modern pages should use CSS to control appearance, rather than the tags mentioned here.
  4. Correct page structure:
    <!DOCTYPE html> <html> <head> <meta charset="utf-8"> </head> <body> Document contents. </body> </html>
    1. The <!DOCTYPE>, suprisingly, declares what type of document this is.
    2. The <html>, <head> and <body> tags structure the document.
    3. The head contains information about the document; the document itself is in the body.
    4. The <meta> tag can provide various information about a document. Here, it describes how the characters are encoded.
    5. The <!DOCTYPE> and <meta> tags are two of a very few that have no closing tag.
  5. Entities.
    1. Purposes
      1. To represent characters which would otherwise be markup.
      2. To represent characters which are not on the keyboard.
    2. Format
      1. Start with ampersand, end with semicolon.
      2. Numeric, &#60;
      3. Symbolic, &lt;
    3. Some entities:
      &lt;  ⇒  <
      &gt;  ⇒  >
      &amp;  ⇒  &
      &sum;  ⇒  ∑
      &divide;  ⇒  ÷
      &sube;  ⇒  ⊆
      &cent;  ⇒  ¢
      &pound;  ⇒  £
      &copy;  ⇒  ©
      &lambda;  ⇒  λ
      &Eacute;  ⇒  É
      &Uuml;  ⇒  Ü
      &atilde;  ⇒  ã
    4. Here is a list.
    5. Some Uses
      1. He paid €15 in München.
        He paid &euro;15 in M&uuml;nchen.
      2. A ⊆ { x | x ≥ 0 ∧ x < 2πr }
        A &sube; { x | x &ge; 0 &and; x &le; 2&pi;r }
  6. HTML comment: <!-- Not displayed -->
  7. Practical Stuff
    1. Create a file with a text editor.
    2. View it locally.
    3. Copy to the server. View with the web browser.
    4. Copy changed versions across, press the reload button.
    5. Browsers may behave differently, though any modern browser should show a correct page the same way.
    6. Browsers may be forgiving of errors.
      1. You may have errors without knowing it.
      2. You have have odd behavior because of an error.
      3. Browsers may treat the same error very differently.
      4. Most browsers have developer tools which you might want to explore.
  8. The header section contains metadata.
    1. Data about the document, rather than data which is part of the document.
    2. The browser does not display the header contents in the window.
  9. The title tag.
    1. This gives the title of the document.
    2. Generally displayed in the browser stripe.
    3. May be repeated in an H tag, but they are separate things.
  10. The meta keyword has several forms.
    1. <meta name="author" content="Your Name Here">
    2. <meta name="description" content="What this page is about">
    3. The similar keywords tag was intended for search engines, but they ignore it now because of abuse.
    4. The Mozilla tutorial discusses some newer types created by certain web sites.
  11. We've already seen meta charset
    1. Tells the browser how to interpret the bytes in your page as characters.
    2. The reasons why there is actually more than one choice are largely historic.
      1. Originally, computers pretty much only understood English. (Or may just American.)
      2. Various means were proposed to represent other characters; none was universally adopted.
      3. These conflicted, and browsers had to guess right or the page could look horrible.
      4. HTML finally standardized on the meta charset tag.
    3. The utf-8 encoding is a good compromize that can represent any language, but usually doesn't break under old software that hasn't heard of the rest of the world.
    4. We will be happy with utf-8 in this class, but feel free to experiment with the document type if you like.
  12. Can declare what language the page is written in use HTML tag:
    1. <html lang="en-US"> (or other).
    2. Helps with indexing and screen readers.
  13. The header may contain style, link and/or script tags, relevent to style sheets and Javascript, which we will discuss later.
  14. Lists.
    1. <ul>: Unordered (bullet) list
    2. <ol>: Order (numbered) list.
    3. Each item in a list (either type) is a list item.
    4. Lists may be nested. Default ordered list symbol may change.