Traditional Internet Application Protocols
  1. Application-Layer Protocols.
    1. Must define:
      1. The syntax and semantics of exchanged messages.
      2. Whether the client or server starts first.
      3. How to handle errors.
      4. How to know when you're done.
    2. Standard or private.
      1. Standard protocols are published by some authoritative organizations, such as the Internet Engineering Task Force (IETF) or the World-Wide Web Consortium (W3C).
      2. Anyone may make a private protocol and use it themselves or within an organization.
  2. Text-Based Protocols
    1. “Traditional” Internet application protocols use messages which are lines of plain text.
    2. Each line is terminated with the two-character sequence, \r\n.
    3. Built atop TCP streams.
    4. The line convention essentially breaks the stream into messages.
  3. Hypertext Transport Protocol (HTTP).
    1. RFC 2616, but there are several other relevant ones.
    2. Simple File Transfer Protocol.
    3. Traditional text protocol.
    4. Usually transfers HTML files, but may transport any type.
    5. Client-server protocol.
    6. Request
      1. Parts
        1. Request line starting with the request type,
        2. Zero or more headers, form name: value
        3. A blank line.
        4. A body, possibly empty. (Often empty on requests.)
        GET /index.html HTTP/1.0\r\n User-Agent: FredView/0.03\r\n Host: sandbox.mc.edu\r\n Accept: */*\r\n Connection: Keep-Alive\r\n \r\n
    7. Response
      1. Parts
        1. Response line including success code.
        2. Zero or more headers, form name: value
        3. A blank line.
        4. A body, possibly empty.
        HTTP/1.1 200 OK\r\n Date: Tue, 22 Jan 2019 17:57:45 GMT\r\n Server: Apache/2.4.34 (Fedora)\r\n Last-Modified: Fri, 05 Oct 2012 22:55:37 GMT\r\n ETag: "81b-4cb57c5327434"\r\n Accept-Ranges: bytes\r\n Content-Length: 2075\r\n Keep-Alive: timeout=5, max=100\r\n Connection: Keep-Alive\r\n Content-Type: text/html; charset=UTF-8\r\n \r\n page contents
    8. The standard describes a few header names.
      1. These soled be used as described.
      2. Other headers can can be created at will.
      3. Clients ignore headers they don't understand.
    9. Request types.
        GETRequest a document. The body of the response will contain the document.
        HEADRequest the header for the document. Like a GET, but but the response body will be empty. The main use is to acquire the Last-Modified header to see if a local copy of the document should be refreshed.
        POSTSend data to the server. The request body will contain the data. This is usually used to send form data.
        PUTSend data to the server, and store it in the indicated file. It is not often used.
      1. A server may create new types, but not redefine these.
      2. A server is required to implement only GET and HEAD.
    10. Response codes
      1. Three-digit codes, first giving the category.
        1xxInformation; continuing.
        2xxSuccess
        3xxRedirection.
        4xxClient error
        5xxServer error
      2. For instance,
        200Ok
        206Partial Content
        301Moved Permanently
        400Bad Request
        403Forbidden
        404Not found
        500Internal Server Error
        501Not Implemented.
    11. Caching.
      1. Clients retain pages to avoid fetching them unnecessarily.
      2. Can use HEAD to tell if a page has changed without fetching it.
      3. Newer method adds this header to a GET to save a round trip:
        If-Modified-Since: Wed, 21 Oct 2015 07:28:00 GMT
        Server returns 304 if document has not changed.
      4. Expires response header advises client how long to retain.
        Expires: Wed, 21 Oct 2015 07:28:00 GMT
    12. Versions.
      1. 1.0 Original
      2. 1.1 Multiple requests on one connection, and many other incremental changes.
      3. 2.0 Substantial changes.
        1. Multiple simultaneous requests via multiple streams.
        2. Requests share headers to reduce redundancy.
        3. Server may send likely-needed documents before being requested.
  4. File Transfer Protocol (FTP) RFC 959
    1. Very old protocol; actually predates the Internet.
    2. Used for general file transfer.
    3. Login sequence:
      Server:220 Welcome to our FTP server\r\n
      Client:USER smith\r\n
      Server:331 Please specify password.\r\n
      Client:PASS Some Password\r\n
      Server:230 Login successful\r\n
    4. Transferring a file requires some setup.
      1. Binary mode means to transfer the file literally. Pretty much the only mode used these days.
      2. Passive mode means the client connects to the server.
      Client:TYPE I\r\n
      Server:200 Switching to binary mode\r\n
      Client:PASV\r\n
      Server:227 Entering Passive Mode (10,27,0,14,75,41)\r\n
    5. The client makes a second connection the server.
      1. The numbers in the 227 give the endpoint to connect to.
      2. IP 10.27.0.14
      3. Port 256×75+41=19241
      4. Connect to 10.27.0.14:19241
      Client:RETR somefile.txt\r\n
      Server:150 Opening BINARY connection for somefile.txt\r\n
    6. The client downloads the file contents on the second connection, the closes.
      Server:226 File send OK\r\n
      Client:QUIT\r\n
      Server:221 Goodbye\r\n
    7. Passive mode.
      1. The PASV request asks the server to listen and the client connects.
      2. In traditional use, the server connects to the client.
      3. The client-server idea was not part of the original Internet design.
    8. Multiple connections.
      1. Very unusual arrangement.
      2. If we sent the file on the control connection, it would be problematic if a file contains FTP commands.
    9. FTP is heck to firewall properly.
  5. Evolution of Email
    1. In the beginning (say the 1980's)
      1. No PCs. Users make text logins to timesharing systems.
      2. Mailboxes are just files (or directories) on these machines. Users read these messages locally.
      3. Mail sent is transmitted to the recipient's machine using the (Simple Mail Transfer Protocol) SMTP.
      4. Machines run SMTP servers which receive email from others and store it in a local mailbox.
      5. SMTP is a peer-to-peer protocol for sharing mail. No notion of clients and servers.
      6. SMTP servers accept mail from any connected host to any local recipient.
      7. SMTP servers may also accept mail in transit to another server. This helps mail transit to machines with poor or intermittent connections.
    2. The PC is invented.
      1. PCs aren't suitable to run SMTP servers.
        1. In the early days, simply not powerful enough.
        2. And people tend to turn them off at night.
        3. Besides, network admins get ticky about outside hosts connecting to PCs.
      2. Mailboxes
        1. Stay on the time-sharing system.
        2. Users use POP or IMAP to access their mailboxes.
          1. POP to just download the mail to the PC.
          2. IMAP to manage the mailbox remotely, but read it on the PC.
        3. Mail sent from the PC goes directly to the recipient by SMTP.
        4. The people quit logging in to the time-sharing system, and we start calling it a server.
    3. Spam is invented.
      1. The ability for any SMTP server to accept inbound mail from anywhere is a boon to spammers.
      2. Organizations designate certain servers as mail exchangers.
      3. Mail exchangers accept mail by SMTP
        1. From within their own organization.
        2. From designated mail exchangers of other organizations.
        3. From nowhere else (send attempts are refused by the SMTP server).
      4. When you send mail, it goes
        1. SMTP to the organization mail exchanger.
        2. SMTP to the recipient's mail exchanger.
        3. POP or IMAP to the recipient's PC.
      5. ISPs may configure blacklists so their mail exchangers won't accept messages from know spammers.
    4. People start reading mail in web browsers (thereby damaging the fabric of the universe).
      1. A web server now takes the role of the PC, and communicates to the mail exchanger by POP or IMAP.
      2. Software on the web server formats the messages into HTML and transmits to the browser via HTTP.
  6. Email Protocols
    1. Simple Mail Transport Protocol (SMTP) (RFC 821, updated as RFC 5321). Here one SMTP is sending an email to another. The sender initiates the connection.
      Rcver:220 mail.fred.com SMTP ready\r\n
      Sender:HELO sender.somewhere.com\r\n
      Rcver:250 OK\r\n
      Sender:MAIL FROM:<fsmith@somewhere.com>\r\n
      Rcver:250 OK\r\n
      Sender:RCPT TO:<jones@fred.com>\r\n
      Rcver:250 OK\r\n
      Sender:RCPT TO:<william@somewhere.com>\r\n
      Rcver:550 No such user here\r\n
      Sender:RCPT TO:<sally@somewhere.com>\r\n
      Rcver:250 OK\r\n
      Sender:DATA\r\n
      Rcver:354 Start mail input; end with <CRLF>.<CRLF>\r\n
      Sender:Date: Thu, 7 Jan 2016 16:18:01 -0600\r\n
      From: "(Fred Smith)" <fsmith@somwhere.com>\r\n
      To: jones@elsewhere.edu\r\n
      Subject: Backup tapes.\r\n
      \r\n
      Do you still have that backup tape from Tuesday?\r\n
      \r\n
      - Fred\r\n
      .\r\n
      Rcver:250 OK\r\n
      Sender:QUIT\r\n
      Rcver:221 Closing\r\n
      1. Message itself is a series of lines, ending with one that is just a period.
      2. Note that the sender and receiver are specified separately, and not just taken from the message headers.
        1. Spammers have long used this for spoofing.
        2. Current practice would be to check a bit better.
      3. No login password in the original password. May be required, but not always practical.
    2. Post Office Protocol (POP) (updated as RFC 1939)
      Server:+OK POP3 server ready\r\n
      Client:USER jones\r\n
      Server:+OK send password\r\n
      Client:PASS the password\r\n
      Server:+OK maildrop locked and ready\r\n
      Client:LIST\r\n
      Server:+OK 2 messages (386 octets)\r\n
      1 186\r\n
      2 200\r\n
      .\r\n
      Client:RETR 1\r\n
      Server:+OK 186 octets\r\n
      Date: Thu, 7 Jan 2016 16:18:01 -0600\r\n
      From: "(Fred Smith)" <fsmith@somwhere.com>\r\n
      To: jones@elsewhere.edu\r\n
      Subject: Backup tapes.\r\n
      \r\n
      Do you still have that backup tape from Tuesday?\r\n
      \r\n
      - Fred\r\n
      .\r\n
      Client:DELE 1\r\n
      Server:+OK message 1 deleted\r\n
      Client:QUIT\r\n
      Server:+OK pop server closing\r\n
    3. Internet Message Access Protocol (IMAP) (updated as RFC 3501). Similar to POP with more commands.
  7. Binary Email Content
    1. Email messages are like HTML messages, a series of headers, blank line, then a plain-text body.
      Date: Thu, 7 Jan 2016 16:18:01 -0600 From: "(Fred Smith)" &lt;fsmith@somwhere.com&gt; To: jones@elsewhere.edu Subject: Backup tapes. Do you still have that backup tape from Tuesday? - Fred
    2. Email protocols assume messages are ASCII text.
      1. RFC 821: all communication is in ASCII, and
      2. messages may not have lines over 1000 characters.
    3. Nowadays, we like to send binary attachments: programs, images, word-processor documents.
    4. A binary file has non-ASCII codes, and need not contain a newline every thousand bytes.
    5. Solution: Code the binary data as text.
    6. Multipurpose Internet Mail Extensions (MIME)
      1. Provides notation for dividing an email message into parts.
      2. Provides encodings for non-ASCII data, primarily base 64 for binary.
      3. MIME message.
    7. Base 64.
      1. Each three bytes is regrouped into four groups of six bits.
      2. A standard table assigns an ASCII character to each group. 52 letters (both cases), 10 digits, + and /
        1. Binary: 11010101 00000110 11010001
        2. Regroup: 110101 010000 011011 010001
        3. Assign: 1 Q b R
  8. Securing Old Protocols
    1. FTP, HTTP and the mail protocols were designed as plain text.
    2. Two ways to retrofit
      1. All TLS, all the time.
        1. After each network connect, add TLS to the channel.
        2. Need a different port number to distinguish plain secure.
        3. Client connects, endpoints complete TLS handshake, then operate exactly as the plain version.
      2. TLS on demand.
        1. Connects in the usual (plain) way, usual port.
        2. Negotiate TLS
          1. One end sends a (plain text) request to start TLS.
          2. The other agrees or refuses. Other details may be negotiated.
          3. If the ends agree, they perform the TLS handshake and proceed with a secure connection.
          4. If desired, one endpoint may refuse sensitive operations (such a login) if not secured.
        3. More flexible. Negotiation usually designed so new clients can just treat old as refusing.
      3. HTTP does it the first way: HTTP and HTTPS, ports 80 and 443.
      4. The others listed here can be done either way.
  9. Domain Name Service (DNS)
    1. The protocol
      1. A client is configured with a DNS server to which it sends requests.
      2. Requests are sent by UDP. Answers return the same way.
      3. Clients may have up to three servers, and send any request to all. Take the first response.
      4. Requests and responses are binary messages.
    2. Multiple servers
      1. Too much load for one.
      2. Single point of failure.
    3. Hierarchical
      1. Distributed Authority
      2. Distributed Effort
    4. Processing a request
      1. A server is assigned to each domain, plus a root server which knows the servers for all the top-level domains.
      2. Clients must know at least the root server ahead of time.
      3. Look up bills.accounting.mega.com.
        1. Ask the root server the address of bills.accounting.mega.com. It will tell you the address of a com server.
        2. Ask the com server if it knows bills.accounting.mega.com. It will tell you the address of a name server for mega.com
        3. Ask the mega.com server if it knows bills.accounting.mega.com. It might be configured to tell you, or might refer to a server for the accounting department.
        4. If the later, one more request should do it.
    5. Recursive and non-recursive requests.
      1. A recursive request asks the server to run through the whole chain and tell the final answer.
      2. A non-recursive request just asks for as much a the server already knows.
      3. Clients may mark their request recursive or not.
      4. A server may ignore the recursive mark if so configured.
      5. If a server tells the next step instead of the answer, that is called a “referral.”
    6. Practical Deployment
      1. Root and Top-Level Domain (TLD) servers organized by IANA and its licensees.
      2. Second-level servers provided by the domain owner or their ISP.
        1. Public, non-recursive service for outside requests.
        2. Recursive, caching service for internal requests.
      3. Local host resolver.
        1. Usually a library to implement DNS client operations.
        2. DNS server to use configured by DHCP, modifiable by hand.
        3. There are public DNS servers if you don't like the one from your ISP or organization.
    7. Efficiencies
      1. Caching and caching support in DNS.
        1. Clients cache final results so they don't have to be run again.
        2. Servers doing recursive lookups cache at each level, and can process new requests with last known part.
        3. DNS responses specify how long they should be cached.
      2. A response from owning domain server, instead of something that cached it along the way is “authoritative.”
      3. Many requests are local.
      4. A name may map to several IP addresses.
        1. The DNS server may return them in rotation.
        2. The DNS server may return all of them and the client picks one.
        3. This allows load balancing.
    8. Security issues.
      1. Clients are generally assigned a local caching DNS server.
      2. Usually assigned by DHCP, which the machine trusts.
        1. The server you're assigned might be full of lies. (The coffee shop just might have a hacker running its DHCP).
        2. DHCP server identity is not verified. Another customer at the coffee shop might be running the one you get.
      3. Hackers can send unsolicited DNS responses in hopes of being believed. If so, these will be cached.
      4. A system using signed DNS records has been standardized, but is not widely deployed.
    9. Record types.
      1. There are many record types for different information DNS can hold.
      2. Major ones:
        ARecords the IP address for a given name.
        PTRMap an IP address to a host name.
        MXWhere to send mail addressed to a host.
        CNAMEName is an alias for another.
      3. There may be multiple A records pointing to the same IP.
      4. A PTR record can denote any name that points to its IP, or one that doesn't.
    10. International Domains.
      1. The DNS standard allows host names to be made of ASCII.
      2. International names are coded with a complicated scheme called Punycode.
        1. www.xn--zrich-kva.com codes www.zürich.com
        2. A converter.
        3. The xn-- introduces a coded name.
        4. The first part holds the regular ASCII characters.
        5. The last part is an odd sort of base-36 number which gives the position and Unicode value for each non-ASCII character.
        6. The DNS server stores www.xn--zrich-kva.com, and it must be looked up under that form. The client converts and displays www.zürich.com.