Client/Server
  1. Accessing Web Pages
    1. Initial model
      1. Client-server
        1. Client sends a request for a particular URL.
        2. The URL maps to a particular file on the server.
        3. Server responds, with the file contents or error indication.
      2. Pages are stored on the server machine. Server copies them out.
      3. Interaction is clicking on links.
    2. Common Gateway Interface (CGI). (Not to be confused with the “Computer-Generated Imagery” in your favorite movie.)
      1. The earliest form of dynamic web page: Can respond to input rather than deliver a pre-made file.
      2. The file the URL refers to is a program instead of text.
      3. The program is run, and the output is sent to the browser.
      4. Usually, the output is HTML.
      5. Can use any programming language supported by the server system.
      6. Note that, for each request, an external program must be run. This is a considerable overhead.
    3. Evolution.
      1. When PHP was first created, the PHP interpreter was run as a new process for each transaction.
      2. Apache server moved the CGI interpreter inside as part of its process so it need not be started every time.
      3. The current version is to run a separate server to which the web server sends the PHP code.
      4. The interface between the HTTP and PHP servers is an IPC on the server using a specialized protocol.
      5. Other dynamic support
        1. There are servers similar to the PHP one for Python, which is used by Flask and others.
        2. Other languages implement HTTP over an IPC and implement the standard HTTP proxy standard.
        3. CGI has been replaced by fast-CGI, where the dynamic program can be in any language, but must interact with the HTTP server
  2. HTTP Protocol
    1. Request and reponses each contain a header and a body.
    2. Request header gives the URL path requestes.
    3. Response header gives a three-digit code indicating success (200 OK; 404 Not found, etc.)
    4. Response body is the page requested, or may be empty on errors.
    5. Request body is usually empty, but contains the data on a form post.
    6. Headers are key/value pairs. There are many of these, and implementers are free to invent new ones.
  3. Transmitting other data.
    1. Other data must be transmitted besides the web page.
    2. Data the user enters into a form must be sent to the server.
      1. It can be sent by the GET method; simply add to the url: https://www.google.com/search?q=wombats.
      2. It can be sent by the POST method, in which the browser transmits the user data along with the request for a page that will use it.
      3. As we see later, an HTML form may contain hidden fields which are sent to the server along with user entries.
      4. A dynamically-generated form may contain hidden values which are returned then to the server.
    3. Cookies are key-value pairs which the server will ask the browser to store, and the browser will return them with subsequent requests.
  4. Sessions and Login.
    1. HTTP is a stateless protocol: No need to log in.
    2. Original HTTP authentication, rarely used now.
      1. No need to create a page which requests credentials; browser handles the credential request and transmission.
      2. Server does not need to store logins; creds are checked on each request.
    3. Contemporary practice is to implement with a web page and server code. No special part of HTTP needed.
      1. The user logs in using an ordinary form, and the server creates a record.
      2. The identifier is usually a large random number, so it is hard to guess.
      3. Knowing the random number is treated as having logged in.