HEAD of the Line
Use the
Cleansocks interface to create a
simple web client that takes a single URL on the command line.
It should report
the server's IP address, the response code, server type, value of the
location header (if present), and and
any cookies set by the server. If the server forwards the request,
the client should follow. You might want to
start with the posted
URL downloader example.
The output looks like this:
bennet@desktop$ headinfo http://sandbox.mc.edu/~bennet/cs423v2/syl.html
Server at 167.160.210.32 responds 200 OK
Server type: type Apache/2.4.62 (Fedora Linux) OpenSSL/3.2.2 mod_wsgi/5.0.0 Python/3.12.
No cookies were set.
bennet@desktop$ headinfo http://www.google.com
Server at 142.251.116.99 responds 200 OK
Server type: type gws.
Cookies:
AEC: AZ6Zc-XMPARk9m2Oi9MFVvy8JdNHL7bJgv56vXM7o47_7mboryybzLyxCA
NID: 521=Al9zIrHmK69jvULF7gesDbvu50_9OCtYBUMUhU2KEFS73TohyyRluGPPy9yHr0ZaZ17QbwIknPZvtE5ksKvugBxY3zdT3ZYY7GHmKMhlFP7vxFksR7vS940DDyvCJkAnmY_onzA9bIpGDKmeyhaGR4jRujWd-OMJKasqi4b8gdckV85xXu8tglsU2AY0-ITiv9m_GDcjpibZww
bennet@desktop$ headinfo http://www.parliament.uk/visiting/visiting-and-tours/tours-of-parliament/guided-tours-of-parliament/
Server at 104.17.177.119 responds 403 Forbidden
Server type: type cloudflare.
Cookies:
__cf_bm: og7HHo4mjKnxGU5bo4Go3DVI_6YFpHYoDZqUFSC1HV4-1737864139-1.0.1.1-yHKSJsfBs3l5Kx3JMnwBY_G8xxEr5suRkQlznsVrITOU459l4VtFrSL40zvxqQ5__vxiXloK_NH54FFl8Cs59w
Your application must extract the host name and path from the URL and
send an HTTP HEAD request to the indicated machine. Parse the response
line and headers
to get the information you need to report.
The relevant header names are Server, Location
and Set-Cookie. The Server will appear once or not at all. If it
does not appear, just say that the server type is “Unknown”.
The Location header may appear once, but usually
not at all. If not present, simply don't mention it in your output.
A server may set no cookies, or it may set more than one, so the
Set-Cookie header may appear any number of times. If it does not
appear, state that no cookies were set, otherwise list them all.
You should list the name and value for each one (see below).
If some networking error prevents the reception of any response from the
server, or the server's response cannot be parsed as an HTTP response, print
an appropriate error message. Otherwise, give the numeric response code and
message from the response (even if it is an error), then print the type of
server, and list any cookies set by the server.
Network errors are thrown as exceptions by cleansocks, so you will need
to catch them and
print the value of the exception's .what() method.
bennet@desktop$ headinfo http://www.forgetit.calm
Error: [IPaddress::lookup(www.forgetit.calm)] Name or service not known
You need only consider very simple URLs. Accept only http or https
URLs, and don't
look for port number or passwords. If the URL is not simple or can't
be parsed, just report an error and exit.
The value of the
SetCookie header is a string which gives the name and
value of the cookie. The form is something like this:
SetCookie: name=value; other stuff
Where the
; other stuff may or may not be present.
(
The
standard
calls this part “unparsed attributes,” even though the client
must parse it. We'll discard it.)
If you find a ; in the cookie string, discard the (first) ; and everything after
it. If you don't find an =, then the cookie is invalid, and you should
discard the whole thing. Otherwise, the name of the cookie is the portion
of the string up to the first =, and the value is the portion after the
first =. “First” is important here, because the value may
contain additional equal signs. (Google seems to love these.) The
standard says that either the name or value is allowed to be empty, but
I don't know that I've seen this actually happen.
If the response code is in the 300s, and the Location header is set,
the server is directing the client to another location. In this case, repeat
the operation using the Location URL. Keep following forwards, but
limit to five fetches. Loops are an error, but
also a possibility. Looks like this:
bennet@desktop$ headinfo http://sandbox.mc.edu/~bennet
Server at 167.160.210.32 responds 301 Moved Permanently
Server type: type Apache/2.4.62 (Fedora Linux) OpenSSL/3.2.2 mod_wsgi/5.0.0 Python/3.12.
Location: http://sandbox.mc.edu/~bennet/.
No cookies were set.
Following to http://sandbox.mc.edu/~bennet/
Server at 167.160.210.32 responds 200 OK
Server type: type Apache/2.4.62 (Fedora Linux) OpenSSL/3.2.2 mod_wsgi/5.0.0 Python/3.12.
No cookies were set.
bennet@desktop$ headinfo http://news.google.com
Server at 142.250.113.100 responds 301 Moved Permanently
Server type: type ESF.
Location: https://news.google.com/.
Cookies:
NID: 521=AQnANbf6C58EMIaGN_SyJubHPvWiVMsJ-iyuJ70hiEuYv9TflR9k5KYu6HvELqY_2PMyiidRWCBtM0dG-2mNKPoLIezQqRW6aN9-P8g66YXQV8HSNbN4yTcaOIDEyyoNDoXa8_-t90e8rhcbSutOM-RBlBbhY_iqxXpWOBmK7pQ0Pt48j-3cYLTzeXi7oQsG
Following to https://news.google.com/
Server at 142.250.113.138 responds 302 Found
Server type: type ESF.
Location: https://news.google.com/home?hl=en-US&gl=US&ceid=US:en.
Cookies:
GN_PREF: W251bGwsIkNBSVNDd2laNnRhOEJoRFFfb2xFIl0_
NID: 521=aCu2vM01ZCxnnsHhI_ySzYyswTEoBJFPMqZ8U3t-fZuE-OxHt19jjXQbQTTSQkbxVdJyU-Zy2igQRuguPEjBk5Am0PZhcO7T8u_gRcGQ0Bs8A9Pt956MbelRqZzsVlSYVOLs0ZScZdnCbyzdveIfI0xg9IYli2und2ki2qUfmBrMIotdMjjVbG-RLMIA7ppJEg
Following to https://news.google.com/home?hl=en-US&gl=US&ceid=US:en
Server at 142.250.113.101 responds 200 OK
Server type: type ESF.
Cookies:
GN_PREF: W251bGwsIkNBSVNEQWlaNnRhOEJoQ2d5N1dlQVEiXQ__
NID: 521=00IPu9HXOqV3eO4fk8oPlv7wavE1FVW8jXRs4nUKF78H5Jipstgf4Tlrzof6_Hf9yn-DftyMQbmuZ8Ov19u9WECYAIMz_NrTNzeyXx3OMyV8IMJUsa9HTj4tapjcMpBtWcgTFef5oDbGGG-JdXuDQ2OtEHJDo_VCN2cGH26xxVq0gNNnsB5HLcP_61HIwnHg
Submission
When your program is working, nicely commented and properly indented,
submit it using the form
here.