In this assignment, you will capture and analyze a network trace between your browser and a Web server. Web servers and proxies are one of the most widely deployed server applications, powering the websites that we visit every day. The protocol behind the Web, the HyperText Transfer Protocol (HTTP) started out as a simple plain-text protocol built on top of TCP. The most widely used plain-text version of the protocol is HTTP/1.1.
As the Web evolved, the number of protocol extensions kept increasing, and the size of websites, or “Web applications”, increased together with the number of users. Improving HTTP performance and combining extensions into one standard, HTTP/2 came out many years later. Three important improvements introduced in HTTP/2 are:
Multiplexing. HTTP/2 is a multiplexed protocol: multiple requests can be made at the same time, and multiple responses can be received in one message.
Header compression. During a web browsing session, the headers your browser will send to the server will remain more or less the same, and the headers the server will send to your browser will also not change by much. To spare bandwidth, headers are compressed and are sent less often, as you will see in the traces.
The most apparent one: HTTP/2 is a binary protocol. This makes it much more difficult to manually read and write HTTP messages.
Although a newer version of HTTP, called HTTP/3 exists, we focus in this assignment on HTTP/2. The most important change from HTTP/2 is that HTTP/3 uses UDP, not TCP, for its transport layer protocol, and moves the responsibility for managing connections and reliable delivery from the transport layer to the application layer. For further reading on how HTTP/3 implements these changes, see QUIC. To obtain an overview of the HTTP protocol versions and their implementations, we recommend reading the articles available at https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP.
Assignment Description
You are going to interact with a Web server that is both HTTP/1.1- and HTTP/2-enabled, and see how HTTP requests and responses look in practice. First, you are going to take a look at HTTP/1.1, and then at HTTP/2.
Web browsers only support HTTP/2 when served using Transport Level Security (TLS) (you will see the URL starts with https://). Although Wireshark intercepts all packets between your system and the Web server, TLS prevents us from reading the contents of the packets because it is designed to be resistant against Man-In-The-Middle Attacks (MITMs). This means we will need to do a bit of setup to enable Wireshark to decrypt the data, i.e., the HTTP/2 messages. This is an example of protocol encapsulation: HTTP/2 messages are encapsulated in TLS!
Setup
Download the support script from Canvas:
mosaic_support.cmd
for Windows, andmosaic_support.sh
for Linux and macOS. By default, the scripts try to launch Google Chrome. If you prefer a different browser, please modify the script accordingly.(For Mac and Linux only) Make the script executable by running at the command line
chmod +x mosaic_support.sh
.Close all instances of Chrome (or what browser you modified the script to run). End the task from Task Manager, run
pkill
chrome on Linux and Mac.Run the script. A browser window should appear.
Open Wireshark, and select your active network interface (you can judge which one it is by the activity graph next to its name). Then go to Edit (in the menu bar) → Preferences → Protocols → TLS. Press the “Browse” button next to “(Pre)-Master-Secret log filename”, and select the file named
keylogfile.txt
in your home directory.
Making sense of packet traces in Wireshark without using filters can be very difficult. You can write http
in the filter box and only HTTP/1.1 traces will be shown. For HTTP/2, you can use http2
. There are many other filters, for example for filtering based on the host and/or port. Try them out and discover!
Requirements
Consider server_url
to mean the URL of the assignment server, which can be found on Canvas.
Set up according to the instructions above and answer the questions below. When you are confident you have correctly answered all questions, discuss your trace and answers with a TA to get your assignment signed off. You do not need to redo the trace when discussing your answers with the TA.
While performing the steps below, you will get a certificate error from your web browser, warning you that the connection is not safe. For a real website, this would be a problem, but this is completely fine for our assignment. You can ignore the error by pressing “Advanced” and then clicking the link at the bottom (“Proceed to server name (unsafe)”).
Navigate to
http://server_url:8080
. Press on the links, and familiarize yourself with how the website looks. Look on the Wireshark trace, identify the packets that go from the client to the server, and the ones that go from the server to the client.Click on “Click for HTTP request information”. You will see the HTTP headers that the server received from your browser. Now look in Wireshark. Are the HTTP headers that the browser sends to the server the same as the ones on the screen? If there are differences, what are they?
What do the headers mean?
Reload the page, first by pressing F5, then by pressing Ctrl+F5. Are the headers different if you press CTRL when refreshing? Why? What do the changed headers mean?
Navigate to
http://server_url:8080/gophertiles
. You will see a picture made of smaller tiles loading. Can you find the request and response for each of the tiles in Wireshark? How does the server know which tile to serve? You may observe that your browser uses more than one TCP connection to load the pictures. Why is this happening? How can you find in Wireshark how many TCP connections are used by your browser, and which connection is used for every tile? How many connections are used?(HTTP/2, difficult) Navigate to
https://server_url:4430
. What does this request look like in Wireshark? Are the headers and the page’s content separated? How does the decrypted response differ compared to the HTTP/1.1 version?(HTTP/2, difficult) Navigate to
https://server_url:4430/gophertiles
. Once again, an image made of tiles is shown, but it loads much faster. As you select higher latencies from the top-left corner, no matter what you select, the HTTP/2-enabled page loads much faster. Why is this the case? What do the client’s requests for the tiles look like, and what is the difference compared to the HTTP/1.1 version? What do the server’s responses look like? How many TCP connections does the browser use to load the tiles in the HTTP/2 version, and why is it the case?(HTTP/2 and TLS, difficult) As HTTP/1.1 and HTTP/2 look completely different on the wire, there needs to be a way for the server and client to communicate which version to use, in a backward-compatible way. This is done through Application-Layer Protocol Negotiation, encoded as a TLS extension. Identify the negotiation in the Wireshark trace. To force HTTP/1, you can use curl:
curl --insecure -v --http1.1 https://server_name:4430/
. Similarly, for HTTP/2:curl --insecure -v --http2 https://server name:4430
Evaluation
A teaching assistant (TA) will approve your assignment if you correctly complete the requirements above. Upload the questions and your answers, in a text or PDF file, to Canvas. Please clearly state your answers to each question. The TA may ask additional questions to evaluate your understanding. Make sure you are ready to show your findings by having Wireshark and your network trace open.
Further Reading
Last updated