The anatomy of a HTTP server
A Computerphile video made me do it. Computerphile is a Youtube channel about computer science, featuring professors explaining technical concepts in an informal setting. The drabness of the backdrop, often an empty classroom, a lab or an office, is counterpoised by the enthusiasm of the presenter, a jolliness in his demeanor. It’s quite British, indeed.
The video in question is titled Coding a Web Server in 25 Lines. It does what is says, or it says what it does. One might wonder what the point is, though. The world is awash with widely available, highly performant web servers. Education, of course! It’s a fun little exercise that upon completion yields a somewhat functional web server. Direct a web browser at it, and voilà! On the Internet, nobody knows you’re a dog.
That’s all good and well, but if Computerphile just did it, why do it again? Firstly, writing minimalist web servers is a genre unto itself. No, seriously. But there is something else, too. As a Lisper, I couldn’t help but notice the workflow: an edit/compile/run affair. How I wished it were a read/eval/print/loop instead! Interactive development environments à la Lisp, or Smalltalk for that matter, are far from commonplace, and there is plenty of room to make a statement. Sure, this could be an expression of smugness, delusion or loneliness. And I never want to be lonely again.
The one thing that Computerphile teaches you without fail is breaking down problems into smaller chunks. Arguably, the most important lesson of them all. Therefore, the first version of the HTTP server presented to the audience is a diminutive one. It prints out the request and does nothing about it. In Rust, this gives the following:
use std::io::BufRead; fn main() { let listener = std::net::TcpListener::bind("127.0.0.1:9999").unwrap(); for mut stream in listener.incoming().flatten() { let mut rdr = std::io::BufReader::new(&mut stream); let mut l = String::new(); loop { rdr.read_line(&mut l).unwrap(); print!("{l}"); } } }
One might object that with netcat
, this is a one-liner:
nc -vvv -l -s 127.0.0.1 -p 9999
Sure. But it doesn’t let transpire much of what is going on. The Rust snippet shows that TCP is the transport layer. The equivalent version in Clojure could look like the snippet below, where ServerSocket
is the abstraction for TCP-based communication.
(import java.net.ServerSocket) (with-open [server (ServerSocket. 8085) conn (.accept server) r (io/reader (.getInputStream conn))] (loop [line (.readLine r)] (when (seq (str/trim line)) (log/info line) (recur (.readLine r)))))
Note the binding form following the with-open
macro. The objects created within belong to classes implementing Java’s AutoCloseable interface, therefore lending themselves to the try-with-resources
idiom. Whereas in Java it is a statement introduced by the language designers with Java 7, in Clojure it is a macro that anyone could have written if it were indeed absent from the language. So there, we haven’t missed the opportunity to praise the virtues of Lisp’s homoiconicity.
This snippet of code is well and good but it makes the browser hang. It is waiting on the socket for a response, which isn’t coming. So here is a minimal modification that sends a response. We’ll just say OK to everyone.
(with-open [server (ServerSocket. 8085) conn (.accept server) r (io/reader (.getInputStream conn)) w (io/writer (.getOutputStream conn))] (loop [line (.readLine r)] (when (seq (str/trim line)) (log/info line) (recur (.readLine r)))) (.write w "HTTP/1.1 200 OK\r\n\r\n"))
The problem now is that we’re dealing with one request only. We ought to keep listening on the socket, processing multiple requests as they come.
(let [server (ServerSocket. 8085)] (loop [conn (.accept server)] (with-open [r (io/reader (.getInputStream conn)) w (io/writer (.getOutputStream conn))] (loop [line (.readLine r)] (when (seq (str/trim line)) (recur (.readLine r)))) (.write w "HTTP/1.1 200 OK\r\n\r\n")) (recur (.accept server))))
Great. Now a word about our workflow. Executed in a REPL, the snippet above will block the top-level loop. In other words, we don’t get our prompt back. This is a problem, because this would force us to restart the REPL. We never want to restart our REPL.
Now, in an interruptible REPL such as Cider, we can press C-c C-b
or Alt+x cider-interrupt
to get back to the prompt. But if you try to rerun the code, you will get a BindException
. The server wasn’t closed properly and it is still bound to the address, port 8085 on the loopback interface. In brief, we need to keep a handle on the server so that we can gracefully close it.
The lesson here is that at a REPL, thought must be given to state, its initialization and its tear down. There are many solutions available to us, and some of them are reified in libraries, but here we will opt for a straightforward solution, yet no less effective. Instead of running that code at the top-level, we will encapsulate it in a function. And that function will receive the server as argument.
(defn tcp-transport [server] (while (not (.isClosed server)) (loop [conn (.accept server)] (with-open [r (io/reader (.getInputStream conn)) w (io/writer (.getOutputStream conn))] (loop [line (.readLine r)] (when (seq (str/trim line)) (recur (.readLine r)))) (.write w "HTTP/1.1 200 OK\r\n\r\n")) (recur (.accept server)))))
Now we can write:
(def server (ServerSocket. 8085)) (tcp-transport server)
In order to get back to a clean state:
(.close server)
Close, but not there yet. We need to run the server in its own thread so that it doesn’t interfere with our top-level at the REPL. Handily, wrapping the code in a future
will achieve the desired result.
(defn tcp-transport [server] (future (while (not (.isClosed server)) (loop [conn (.accept server)] (with-open [r (io/reader (.getInputStream conn)) w (io/writer (.getOutputStream conn))] (loop [line (.readLine r)] (when (seq (str/trim line)) (log/info line) (recur (.readLine r)))) (.write w "HTTP/1.1 200 OK\r\n\r\n")) (recur (.accept server))))))
Things are shaping up nicely. There is a tiny annoyance, though. When we call the close method on the server instance, the underlying socket (conn
) is left hanging, and that generates an error which becomes apparent only if we dereference the future.
Execution error (SocketException) at sun.nio.ch.NioSocketImpl/endAccept (NioSocketImpl.java:694). Socket closed
It is of no consequence, since the thread has ended, but we can do the following gentle adjustment which catches the exception.
(defn tcp-transport [server] (future (try (while (not (.isClosed server)) (loop [conn (.accept server)] (with-open [r (io/reader (.getInputStream conn)) w (io/writer (.getOutputStream conn))] (loop [line (.readLine r)] (when (seq (str/trim line)) (recur (.readLine r)))) (.write w "HTTP/1.1 200 OK\r\n\r\n")) (recur (.accept server)))) (catch SocketException e (.getMessage e)))))
Now when you dereference the future, it will report:
"Socket closed"
Are you hiring?
Looking for a collaborator? A team lead or member? A consultant, maybe? Need a little boost or a long-term commitment? I’m open to work. Let’s talk.
If you have watched the video, you know that the web server, after being coaxed into saying OK to all requests, gets upgraded so that it can actually serve content. That second version is able deliver the presenter’s personal static website to the browser. We are going to do the same move now.
First, we will teach our web server to parse the request and extract the URI, the resource being requested. The first line of a HTTP request contains the URI.
GET /index.html HTTP/1.1
If we split the string at the space boundary, it is going to be the second element. Hence:
(loop [line (.readLine r)] (when (seq (str/trim line)) (when (str/starts-with? line "GET") (log/info line) (reset! resource (second (str/split line #" ")))) (recur (.readLine r))))
Where resource is an atom used for the assignment. However, we are functional programmers and we’d like to avoid the assignment. We’ll see how in a moment. For now, let’s ignore the ickiness and move along. We can write the following program:
(defn tcp-transport [server] (let [path "/tmp/resources"] (future (try (while (not (.isClosed server)) (loop [conn (.accept server) resource (atom nil)] (with-open [r (io/reader (.getInputStream conn)) w (io/writer (.getOutputStream conn))] (loop [line (.readLine r)] (when (seq (str/trim line)) (when (str/starts-with? line "GET") (log/info line) (reset! resource (second (str/split line #" ")))) (recur (.readLine r)))) (let [ok (str "HTTP/1.1 200 OK\r\n\r\n") file (str path @resource)] (.write w (str ok file)))) (recur (.accept server) resource))) (catch SocketException e {:msg (.getMessage e)})))))
This will do in an ideal world. But we can’t just pretend that the requested resource always exists. We can see where this is going:
(let [ok (str "HTTP/1.1 200 OK\r\n\r\n") not-found (str "HTTP/1.1 404 Not Found\r\n\r\n") file (try (slurp (str path @resource)) (catch FileNotFoundException e))] (if (seq file) (.write w (str resp file)) (.write w not-found)))
More logic. But hey, our web server can now serve resources from a folder. We have caught up with the Computerphile video. The demonstration is over. We too have done the Hello, World! web server, and then an upgraded version serving static resources. But really, that’s no way to say goodbye. Our web server is one mess of a function. It does way too much: listening on a TCP port, parsing the request, crafting the response. And on top of that we’re doing a mutation. It’s high time for a refactoring.
We’ve just mentioned how to better separate concerns, so let’s do that, starting with parsing the request:
(defn parse-request [conn] (let [r (io/reader (.getInputStream conn))] (loop [line (.readLine r) request {}] (if (seq (str/trim line)) (if (str/starts-with? line "GET") (let [[method uri scheme] (str/split line #" ")] (recur (.readLine r) (assoc request :method method :uri uri :scheme scheme))) (let [[k v] (str/split line #":")] (recur (.readLine r) (assoc request :headers {k v})))) request))))
Look, ma, no mutation! Instead, we’re accumulating each header line in the request map, and that is what the function returns. It seems we’ve been feeding a couple of birds with the same scone. That request can now be passed on to our new response function:
(defn send-response [conn request] (let [path "/tmp/resources"] (with-open [w (io/writer (.getOutputStream conn))] (if (.exists (io/file (str path (:uri request)))) (let [resource (slurp (str path (:uri request)))] (.write w (str "HTTP/1.1 200 OK\r\n\r\n" resource))) (.write w "HTTP/1.1 404 Not Found\r\n\r\n")))))
Tying everything together:
(defn tcp-transport [server] (future (try (while (not (.isClosed server)) (loop [conn (.accept server)] (let [request (parse-request conn)] (send-response conn request)) (recur (.accept server)))) (catch SocketException e {:msg (.getMessage e)}))))
Remember when I enumerated the reasons behind yet another minimalist web server. I said that we wanted to show the power of REPL-based development? Well, there’s another reason why I think this exercise is worth pursuing. We are not going to stop with a minimal web server. Our web server will be Ring-compliant. This transforms our static web server, one that serves static resources, to a dynamic web server, one that is driven by a program, any program, as long as it takes a request as parameter and produces a response as return value. In Ring speak, this is called a handler.
Ring is a Clojure web applications library inspired by Python’s WSGI and Ruby’s Rack. By abstracting the details of HTTP into a simple, unified API, Ring allows web applications to be constructed of modular components that can be shared among a variety of applications, web servers, and web frameworks.
Next to the handler, the Ring spec defines the request map, its required keys. Some we’ve already seen: the URI, the method and the scheme. All the above extracted from the first line of a HTTP request. The response map consists of a headers and a status key. If present, the body must satisfy a protocol:
(defprotocol StreamableResponseBody (write-body-to-stream [body response output-stream]))
The spec is of interest to us because we’re implementing a web server, but the user doesn’t need to care about these intricacies. From his perspective, the handler is important, because it defines the behavior of his web application. And that handler will be passed to the Ring adapter, which will start the HTTP server. The adapter, thus, is the entry point and has the following signature.
(run-adapter handler options)
We can thus write the following:
(defn run-adapter [handler options] (let [server (ServerSocket. (:port options))] (tcp-transport server handler) (fn close [] (.close server))))
The run-adapter
function returns a close
function that we can keep a handle on. We’ll call it when we want to terminate the server gracefully.
Our TCP listener requires a small modification. It receives an extra argument, the handler. Herein lies the secret sauce. This is how we are creating an abstraction external to the implementation details of our web server, its transport layer, etc.
(defn tcp-transport [server f] (future (try (while (not (.isClosed server)) (loop [conn (.accept server)] (let [request (parse-request conn)] (send-response conn (f request))) (recur (.accept server)))) (catch SocketException e {:msg (.getMessage e)}))))
We are able to supply a Ring adapter to users who in turn can focus on Web development without giving a second thought to the underlying machinery. As to the underlying machinery, our ServerSocket
that establishes TCP connections, well, TCP is a byte-stream protocol, and that is why our Ring adapter needs to transform the response’s body to a byte-stream. Once the stream is passed down to the socket, TCP itself will chunk it and ensure that the data is reliably transferred over the network. The StreamableResponseBody protocol allows the user to pass different formats in the body of the response, as long as they satisfy the protocol. They all will be reduced to a stream of bytes.
Out of the box, Ring comes with implementations for byte-arrays, strings, Clojure sequences, files and Java input streams. Since we are not relying on the Ring libraries but rather are referring to the spec (and building everything from scratch), we will re-implement some of them.
It seems desirable that we always allow a string as body of a response.
(extend-protocol StreamableResponseBody String (write-body-to-stream [body _ output-stream] (.write output-stream (.getBytes body)) (.close output-stream)))
However, consider the case of a static website. An index.html itself is a text file. We could slurp it and pass the string to the java Writer, as we’ve done before. The latter will convert the characters to bytes before writing them to the underlying socket. However, when the browser receives the index.html, it will follow up with more requests as it encounters embedded links asking for CSS, scripts, images, fonts, etc. Some files, like the images or the fonts, are binary. Slurping them won’t do us good. While all strings are bytes, not all bytes are strings.
Thus, a protocol method implementation for java.io.File seems highly useful, since it will cover all types of files, allowing us to serve an entire directory of static resources, for example.
(extend-protocol StreamableResponseBody File (write-body-to-stream [body _ output-stream] (.write output-stream (Files/readAllBytes (.toPath body))) (.flush output-stream) (.close output-stream)))
Turning the body of the response into an output stream (of bytes) happens after we write the headers of the response to the output stream (as bytes).
(defn send-response [conn response] (->> (write-headers response (.getOutputStream conn)) (write-body-to-stream (:body response) response)))
As for the write-headers
function:
(defn write-headers [response output-stream] (.write output-stream (into-array Byte/TYPE (str (get responses (:status response)) (apply str (for [[k v] (:headers response)] (str k " " v "\r\n"))) "\r\n"))) output-stream)
The important point is that the write-headers
function returns the output stream without closing it because it may be further processed. Indeed, we call the protocol method write-body-to-stream
on the body, and if it is nil, we close the stream.
(extend-protocol StreamableResponseBody nil (write-body-to-stream [_ _ output-stream] (.close output-stream)))
In the Computerphile version of our web server, or the pre-Ring one if you will, serving static resources under a folder was all our web server could do. And the logic was baked in. In the context of the Ring framework, it is a user-provided unit, separate and independent. Serving static resources under a folder becomes application logic.
Here’s how a handler that serves static pages might look like.
(defn handler [request] (let [redirects {"/" "index.html" "/index.htm" "index.html"} base-path "/tmp/resources"] (cond (contains? redirects (:uri request)) {:status 301 :headers {"Location:" (get redirects (:uri request))}} (.exists (io/file (str base-path (:uri request)))) (let [file (io/file (str base-path (:uri request))) content-type (Files/probeContentType (.toPath file))] {:status 200 :headers {"Content-Type:" content-type} :body file}) :else {:status 404 :headers {"Content-Type:" "text/html"} :body "<html>404</html>"})))
This concludes my exploration of minimalist web servers, and may herald the beginning of yours. Indeed, have a look at the final code, study it, play with it, make it better. I am welcoming contributions that are close to the original spirit: single namespace, no dependencies, code brevity, etc. Here are some ideas to explore further:
- Support for other HTTP verbs than GET
- Write asynchronously to the socket with Java NIO channel
- Workers in a thread pool
- Efficient data transfer through zero copy
- HTTP server on top of a non-TCP transport, for example Unix domain sockets.
- Other optimizations
- Benchmark our naive implementation with production-grade software
Before we go, it would be nice to reflect back on what we’ve done here. Hopefully, I’ve succeeded in my attempt to document the cognitive experience at a REPL. How it truly is a bicycle for the mind, fostering exploration, empowering the inquisitive mind. We saw a progression of ideas, a succession of layers, a feedback loop. Sure, we have focused on the anatomy of a HTTP server, but that is secondary. Any challenge would have fit the bill. REPL-based development is a methodology, ancient and well-known in the field of computer science pedagogy, but not nearly an established practice in the industry. Part of this has to do with the fact that not all REPLs are born equal. The REPL experience alone is rarely a determining factor behind language choice among developers. Perhaps it should be.