Here is a comprehensive guide to creating RESTful services over HTTP.
This page is a work-in-progress. Consider ALPHA status. It was last updated on 2021-04-13 00:49:05 +0100. However, it is considered in a state worthy of publication and you should still be able to learn a lot from it. If you want to provide feedback, head over to the project discussions page. |
1. Introduction
So you want to write a RESTful Web API?
If you’re happy with a first-cut quick-and-dirty get-the-job-done implementation, then perhaps this guide isn’t for you.
If you’re new to programming, then this probably isn’t the best guide to follow either (although writing a Web API isn’t a bad programming project to improve your skills).
If you want to write a 'proper' RESTful Web API, one that can be run in a production environment, work well, be solid and maintainable, then this is the guide for you.
Along the way, you’ll learn about what it really means to be a RESTful Web API, and much more, including:
-
The difference between resources and representations
-
Content negotiation—how resources can be mapped to multiple representations, and when you need to pick one
-
Methods—what you have to do for each method you support
-
Status codes—which status codes you should return in your responses, and when
-
Conditional requests—how certain requests should proceed only if certain tests pass (called preconditions)
-
Ranges—when representations are large, how to serve only the fragments that are actually needed
-
Caching—how to make the use of web caches to help your API to scale
-
Authorization—how to restrict access to your service to authorized parties
This guide is meant to be followed one step after another. At each step, feel free to stop and perhaps dive into some of the references we’ll provide. If nothing else, you’ll learn a lot!
1.1. Terminology
First, a word about the terminology we’ll use.
Developers don’t (usually) write web servers, but make use of one that is already written. Examples include the venerable Apache web server, Nginx, Jetty and Netty on the Java platform, Express on Node.js and many others.
Many web servers provide a way to plug-in your own code. The web server will handle all the networking stuff, and will call your code for each web request that is sent to it. The code you provide is called a handler. The job of a handler is to receive the web request as input and produce a web response as output. The form of these requests and responses vary between languages and web servers, but a handler is typically a single function or object.
1.2. The last handler you’ll ever need to write?
Common advice is to write a separate handler for each resource, or set of similar resources. However, this leads to a lot of duplication and unnecessary premature specialization. In this guide, we recommend you create one handler 'to rule them all', which you can take from project to project, tweaking as necessary.
One of the six 'architectural constraints' of REST is the 'Uniform interface'. In REST, each resource should behave more-or-less the same way, each following a set of rules. Although these rules are numerous and fairly arduous to implement in places, the good news is that you only need to code up these rules once. Ultimately, that should mean a lot less code to write and maintain!
Another benefit of this approach is that having separated the code from the data declarations that drive it, you can store that data inside a database. Hey, you could even use our Crux database! |
1.3. Clojure?
We’ve chosen to use Clojure for the example code in this guide. Clojure is concise, and lends itself well to data-driven coding.
That said, it doesn’t actually matter which programming language you choose to implement your web API in. The example code is small and easily translated to other programming languages.
1.4. Is this a web framework?
No.
Library composition outperforms framework callbacks in the long run. The long run typically begins on day two.
None of the support libraries used here accept 'callbacks', and that’s why we don’t call the composition a 'web framework'.
The major downside of our approach more work for you to do. There is no magical web framework to orchestrate everything for you.
The payback is that you retain control of your implementation. Ultimately, you make the decisions and can choose to deviate from this guide when appropriate. You spend more time wrangling your own problems and less time wrangling the web framework you’ve adopted.
This also leads to you reaching a deeper understanding of your own web API service, what it does and how to change it to meet new requirements. You’ll also learn more about the parts of the web that web frameworks hide from you.
For many, this payback is well worth the extra effort.
Good luck, be brave, take small deliberate steps, one at a time.
1.5. Is there a complete solution somewhere?
If you want a complete solution to study, you can find one in the Site source code.
1.6. How to get involved?
If you want to provide feedback, share ideas or otherwise contribute, please head over to our project discussions page.
2. Preliminaries
2.1. Clojure setup
With Clojure’s Ring library, we can handle a web request with a function. A simple Ring handler illustrates a simple function that takes the web request and returns a web response.
(fn [req] (1)
{:status 200 :body "OK"} (2)
)
1 | req is a Clojure map, containing details of the incoming web request. |
2 | This is a Clojure map, the value returned from the function, representing the HTTP response. |
The decision whether to adopt the classic synchronous single-arity Ring handler functions, or asynchronous 3-arity Ring handler functions, is out of scope for this guide. You may use either. |
3. The Steps
3.1. Initialize the request’s state
(defn wrap-initialize-request
"Initialize request."
[h]
(fn [req]
(let [extended-req
(into
req
{:start-date (java.util.Date.)
:request-id (java.util.UUID/randomUUID)
:uri
(str "https://"
(get-in req [:ring.request/headers "host"])
(:ring.request/path req))})]
(h extended-req))))
3.2. Is the service available? (Optional)
The 503 (Service Unavailable) status code indicates that the server is currently unable to handle the request due to a temporary overload or scheduled maintenance
-
Check that your service is not overwhelmed with requests.
-
If it is, throw an exception. Otherwise, go to the next step.
How you can tell this is beyond the scope of this guide. It might be a feature of the web listener you are working with. Or you might want to build something that signals that new web requests should be temporarily suspended. If you don’t know, just skip this section, it’s optional. |
In Clojure, when throwing an exception, embed the Ring response as exception data. This might include a Retry-After
header and the time to wait, in seconds.
(throw
(ex-info "Service unavailable"
{::response (1)
{:status 503
:headers {"retry-after" "120"} (2)
:body "Service Unavailable\r\n"}}))
1 | Embed the Ring response as exception data. |
2 | Add a Retry-After header. |
Your whole handler should be wrapped in a try/catch block.
The catch block should catch the exception, extract the Ring response, and return it to the Ring adapter of the web server you are running.
3.2.1. References
503 Service Unavailable |
|
Retry-After |
3.3. Is the method implemented?
The 501 (Not Implemented) status code indicates that the server does not support the functionality required to fulfill the request.
The next step is to check whether the request method is one your implementation recognises.
-
Check if the request method is recognised.
-
If so, go to the next step.
-
If not, throw an exception containing a
501 (Not Implemented)
error response.
-
In Clojure, throw an exception like this:
(throw
(ex-info
"Method not implemented"
{::response
{:status 501
:body "Not Implemented\r\n"}}))
The spin library offers a helper function that checks the request method is one of a set of known common HTTP methods, and if necessary, throws the exception as described:
(spin/check-method-not-implemented! request)
3.3.1. References
501 Not Implemented |
3.4. Locate the resource
The target of an HTTP request is called a "resource".
-
Use the URL of the request to lookup or otherwise locate the resource object (which can be null).
-
Hold this data structure as a variable, and go to the next step.
-
Typically, a resource object will include the following:
-
The resource’s identifier (the URI) or, at least, its path
-
Which methods are allowed on the resource?
-
Current representations
-
Which ranges, if any, are acceptable?
-
Authorization rules - who is allowed to access this resource and how?
-
The allowed types of submitted representations
-
Anything else that is useful
An origin server maintains a mapping from resource identifiers to the set of representations corresponding to each resource
Architectural Styles and the Design of Network-based Software Architectures
Try to avoid using the request method when locating a resource—a resource value should encompass all its methods. |
In Clojure, you might choose to use to model a resource as a map.
For example, here is a map that corresponds to a certain resource. It
demonstrates a number of the declarations that are possible that are recognised
by functions in the Spin library (denoted by the use of the ::spin
namespace
prefix. Many other additional application-specific entries may be added.
{:description "Prints 'Hello World!'"
::http/methods #{:get :head :options} (1)
}
1 | Allowed methods |
You can use a router to locate the resource, but since resources can be modelled as data values, they can be stored in a key/value database. Locating a resource is simply a matter of looking it up using the URL as the key.
3.5. Redirect if necessary
3.6. Determine the current representations
A representation consists of both data (e.g. an HTML document, a JPEG image) and metadata, called representation metadata.
Representation metadata may include the following:
Key | Description | Example |
---|---|---|
|
The representation’s media type. If a |
|
|
How the representation’s data is encoded |
|
|
The human language used |
|
|
The URL of the representation, if different from the request URL |
|
|
When the representation was last modified |
|
|
A tag, uniquely identifying the version of this representation |
|
Representation data consists of payload header fields and a stream of bytes. Payload header fields may include the following:
Key | Description |
---|---|
|
The length of the representation’s stream of bytes |
|
If a partial response, the range of the representation enclosed in the payload |
|
Additional fields at the end of a chunked message |
|
How the payload has been encoded in the message body |
The vast majority of resources map to a single representation, but some resources can have multiple representations.
A representation reflects the current state of the resource. Where there are multiple representations, each representation should correspond with the current state of the resource.
-
Using the resource, determine the currently mapped representations and store in a variable.
3.6.1. References
Representation Metadata |
|
Payload Semantics |
|
Last-Modified |
|
ETag |
3.7. Select the most acceptable current representation
For the given resource, determine the content negotiation strategy and follow one of the sections below (although it is permissable to use a hybrid or combination of strategies).
If in doubt, use proactive content negotiation, which is by far the most commonly employed strategy. |
3.7.1. Proactive Content Negotiation
-
Load the current representations found in Determine the current representations.
-
If there are no representations, and the method is a GET or HEAD, return a
404 (Not Found)
error response. -
Select the most acceptable representation from this set, using the preferences contained in the request.
-
If there is no such acceptable representation, and the method is a GET or HEAD, throw an exception containing a
406 (Not Acceptable)
error response. Construct a body containing links to each unacceptable representation from step 1. -
Otherwise store the most acceptable current representation. This will be referred to from now on as the selected-representation. Move to the next step.
-
3.7.2. Reactive Content Negotiation
-
Determine the set of available representations for the resource.
-
If step 1. yields no representations, return a 404 error response. Go to [error-response].
-
Optionally, filter this set using the preferences contained in the request.
-
If step 3. yields a single representation, then use this as the representation and move on to the next section.
-
If step 3. yields multiple representations, respond with a 300 response and construct a body containing links to each representation in this filtered set.
3.8. Authenticate the request (Optional)
-
Add to the request, any roles, credentials or entitlements that can be acquired. Use information in the resource found in Locate the resource to determine the authentication scheme and/or protection space.
-
This usually involved inspecting the request’s
Authorization
header and/or other headers, frequentlyCookie
headers.
-
3.9. Authorize the request (Optional)
-
Update the resource object according to the authenticated request’s roles, credentials or other entitlements.
-
If the resource cannot be accessed without credentials, and if none have been supplied (or ones that have been supplied are invalid) throw an exception that contains a
401 (Unauthorized)
error response. This response may include aWWW-Authenticate
header to indicate to the user agent that it should resend the request with credentials. -
If the request does contain valid authenticated credentials, but they are insufficient to provide access to the resource given the request’s method, throw an exception that contains a
403 (Forbidden)
error response, or a404 (Not Found)
error response if you want to hide the existence of the unauthorized resource to the user.
-
3.10. Check method allowed
-
Check the request method against the methods allowed by the resource.
-
If the request method isn’t allowed, return a '405 (Method Not Allowed)' error response containing an
Allow
header.
-
The rationale for authorizing the request prior to checking that the method is allowed is to hide which methods are allowed to unauthorized users. |
3.11. Prepare the response
-
Get the system time and store it in a variable. This will now be referred to as the message origination date for the response.
3.12. Perform the method
You should now perform the action associated with the request method.
-
Go to the section that matches the request method:
3.12.1. The GET (and HEAD) methods
-
If there is a
Range
header in the request, and ranges are supported on this resource, parse its value.-
If the units of the range header isn’t supported by the resource, throw an exception with a
400 Bad Request
error response.
-
-
Compute the payload header fields and payload response body
-
If there is a valid
If-Range
header, and ranges are supported, set the status to 206, add aContent-Range
header to the payload header fields, and compute the shorter body to reflect the requested range.
-
-
Add the
Date
header, using the message origination date stored in [record-the-date]. -
If supported, add an
Accept-Ranges
header. -
Add the representation metadata to the response headers.
-
Only include the
Content-Location
metadata if this is different from the URL of the request.
-
-
Add the payload header fields.
-
If the request method is GET, add the representation’s data stream to the response’s body.
3.12.2. The POST method
The first step in processing a POST request is to receive any "representation enclosed in the request message payload" and check its validity.
-
Process the received representation. This may involve per-resource custom code.
3.12.3. The PUT method
The PUT method requests that the state of the target resource be created or replaced with the state defined by the representation enclosed in the request message payload.
The first step in processing a PUT request is to check if there’s a
Content-Range
header in the request. If so, you should return a 400 error
response.
The next step is to receive the "representation enclosed in the request message payload" and check its validity.
Here is the procedure:
-
Check if there’s a
Content-Range
in the request. If so, return a 400 error response.
The second part of processing a PUT request is to update the state of the resource. The representation read from the request indicates that the state of the resource needs to change, and that might involve changing all its current representations together. Ideally, this should happen atomically (all changes should succeed together, or fail together).
We must also evaluate any preconditions just before performing the required updates. To guarentee that we will avoid losing updates, we should run the preconditions at the beginning of the same transaction. That way, race conditions will be avoided.
Therefore, here is the procedure:
-
Within a transaction,
-
Update the state of the resource (this might involve resource-specific code)
3.12.4. The DELETE method
-
Delete the mapping between the URI and the resource (this might involve resource-specific code).
3.12.5. The OPTIONS method
-
Return a
200 (OK)
response containing anAllow
header to indicate the allowed methods on the resource.
3.13. Add security headers
3.14. Add CORS headers (Optional)
3.15. Handle errors
3.16. Log the request (Optional)
Appendix A: Procedures
The procedures in this section are linked to from the main content.
A.1. Evaluate preconditions
For any request method that involves the selection or modification of a representation (e.g. GET, POST, PUT, DELETE), a set of preconditions are evaluated.
Here’s the procedure:
-
If the request contains an
If-Match
header field value, and-
If the value is
*
and the resource has no mapped representations, return a412 (Precondition Failed)
error response. -
If none of the entity-tags in
If-Match
strongly match the entity tag of the selected representation, return a412 (Precondition Failed)
error response.
-
-
If the request does not have an
If-Match
header, but contains the headerIf-Unmodified-Since
, and-
If the
last-modified
value of the representation metadata of the selected representation is after the date in theIf-Unmodified-Since
header, return a412 (Precondition Failed)
error response.
-
-
If the request contains an
If-None-Match
header field value,-
If the
If-None-Match
header field value contains an entity-tag which weakly matches theetag
value of the representation metadata of the selected representation, OR if theIf-None-Match
header value is*
and there is at least one current representation for the resource,-
If the request method is a GET or HEAD, return a
304 (Not Modified)
response, -
Otherwise, return a
412 (Precondition Failed)
error response.
-
-
-
Otherwise, if the request does not have an
If-None-Match
header field value,-
If the request method is GET or HEAD, and the request has a
If-Modified-Since
header field value, unless thelast-modified
value of the representation metadata of the selected representation is after the value of theIf-Modified-Since
header field value, return a304 (Not Modified)
response.
-
Spin has a utility function you can call with the request, resource and representation metadata of the selected representation.
The function will evaluate the preconditions using the header field values in the request and the representation metadata of the selected representation, throwing an exception at any point one of the preconditions fails.
(spin/evaluate-preconditions!
request resource selected-representation-metadata date)
A.1.1. References
Evaluation |
|
Precedence |
A.2. Receiving a representation enclosed in a request
Here is the procedure:
-
If the request doesn’t have a
Content-Length
header, return a411 (Length Required)
error response. -
If the value of the
Content-Length
header field is more than the maximum content length allowed by the resource, then return a413 (Payload Too Large)
error response. -
If the is no request message payload, return a
400 (Bad Request)
error response. -
Check that the representation metadata in the request headers meet the acceptability criteria for the resource and if not, either reconfigure the resource, transform the PUT representation somehow, or reject the request with a
415 (Unsupported Media Type)
or409 (Conflict)
error response. -
Load the representation from the request message payload. Close the input stream after reading exactly the number of bytes declared by the
Content-Length
request header (and no more).
Spin has a utility function that implements this procedure:
(spin/receive-representation request resource date)
Currently, if the representation doesn’t not meet the criteria in the resource’s configuration, the request is rejected. There is no attempt to recover, either by reconfiguring the resource or transforming the representation.
A.3. Error
If you want to send an error response, you should decide whether to send a body in the response. This might contain information about the error and explain to the user-agent (or human) how to avoid the error in future.
-
Perform content negotiation to establish the best representation to send.
-
Add the representation metadata to the response headers, and stream the representation data as the the body of the response.
References
-
[cowboy] Cowboy is a callback-based web framework in Erlang, sharing similar goals of full conformance with HTTP standards.
-
[Fielding-2000] Fielding, Roy Thomas. Architectural Styles and the Design of Network-based Software Architectures. Doctoral dissertation, University of California, Irvine, 2000.
-
[liberator] Liberator is a Clojure library by Philip Meier (et. al.) based on Alan Dean’s activity diagram.
-
[RFC7230] R. Fielding, J. Rescheke, (et. al.) RFC 7230. Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing. Internet Engineering Task Force (IETF). 2014.
-
[RFC7231] R. Fielding, J. Rescheke, (et. al.) RFC 7231. Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content. Internet Engineering Task Force (IETF). 2014.
-
[RFC7232] R. Fielding, J. Rescheke, (et. al.) RFC 7232. Hypertext Transfer Protocol (HTTP/1.1): Conditional Requests. Internet Engineering Task Force (IETF). 2014.
-
[RFC7233] R. Fielding, J. Rescheke, (et. al.) RFC 7233. Hypertext Transfer Protocol (HTTP/1.1): Range Requests. Internet Engineering Task Force (IETF). 2014.
-
[RFC7234] R. Fielding, J. Rescheke, (et. al.) RFC 7234. Hypertext Transfer Protocol (HTTP/1.1): Caching. Internet Engineering Task Force (IETF). 2014.
-
[RFC7235] R. Fielding, J. Rescheke, (et. al.) RFC 7235. Hypertext Transfer Protocol (HTTP/1.1): Authentication. Internet Engineering Task Force (IETF). 2014.
-
[Webmachine] webmachine is based on an activity diagram, first created by Alan Dean.
-
[yada] yada is a JUXT project with similar aims but technically a framework requiring callback functions. The library composition of Spin, pick and reap is far more complete (in terms of conforming to the RFCs, in both breadth and depth) and accurate, but not as well battle-tested.