I recently thought I’d google around a bit to learn more about REST, and it — just trying to find good information regarding REST — turned out to be much harder than I initially thought it would. This was due to the lack of good examples online, conflicting opinions regarding what REST really is about, and a lot of so called “RESTful” web services that don’t seem to be very RESTful.
This is my current understanding (with references). Hopefully I add more legitimate information to this topic than I do misinformation. I’ll try to cover my current thoughts regarding versioning, media types, URIs, hypermedia, and more.
To start off with, check out Roy Fielding’s blog (“Untangled”), specifically this entry: “REST APIs must be hypertext-driven“.
A few things can immediately be learned from that post alone. Specifically,
1. Hypermedia: REST is driven by hypertext. If you’re not using hypermedia, you’re not doing REST (more on that later).
2. URIs: REST doesn’t care about how you define your URIs. If your web-service goes into detail about what the URIs you build should look like, then you’re not doing REST. REST is 100% hypertext-driven and 0% URI-driven. If you didn’t fully get that from the above link, Roy further clarifies his position in a news-group posting:
“…any pre-definition of URI layout or WSDL-like service semantics is an absolute waste of time as far as a RESTful architecture is concerned. External artifacts might help the developers communicate about or improve the design of their system, just as readable URIs will help a human user understand where they are in a hypertext user-interface, but those external artifacts must not have a role in the runtime architecture if the system is truly hypertext-driven.”
3. REST doesn’t depend on HTTP. You can choose to implement it using HTTP (and if you do, you shouldn’t change the way HTTP works), but you can implement it over any communication protocol. From that, it should also be clear that just using HTTP doesn’t make your API RESTful.
It’s probably a good time to point out that — in the comments section of that same blog entry — Roy also adds:
REST is intended for long-lived network-based applications that span multiple organizations. If you don’t see a need for the constraints, then don’t use them.
So, if you’re building an internal company web-service, don’t kill yourself trying to be 100% RESTful if it doesn’t quite work for you. Do what works best for you and your clients (that pretty much applies to any web-service you’re trying to build).
That being said, if you don’t follow all of the mandatory constraints of REST, then please don’t call your web-service RESTful (call it a web-service, or a hypermedia service, or what-ever might actually be appropriate).
Service Versioning and Media Types:
There are a few ways you can version your service (we’ll discuss how applicable these are to REST in a moment):
- in the URI itself (e.g., “http://myservice/v2/resource”);
- as a query parameter (e.g., “http://myservice/resource?version=2″);
- inside the payload of your data (e.g., “<myservice version=’2′ … />”); or
- as part of a custom media type (examples to follow).
Media Types & Data Formats
Sticking with media types for a moment, let’s first address the question of whether or not you should you use a generic media type (e.g., “application/xml”), versus using some other standard media type (e.g., “application/atom+xml”) or your own custom media type (we’ll leave the discussion about the choice of the last two for later).
When Nic James Ferrier talked about generic media types in an online discussion, Roy responded with:
…I would not use “application/xml” for anything useful. The argument that people can look inside at the root element
died a long time ago (shortly after the people making it tried
to build an intermediary).
Cool. So that clarifies his position on using a generic media type (like “application/xml” or “application/json”). Admittedly, Roy doesn’t specifically say here that doing those things aren’t RESTful, but since he clearly thinks there are better ways, let’s rule this option out.
What about putting the data format in a query string, like “/resource?format=xml”? Regarding that, Roy makes the following comment: “I don’t know of any sane system that uses the query portion to distinguish format-specific resources” (see also here). In the same posting he mentions that there’s “absolutely no good reason” to do that instead of something like “resource.xml”.
So, in regards to where to put the resource format, adding it as part of the query string is out, while including it as part of the resource name seems like a viable approach (What about as part of the media type? Read on…).
This posting adds some more information (Roy’s comments follow after the initial poster’s comments) :
> We talked about unique naming and how there
> shouldn’t be /resource.xml and /resource.json,
> but rather /resource and two representations based
> on the Accept header.
Actually, there should be all three if you want a negotiated resource.
It is important to understand that these are three *different*
resources (resource != file). Each identifier corresponds to a
unique semantic and mapping over time.
The former are requests on two different resources. The latter
are two varying requests on one resource. The only difference,
in my opinion, is that the single varying resource makes for a
better bookmark because it is less susceptible to both
differences in user agent capabilities (different accept lists)
and changes in supported media types over time. It is not,
however, a replacement for the media-specific resources and
their corresponding URIs.
The media-specific resources are also useful for the apps
that don’t want to negotiate, especially those performing
remote authoring or versioning.
So, to summarize: the resource format should either go:
(i) in the resource name (e.g., “/resource.json”) or URI (e.g., “/json/resource”); OR
(ii) in the ACCEPT header as part of the media type; OR
(iii) allow for both.
However, it would be preferable for it to not be a part of the query string (e.g., “/resource&type=json”), although that’s not strictly disallowed.
My personal preference at the moment is to put the format information in the ACCEPT header. I’ll go into more detail later on why I think that (via an example), but to offer a glimpse… after following a link that Roy provided to this article, I came across the following passage:
… although people may well be able to discern that two URLs probably point to the same thing, machines can’t.
Once you have a single resource at a single persistent URL you can start to do some interesting things. You can make that resource available in a variety of different formats each optimised for different uses: HTML for web browsers, XHTML MB or WML for mobile, JSON for Ajax applications etc.
While these aren’t Roy’s comments (they came from the aforementioned link, provided by Roy), I tend to agree with them myself.
I didn’t find any specific RESTful argument against putting the version information inside of your data — such as in a header, for example. Earlier it was mentioned that looking inside the root element for the media type may not be a good idea, but this doesn’t necessarily apply to the version information. After all, HTTP and XML both include version information in their headers, so this seems like a possibility that still fits with a RESTful approach.
What about putting version information inside of the URI itself (as in: “/myservice/v2/resource”)?
Reasoning that this is not much different than putting the resource type in the URI, I would tend to think that this option is also OK from a REST point-of-view as well. You may have heard objections to doing this along the lines of “URIs must be opaque.” But in fact Mr. Fielding himself wrote: “REST does not require that a URI be opaque.” So, REST clearly allows this.
Where did this “constraint” come from, then? Well, it was Tim Berners-Lee who wrote about “The Opacity Axiom” which reads:
The only thing you can use an identifier for is to refer to an object. When you are not dereferencing, you should not look at the contents of the URI string to gain other information.
So that pretty much makes this option a toss-up: the creator of REST states that REST does not require a URI to be opaque, while the creator of the World Wide Web states that URI’s should be opaque.
For reasons similar to the discussion regarding data formats I think that it would be preferable that the version information not be in the query string, though.
So, we have the following options:
(i) put the version information inside the payload; OR
(ii) put the version information in the URI (not my personal favorite, but still allowed); OR
(iii) include the version information as part of the media-type (either explicitly or implicitly).
As before, it would be preferable for it to not be a part of the query string (e.g., “/resource&version=2″), although that’s not strictly disallowed either.
My preference, again, is to include the version information (implicitly or explicitly) as part of the media-type. The following two examples give some clues as to why I think that this approach may be more flexible.
EXAMPLE SCENARIO #1:
Assume you have two versions of your service: V1 and V2. Also assume that you provided support in your service implementation for both xml and json in both version V1 and version V2.
Now, let’s say you have a client that also supports both V1 and V2. But, due to time constraints, it only supported xml in V1. Furthermore, someone thought XML was too verbose so they switched to only supporting json in V2.
How can you let the server know what versions and formats you support?
You can’t easily do that as part of the URI. Regarding the version, you might get away with declaring support for only the latest version (as in “/service/v2/resource”), but that has some limitations, as your client is potentially limiting itself by not letting the service know that it can support multiple versions (see next example).
You can try to specify them in the query string, using multiple options, as in “/resource?v=1,2&format=xml,json”, but that might suggest that you can handle all four combinations (V1 & xml; V1 & json; V2 & xml; V2 & json), which you don’t. You could also explicitly put all the valid combinations in the query string, as in “/resource?versionAndFormat=v1_xml, v2_json”, but not only does that seem like an ugly way to attempt to do what the ACCEPT header already gives us, but we’ve already found out that putting format information in the query string isn’t recommended.
EXAMPLE SCENARIO #2:
While this scenario may not be applicable to you as-is, it does highlight the flexibility that using the ACCEPT header gives you.
Imagine a system in which remote sites collect data and send it to a centralized host server. Ideally, clients will connect to this central server to view their data. However, in some cases this isn’t possible (e.g., the remote sites are connected via a satellite, which is currently down), or it just isn’t ideal (maybe there is too much latency, what with all the sunspot activity, or maybe your system batches its data and only periodically sends it to the server, and your client — which also is at the remote site — happens to want more real-time data in this case).
In this example, there are actually two (or more) web-servers that implement your (RESTful) web-service API. One is your “main” server (the central server), and the other is hosted at the remote site (or sites, if there are more of them). Since your remote server is, well, remote, its software hasn’t been updated in a while and still only supports Version 1 of your service (you can probably already see where this is going…).
Now, your client software could have some local config file that you need to keep up-to-date that says what servers support which versions (although keeping that up-to-date with multiple clients and multiple servers seems questionable), so that it knows exactly what to ask for. But remember, REST is hyper-link driven. If your entry point to the service is the “remote” server, then using the “version in the URI” approach requires you to use a URI like “/service/V1/”.
But what if you do that, and the “remote” server has to redirect you to information that only exists on the central server (or some other remote server)? Since the server you’ve initially connected to only knows about “V1″ (as that’s what you gave it in your URI), that’s all it can pass on to the central server. So, for any information you get back from the central-server (again, you’re just following a link, so you don’t really know where the data is coming from) you’re only getting version V1 of the data. But, darn it, you actually do support “V2″ (in json format) and you’d prefer to get “V2″ data as it contains a whole bunch of extra, useful, and otherwise really nice-to-have information. Unfortunately, you can’t easily do that if the version information is in the URI. You can probably come up with other scenarios that equally aren’t optimal…
Now, if you used the ACCEPT header instead, passing in something like:
Accept: application/vnd-example-v2+json, application/vnd-example-v1+xml; q=0.5
(and we’ll go over formatting the ACCEPT header some other time), then you’ll always get the best format and version possible, regardless of which server ultimately ends up responding to your request.
There’s more that I’d like to write about on this topic, but since my wife is complaining about me spending too much time on my computer right now, I’m going to save that for a follow-up post .