Peer Pressure

Stuff to look at about looking at stuff. From Chris Dent. What?

Archive

Aug
19th
Tue
permalink

HTTP Response Code Poser?

TiddlyWeb has support for arbitrary storage mechanisms. For the sake of maximum flexibility a storage system does not have to support all the methods in the StorageInterface. Today I made it so when a method is not implemented a StoreMethodNotImplemented exception is raised. This is nice because it means we don’t need to check or create return codes from the store. Exceptions are a happy making nice thing, at least in this context.

However, in the web handlers, that StoreMethodNotImplemented exception needs to be translated into some kind of HTTP response code to inform the calling client that what they wanted, they can’t do, not on this server. I ran into some mental confusions about which one was the best. Anybody know?

Because TiddlyWeb is supposed to present a fairly uniform API for clients the response code should indicate “What you did there, that is sometimes a normal thing, but on this server, this particular implementation, you can’t do that.”

To make this concrete let’s say that the store doesn’t support recipe_delete so we need a reasonable response to:

DELETE /recipes/recipe_name

404 Not Found makes some sense:

The server has not found anything matching the Request-URI. No indication is given of whether the condition is temporary or permanent. The 410 (Gone) status code SHOULD be used if the server knows, through some internally configurable mechanism, that an old resource is permanently unavailable and has no forwarding address. This status code is commonly used when the server does not wish to reveal exactly why the request has been refused, or when no other response is applicable. spec

The server can find something matching the Request-URI. In fact it’s running code associated with it right now. Also, this is a sometimes valid TiddlyWeb entry point so do we want to give the impression that it is not? And the resource we’re talking to, the recipe, it does exist, you just can’t delete it given the current constraints of the server. Also what about /search. An empty results set returns a 404. If lack of support for /search also returns a 404 that’s a confusing bit of ambiguity.

405 Method Not Allowed makes some sense too:

The method specified in the Request-Line is not allowed for the resource identified by the Request-URI. The response MUST include an Allow header containing a list of valid methods for the requested resource. spec

For our recipe example it is very much the case that we don’t support the DELETE method for this particular resource. But for list_tiddler_revisions it makes less sense. If there is no support for listing revisions, a 404 makes more sense: there are no methods at all on the revisions Request-URI. Neither gives the sense that the situation which obtains in this context may not be true in another. It would be nice to have a generic response we can throw here, not something specific for each URI.

501 Not Implemented seems an option:

The server does not support the functionality required to fulfill the request. This is the appropriate response when the server does not recognize the request method and is not capable of supporting it for any resource. spec

The server does not support the functionality required. However it does support the DELETE request method for some resources. Just not this one. And 500 and above is often interpreted to mean that something hit the fan, and that’s not really the case here.

400 Bad Request covers a lot bases:

The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications.spec

This is close, except that the “malformed syntax” is specific to this server rather than to the TiddlyWeb class of servers.

For the time being, after some chatting in #tiddlywiki, a 400 is being returned, but I suspect the code will need to be changed to be more specific; more aligned with the specific Request-URI. I suspect 405 when one of the *_{get,put,delete} is not implemented and 404 when one of the list_* or search methods are used. Hmmm. Or maybe just 405 all round.

Input appreciated. Lazyweb?

Comments (View)
Aug
18th
Mon
permalink

And the Leaders said, "Let Them eat Cake"

I have tendency to want to tie together threads that don’t appear to have much to do with one another. A few days ago JP made a posting …musing about leadership… in which he suggested:

Leadership is about taking the risk of managing meaning

Perhaps true in hierarchies, but I prefer to think that a more critical action of leadership is feedback. So, to me, it’s not about accepting risk, it’s about helping to shape meaning, and the process of shaping meaning leads to understandings which lead to goals. Leadership is spread around among all the communicating participants. Leadership is something anyone, in the right environment, can do.

What this can mean is that sometimes the person who gives the most feedback in any situation can exhibit significant power. In conversation with Fred (one of the Osmosoft guys) I suggested that he, by asking the most questions and making the most comments about TiddlyWeb (i.e. providing feedback), was leading the direction of TiddlyWeb.

So asked him what he wanted next. And he wanted fudge cake. So I set about making that possible. Cake from TiddlyWeb.

In Web of TiddlyWebs with TiddlyWebWeb I gave an intro to the StorageInterface system in TiddlyWeb. It makes it relatively easy to provide different persistence engines.

There is a similar mechanism for handling multiple content-types for incoming and outgoing representations of resources. There is a SerializationInterface. Implementors of that interface are responsible for taking the internal object representation of some entity (Recipe, Bag, Tiddler) and turning it into a string that represents a particular type (HTML, JSON, a raw text format, Atom, etc). They also take a string of a particular type and turn it (if possible) into an object representation. The relevant methods are:

recipe_as(self, recipe):
as_recipe(self, recipe, input_string):
bag_as(self, bag):
as_bag(self, bag, input_string):
tiddler_as(self, tiddler):
as_tiddler(self, tiddler, input_string):
list_tiddlers(self, bag):
list_recipes(self, recipes):
list_bags(self, bags):

The default TiddlyWeb installation comes with four serializations: html, json, text and wiki. What serializations are available is controlled by entries in tiddlyweb.config.

Which serialization is used is determined by a simplified form of content negotation, handled by code in tiddlyweb.web.negotiate. For a GET request we look at the Accept header (or an extension on the URL) and eventually use a *_as or list_* method on the serialization to turn an object or objects into the requested format representation. For a PUT or POST request we look at the Content-Type header and eventually use a as_* to turn the provided representation into an object.

Calling code does not directly call recipe_as or similar. Instead it asks the Serializer to do a to_string() or from_string() on a provided object and optional string. A Serialization doesn’t have to provide support for transforming all objects, just stuff it cares about.

All these things make it quite easy to add support for other serializations. The first optional one I made added Atom support as a content oriented web service without Atom is very sad making. There’s a README beyond that link which explains how to add Atom to your own TiddlyWeb installation.

So back to Fred. Fred wants TiddlyWeb to produce cake. In my mind a TiddlyWeb that produces cake is producing cake with Tiddler information in/on/around/with the cake. To keep things simple I decided all we need to do to make TiddlyCake was to write tiddler content onto a picture of a cake and send it out. This means that we only need to write a tiddler_as method and choose a content type for it. cake/x-fudge is what we’ll use.

First we customize the tiddlyweb.config by making our own tiddlywebconfig.py:

config = {
        'extension_types': {
            'cake': 'cake/x-fudge',
            },
        'serializers': {
            'cake/x-fudge': ['cake.cake', 'image/jpeg'],
            },
        }

This says that when we put .cake on the end of a resource think of that as an Accept or Content-Type header being set to cake/x-fudge. When we have that content-type for requests, use the Serializer in the module cake.cake and output the results as image/jpeg.

Now we need to write cake/cake.py. We need a class called Serializer that inherits from SerializationInterface and implements tiddler_as().

In tiddler_as we gather up tiddler.text and do a bit of math to write it over the top of a picture of cake with some really lame line wrapping and then return a string in JPEG form.

Python has the very useful Python Imaging Library that makes most of the image manipulation easy (easy if you are willing to accept ugly).

I searched Creative Commons images on Flickr and found a nice looking cake from Just American Desserts in Spokane.

When you gather all the necessary pieces you can start up a server that will provide TiddlerCake. You give a URL like this:

http://0.0.0.0:8080/recipes/TiddlyWeb/tiddlers/TiddlyWeb.cake

and the browser will show something a little bit like this:

TiddlerCake

Sadly, I wanted to make this bit of silliness show up on the live TiddlyWeb server at peermore.com but it does not have the necessary graphic library support on it right now, and updating it is a task I don’t want to do today. So images of images will have to do for now. You can, of course, try it out for yourself on your own TiddlyWeb server.

Comments (View)
permalink
If Gordon Brown is still looking for a “big idea”, then he could do worse than adopt internet collaboration. That means not just bringing fast broadband internet into the home, especially the homes of poor people, but also to reverse the government’s lamentable resistance to open source.
Comments (View)
Aug
17th
Sun
permalink
His post made the rounds on the expected social news sites like programming.reddit and Hacker News, where I was amused to note that my blog is now being used as an example of silly REST dogma by REST skeptics in such discussions. From reading the Damien’s post and the various comments in response, it seems clear that there are several misconceptions as to what constitutes REST and what its benefits are from a practical perspective.
Comments (View)
Aug
16th
Sat
permalink

Webs of TiddlyWebs with TiddlyWebWeb

In the previous posting I said creating a new storage mechanism for TiddlyWeb was a simple matter of creating a new module that supported an interface. It occurs to me that I can kill a few birds by writing about the process of creating such a module. One of the birds is explaining some of the facets of TiddlyWeb’s architecture. Another is explaining why, despite Damien Katz getting all up in people’s business and calling bullshit, REST is teh awesome and I want it.

What we’ll do here, just to be a bit perverse, is a create a store for TiddlyWeb that uses another TiddlyWeb as the place where resources are stored. In the process we should see why TiddlyWeb’s API being “RESTful” is useful and also expose a few bugs that need to be fixed to have better behavior.

I started this a while ago and called it tiddlywebweb, so we’ll carry on with that.

In TiddlyWeb a store is a module with a known name, containing a class with the name Store that is a subclass of StorageInterface. The interface is defined in tiddlyweb/stores/init.py. The default store for TiddlyWeb is called text and is in tiddlyweb/stores/text.py. text uses the filesystem and easy to read (by humans) text files for storing data. It’s not terribly speedy, but it is easy to grok.

In usual use a particular implementation of a StorageInterface is not directly instantiated. Instead the calling code creates a Store object of the class defined in tiddlyweb/store.py, choosing the type of store by name:

from tiddlyweb.store import Store
store = Store('text')

Methods are then called on the store object to access data:

from tiddlyweb.tiddler import Tiddler
tiddler = Tiddler('MyTiddler', bag='foo')
store.get(tiddler)
print tiddler.text

(The unique id of a tiddler is it’s title and the name of the bag in which it lives, so we need both in order to be able to retrieve a tiddler from the store.)

When a Store object is being created the system first looks in the tiddlyweb.stores package for a module with the given name. If it finds that, it is imported and the Store class within is used. If that import does not happen, then the system tries to import the name directly (searching sys.path). Our new tiddlywebweb code is located in tiddlywebweb/tiddlywebstore.py so we could load it like this:

from tiddlyweb.store import Store
store = Store('tiddlywebweb.tiddlywebstore')

Except that in most cases we would not. Each TiddlyWeb instance is assumed to have just one store (for now) so the storage system can be defined in the system configuration. The default system configuration lives in tiddlyweb/config.py and can be overridden by a Python file called tiddlywebconfig.py kept in the working directory of the TiddlyWeb server. The config dict in tiddlywebconfig.py is merged over the top of the config dict in tiddlyweb/config.py.

The server_store key in that dict holds the name of the store that is to be used, and a dict of any configuration information that it needs (path information, username and password handling, etc.) For the default text store it looks like this:

'server_store': ['text', {'store_root': 'store'}],

text is the name of the store. store_root is the path to the directory where data is to be stored.

To use tiddlywebweb as the store for our server A, we know we need the the base URL of the other TiddlyWeb server (server B) so to start out our config entry will look a bit like this (assuming the other server is running on localhost:8000):

config = {
    'server_store': ['tiddlywebweb.tiddlywebstore',
        {'server_base': 'http://localhost:8000'} ],
}

That will go in the tiddlywebconfig.py of server A (more detail on that in a bit). Server B needs no particular modification (for the purposes of this exercise we’re not going to worry about access control, maybe next time), it just needs to be running.

Okay, so now we know how to use a different store, but what does a store do? As said before a Store needs to implement (some of) the StorageInterface. This is a collection of methods that get, put, delete and list a variety of entities used by TiddlyWeb. Here’s the complete list:

recipe_get(self, recipe):
recipe_put(self, recipe):
bag_get(self, bag):
bag_put(self, recipe):
tiddler_delete(self, tiddler):
tiddler_get(self, tiddler):
tiddler_put(self, tiddler):
user_get(self, user):
user_put(self, user):
list_recipes(self):
list_bags(self):
list_tiddler_revisions(self, tiddler):
tiddler_written(self, tiddler):
search(self, search_query):

Some caveats:

  • Support for deleting recipes and bags is planned but not yet supported.
  • Support for users is optional (for example the GoogleAppEngine version of TiddlyWeb uses Google Users, so doesn’t need to store them).
  • Support for listing tiddler revisions is optional. If a store doesn’t support revisions, then a tiddler is always revision one.
  • tiddler_written is by default a pass. It is called by tiddler_put. If overriden it can be used to update an index if one is used by search.

Alright, let’s think about this a minute. If our server A is using server B for storage, then when a user asks server A “GET /recipes/foo” what needs to happen is that server A asks server B “GET /recipes/foo”. And when a user tells server A “PUT /recipes/foo”, server A needs to tell server B “PUT /recipes/foo”. Similar things for a bag or a tiddler or collections thereof. So what we need to do is proxy web requests to server A through to server B, right?

Wrong. In order for TiddlyWeb to be able to support multiple storage types it has to have fairly disciplined separation of concerns amongst all the bits of code. When a user asks server A for a recipe the request is processed something like this:

  1. The selector map dispatches to the right piece of code (tiddlyweb.web.recipe:get) based on urls.map.
  2. That piece of code uses the name provided by the URL to instantiate an empty Recipe object, recipe.
  3. The configured store is then asked store.get(recipe) to populate the recipe.
  4. Based on content negotiation, that recipe is then serialized to a particular string and sent out in the response.

Step 2 is required in order for step 3 to be most flexible. And Step 3 has no clue what the configured store is, just that it calls store.get().

So proxying is right out. Which means our tiddlywebweb store needs to be an HTTP client that constructs proper URLs and makes requests to a remote server. We’ll only pass content that is application/json to be simple (and because it is the most robust of the default TiddlyWeb serializations). And to keep things simple we’ll just get the bare bones in. Later we’ll add the fancy.

We know about recipe_get. Let’s describe what it needs to do:

  • Accept a recipe.
  • Use its name to construct a URL.
  • Make a GET request to that URL with an Accept header of application/json.
  • Transform a successful response from JSON into a proper Recipe object.
  • Get the object back up the stack.

recipe_put does something quite similar:

  • Accept a recipe.
  • Transform the recipe into JSON structure.
  • Use the name of the recipe to construct a URL.
  • Make a PUT request to that URL with a Content-Type header of application/json and a body of the JSON.
  • Promulgate success up the stack.

The rest of the methods are in some very fundamental ways basically the same (this is why mnot is basically right for many cases) so we’ll not bother to go into too much detail.

In tiddlywebweb.tiddlywebstore recipe_get looks like this:

def recipe_get(self, recipe):
    url = self.recipe_url % urllib.quote(recipe.name)
    self.doit(url, recipe, self._any_get, NoRecipeError)

Wait, what? That gets the first step (url construction) but what’s going on with the rest of it. Well, like I said, all the methods are basically the same so we can abstract that out. doit() looks like this:

def doit(self, url, object, method, exception):
    try:
        method(url, object)
    except TiddlyWebWebError, e:
        raise exception, e

Which says “try to do method method, if it works, we’re golden, otherwise raised the exception named by exception”. doit() wraps either _any_get() or _any_put(). Here’s _any_get():

def _any_get(self, url, target_object):
    response, content = self._request('GET', url)
    if self._is_success(response):
        self.serializer.object = target_object
        self.serializer.from_string(content)
    else:
        raise TiddlyWebWebError, '%s: %s' % (response['status'], content)
  1. Make the request.
  2. If it was good, transform the JSON response to the object form.
  3. If it was bad, throw an error.

So I’ve done that. You can see the store module and some extra bits for making it go.

This covers the basics:

  • You can lists tiddlers, bags, recipes.
  • GET representations of each.
  • PUT some tiddlers.

So the basic concept of having the content hosted on a remote server is doable. But there are problems:

  • Listing a bag or recipe with a lot of tiddlers in it results in a lot of requests to server B.
  • We’ve got not delete yet.
  • There’s no auth handling.
  • It’s kind of slow.

Some other time, sooner than later if there is interest, I’ll improve the code to show how some solid use of good HTTP principles will make the code better and make the system operate more efficiently.

Comments (View)