The Python object publisher provides a simple mechanism for publishing a collection of Python objects as World-Wide-Web (Web) resources without any plumbing (e.g. CGI) specific code.
Applications do not have to include code for interfacing with the web server.
Applications can be moved from one publishing mechanism, such as CGI, to another mechanism, such as Fast CGI or COM, with no change.
Python objects are published as Python objects. The web server "calls" the objects in much the same way that other Python objects would.
Automatic conversion of URL to object/sub-object traversal.
Automatic marshaling of form data, cookie data, and request meta-data to Python function arguments.
Automated exception handling.
Automatic generation of CGI headers.
Automated authentication and authorization.
Objects are published by including them in a published module. When a module is published, any objects that:
can be found in the module's global name space,
that do not have names starting with an underscore,
that have non-empty documentation strings, and
that are not modules
are published.
Sub-objects (or sub-sub objects, ...) of published objects are also published, as long as the sub-objects:
have non-empty doc strings,
have names that do not begin with an underscore, and
are not modules.
Note that object methods are considered to be subobjects.
Object-to-subobject traversal is done by converting steps in the URI
path to get attribute or get item calls. For example, in traversing
from http://some.host/some_module/object
to
http://some.host/some_module/object/subobject
, the module
publisher will try to get some_module.object.subobject
. If the
access fails with other than an attribute error, then the object
publisher raises a "NotFound" exception. If the access fails with
an attribute error, then the object publisher will try to obtain the
subobject with: some_module.object["subobject"]
. If this access
fails, then the object publisher raises a "Not Found"
exception. If
either of the accesses suceeds, then, of course, processing continues.
During object traversal, the names .
and ..
have special meaning
if the application does not provide meaning for them. If the name
.
is encountered and the application does not provide a value,
then the name is effectively skipped. For example, the path x/./y
is equivalent to x/y
. If the name ..
is encountered and the
application does not provide a value, then the parent of the object
being traversed is used. For example, x/y/../z
is almost
equivalent to x/z
, except that y
is considered to be part of the
path to z
. If y
has a user folder, it will be consulted when
validadting access to z
before a user folder in x
is consulted.
Normally, URL traversal begins with the published module. If the
Published module has a global variable named bobo_application
,
then traversal begins with this object instead.
If the final object encountered when traversing the URL has an
index_html
attribute, the object traversal will continue to this
attribute. This is useful for providing default methods for objects.
Object access can be further controlled via roles and user databases.
The Bobo authorization model uses roles to control access to
objects. As Bobo traverses URLs, it checked for __roles__
attributes in the objects it encounters. The last value found
controls access to the published object.
If found, __roles__
should be None or a sequence of role names. If
__roles__
is None
, then the published object is public. If
__roles__
is not None
, then the user must provide a user name and
password that can be validated by a user database.
If an object has a __roles__
attribute that is not empty and not
None
, Bobo tries to find a user database to authenticate the user.
It searches for user databases by looking for an __allow_groups__
attribute, first in the published object, then in it's container,
and so on until a user database is found. When a user database
is found, Bobo attempts to validate the user against the user
database. If validation fails, then Bobo will continue searching
for user databases until the user can be validated or until no
more user databases can be found.
The user database may be an object that provides a validate method:
validate(request, http_authorization, roles)
where:
request
a mapping object that contains request information,
http_authorization
the value of the HTTP Authorization header
or None
if no authorization header was provided, and
roles
a list of user role names
The validate method returns None
if it cannot validate a user and
a user object if it can. Normally, if the validate method returns
None
, Bobo will try to use other user databases, however, a user
database can prevent this by raising an exception.
If validation succeeds Bobo assigns the user object to the
request variable, AUTHENTICATED_USER
. Bobo currently places
no restriction on user objects.
If the user database is a mapping object, then the keys of the object are role names and values are the associated user groups for the roles. Bobo attempts to validate the user by searching for a user name and password matching the user name and password given in the HTTP Authorization header in a groups for role names matching the roles in the published object's __roles__ attribute.
If validation succeeds Bobo assigns the user name to the
request variable, AUTHENTICATED_USER
.
When a user first accesses a protected object, Bobo returns an error response to the web browser that causes a password dialog to be displayed.
You can control the realm name used for Bobo's Basic
authentication by providing a module variable named
__bobo_realm__
.
Some web servers cannot be coaxed into passing authentication
information to applications. In this case, Bobo applications
cannot perform authentication. If the web server is configured
to authenticate access to a Bobo application, then the Bobo
application can still perform authorization using the
REMOTE_USER
variable. Bobo does this automatically when
mapping user databases are used, and custom user databases may
do this too.
In this case, it may be necessary to provide more than one path to an application, one that is authenticated, and one that isn't, if public methods and non-public methods are interspursed.
For some interesting objects, such as functions, and methods,
it may not be possible for applications to set
__roles__
attributes. In these cases, the
object's parent object may contain attribute
object_name__roles__
, which
will be used as surrogates for the object's
__role__
attribute.
If a published object is a function, method, or class, then the object will be called and the return value of the call will be returned as the HTTP resonse. Calling arguments will be supplied from "environment variables", from URL-encoded form data, if any, and from HTTP cookies by matching argument names defined for the object with variable names.
If the object being called has an argument named REQUEST
, then
a request object will be passed. Request objects encapsulate
request meta data and provide full access to all environment
data, form data, cookies, and the input data stream (i.e. body
data as a stream).
If the object being called has an argument named RESPONSE
,
then a response object will be passed. This object can be used
to specify HTTP headers and to perform stream-oriented output.
Rather than returning a result, data may be output by calling
the write and flush methods of the response object one or more
times. This is useful, for example, when outputing results from
a time-consuming task, since partial results may be displayed
long before complete results are available.
Normally, string arguments are passed to called objects. The called object must be prepared to convert string arguments to other data types, such as numbers.
If file upload fields are used, however, then FileUpload objects will be passed instead for these fields. FileUpload objects bahave like file objects and provide attributes for inspecting the uploaded file's source name and the upload headers, such as content-type.
If field names in form data are of the form: name:type, then an attempt will be to convert data from from strings to the indicated type. The data types currently supported are:
Python floating point numbers
Python integers
Python long integers
python strings
non-blank python strings
Date-time values
Python list of values, even if there is only one value.
Python list of values entered as multiple lines in a single field
Python list of values entered as multiple space-separated tokens in a single field
Python tuple of values, even if there is only one.
Augment PATH_INFO with information from the form field. (See "Method Arguments" blow.)
For example, if the name of a field in an input
form is age:int
, then the field value will be passed in argument,
age, and an attempt will be made to convert the argument value to
an integer. This conversion also works with file upload, so using
a file upload field with a name like myfile:string will cause the
UploadFile to be converted to a string before being passed to the
object.
Sometimes, it is desireble to control which method is called based on form data. For example, one might have a form with a select list and want to choose which method to call depening on the item chosen. Similarly, one might have multiple submit buttons and want to invoke a different method for each button.
Bobo provides a way to select methods using form variables through
use of the "method" argument type. The method type allows the
request PATH_INFO
to be augmented using information from a
form item name or value.
If the name of a form field is :method
, then the value of the
field is added to PATH_INFO
. For example, if the original
PATH_INFO
is foo/bar
and the value of a :method
field is
x/y
, then PATH_INFO
is transformed to foo/bar/x/y
. This is
useful when presenting a select list. Method names can be
placed in the select option values.
If the name of a form field ends in :method
and is longer than 7
characters, then the part of the name before :method
is added to
PATH_INFO
. For example, if the original PATH_INFO
is
foo/bar
and there is a x/y:method
field, then PATH_INFO
is
transformed to foo/bar/x/y
. In this case, the form value is
ignored. This is useful for mapping submit buttons to methods,
since submit button values are displayed and should, therefore,
not contain method names.
Only one method field should be provided. If more than one method field is included in the request, the behavior is undefined.
The base HREF is set when method fields are provided. In the
above examples, the base HREF is set to .../foo/bar/x
. Of
course, if, in this example, y
was an object with an index_html
method, then the base HREF would be reset to .../foo/bar/x/y
.
If a published object that is not a function, method, or class is accessed, then the object itself will be returned.
A published object, or the returned value of a called published
object can be of any Python type. If the returned value has an
asHTML
method, then this method will be called to convert the
object to HTML, otherwise the returned value will be converted to a
string and examined to see if it appears to be an HTML document. If
it appears to be an HTML document, then the response content-type
will be set to text/html
. Otherwise the content-type will be set
to text/plain
.
If the returned object is None or the string representation of the returned object is an empty string, then the HTTP return status will be set to "No Content", and no body will be returned. On some browsers, this will cause the displayed document to be unchanged.
In general, in Bobo, relative URL references should be interpreted relative to the parent of the published object, to make it easy for objects to provide links to siblings.
the result of a request is HTML text,
the text does not define a base
tag in the head
portion of
the HTML, and
The published object had an index_html
attribute that was not included
in the request URL,
then a base reference will be inserted that is the URL of the published object.
Unhandled exceptions are caught by the object publisher and are translated automatically to nicely formatted HTTP output.
When an exception is raised, the exception type is mapped to an HTTP
code by matching the value of the exception type with a list of
standard HTTP status names. Any exception types that do not match
standard HTTP status names are mapped to "Internal Error" (500).
The standard HTTP status names are: "OK"
, "Created"
,
"Accepted"
, "No Content"
, "Multiple Choices"
, "Redirect"
,
"Moved Permanently"
, "Moved Temporarily"
, "Not Modified"
,
"Bad Request"
, "Unauthorized"
, "Forbidden"
, "Not Found"
,
"Internal Error"
, "Not Implemented"
, "Bad Gateway"
, and
"Service Unavailable"
, Variations on these names with different
cases and without spaces are also valid.
An attempt is made to use the exception value as the body of the
returned response. The object publisher will examine the exception
value. If the value is a string that contains some white space,
then it will be used as the body of the return error message. It it
appears to be HTML, the the error content type will be set to
text/html
, otherwise, it will be set to text/plain
. If the
exception value is not a string containing white space, then the
object publisher will generate it's own error message.
There are two exceptions to the above rule:
If the exception type is: "Redirect"
, "Multiple Choices"
"Moved Permanently"
, "Moved Temporarily"
, or
"Not Modified"
, and the exception value is an absolute URI,
then no body will be provided and a Location
header will be
included in the output with the given URI.
If the exception type is "No Content"
, then no body will be
returned.
When a body is returned, traceback information will be included in a
comment in the output. The module variable
__bobo_hide_tracebacks__
can be used to control how tracebacks are
included. If this variable and false, then tracebacks are included
in PRE tags, rather than in comments. This is very handy during
debugging.
Automatic redirection may be performed by a published object by raising an exception with a type and value of "Redirect" and a string containing an absolute URI.
If no object is specified in a URI, then the publisher will try to
publish the object index_html
, if it exists, otherwise the module's
doc string will be published.
If a published module defines objects __bobo_before__
or
__bobo_after__
, then these functions will be called before
or after a request is processed. One possible use for this is to
acquire and release application locks in applications with
background threads.
Do not copy the module to be published to the cgi-bin directory.
Copy the files: cgi_module_publisher.pyc and CGIResponse.pyc to the directory containing the module to be published, or to a directory in the standard (compiled in) Python search path.
Copy the file cgi-module-publisher to the directory containing the module to be published.
Create a symbolic link from cgi-module-publisher (in the directory containing the module to be published) to the module name in the cgi-bin directory.
File upload objects
File upload objects are used to represent file-uploaded data.
File upload objects can be used just like files.
In addition, they have a headers
attribute that is a dictionary
containing the file-upload headers, and a filename
attribute
containing the name of the uploaded file.
Model HTTP request data.
This object provides access to request data. This includes, the input headers, form data, server data, and cookies.
Request objects are created by the object publisher and will be passed to published objects through the argument name, REQUEST.
The request object is a mapping object that represents a collection of variable to value mappings. In addition, variables are divided into four categories:
Environment variables
These variables include input headers, server data, and other request-related data. The variable names are as specified in the CGI specification
Form data
These are data extracted from either a URL-encoded query string or body, if present.
Cookies
These are the cookie data, if present.
Other
Data that may be set by an application object.
The form attribute of a request is actually a Field Storage object. When file uploads are used, this provides a richer and more complex interface than is provided by accessing form data as items of the request. See the FieldStorage class documentation for more details.
The request object may be used as a mapping object, in which case values will be looked up in the order: environment variables, other variables, form data, and then cookies.
General Services Provided by Request
Convert a Request to a string that looks like a Python expression.
Convert a Request to a string.
Attribute Access Services Provided by Request
Return a value for the required variable name. The value will be looked up from one of the request data categories. The search order is environment variables, other variables, form data, and then cookies.
Indexing Services Provided by Request
Return a value for the required variable name. The value will be looked up from one of the request data categories. The search order is environment variables, other variables, form data, and then cookies.
This method is used to set a variable in the requests "other" category.
An object representation of an HTTP response.
The Response type encapsulates all possible responses to HTTP
requests. Responses are normally created by the object publisher.
A published object may recieve the response abject as an argument
named RESPONSE
. A published object may also create it's own
response object. Normally, published objects use response objects
to:
Provide specific control over output headers,
Set cookies, or
Provide stream-oriented output.
If stream oriented output is used, then the response object passed into the object must be used.
Constructor For Response
Creates a new response. In effect, the constructor calls "self.setBody(body); self.setStatus(status); for name in headers.keys(): self.setHeader(name, headers[name])"
Instance Methods For Response
Cause a redirection without raising an error
Set an HTTP cookie on the browser
The response will include an HTTP header that sets a cookie on cookie-enabled browsers with a key "name" and value "value". This overwrites any previously set value for the cookie in the Response object.
Append a value to a cookie
Sets an HTTP return header "name" with value "value", appending it following a comma if there was a previous value set for the header.
Returns the current HTTP status code as an integer.
Sets an HTTP return header "name" with value "value", clearing the previous value set for the header, if one exists.
Returns an HTTP header that sets a cookie on cookie-enabled browsers with a key "name" and value "value". If a value for the cookie has previously been set in the response object, the new value is appended to the old one separated by a colon.
Returns a string representing the currently set body.
Get a header value
Returns the value associated with a HTTP return header, or "None" if no such header has been set in the response yet.
Set the body of the response
Sets the return body equal to the (string) argument "body". Also updates the "content-length" return header.
You can also specify a title, in which case the title and body will be wrapped up in html, head, title, and body tags.
If the body is a 2-element tuple, then it will be treated as (title,body)
Set the base URL for the returned document.
Cause an HTTP cookie to be removed from the browser
The response will include an HTTP header that will remove the cookie corresponding to "name" on the client, if one exists. This is accomplished by sending a new cookie with an expiration date that has already passed. Note that some clients require a path to be specified - this path must exactly match the path given when creating the cookie. The path can be specified as a keyword argument.
Sets the HTTP status code of the response; the argument may either be an integer or a string from { OK, Created, Accepted, NoContent, MovedPermanently, MovedTemporarily, NotModified, BadRequest, Unauthorized, Forbidden, NotFound, InternalError, NotImplemented, BadGateway, ServiceUnavailable } that will be converted to the correct integer value.
Return data as a stream
HTML data may be returned using a stream-oriented interface. This allows the browser to display partial results while computation of a response to proceed.
The published object should first set any output headers or cookies on the response object.
Note that published objects must not generate any errors after beginning stream-oriented output.
General Services Provided by Response
Convert a Response to a string that looks like a Python expression.
Convert a Response to a string.
Indexing Services Provided by Response
Get the value of an output header
Sets an HTTP return header "name" with value "value", clearing the previous value set for the header, if one exists.