Introduction
Why a document on programming in
stateless environments and why this under the IIS help
pages? Pretty easy: Applications on the web are running
in an stateless environment. Thus you need to know how to
deal with this in order to write great apps.
This document tries to build a
solid understanding of what a stateless environment is,
how it affects your application design and which
solutions are commonly used. It describes the whole topic
from the web point of view. It's main focus, however, is
the general theory. It won't cover any special server
statements etc (although it might refer to some in the
examples). You might find this document interesting even
if you do no web development but work in other stateless
environments (like some mainframe transaction monitors or
MTS).
Version
This is version 0.1 of this
document. It has been compiled on Sun, August 11th 1996.
Keep watching for updates! It has been ported to the
Adiscon site in late December 1997 and is scheduled to
undergo a major "renovation" in the first
quarter of 1998. Your opinion on this document is happly
read. Simple email us.
Disclaimer
This document is a free public
service of Adiscon GmbH, Germany. The information
contained in this document is to the best of our
knowledge. However, we do not guarantee its accuracy. Use
the information contained herein at your sole risk.
What is a Stateless Environment?
Stateless vs. Stateful
In ususal interactive programming,
you deal with a stateful environments. That means, your
app is in a session with the user. You have multiple
queries and responses during this single session and you
keep some sort of processing state (usually in main
memory).
A typical example is a mailorder
application. Following is a typical pseudo-code for such
an application:
log on user
do
allow user to browse database
add new selection to set of selections
until user has finished selection phase
if any products were selected
ask for shipping/billing information
enter selection as an order
log off user
This is a typical sample in a
procedure oriented environment. Not really neat for an
event driven GUI interface, but one thing is common to
such applications: the application keeps state
information of the user session. This processing state is
preserved across multiple interactions (here the product
selection phase). The preservation of the processing
state is done by the environment itself, typically by
letting you application staying loaded during the
multiple requests. Your application is always in control
of what's going on and can handle it. This is a typical
example of a stateful environment.
A stateless environment, on
the other hand, does not offer these benefits to your
application. The environment won't do anything to
preserve the session state. Each request and response is
a new one and not related to any other.
As you might guess, there are of
course ways to implement stateful sessions in stateless
environements. This is what this document is all about.
Why a stateless environement?
Good question, you probably say.
Why haven't these real cracks, the OS programmes, created
stateful environments only? Why do the bother me with the
burden to keep things together that naturally belong
together? Shouldn't that be a job of the OS?
Well, stateless environments have
their adavantages. They are relatively easy to implement
and thus efficient. The are inherently well adopted to
non reliable environments, as a recovery is simple (as
there is no state information). They are very well suited
for retrieval type application.
A key thing here is the efficiency.
I bet this is the main thing why this had been
implemented in almost all of the mainframe transaction
monitors.
Is the web stateless?
Indeed it is. We think the main
reason for this is its initial design as a retrieval
system. However, there are a lot of initiatives to adopt
it to the new needs of stateful applications. The basic
protocol (http), however, is and will remain to
be stateless.
Stateless vs. Transactional
A stateless environment is often
mistaken with a transactional environment: in a
transactional environment a single transaction (logical
group of actions) is eiteher completed fully or not done
at all. Many transactional environments are also
stateless, which means that a transaction is formed by a
single request and response. However, a stateless
environment is not necessarily transactional.
Take an http request as an example:
let's say that a single request and response makes up a
logical group of actions. So it seems to be both
stateless and transactional. However, the following might
happen:
- User sends request to server
- Server processes request,
including entering data into database
- Server sends back response to
user
- Send response gets lost due to
network failure (or for whatever reason)
So the logical group of actions
could not be completed successfully: processing was done
OK, but the user could not be notified of the success. In
a transactional environment, the database update would
now be invalidated (rolled back). In the Web, however,
this doesn't happen. It is not even sure that the server
will note the session abort! If the user would resubmit
the data, there will be duplicate databse entries!
So keep in mind: http is stateless,
but not transactional.
Some definitions
In order to get any further, we
need to use some common terms. There are a lot of
definitions for all of this things out there. Many use
different names for the same thing and some use same
names for different things. Thus we have decided to
define some terms by ourselfs and refer to this
definitions later on in this document. Note that these
definitions are neither "official" nor
"self made". We've based them on common
understanding but want to make sure you know what we are
talking about. Thus we'd like to remove any doubt about
our usage of terms.
Session
This is the logical session your
application has with the user. It is started by the log
on to your application and ended by the user (or any
abnormal termination). There is in infinite number of
requests and responses during a single session. However,
all of this requests and responses logically belong to
the same application and are used to create a single
"effect". Thus, there might be multiple
sessions during one logon / logoff (e.g. if you have a
warehouse applications and start the selection and order
process multiple times).
Begin and end of a session is only
determined by application logic.
Session Identifier
The session identifier describes a
single session. It can be used to describe a "master
session" containing multiple sub-sessions (as
described above) or can identify a sub session. It will
most likely describe the master session if there a
multiple operations going on during one session. The
sub-session will most likely be implemented in the form
of multiple contexts.
Context
The context is often also refered
to as the state of an operation / application. In this
document, context describes all the information that is
needed by the application to describe the current state
of the application. It includes all pre-entered data as
well as any housekeeping information. The context is a
logical rather than a physical entity. It's physical
representation might be stored in memory, in a database
or even "on the wire" (be part of the
request/response).
Context Identifier
This is an entity (usually a key)
that describes a given context. It can be equivalent to a
session identifier, but it must not (as a session might
be used to create several contexts).
How to handle stateless
environments
There is just one way to deal with
them: implement some kind of state awareness. The key to
stateless environment handling is the design of
appropriate sessions and contexts. For each of this
entities, a identifier is then created and be passed
between requests. Thus you have at least two levels of
communication:
- the logical session, which
runs across multiple requests (with maybe
multiple contexts)
- the "physical"
environement dependent request/response (in our
case a http request)
Designing Sessions
This is just as with usual stuff.
Consider which need your application should serve. Think
about which steps the user is required to take and which
he might take. Have a look at which data (and how many!)
does make up a context.
There are some special
considerations for stateless (http) environments:
- what should be done if the
session is discontinued?
- when to do final updates?
- which technology can be
expected to be on the users' side (restricts
methods to handle context saving)
Keeping track of Sessions
In most cases, sessions have a
single context and both this context and the session is
identified by the session context. Thus, we will refer
from now on to "session" for the logical group
of actions and to "context" for all the
sessions' data as well es its identifier.
Saving the Context
In general, you need to decide a
way to save the context between multiple requests. This
saved context is restored which each new request and thus
can be used to build the logical session. There are two
basic ways to save the context:
- store it in the request itself
- store it at the server
Each of this options will be
discussed on greater detail in the following sections.
Saving the Context in the Request
itself
This is especially popular if your
context information is small. All of its data is simply
put into the request and being retransmitted by the
client in the response. It is relatively easy to do this
kind of context saving if you have full control over your
html files.
A major advantage
of this approach is its inherent handling of session
aborts etc: if the user deciced to discontinue a session
(or a fatal error happens) the data doesn't get
retransmitted to the server and thus the action is really
aborted. As there is nothing stored on the server side,
there is no need to clean up anything. So this is a
pretty elegant solution.
However, there are two major drawbacks
of this method: the context is transmitted in each and
every request. This works well with small data size but
soon becomes impractical as data sizes increase. Thus it
can only be used for small contexts. Another drawback is
that the context must be transmitted in every
request. If one request doesn't transmit it, the whole
context (and thus session) is lost. This effectivly
prohibits any static html inside such a session.
Saving the Context at the Server
This method effectivly eliminates
the drawbacks of saving the context in the request
itself: static html is no problem (well, mostly...) and
you can save rather large contexts. However, here are
some other drawbacks: you have to handle session aborts
with some special logic, as there is a need for cleanup
of resources (the orphaned contexts). The other - major
- drawback is the lack of any really standardized way to
handle server context saving. The thing closest to the
standard is the cookie proposal by Netscape as well as an
related IETF draft, but this is a moving target and might
change.
.... to be extended
|