This is a mirror of Joe Marshall's ChangeSafe document at https://sites.google.com/site/evalapply/changesafe.html?attredirects=0. ChangeSafe code is now available here

What is ChangeSafe?

Quick History

Content Integrity, Inc. was founded in 1999 to develop and market what became ChangeSafe. Despite the fact that we had a product, customers, and revenue, our backers pulled the plug on us when the dotcom bubble burst. Content Integrity filed for Chapter 11 bankruptcy in April 2001.

In the summer of 2001, ChangeSafe LLC was formed. We purchased the intellectual property of Content Integrity in order to continue developing it. Rather than repeating the mistakes of Content Integrity (in particular, the mistake of relying on venture capital for funding), we decided to continue development on a shoestring. Progress has been steady, but very slow.

There were a few technical issues that we felt should be addressed. The original product used the ObjectStore ODBMS as the back end. We found, however, that ObjectStore's features were a poor match to our requirements and this caused performance problems when we attempted to scale up to larger data sets. The web interface of the original code base was difficult to maintain and we wished to replace it.

Over the past few years, I have rewritten the back end, replaced the web interface, redesigned the versioned object layer, and reworked the more problematic code. The code is once again working and approaching beta-release stage.

A New Language

ChangeSafe is a complex piece of software. Revision control and configuration management is tricky enough, but we wanted to solve an even more difficult problem, that of making revision control and configuration management look easy. To do this, we needed a new computer language: one in which versioned data structures are built in, one that allows us to switch our viewpoint from a single instant of time to the entire history of the stored data, one in which the primitive operations of assignment and object creation can themselves be reflected back into the language as objects to be manipulated. Of course this new language needs all the usual features --- numbers, strings, arithmetic, procedures etc. --- and we'll make heavy use of some advanced features --- multi-threading, structured error handling, reflection --- and we want a rich set of auxiliary libraries --- collections, web stuff like sockets, http, and html generation, database connectivity. We can't forget the infrastructure tools like debuggers, profilers, and a reasonable IDE.

But metalinguistic abstraction --- inventing custom computer languages --- doesn't mean that you have to start from scratch. Some computer languages make it easy and natural to extend the core language with new language constructs. That is why, with the exception of the little Java applet that runs on the client machine, all of ChangeSafe is written in Common Lisp.

Stratified Design

ChangeSafe uses the principle of stratified design. Each level of the design builds upon and extends the level below it. We can look upon each level as providing a specialized language tailored for the level above. Lisp is particularly suited for this style of design because it hides the distinction between the built-in language features and new ones. Each level of design provides an extended version of Lisp that we use to write the next level.

ChangeSafe has five principal layers. At the bottommost layer, we extend the vendor's Common Lisp system with several utilities: the SERIES package, high-performance vector routines, logging facilities for debugging, string manipulation functions, a number of simple datatypes, etc. We also shadow some built-in functions to extend their functionality or suppress undesired behavior.

Above the utility layer is the persistent store layer. We extend the Common Lisp Object System (CLOS) with persistent classes. Instances of persistent classes are allocated in non-volatile storage, so these objects survive shutdowns or system crashes. Access to persistent objects is controlled through transactions so the persistent objects have the desired ACID (atomic, coherent, independent, and durable) properties. No special syntax or language constructs are needed to use persistent objects. They behave like normal CLOS objects (although they do have the convenient behavior of rolling back their state when a transaction aborts).

The ChangeSafe core engine sits atop the persistent object layer. It too extends CLOS, but provides a versioned object abstraction. Transactions at the core level specify which version or set of versions should be visible and the slots (fields) in the object are presented appropriately. Assignment to a slot of a versioned object causes the creation of a new version. Naturally, a versioned object is persistent. Core transactions build upon the lower transaction layer, so a transaction abort causes objects to be reverted to their previous state.

The version management layer provides the reflection capability for ChangeSafe. When new versions of core objects have been created as the result of assignments, the version management layer collects these changes as a named Change Set. In addition, the version management layer maintains collections of Change Sets as named version objects. Different paths of change evolution, represented as linear sequences of version objects, are kept in different branch objects, while related branch objects are collected within a project.

The repository file model is a relatively simple layer. It models a hierarchical file system like one would find on a Unix or Windows machine. But because it is built upon the version management layer, the objects in the model have full versioning capability. It is this layer that gives ChangeSafe PCM the revision control functionality.

The configuration management layer is more complex than the file model. The changes, versions, branches, and projects of the version management layer are once again reflected so that more complex relationships can be modeled. For example, two different software applications can share a single change-managed library component, or a customized product can be created that permits changes to flow in from the main product line, but prevents changes from flowing the other way.

In addition to these layers, there are some non-layered modules that are available across layers. The workspace model, built upon the persistent store layer, models the state of the users workspace. The server module communicates with the file system agent applet to manipulate files and directories in the user's workspace. The web server module provides a web-based interface to the ChangeSafe system.

The Function of Lambda

I'm hoping that this monograph will be read by people with little or no Lisp experience as well as by people who have a lot. The experts can probably skip this section, but the newbies might find this helpful.

Common Lisp and Scheme have a special operator called lambda. Most people find lambda expressions somewhat confusing when they first encounter them. The word LAMBDA is weird and it isn't descriptive like IF or LET. Textbooks often describe lambda expressions as anonymous functions. Why not simply use a named one? You may read that lambda expressions create functions. Isn't it really the compiler that does that?

The textbooks are correct, but they miss an important point. A lambda expression is simply a chunk of code to be run at a later time. The word lambda means “take this chunk of code and stick it in a box, I'll run it later.” This is an incredibly powerful feature. Chunks of code no longer have to be physically located inline at the point they are used.

This works especially well in Lisp. Because every chunk of Lisp code is an expression, you can pretty much slap a LAMBDA around anything you want and wrap it up for later. And since the place where you activate the wrapped up code is also an expression, you can activate wrapped up code anywhere you want.

Here's a simple example. Suppose we have a database system where we require that data is accessed within a transaction. The transaction code is idiomatic: we create a transaction, do whatever, and then clean up. The clean-up is crucial. No matter what happens, we cannot forget to clean up. We would use a try...finally statement in a more traditional language. In Lisp we use unwind-protect.

  (unwind-protect
      (progn (begin-transaction) ;; Try these forms
             (do-whatever))
    (clean-up))                  ;; Finally clean up

But we don't want to sprinkle this boilerplate everywhere in our code. What if we need to change the setup code or the clean-up code? What we want to do is separate the transaction boilerplate from the other code. But we need to run the boilerplate around the other code. LAMBDA comes to the rescue.

;; TRANSACTION accepts some saved code.  This is traditionally
;; called a thunk.  It establishes the protected region using
;; unwind-protect and then activates the saved code by
;; funcalling it.

(defun transaction (thunk)
  (unwind-protect
      (progn (begin-transaction)
             (funcall thunk))
    (clean-up)))

;; Here we want to look up an address when given a name.  We want
;; to fetch the record and select the address, but we can't do
;; that without a transaction.

;; We don't want to run the code right now, so we wrap it in
;; lambda to save it for later.  The transaction procedure
;; will activate it.

(defun lookup-address (name)
  (transaction
    (lambda ()
      (select-address (fetch-record name)))))

It gets better. Suppose you have in your code a function that needs, say, three arguments. Suppose that at the location where it makes sense to call the function, only two of those arguments are available --- perhaps the third will be computed later. Just wrap the code that needs the third argument in a LAMBDA, save it for later, and activate it when you find out the third argument.

;; We're parsing some C code and we've discovered some square
;; brackets.  We know we've got an array declarator, but we don't
;; yet know what it is an array of.  We've got most of
;; information we need to build the declarator, but not all.

;; We wrap the code that creates the declarator in a lambda
;; so we can activate it later once we know the base type.

(define (parse-array-declarator ctenv inner wrapper size)
  (parse-declarator ctenv inner
    (lambda (base)
      (make <array-declarator>
         :inner-declarator (wrapper base)
         :size size))))

It gets even better. Suppose you have to choose one of two ways of handling some data, but at the moment you don't know which one to choose. Take both options, wrap each in a lambda, and pass them on. Later on, when you can make the decision, unwrap and activate the appropriate code and discard the other.

;; In a game program, the player has hit the fire button.  We want
;; to either increase or decrease his score depending on what
;; the shot hits, but we won't know until we finish tracking the
;; bullet. 

(defun player-shoot (direction)
  (start-bullet direction
    (lambda ()   ;; if it misses, penalize
      (decrement-score 10))
    (lambda ()
      (increment-score 100)
      (play-sound explosion))))

;; A shot by the AI opponent, however, has different effects.

(defun ai-shoot (direction)
  (start-bullet direction
    (lambda ()    ;; if it misses, make a noise
      (play-sound ricochet))
    (lambda ()
      (play-sound ouch)
      (decrement-health)
      (if (< health 0) (die))))))

This technique of wrapping up code to be used later is called continuation-passing-style.

Real Examples

Here is example taken from the ChangeSafe sources.

(defun list-products (request)
  "For each product, return a list of 3 items.
   The object-id of the product, the name of the product, and the
   description of the product."
  (call-with-master-catalog-transaction
    (request/repository-dbpath request)
    (request/user-name request)
    :transaction-type   :read-only
    :reason             "list products"
    :master-metaversion :latest-metaversion
    :version            :latest-version
    :receiver (lambda (master-repository master-transaction master-catalog)
                (declare (ignore master-repository master-transaction))
                (collect 'list
                         (map-fn 't (lambda (product)
                                      (list (object-identifier product)
                                            (object-name product)
                                            (object-description product)))
                                 (scan 'list (master-catalog-products master-catalog)))))))

The request object encapsulates the various parameters to the list-products command. It is created by either the web server front-end or the command line front-end and passed to this layer.

call-with-master-catalog-transaction establishes a transaction context for the purposes of examining the master catalog. This is a persistent, versioned, data structure that contains the product configurations that are under control of ChangeSafe. Within the transaction, we will collect a list that we obtain by mapping a simple function over the products in the master catalog.

This piece of code is quite simple, but that is because we are able to abstract away the details. Let's examine what happens when this code is invoked.

call-with-master-catalog-transaction takes several arguments.

The version info needs some explanation. The master catalog is a versioned object. We can select which change sets are used to compose the view of the master catalog. We select :latest-version to indicate that we want to view the most up-to-date revision of the master catalog.

But what was the latest version last month is not what is the latest version today! The notion of “latest“ itself changes over time, so we specify a metaversion against which we resolve the version. In this case, we use the :latest-metaversion to indicate that we want to resolve the :latest-version at the current time.

Versioning and metaversioning will make your brain hurt. The metaversion is generally used to specify a particular point in time for the view. If we wished to discover the products that were managed by ChangeSafe last year, we would use a metaversion of last year, and a version of :latest-version.

On the other hand, the version is generally used to specify a particular collection of change sets used to synthesize the view. If we wanted to discover the products that are currently managed, but undo an erroneous project deletion, we would specify the :latest-metaversion, and a version that omitted the change that deleted the product.

call-with-master-catalog-transaction opens the master database, establishes the appropriate transaction, retrieves the master catalog, and invokes the receiver on those objects. In addition, it arranges for change sets to be constructed should something change (although in a read-only transaction, this is forbidden), the transaction to be committed or aborted as appropriate, and the master database to closed. These operations are vital for the correct functioning of the database, but they are all performed automatically.

Here is the definition of the master catalog:

(defclass master-catalog ()
  ;; subsystems and products (PC's) in this repository
  ((products   :initform nil
               :version-technique :composite-set
               :accessor master-catalog-products)

   ;; ... more slots omitted ....
   )
  (:documentation "Catalog/hierarchy-root of versioned information maintained in the master repository.")
  (:metaclass versioned-standard-class))

Other than the :metaclass of versioned-standard-class and the slot option of :version-technique :composite-set, this is a standard CLOS class definition.

At the configuration management layer, we can simply program using persistent, versioned objects. The lower layers take care of managing transactions and change sets and provide a simple API that acts more-or-less like the standard CLOS API.

Now let's examine a lower layer.

A versioned object is a persistent object that contains versioned slots. A versioned slot is a data structure that contains the history of the slot value. Upon access, the value of the slot is synthesized from an examination of the slot's history and the version information in the enclosing versioned transaction. This is simply done by override the definition of slot-value-using-class.

;;; The versioned slots in a versioned object actually contain a
;;; versioned value.
(defmethod clos:slot-value-using-class ((class versioned-standard-class)
                                        (instance versioned-standard-object)
                                        slot)
   (view-versioned-value (call-next-method) *transaction* instance slot))

The definition of a versioned value is this:

(defclass versioned-value ()
  ((change-records
     :initform (make-instance 'persistent-vector :size 0)
     :reader   versioned-value-change-records))
  ((:metaclass persistent-standard-class)))

At the versioned object layer, we program using non-versioned, persistent objects. The lower layers take care of persistence and provide the persistent-standard-class metaclass as a primitive to this layer.

The stratified design allows each layer to build upon the layer below it without knowing the details about how that layer is implemented. Each layer is itself ignorant of the layers above it. This is an important element in the stratified design. The lower layers are more general than strictly necessary because they must supply facilities that the upper layers can mix and match. But a more general design can be simpler than one that has been specialized for a particular use.

The stratified design also allows each layer to evolve independently of the other layers. A different approach to persistence, for example, could be used without making major changes to the versioning layer. In fact, I was able to replace the ObjectStore back end with one of my own design.

Why Lisp?

The ChangeSafe PCM product did not need to be written in Lisp. Other change set based PCM systems exist that have been written in C, Java, Haskell, etc. But none, to my knowledge, build the PCM system on top of a change-managed language substrate. The ChangeSafe core engine can be adapted to model any object that changes, not just files.

There are major obstacles to adding persistent versioning to other languages. Meta-object protocols are still experimental in most languages, while Lisp's MOP is stable, well-supported and mature. The abstraction of complex control flow made possible through the use of continuation-passing-style is extremely cumbersome in most languages, if it is possible at all. Few languages with explicit types can deal with anonymous procedures when used in this manner. While the example above did not illustrate it, macros are used to create an intuitive, streamlined syntax for common operations. Algebraic notation is hostile to macros.

I don't believe any other computer language could be as easily adapted to versioned object modelling.

Is it Available for a Test Drive?

Unfortunately, not yet. I cannot put as much time into ChangeSafe as I'd like, and the current version has many rough edges. For example, most errors drop you into the lisp debugger rather than generating a nice friendly message telling you how to correct the problem. I'd be happy to demo it, however.

I'm hoping that my partner and I will be able to raise enough money to get ChangeSafe into a sellable state. How long that would take depends on how much spare time I have.