On a scheme for backward recovery in complex systems including both client processes and data servers
* Centre for Software Reliability, City University
CSR Technical Report. October 1996. Last revised: March 1997.
We consider backward error recovery for complex software systems, where different subsystems may belong to essentially different application areas, like databases and process control. Examples of such systems are found in modern telecommunication, transportation, manufacturing and military applications. Such heterogeneous subsystems are naturally built according to different design "models", viz. the "object-action" model (where the long-term state of the computation is encapsulated in data objects, and active processes invoke operations on these objects), and the "process-conversation" model (where the state is contained in the processes, communicating via messages). To allow backward error recovery in these two "models" of computation, two different schemes are most appropriate. For the object-action model of computation, atomic transactions are now the accepted model of backward recovery. For the process-conversation model, a recovery scheme based on planned conversations has been widely studied. We have shown how checkpointing and roll-back can be co-ordinated between two sets of such heterogeneous subsystems, namely sets of message passing processes organised in conversations and data servers offering atomic transactions. Assuming that each of the two kinds of subsystem already has functioning mechanisms for backward error recovery, we have described the additional provisions needed for co-ordination between heterogeneous subsystems. Our additions are based on rather general models of both transactions or conversations: they could be adapted for most specific instances of either scheme. Our solution involves altering the virtual machine on which the programs run, and programming conventions which seem rather natural and can be automatically enforced.
Related papersThe ideas presented here were first outlined in:
A shorter version of the discussion and specification parts of this report, with an example of use of our method, is published in
Material from that paper is reused here with permission from IEE.
The older report
The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.
Page maintained by: Lorenzo Strigini