An Environment for Integrating Formal Methods Tools

Paul Anderson[*], Michael Goldsmith[+], Bryan Scattergood[+] and Tim Teitelbaum[*,#]

[*] GrammaTech, Inc.
317 N. Aurora St.
Ithaca, NY 14850, USA

[+] Formal Systems Design & Development, Inc.
P.O. Box 3004
Auburn, AL 36831, USA

[#] Department of Computer Science
Cornell University, Upson Hall
Ithaca, NY 14850

Presented at
User-Interfaces for Theorem Provers 97 (UITP97),
INRIA, Sophia-Antipolis,
September 1-2, 1997.

Abstract

    This paper describes a project that aims to speed adoption of formal methods in industry. A design for an open system for integrating diverse formal methods tools using an industry standard communications mechanism is presented. Initial work is particularly aimed at reasoning about state machines, but as a first step towards a universal Formal Methods Interface.

1 Introduction

Formal methods are not as widely used in industry as the undoubted benefits they can bring should imply. Some of the reasons are:

    Investment Catch-22. There is `Catch-22' situation: while the market is small, there is little incentive for tool companies to construct industrial-strength tools; but the market will remain small unless high-quality tools are available to industry.

    Industry Resistance. There is an inherent resistance to adoption of new notations and methods in industry. Formal methods are seen as esoteric techniques with limited applicability. The educational requirements seem prohibitive. The investment required to bear fruit is seen as too high and too risky.

    Researcher Isolation. Formal methods research and prototype development is often conducted in relative isolation from real world practice. Applicable formal methods solutions for practical problems go unrecognized and unappreciated because they are implemented in these remote settings. Researchers do not make their systems compatible with complex industry standards.

    Tool Isolation. Valuable multipurpose formal-methods nuggets are often locked inaccessibly inside tools built as monolithic hulks. Coupling formal-methods tools together is currently a difficult task done in an ad-hoc manner. Any formal effort is also generally forced to be separate from the mainstream development, making it hard to keep up with -- and influence -- the leading edge.

    User-Interface Inadequacies. Existing formal-methods tools are hobbled by weak user-interfaces. When improved user-interfaces are attempted, their implementation cost often dominates the cost of implementing the core formal method. Systems become ossified because the cost of updating the user-interface component is prohibitive.

This project aims to build an environment for integrating formal methods tools that will help overcome these problems. The Investment Catch-22 is addressed by capitalizing on the significant existing investment in CASE tools, and by developing generic components with dual-use application. Industrial resistance is ameliorated by building the FMI within an environment industry has already embraced. Researcher isolation from industrial practice is overcome if the FMI becomes the sandbox of choice for formal methods researchers and their prototypes. Tool isolation is tackled by adopting industry standards for component technology -- thereby enabling integration of diverse formal methods tools, along with established structured design and configuration management tools, within a common framework. User-interface inadequacies will be addressed by an Active Document Toolkit incorporated as part of the FMI.

The project is a collaborative 4-5 person-year effort between Formal Systems Design & Development, Inc. [FSDD], and GrammaTech, Inc., with support from the U.S. Office of Naval Research. Each company focuses on exploiting their existing expertise: FSDD (together with Formal Systems (Europe) Ltd [FSEL]) on building a semantic framework for integrating formal methods tools; and GrammaTech on extensible linguistic and presentational elements and integration with conventional software engineering tools.

The resulting framework will provide a formal methods ``interface'' in three senses, each of which raises a number of issues which have been considered in the initial design phase.

  • Between the engineer and the formal methods tools:
    • presentation of information
    • modes of interaction with the FMI itself, and with the tools it manages
    • task management and status displays
    • modes of experimentation -- support for ``what if'' investigations (for example, the effect of adding a new operating mode), outside the mainstream development.

  • To support the interworking of diverse tools:
    • provision of ``primary'' knowledge base
    • interaction with and management of requirement models and formal models / theories in store
    • ``fact'' and ``conjecture'' translation and transmittal
    • use of common intermediates to avoid need for any-to-any translations and to facilitate extensibility
    • information fusion and exploitation of results from multiple sources.

  • Between the formal tools and the project environment:
    • interaction with revision and configuration control
    • ensuring consistency or invalidation of previous results
    • allowing ``replay'' of tool interactions wherever possible
    • minimally interactive mode to be invoked by ``rebuild'' operations
An initial population of the framework, consisting of Cayenne's ObjectTeam [5] as a requirements capture / specification engine, SRI's PVS [14] as an interactive theorem prover and FSEL's FDR-2 high-level model-checker [22], will demonstrate the viability and utility of the interworking of disparate tools. A wide range of further tools are under consideration, and will be added as resources allow and customer interest dictates, including ---
  • The U.S. Naval Research Laboratory's SCR* toolkit [11], supporting the SCR specification method [15]
  • The Murphi model-checker [7], and SMV [13]
  • Isabelle [17] and HOL [9], as general-purpose theorem provers
  • The B-Toolkit [1], providing a route through proof to automatic code generation
  • Tools supporting the Temporal Logic of Actions [12], StateCharts [10], VDM, JSD.
The remainder of this paper is structured as follows. Section 2 outlines the vision by describing how a user might use formal methods systems integrated into the environment. Section 3 describes the toolkit and how different systems are made to fit into and communicate within the system.

2 The Vision

A key part of the project is that the user interface provided by the environment be based on compound document technology. This paradigm has proved very successful in standard desktop applications such as word processors and spreadsheets. It allows the user to embed fragments of objects such as spreadsheets, or drawings, or charts within a container document. As the data in the imported object is changed, the container document is updated implicitly. Also, the user can update the data in the imported object directly from within the container document.

We believe that such a user-interface can be provided for formal methods users. This user-interface would provide all the benefits of a `literate proof', but in addition the document would be active. For example, current user interfaces to theorem proving tools are typically ``temporal'', that is, the presentation takes the form of a sequence of actions or commands passed to the underlying prover. In contrast, the traditional presentation of a proof has been as a ``spatial'' document: the user sees a structured two-dimensional object. The active document paradigm takes things one further: the presentation is still as a two-dimensional object, but the user can activate hypertext links to retrieve or edit previously hidden underlying information.

To illustrate the power of this approach consider the example of an engineer who wishes to present a proof that a circuit satisfies some invariant. Say the specification of the circuit consists of a diagram illustrating the connections of the components, a set of tables defining the behavior of some of those components, and some equations also specifying behavioral properties. The proof of the invariant is derived from a model checker and a theorem prover. The entire specification along with the proof of the invariant can be presented as a unified document. The circuit diagram can be an embedded object whose display is generated by a CAD tool. The tables can also be embedded objects. The equations are terms of the theorem prover. The invariant is a theorem that specifies the goal of the theorem prover. The use that is made of the theorem prover is to decompose the problem into subgoals, some of which may be trivial to discharge; others are reduced to a form where theie truth can be established efficiently by appeal to the model-checker. Even the proof of the theorem is an embedded object which can be displayed in many ways: as a high-level strategy with the details hidden, or as the set of low-level instructions to the prover. Finally, the document might conclude with a Q.E.D., which is itself an object which is the result of applying the proof steps to the specification and the results yielded by the model-checker.

All of these objects are linked together by the object technology. The circuit diagram can be linked so that it is automatically translated into a term to be passed to the theorem prover. The tables can also be linked to the theorem prover so that their data is transferred automatically.

As the engineer develops the specification and proof, the display of the components is maintained to reflect ongoing changes. For example, if the circuit diagram is edited, the proof of the invariant is no longer known to be valid. In this case, the conclusion would change from Q.E.D. to Not proven. If an attempt to re-establish the proof failed, the conclusion line would change to Proof failed.

2.1 Key elements of the environment

2.1.1 Requirement engineering and system abstraction tools
Although the notations and intentions which they support are often quite distinct, the problems of interfacing with structured and semi-formal descriptions are similar, whether they lie on the ``specification'' or ``implementation'' side of the development. Apart from the purely technical aspects of interacting with tools in order to extract information from them, there are a number of issues which must be addressed in order for them to play their role in this framework:
  • System evolution capture, typically in the form of transition tables, possibly enhanced by evolution rules for combining or manipulating tables
  • Predicate capture, either directly as logical formulae or implicitly by claim of language inclusion or refinement
  • Checks on internal consistency and completeness, etc, as required or implied by the formalism; among these may be issues of syntax, type correctness, completeness of entity definitions, reachability of states, existence of defined initial values, disjointness and/or determinism, coverage and totality, circularity detection. More satisfactory tools will come with guarantees of many of these properties, but it may well be incumbent on the user to ``sanity-check'' others before it is safe to use them in proofs or as models.
  • Recognition of entities and other declarations, including multiple viewpoints and partitioning.
2.1.2 Formal verification tools
The other side of the integrated framework is made up of back-end tools including model-checkers, theorem provers and proof-checkers, but also other tools which operate in the semantic domain, such as simulators and automated code-generators. Here, one trusts, the outputs will require much less sanity checking, but it is important to identify what internal properties they expect and rely on in the data supplied to them. Other important issues include
  • The construction of formal theories appropriate to the problem, or the embedding of the problem in existing theories
  • Verification of preconditions for the applicability of the tool
  • Definitions and the creation of the environment necessary to interpret inputs correctly
  • Transmittal of theorems and counter-examples, both as lemmas (justified by other tools) and as results of an appeal to the tool.

2.2 Platforms

Although most work on formal methods is currently done on Unix machines, there is a clear trend in industry -- even for engineering activities -- away from Unix towards desktop-based machines running Windows NT. One goal of this project is to attain cross-platform interoperability. Thus presentation components may be run on a Windows desktop machine, while the core parts of the formal methods tools may be running as servers on a Unix machine. The ability for interfaces to run on Windows NT is a core goal of the project.

2.3 Tool Integration

Tool integration is not a problem unique to the formal-methods community -- it is universal. The emerging industry solution is compound document and object bus technology. The choice for the object bus is clear: CORBA [3] is the industry standard object bus for distributed environments. The compound document technology choice is more difficult. On the desktop, the dominant technology is clearly OLE (now ActiveX) from Microsoft [4]. This will be the initial direction the project will take. For other platforms there is no clear choice. OpenDoc [2] from Apple/IBM seems to have failed to gain acceptance. An emerging standard that has much more promise is JavaBeans, so we plan to consider this standard as well.

3 The Toolkit

The FMI toolkit will consist of two major subsystems providing encapsulation services and active document services, respectively. The Encapsulation Toolkit will allow a given formal methods system (FMS) to communicate on an object bus. The Active Document Toolkit will provide high-quality user-interface components for the encapsulated FMS.

Faceless tools, i.e., slave tools such as a BDD package that need no interactive user interface, and tools with their own GUI components, e.g., ObjectTeam, may only need the Encapsulation Toolkit. In contrast, the PVS integration will use both the Encapsulation Toolkit and the Active Document Toolkit.

The Active Document toolkit consists of two subsystems, one for constructing coarse-grain tools and the other for constructing fine-grain tools. Coarse-grain browsers present high-level graphical views of object collections and their relationships, e.g., theory libraries. Fine-grain browsers present low-level, mainly textual views of objects, e.g., individual theories, theorems, and proofs.

For example, the user-interface for a model checker might provide a language-sensitive editor for the textual description of a model, and a ``debugging tool'' for navigating from error output back to source input.

The underlying communications and control interface for all these tools will be CORBA [3] -- the industry standard object bus.

3.1 Encapsulation Toolkit

The Encapsulation Toolkit will provide the means whereby a given formal methods system can be made to communicate with other tools. Propositions form a core abstraction for the encapsulation. They are statements about systems expressed in a notation which must include the expressive power of all the tools in the framework. Because this is likely to be strictly greater than the expressive power of any single tool in the framework, we need to be able to translate propositions to and from tool-specific notation as part of the application of a tool to a problem. Defining these translations is part of the cost of integrating a tool into the framework.

Each tool to be integrated into the framework requires the provision of a callable interface for its control with respect to initialization, and the identification and execution of ``steps'' in its progress. This may, if the appropriate hooks are not directly available, require a veneer which stimulates the tool's original human-oriented interface.

In addition, a number of filters will be required to translate between the core representation and tool-specific notations and propositions, and to report back any anomalies discovered in terms appropriate to the tools in which the user chose to describe the problem.

The Encapsulation Toolkit abstracts the ``recipe'' for carrying out this process, and provides the basis common to the task of fitting a ``plug'' to a tool. The framework offers the corresponding ``socket'', which will provide the following services to support the interworking of diverse tools:

  • Provision of a primary knowledge base; the propositions, given, postulated or proven, about a system, together with the dependency relationships between them form the fundamental basis for the FMI's integration of tools. The knowledge repository maintains this information, and provides a variety of access mechanisms reflecting the needs of the framework and user specified groupings.
  • Interaction with, through the coarse-grain components of the Active Document Toolkit, and management of requirement models and formal models / theories in store, including information fusion and exploitation of results from multiple sources.
  • Use of common intermediates, and the fine-grain components of the Active Document Toolkit, to avoid need for any-to-any translations and to facilitate extensibility.

3.2 Active Document Toolkit

The active document toolkit is the component that allows the user to create a graphical user interface to the integrated formal methods system.
3.2.1 Coarse-grain interfaces
Coarse-grain tools are those that present a graphical interface to the high-level components of a system, e.g., proofs, and specifications. The coarse-grain toolkit will provide a high-level way of specifying the user interface using common components. We have identified the following components as ones that will be useful for formal methods systems. However, we do not wish to duplicate work. Where existing coarse-grain tools produced by other formal methods workers are appropriate or can easily be integrated, we will provide integrations for them rather than build new components.
  • Graph editor/browser. This is a component that is common to many formal systems user-interfaces. In the model-checking world, graph editors are used to construct the input model. In the theorem proving world, graph browsers are required for displaying and browsing dependencies between components.
  • Proof manager. As a user of a theorem-proving system develops a proof, many attempts may be made before a successful proof is constructed. Fragments of unsuccessful attempts may however be of use in the successful one, so the user will want to maintain a searchable record of these attempts. One role of the proof manager is to provide services to help maintain these multiple attempts. Another service is provision of facilities for better proof understanding. These include hiding of information not relevant to the current line of attack, and display of dependency information.
  • Table editor. A useful notation for specifying requirements and behaviors is the tabular notation [11,15]. Tables are easy to understand and develop, and large amounts of information can be concisely and clearly presented. A generic tabular representation editor is a necessary coarse grain toolkit component.
  • Library browsers. Most formal-methods systems have a notion of a library that contains collections of previously established theorems, tactics and decision procedures. One of the most essential skills a user must develop is the ability to use the library effectively. A good library browser will enable effective use by providing powerful query services.
3.2.2 Fine-grain interfaces
Fine-grain tools are those that present an interface to the lower level components of a system. For example, a fine-grain tool for PVS would provide access to low-level objects such as individual terms in the specification language. To the end user of the FMI, these lower level objects are typically represented textually. The toolkit will provide the ability to specify many interfaces to these objects including translation, editing and display, scripting, and attribution.

The technology that will be used for constructing fine-grain tools is GrammaTech's Synthesizer Generator (SG) [19,20,21]. This is a system for constructing language-sensitive user-interfaces. The SG takes a definition of a language, in terms of its abstract and parsing syntax, display rules, and attribution equations to define context-sensitive relationships; and produces a language-sensitive editor for that language. The SG runtime components manage the user-interface for the generated editor. This includes providing incremental parsing, editing, display, scripting, and attribution services. Standard text editing functionality, as well as structure editing, are supported. The attribution mechanism of the SG provides powerful translation services. These can be used to define the translation of terms in one language into terms in another. Programmatic control of running editors is provided by the SG's scripting languages, Scheme (which has full access to the attributed tree structures maintained in the editor), and Tcl (which does not).

4 Related work

4.1 PCTE

Previous work on integrating tools within a common environment has included work on PCTE [18]. Industrial interest in PCTE seems to have died out however as less ambitious mechanisms such as CORBA have gained momentum. One of the problems with PCTE was that it was too ambitious: it promised too much to too many kinds of tool. We hope to avoid repeating the PCTE errors by focusing more closely on our application domain, and by leveraging work on industry standard solutions.

4.2 Centaur

The research most similar in spirit to this work proposal is unquestionably that of Gilles Kahn and his associates at INRIA and Cambridge University [23]. Kahn's group has used the Centaur system [6] to build user-interfaces for HOL [9], Isabelle [16], and Felty's prover [8]. Our proposed approach has some similarities to the one that Kahn's group followed; however, there are a number of differences. The chief difference is that this project attempts to build a coherent theorem-prover-independent framework -- the FMI -- whereas Kahn's group built three separate interfaces in an ad hoc fashion (albeit while building up a library of reusable modules.) Other differences will stem from the fact that Centaur and the Synthesizer Generator each provide extensive support for certain functions that the other system does not.

Some of the noteworthy differences between Centaur and the described system that will bear on our relative approaches to building interfaces are:

  • Control integration model. This project will base the FME on CORBA and OLE, industry standard frameworks for supporting network-transparent, plug-and-play, compound-document interfaces, whereas the Centaur group used SophTalk, its own experimental broadcast message server.
  • Data integration model. Centaur has a detached (unattributed) tree server, whereas the Synthesizer Generator has an integral (attributed) tree module. However, because Centaur does not exploit this architecture in its theorem-prover interfaces, we will probably not imitate it.
  • Display model. Centaur's prettyprinting meta-language PPML supports 2-D display (e.g., $\sqrt{x_i^2}$), whereas the Synthesizer Generator does not.

5 Conclusion

We view this project as a strategic infrastructure initiative with broad application for the formal methods community. The ultimate goal of a universal integration framework is ambitious, but we believe it is achievable. This project is going to provide a firm foundation for future integration, with an initial combination of tools which itself a potent agent in the targeted application domain. We anticipate that the resulting product will be the sandbox of choice for formal methods researchers who want to achieve real industrial use.

References

[1] J.R. Abrial, M.K.O. Lee, D.S. Neilson, P.N. Scharbach, and I.H. Sorensen. The B-method. In S. Prehn and W.J. Toetenel, editors, VDM'91: Formal Software Development Methods, Volume 2, volume 552 of LNCS, pages 398-405. Springer-Verlag, 1991.

[2] Apple Computer, Inc. OpenDoc Technical Summary. April 14, 1994.

[3] Ron Ben-Natan. CORBA -- A Guide to Common Object Request Broker Architecture. McGraw-Hill, 1995.

[4] K. Brockschmidt. Inside OLE, Second Edition. Microsoft Press, 1995.

[5] Cadre, Inc., Providence, RI. OMT Users Guide.

[6] The Centaur 1.2 Manual. Jacobs, ed, March 1992. Available from INRIA -- Sophia-Antipolis.

[7] David L. Dill, Andreas J. Drexler, Alan J. Hu, and C. Han Yang. Protocol verification as a hardware design aid. In IEEE International Conference on Computer Design: VLSI in Computers and Processors, pages 522-525. IEEE Computer Society, 1992.

[8] A. Felty. Specifying and Implementing Theorem Provers in a Higher-Order Logic Programming Language. PhD thesis, University of Pennsylvania, August 1989.

[9] M. J. C. Gordon. HOL: A proof generating system for higher-order logic. In VLSI Specifications, Verification and Synthesis, pages 73-128. Kluwer, 1988.

[10] D. Harel. Statecharts: a visual approach to complex systems. Science of Computer Programming, 1987.

[11] C. Heitmeyer, A. Bull, C. Gasarch, and B. Labaw. SCR*: A Toolset for Specifying and Analyzing Requirements. Technical report, Center for High Assurance Computer Systems, Naval Research Laboratory, Washington, DC 20375, 1995.

[12] Leslie Lamport. The Temporal Logic of Actions. ACM TOPLAS, 16(3):872-923, May 1994.

[13] K.L. McMillan. Symbolic Model Checking. Kluwer Academic Publishers, Boston/Dordrecht/London, 1993.

[14] Sam Owre, John Rushby, Natarajan Shankar, and Friedrich von Henke. Formal Verification for Fault-Tolerant Architectures: Prolegomena to the Design of PVS. IEEE Transactions on Software Engineering, 21(2):107-125, February 1995.

[15] Parnas, D. and Madey, J. Functional Documentation for Computer Systems Engineering (Version 2). Technical report, Telecommunications Research Inst. of Ontario (TRIO), McMaster University, Hamilton, Ontario, 1991.

[16] L. C. Paulson. Isabelle: The next 700 theorem provers. In P. Odifreddi, editor, Logic and Computer Science, pages 361-386. Academic Press, 1990.

[17] Lawrence C. Paulson. Isabelle: A Generic Theorem Prover. Number 828 in LNCS. Springer, 1994.

[18] PCTE Project. PCTE: A basis for a portable common tool environment. In Directorate General XIII - Telecommunications, Information, Industries, and Innovation, editor, Esprit '86 - Results and Achievements. North-Holland, 1987.

[19] T. Reps and T. Teitelbaum. The Synthesizer Generator: A System for Constructing Language-Based Editors. Springer-Verlag, NY, 1988.

[20] T. Reps and T. Teitelbaum. The Synthesizer Generator Reference Manual. Springer-Verlag, Third edition, 1988. First edition, Cornell University, August, 1985; Second edition, Cornell University, June, 1987.

[21] T. W. Reps. Generating Language-Based Environments. The MIT Press, Cambridge, Massachusetts, 1984.

[22] A.W. Roscoe. Model-Checking CSP. In A Classical Mind, Essays in Honour of C.A.R. Hoare. Prentice-Hall, 1994.

[23] L. Thery, Y. Bertot, and G. Kahn. Real theorem provers deserve real user-interfaces. In Proceedings of the 5th ACM SIGSOFT Symposium on Software Development Environments, Tyson's Corner, Virginia, December 1992.