Chapter 18

And What About…

This last chapter provides a brief summary of some of the topics that this book does not cover. These are some quite important influences upon software design that either don't really fit into the narrative of this book, or that don't contribute quite enough to the theme of emerging software design knowledge to be included. And inevitably, there will also be some topics that don't appear here that others may feel should do.

The structure of this chapter is a little different too, in that the discussion of empirical knowledge is embedded in the sections, rather than appearing in a separate section. There is also no final ‘take home’ section, since the descriptions are really too short to merit this.

18.1Open source software (OSS)

This has had occasional mention in the preceding chapters. One description for the way that open source software (OSS) is developed is that this is a reactive development process, whereby a particular application evolves largely in response to what its developers perceive as being needed. It can also be viewed as being a form of incremental development. Either way, it has become a major source of software that is used worldwide for an ever-increasing number of applications. For most users, the most important characteristic is usually that it is free to download and use, but for some the availability of the source code and the freedom to change it opens up important opportunities.

The ramifications of OSS can be somewhat theological in nature, but most would consider that its roots lie back in the pioneering work of Richard Stallman's Free Software Foundation and the GNU project (a recursive definition of GNU is ‘GNU's Not Unix’!). Two factors accelerated its emergence in the 1990s: one was when the Linux operating system kernel came together with GNU to provide a widely distributed free operating system for personal computers; and the other was the emergence of the world-wide web as a platform for ready distribution and downloading of software applications. The establishment of the sourceforge¹ web site as a trusted platform for hosting and distributing open source software has also provided an important element of confidence for users.

Our interest here however, does not lie in the complexities and philosophies of OSS definitions, licensing etc., or in the ever-growing corpus of widely used applications, but in the way that such software is developed. A relatively early study by Wu & Lin (2001) suggested that while the development process usually adopted by OSS projects was incremental, it also took a form that was very different to the way that this type of development process was used in more ‘conventional’ development practices. In particular, they observed that projects could work in a great variety of ways, ranging from the use of highly democratic processes for making decisions through to a ‘dictatorship’ model.

The key characteristic that they identified lay in the organisation of the development process, and particularly the massive coordination effort required, rather than in the development steps themselves. Instead of the ideas from a small number of decision-makers (designers) directing the efforts of many developers, the open source model involves one or a team of co-ordinators drawing together the ideas of many participants. Inevitably, simple model does not come near to describing all of the variations. Many projects have begun with a single developer (as did Linux) and it was the later extension and evolution that was influenced by the many.

OSS development tends to make extensive use of code reviews, although because teams may well be globally distributed, these are not necessarily in the form described in Chapter 17. OSS projects can also demonstrate quite complex social structures that go well beyond the relatively simple forms outlined above. And joining such a project can be a complex process in terms of the group dynamics involved, as is illustrated by the systematic review reported by (Steinmacher et al. 2015).

With the potential to involve so many contributors, open source development requires quite comprehensive support for change management (examples of tools include git and subversion, but new ones continue to emerge). The comprehensive and detailed records related to open source projects readily available from these code management systems makes them a valuable resource for empirical studies. However, the use of OSS data may also form a source of bias for their findings where these relate to the processes involved in design and development.

So, in terms of contribution to design knowledge, OSS has broadened our understanding of how to organise potentially large and distributed teams, as well as encouraging automation of change records. It has certainly demonstrated the benefits of having many pairs of eyes scrutinising design ideas too.

18.2Formal description techniques (FDTs)

The notations discussed in the previous chapters have largely been both diagrammatical and systematic, with most of them lacking formal mathematical underpinning (the main exception being the statechart). And as noted, the advantage of diagrams is that they can be easily modified and can be used informally—reflecting the way that many designers work.

While informal syntax and semantics may be useful when exploring ways of addressing an ISP, the lack of rigour can present problems for such tasks as verification. And for design problems that fall into such categories as safety-critical, or that are concerned with financial transactions, an inability to formally and rigorously check for correctness and completeness is a significant disadvantage. Most of us, when travelling on an aircraft that is using ‘fly by wire’ technology to manage the control surfaces, would like to think that the software used for this purpose had undergone very thorough and comprehensive examination and assessment to ensure correct functioning in all situations!

From the early days of software engineering, researchers have therefore tried to harness the powers of mathematical notation to aid with modelling such activities as software specification, design and testing. Unfortunately, while mathematical rigour may be achievable when solving many forms of WSP, the characteristics of ISPs make it difficult to employ mathematical reasoning very widely, particularly in a context that exhibits the combined characteristics of ISPs and of software that we discussed in Chapter 1. So although the resulting formal description techniques (FDTs) are sometimes referred to as formal methods, in practice they tend to have very powerful representation parts while providing very little guidance on how to use them for design.

While FDTs can be used almost anywhere in the software development process and life-cycle, the roles for which they are particularly well suited are:

specifying system properties for requirements specification (black box); and
specifying the detailed form of a design model in the detailed design stages (white box).

In addition, their use does require some mathematical training and the amount of time needed for their use may create a substantial overhead, even with the support of tools. The variety of notations used, and their relative unfamiliarity, may also have made them less attractive to project managers. Incorporating ideas about software architecture has also proved to be challenging.

In a fairly early study of their use, Hall (1990) observed that, despite various limitations upon how and where they could be deployed, FDTs had been used on a reasonably large number of real projects, particularly those with safety-critical features. However, they also seemed to have been most successful when used on key elements of applications, rather than when being used to develop the complete application.

Little seems to have changed since then. FDTs have become part of the software engineer's toolbox, albeit a rather specialised part, to be used as and when appropriate. Researchers have also emphasised the use of ‘lightweight’ forms, seeking to use the power of mathematics where it can be most useful. However, there is little evidence that industry has embraced their use very widely, and indeed, they do sit awkwardly alongside ideas such as agile development.

This section provides a very brief introduction to one of the best-known model-based specification forms, the Z language (pronounced as ‘zed’). When creating a model-based specification, the specifier uses various mathematical forms to construct a ‘model’ of the application and then uses this to help reason about its properties and behaviour.

Z was created by J-R Abrial in the Programming Research Group at the University of Oxford, underwent major development in the 1980s, becoming established in the 1990s. The text by Spivey (1998) is widely regarded as providing the authoritative definition, but many other, more introductory, texts are also available. An ‘object’ version of the formalism is described in Smith (2000).

Z exhibits an unfortunate characteristic that typifies many formal specification forms, which is the use of mathematical symbols that are not easily reproduced in everyday typefaces. (In fairness, most are quite easily drawn on a whiteboard.) The only real process involved in its use is one of refinement.

The basic vocabulary of Z is based upon three elements: sets, set operations and logic, each with their own notational features.

Sets. A set is a collection of elements (or set members), in which no sense of ordering of elements is involved. In Z, there are some pre-defined set types, such as ℤ, the set of all integers; and ℕ, the set of all natural numbers (positive integers and zero). In addition, the user may specify their own set types, and for many problems is likely to want to do so. The convention is to employ all upper case italic letters for the identifier of a set type, with set members being described in text or enumerated.
Examples are:
[CCCCARS]the set of all cars available for use by CCC members
[RUNWAY] ::== main | north | west
The first example is self-evident, while the second tells us that an airport has three runways. As in this example, the variables are in lower case.
Set operations. These are largely as would be expected. Two that may be less familiar are the use of # that returns the number of elements in the set, and ℙ for the powerset of a set (the set of all subsets of that set).
Logic. Operations and relations in a specification are usually expressed by using the standard operators of predicate logic. The logical operators can then be combined with the set elements and used to describe characteristic rules affecting a set, usually in the general form of:
declaration|predicate · expression
where the declaration introduces the variables of concern to the rule; the predicate constrains their values; and the expression describes the resulting set of values.

A key component of Z is the schema. The role of this is to describe a system operation, and it is usually drawn as an open box that has three elements:

the schema name;
the signature, that introduces any variables and assigns them to set theoretic types;
the predicates that form the state invariants, describing how the elements of the signature are related and constrain the operation, and that are described in terms of preconditions and postconditions.

A very simple example of a schema that is related to the activities of the CCC is provided below.

Here, the elements used are:

the name of the schema, which is ReserveCar;
the signature that describes the ‘before’ and ‘after’ states of the set of drivers (CCC members who are currently using a CCC car), with the convention being that the primed identifier denotes the state of the set after the operation; while the variable d? is a variable used for an input that is required as part of the schema;
the predicate describes the operations required to reserve a car, which are as follows:
- – firstly, the number of drivers should be fewer than the number of cars available (precondition 1);
- – the new driver should not already be driving a car (and hence is not in the set of existing drivers) (precondition 2);
- – the new set of drivers will be comprised of the original set together with the new driver (postcondition 1);
- – the number of drivers after the operation must now be either less than the number of available cars, or equal to it (postcondition 2)

One advantage of this type of formal specification is that we can readily reason about it, and clearly there are some shortcomings we can identify fairly quickly. In particular, the set of cars implicitly includes all of the cars available at the site. But of course, some may be unavailable for other reasons, such as being serviced, and so we should probably use a subset of ‘active’ cars, together with another schema that keeps that subset up to date.

From a design perspective, Z probably does not particularly assist with the problems of producing a design solution for an ISP, beyond helping to manage completeness and consistency. In many ways it can make more of a contribution to the issue of clarifying the needs that an application needs to meet (that is, the specification). One of the characteristics of an ISP is the lack of a definitive specification, and while Z might not be able to address all the aspects of that lack, it can help with clarification.

Few empirical studies appear to have been performed on the use of FDTs, particularly with regard to their use in industry. This in turn makes it difficult to assess what features of their use are particularly valued by those who adopt them. However, the survey of use reported in Woodcock, Larsen, Bicarregui & Fitzgerald (2009) does provide some interesting illustrations of the use of formal techniques in industry as well as reviewing the domains where there has been adoption. As might be expected, many of the examples do fall into what might be termed as domains requiring ‘high-integrity software’.

18.3Model driven engineering (MDE)

Model-driven engineering (MDE) shares some characteristics with the FDTs discussed in the previous section. In particular, it would seem that a condition for successful adoption may well be for it to contribute to part of a project, rather than be used for the whole of it.

In Part II of this book, we focused strongly on the issue of producing design models, whether these were relatively informal ones, largely based on sketches, or more formalised, such as those using the UML notations. MDE is focused on making use of such models for automating various tasks, such as the generation of code, but also such issues as evolution and testing.

As noted by Schmidt (2006) MDE has built upon earlier technologies, such as CASE (Computer Assisted Software Engineering), and in the process has moved from using more generalised tools to more specialised ones such as domain-specific modelling languages (DSMLs), and generators that can create source code or simulations, observing that:

MDE can also be employed with the UML as the modelling language, and the OMG (Object Management Group) provides and supports standards for this.

The use of MDE does appear to have been the subject of more critical analysis than FDTs, with examples such as the relatively early systematic review from Mohagheghi & Dehlen (2008), looking at the use of MDE in industry. At that point in time they did note the lack of suitably mature tools, and observed that they found “reports of improvements in software quality and of both productivity gains and losses, but these reports were mainly from small-scale studies”.

A later survey by Whittle, Hutchinson & Rouncefield (2014) found quite widespread use of MDE, and observes that “many of the lessons point to the fact that social and organisational factors are at least as important in determining success as technical ones”. They also note that: “the companies who successfully applied MDE largely did so by creating or using languages specifically developed for their domain, rather than using general-purpose languages such as UML”. An interesting conclusion is that the real benefits to a company didn't stem from things like code generation but “in the support that MDE provides in documenting a good software architecture”. One of the tips that they offer for successful use of MDE is to “keep domains tight and narrow”, suggesting that it is easier to create DSMLs for “narrow, tight domains”, rather than attempting to use MDE for broad areas of application. (There are some parallels with FDTs here, where success seems to have often been associated with quite narrow (but important) domains such as telecoms.)

And, where Moody (2009) criticises the visual aspects of the UML, from an MDE perspective Whittle et al. (2014) observe that:

So from the perspective of this book, MDE does have some benefits when used appropriately, particularly with regard to helping with understanding and development of architectural features. Rather as FDTs require some familiarity with mathematical formalisms, successful use of MDE does need domain knowledge, but it is interesting to note that Whittle et al. (2014) also observe that the models themselves need not necessarily be formal ones, at least in the early stages of developing a design.

18.4And the rest…

This book began with a discussion of the challenges posed by ill-structured problems such as those encountered during software development. The focus throughout this edition has been on the contribution that different techniques and forms make to addressing these challenges. And not only do ISPs have no stopping rule, a book like this doesn't either. There are many more approaches to modelling and designing software than those covered in the existing chapters, and many variations that arise too.

This last chapter has looked briefly at a small selection of some of the other techniques that software engineers have devised for modelling and designing software, over and above those covered in the main chapters. If your favourite technique has been omitted or overlooked, my apologies.

And of course, vast numbers of software systems are developed without making use of formal modelling or of any form of systematic design process (not that these are always developed well of course). What matters though is that anyone seeking to design software applications should have a good understanding of the medium and of the purposes of the application they are developing. And acquiring the relevant knowledge schema needs an understanding of the topics covered here, regardless of how they are eventually employed…