Market Processes

A New Foundation for Software Engineering

Cox MON Pix Java Perl


by Brad Cox
George Mason Program on Social and Organizational Learning

Unpublished draft
1993 ACM SIGSOFT Conference
New Foundations for Software Engineering.

This article received extremely negative reviews and was emphatically rejected. I interpret the rejection as confirmation of the thesis of a paper that argues that computer science and software engineering are neither science or engineering and that to become so they must undergo a paradigm shift of comparable magnitude to the scientific revolution. That is, discarding the programmer-centric (process-oriented) attitudes of software engineering to date and adopting the user-centric (product-oriented) attitudes of mature disciplines such as manufacturing.

Abstract

After a quarter-century of deploying ever more sophisticated programming languages and methodologies, the software engineering community is still the cargo cult of the information age; a society of generalist craftspersons that advances primarily by acquiring faster computing technology (and the latest trendy fashions) from its specialized neighbors in the hardware engineering community.

The software crisis shows no sign of yielding to the established paradigms of software engineering and computer science. A new paradigm is needed in which we recognize that this crisis does not originate in a deficiency in software development technology or methodology. It originates in the easy-to-replicate nature of the very goods we produce.

The tangible goods of the manufacturing age are made of atoms. Since these abide by physical laws of conservation of mass, energy and spin, scarcity has emerged as the traditional basis for commerce. However the electronic goods of the information age are made of bits that can be copied in nanoseconds and transported at literally the speed of light. Since conservation laws do not apply for bits, this undercuts the market processes that support the commercial structure so characteristic of hardware engineering today; the very structures upon which hardware engineering has achieved the maturity to which software engineering and computer science can only aspire.

Superdistribution is a market mechanism for electronic goods now being pioneered in Japan. Instead of treating ease-of-replication as a liability to be laboriously prevented with copy protection technologies and legal or moral restrictions, superdistribution treats it as the asset upon which a new foundation for software engineering could be based. As with superconductivity, superdistribution lets information flow freely, without resistance from either copy protection or piracy. Unlike the low-tech property rights mechanisms that are widespread already (shrink-wrap software, license servers, dongles, demoware, shareware, etc), superdistribution allows producers and consumers at every level of a specialized labor hierarchy to buy and sell, not just software applications but information age goods of any granularity, just as in the advanced industrial societies we live in today. Instead of trying to enforce scarcity as the traditional basis for commerce, superdistribution assumes hyper-abundance and bases commerce on invocation instead.

Introduction

"Of all the monsters that fill the nightmares of our folklore, none terrify more than werewolves, because they transform unexpectedly from the familiar into horrors. For these, one seeks bullets of silver that can magically lay them to rest. The familiar software project, at least as seen by the nontechnical manager, has something of this character; it is usually innocent and straightforward, but is capable of becoming a monster of missed schedules, blown budgets, and flawed products. So we hear desperate cries for a silver bullet--something to make software costs drop as rapidly as computer hardware costs do."Fred Brooks [1]

Fred Brooks' seminal paper, `No Silver Bullet; Essence and Accidents in Software Engineering', is profoundly discouraging to those who seek an end to the software crisis. It argues that the crisis is inevitable, arising from software's inescapable `essence', not `accident', a flaw in how we build it today.

This paper adopts a broader point of view from which optimistic alternatives to this bleak conclusion can be considered. It argues that solutions to the software crisis can indeed be found if we can muster the determination to build and deploy them. In the terms of this symposium's call for papers, it argues that a new foundation can indeed be erected upon which software engineering can advance beyond its present primitive status. But this new foundation will not emerge by extending the paradigm upon which software engineering is based today. It will emerge from a paradigm shift, a software industrial revolution, during which software engineering's core paradigm is overthrown and replaced.

Software engineering is currently based on a process-centric, language-oriented paradigm. Its core ideology is that solutions will emerge from continual refinements in programming languages and methodologies for fabricating software from first principles. The paradigm shift involves adopting a product-centric, object-oriented paradigm in which software is primarily assembled from pre-fabricated components.

This involves far more than merely adopting new tools and methodologies. It involves transforming a unspecialized socioeconomic order into an specialized one, and is thus comparable in difficulty to other massive social upheavals of history; the scientific revolution, the industrial revolution, and so forth.

This does not mean that process-centric refinements will cease to be important. W.E. Deming's profound contributions to manufacturing[2] shows quite the opposite. Process refinements are always important, and pre-fabricated components must clearly be fabricated by someone. But software engineers too readily overlook the fact that manufacturing's recent enthusiasm for process-centric innovation was always in conjunction with, not to the exclusion of, the product-centric perspective that those who make their living by selling tangible goods take for granted.

The process-centric paradigm of software engineering and computer science is very different from the product-centric paradigm of other engineering domains. In other domains, producers are required to deliver standard products but are free to choose any process for building them that they consider to be the right tool for the job. By contrast, software engineering defines standard processes such as programming languages and methodologies, then hopes that standard products will automagically ensue. We extol the virtues of abstract data types at the expense of the concrete ones upon which the prosperity of individuals, companies and nations have been based since antiquity.

Of course, our process-centric orientation was not willful. It is an inevitable consequence of an essential property of the electronic goods we produce; ease of replication. Unlike the tangible goods of the manufacturing age, the intangible goods of the information age can be copied in nanoseconds and transported at literally the speed of light. Although Brooks' paper neglected to mention this property, it is even more essential than the four properties that he considered in detail. It undercuts the very socioeconomic mechanisms upon which other domains have achieved the maturity to which we aspire. The specialization of labor that distinguishes engineering within specialized societies from the craftsmanship of unspecialized ones are based on the ability to own, buy and sell goods and services that are intrinsically hard to copy.

No Silver Bullet?

The subtitle of Brooks' paper originated in the Aristotelian distinction between essence, "the difficulties inherent in the nature of software', and accident, "those difficulties that today attend its production but are not inherent". His conclusion derives from considering "the inherent properties of this irreducible essence of modern software systems: complexity, conformity, changeability, and invisibility." This list does not include the essential property that will be the subject of this paper; ease of replication. It also excludes another essential property, single-threadedness, that I'll mention briefly in the next section.

I've argued elsewhere[3] that software development is not a mature engineering discipline but a pre-engineering craft analogous to cottage-industry manufacturing and a pre-scientific activity like Ptolemaic astronomy. Software engineering's emphasis on processes for fabricating software products, as distinct from emphasizing the products themselves, has clearly provided useful language and methodological improvements. But it has not provided, and seems increasingly unlikely to do so, a fundamental solution to the software crisis. Brooks made the same point when he argued that technology will not provide a silver bullet for the software crisis within the foreseeable future.

However silver bullets are actually common in human history. They were not new technologies but new paradigms; new exemplars for understanding a problem. Copernicus used one to eliminate the crisis in Ptolemaic astronomy. His bullet was not a tool for computing epicycles faster. It was a paradigm shift to a heliocentric model of the heavens that provided a new foundation upon which a true science of astronomy could be erected.

The viewpoint of this paper originated in Thomas Kuhn's book, The Structure of Scientific Revolutions[8]. Kuhn argued that science does not advance incrementally, by layering new knowledge upon old. It advances discontinuously, in tumultuous periods in which established paradigms are destroyed and new ones erected. During long stagnant periods of `normal science', scientists are not engaged in exploring new frontiers. Except for rare individuals like Copernicus, most scientists spend their careers doing what Kuhn calls `puzzle solving'; confirming an established paradigm by testing it against experimental observation of nature. A crisis arises if observations cannot be reconciled with the established paradigm. This may trigger a revolution in which the established paradigm is overthrown and replaced.

However even the most revolutionary upheavals occur one evolutionary step at the time. Although individuals can undergo a paradigm shift in milliseconds, it takes much longer, a generation or more, for an innovation to diffuse through a society (Figure 1). This diffusion of innovation `time constant' of a half-century or more may explain the pessimistic viewpoint of Brooks' paper, which had a much shorter time frame in mind:

"But, as we look to the horizon of a decade hence, we see no silver bullet. There is no single development, in either technology or in management technique, that by itself promises even one order-of-magnitude improvement in productivity, in reliability, in simplicity. In this article, I shall try to show why, by examining both the nature of the software problem and the properties of the bullets proposed."

Figure 1: Diffusion of a paradigm shift through a large population tends to follow a diffusion curve with a characteristic time-constant of a generation or more (50-100 years).

This is why this paper does not deny Brooks' analysis, but complements and extends it. The apparent disagreement is my assertion that a fundamental solution to the software crisis, a `silver bullet' in Brooks' terminology, can indeed be found if we muster the determination to do so. We agree that this solution will not be a mere methodology or technology, that it will emerge over a broader horizon than `a decade hence' and that it will involve far more than `the bullets proposed'.

Common Thread of Control

This paper will gloss over another fundamental property that was also omitted from Brooks' list; single-threadedness. This underlies our reluctance to include other programmers' objects in our applications. Any object can endanger all other objects if it misuses the machine's control thread. This control thread is a shared global resource that software engineers, unlike other engineers, must continually share and protect against misuse. This raises a perpetual obstacle to reusing code the way advanced engineering domains reuse the components of other members of their society.

However, although this obstacle clearly makes the software engineer's job harder, techniques like exception handling, multi-tasking, distributed processing and trellis architectures[4] already exist that are capable of controling part of the resulting complexity. I've glossed over this difference, not because it is unimportant, but because the lack of robust market mechanisms is even more essential. Ease of replication inhibits us from using the specialization of labor that mature engineering domains rely on to control complexity by distributing it across time and space by encapsulating it within each other's products.

Chicken versus Egg

These two quotations exemplify why I say software engineering is based on a process-centric paradigm to the exclusion of a product-centric focus:=

"Process improvement is central to the Software Productivity Consortium's long-term mission to forge significant advances in software engineering. In this view, process is the integrating "glue" with which distinct methods and tools can be implemented to address the specific needs of member companies." SPC Quarterly Newsletter
"The recently-formed ANSI C++ committee, X3J16, has the task of standardizing the C++ language. Part of this is to specify zero or more standard libraries. Which libraries become part of the standard is still an open question, and will not be settled soon. So for the next year or two, you are on your own." Internet NetNews Article

The relationship between process and product is an intimate one, so much so that taking the pro-product rhetoric with which I opened this topic to its logical extreme would be as bad as the pro-process orientation of these quotations. Kuhn's paradigm shift rhetoric has served its purpose by exposing the process-centric orientation of these quotes. So from here on I'll dispense with this rhetoric and concentrate on avoiding the opposing horn of the chicken versus egg dilemma. An exclusively product-centric paradigm is clearly just as indefensible as an exclusively process-centric one. Neither process nor product comes first since process is to product as chickens are to eggs (and vice versa).

Is Software Engineering?

Now let's consider the kinds of human organizations that can be said to do `engineering'. Since my purpose is not to find a narrow (exclusive) definition of this term that its diverse claimants might agree on, I shall adopt an inclusive definition so broad that it includes almost everything. By `engineering' in this inclusive sense, I'll include anyone engaged in transforming incoming products (raw materials) to produce new products. This inclusive definition will help to focus attention on specialization of labor as mediated by market processes as the key difference between primitive (non-specialized) and advanced (specialized) societies.

Thus engineering involves acquiring raw materials from nature or from other members of a society and transforming them into higher-level products for others to acquire. In an advanced society, this goes on at every level of a deep hierarchy of producers and consumers. For example, mining is an engineering process in which incoming products are acquired from nature and transformed into ore. Refining is an engineering process in which ore is purchased from miners and transformed into refined metals. Computer manufacturing is an engineering process in which silicon chips are purchased from a market in electronic components and transformed into higher-level consumer goods.

By my inclusive definition, an aborigine basket-maker who cuts reeds on the river bank to weave into baskets is also an engineer, but of the primitive variety that customary definitions of this term would exclude as `mere' craftsmanship. On the other hand, software end-users buy shrink-wrap software from a store and assemble them to make a personalized desktop publishing solution. This is also engineering by my inclusive definition, but of the advanced variety that software engineers would exclude by arbitrarily adopting a new definition that excludes non-programming end-users.

Notice that the conventional definition of software engineering includes the primitive variety of engineering that we'd normally exclude as mere craftsmanship. And it excludes the advanced variety that hardware engineers would enthusiastically endorse.

Consider a financial analyst who assembles a desktop publishing engine from a generic personal computer by buying shrink-wrapped word processors and spreadsheets, and then uses this engine to process a Dow Jones stock quotation data-feed to produce financial articles for publication. Isn't this analyst just as much engaged in advanced engineering as a refinery engineer who builds a refinery to process petroleum? Isn't a Smalltalk programmer who assembles classes from a class library to build such an application doing engineering in an equally advanced sense of the term, although at several levels lower in a specialized labor hierarchy? And isn't a C++ programmer who built a similar application, but with components fabricated solely for this application, engaged in the unspecialized kind of engineering as the basket-weaver's hand-craftsmanship?

Paradoxically, we've defined software engineering to mean the organization of unspecialized societies. We've perversely excluded the style of engineering that is widespread within the non-programming end-user community.

Of course, this perverse situation was neither capricious nor malicious. It was a consequence of the fact that electronic products are so ephemeral that it has never been obvious how to treat them as `products' in the sense that ore, metals, and silicon chips are products that can be robustly bought and sold by the copy.

The key to resolving the software crisis hinges on a resolution to this long-standing matter of economics, which is to say human motivation. It is not a matter of hardware or software technology except insofar as technology may play a role in implementing a solution.

Information Age Economics

Electronic goods are not found in nature. They are invariably produced by people. Reusable software components can only be produced by the individual who uses them as in primitive unspecialized societies, or by other members of the society as in the advanced industrial societies we're so familiar with today.

Clearly intangible electronic goods like shrink-wrap applications, stock price quotations, and Smalltalk classes are very different from tangible goods like baskets, reeds, oil refinery machinery, and personal computer hardware. Although the differences are significant from a techno-centric perspective, they are immaterial from the perspective of human motivation. Those who produce electronic goods and services know that production of such goods will consume capital, labor and knowledge just as tangible goods do[5]. Reasonable people don't make such investments in the absence of a robust mechanism capable of assuring a positive return on their investment.

Once a robust incentive structure is provided, the self-organizing systems that we call markets can emerge. Market processes can then organize and coordinate the distributed decision-making of independent self-interested individuals by what the economist, Adam Smith, called `The Invisible Hand'. Markets eliminate complexity by dispersing it across time and space, encapsulating and hiding it within goods produced by others.

Market Mechanisms

The market mechanism for the tangible goods of the manufacturing age didn't require any particular attention. The hard-to-copy nature of tangible goods made the traditional pay-per-copy mechanism the natural choice. But the market mechanism is very much an issue for information age goods that can be copied in nanoseconds and transported at literally the speed of light. This so thoroughly undercuts the pay-per-copy mechanism of traditional markets that there is considerable dispute as to whether a robust supply of pre-fabricated information age goods is even possible.

This dispute is generally engaged in under the name `Intellectual Property', an unfortunate term that lumps property that resides externally, on computers and networks, with true intellectual property which resides internally, in the mind. There is considerable room for dispute as to whether true intellectual property can or even should be `protected'. However electronic property need not be enmeshed in this dispute because it does not reside in the mind. It resides externally, on computers and networks where it is completely accessible to technological intervention and metering. Computer-based electronic goods are even more accessible to such metering than other kinds of electronic goods, such as those produced by the music industry, which began solving these same property rights issues almost a century ago.

A common argument that is often advanced during any consideration of robust market mechanisms for software is that software is so uniquely malleable that any protection scheme can be subverted. However the perpetual race between bank vaults and safe-cracking technology, or between tank armor and anti-tank missile technology shows that the same is true of tangible goods. Robust markets thrive in shopping malls where the obstacles to shoplifting are almost as negligible as the barriers to software piracy in computers. There is a difference in degree for software, but not in kind. The conflict with those who wish to steal is always with us.

These examples show that absolute protection is neither possible nor necessary. All that is necessary is that the costs and risks to those who would subvert the protection be greater than the value of the goods they might acquire. The next section will show that this weaker condition is achievable by relying on the same combination of technological, social, moral and punitive sanctions upon which markets have relied since antiquity.

Superdistribution[6]

Existing copyright law distinguishes between copyright (the right to copy or distribute) and useright (the right to `perform', or to use a copy once obtained). These laws were stringently tested in court a century ago as the music publishers came to terms with broadcast technologies such as radio and TV.

In the eyes of the law, when we buy a record or CD disk we're actually purchasing a bundle of rights with respect to the electronic property contained on that medium. The individual rights in this bundle were established in the U.S. Constitution. Subsequent case law has upheld the property owner's right to bundle these rights for sale in any combination they please. So when Suzy Sixpack buys a record at the store, she's buying a rights bundle that includes ownership of the physical medium and a severely limited useright that only allows her to use the music on that medium for personal enjoyment.

On the other hand, television and radio broadcasting companies acquire an entirely different bundle of rights. The publishing companies thrust records and tapes on them for free in expectation of substantial fees for the useright to broadcast the music on the air. The physical copies are identical except for a `not for resale' sticker on the cover. The collection and distribution of usage-based fees is administered by ASCAP (American Society of Composers, Authors and Publishers) and BMI (Broadcast Musicians Institute) by monitoring how often each record is broadcast to how large a listening audience.

Superdistribution analogizes a personal computer to a broadcasting station whose `audience' is a single `listener'. Work on superdistribution has been underway since 1987. It was pioneered by Dr. Ryoichi Mori who heads JEIDA (Japan Electronics Industry Development Association), an industry-wide consortium of telecomputing companies. He calls this approach superdistribution because, like superconductivity, it allows information to flow freely without resistance from copy protection and piracy[7].

Superdistribution is based on the following observation. Electronic objects differ from tangible objects by being fundamentally unable to monitor their copying but trivially able to monitor their use. For example, it is easy to make software count how many times it has been invoked, but hard to make it count how many times it has been copied. Superdistribution builds an information age market economy around this difference between manufacturing age and information age goods.

The premise of this approach is that copy protection is exactly the wrong idea for intangible, easily copied goods such as software. Superdistribution treats ease of replication as an asset instead of a liability. When revenue collection is based on monitoring the use of software inside a computer, vendors can dispense with copy protection altogether. They can distribute electronic objects for free in expectation of a usage-based revenue stream. It actively encourages free distribution of information age goods via any distribution mechanism you please. Users are actively encouraged to download superdistribution software from networks, to give it away to their friends, or to send it as junk mail to people they've never met.

This generosity is possible because the software is actually `meterware'. It has strings attached that decouple revenue collection from however the software was distributed. Superdistribution software contains embedded instructions that make it useless except on machines that are equipped for this new kind of revenue collection.

The computers that can run superdistribution software are otherwise quite ordinary. In particular, they run ordinary pay-by-copy software just fine. They just have additional capabilities that only superdistribution software uses. In Mori's prototype, these extra services are provided by a silicon chip that plugs into a Macintosh coprocessor slot. Electronic objects (not just applications, but objects of every granularity) that are intended for superdistribution invoke this hardware to ensure that the revenue collection hardware is present, that prior usage reports have been uploaded, and that prior usage fees have been paid.

The hardware is surprisingly uncomplicated (the main complexities are tamper-proofing, not base functionality), far less complicated than hardware that the computer industry has been routinely building for decades. The hardware merely provides several instructions that must be present before superdistribution software can run. These instructions count how many times they have been invoked by the software, storing the resulting usage information temporarily in a tamper-proof persistent store. Periodically (say monthly) this usage information is uploaded to an administrative organization for billing, using encryption to discourage tampering and to protect the secrecy of the metered information.

The end-user receives a monthly bill for their usage of each top-level component. These payments are credited to each component's owner in proportion to usage. The owners' accounts are then debited according to their applications' usage of any sub-components. These are credited to the sub-components' owners, again in proportion to usage. In other words, the end-user's payments are recursively distributed through the producer-consumer hierarchy. The distribution is governed by usage metering information collected from each end-user's machine, plus usage pricing data provided to the administrative organization by each component vendor.

Since communication is infrequent and involves only a small amount of metering information, the communication channel could be as simple as a modem that autodials a hardwired 800 number each month. Many other solutions are viable, such as flash cards or even floppy disks to be mailed back and forth each month in the mails.

Technical Implications

A change in the socioeconomics of telecomputing has radical implications with respect to underlying software technology.

For example, present-day notions of software property rights emerged out of ideologies that were established during the transition from timeshared to personal computing. Each individual works at a personal computer, loosely connected, if at all, to other individuals' computers by a network. Each computer provides an operating system that allows the individual to manage a disk on which the individual's electronic property resides. This architecture was shaped by the assumption that `owning' electronic property means having a copy of it on the hard drive. This assumption imposes fundamental restrictions to exchange of goods across different levels of a specialized labor hierarchy.

Consider an author who wishes to distribute (with superdistribution, even to sell) a multimedia document; i.e. a document that cannot be handled as a simple text file. Today, this author's market is limited, since only those who have already purchased a program capable of displaying this document could possibly read it. The same restriction occurs at each lower level of the producer/consumer hierarchy. The market of a programmer who wishes to sell a reusable software component is restricted to those who have already purchased the components and tools upon which the software component relies.

These restrictions could be relaxed if property rights are reconceived in terms of usage instead of acquisition of copies. This also enables network-based operating system architectures that are not possible today. With superdistribution, property exists `out there on the network'. The hard drive is no longer interesting since it is no longer a place where the user manages property purchased from others. The hard drive disappears to become just part of the plumbing; a cache that the operating system manages to avoid having to reacquire bytes that have been used recently and are likely to be needed again soon.

With superdistribution, the potential market for the hypothetical author's multimedia document becomes universal. The market is no longer restricted to those that own a copy of a program capable of reading the document. The reader program can be acquired automatically from the network as if it were bundled as part of the document. The owner of the document accrues revenue from those who read the document. The owner of the reader program is then paid from the document owner's account as the `user' of the reader program. And so on for any reusable software components that may have been used in constructing the program .

The user's operating system acquires subcomponents of the document, such as the reader program and any sub-components it relies on, from the hard drive. The operating system manages the hard drive as a cache, automatically loading it as needed from the network. The operating system can do this automatically and transparently, without even bringing this to the attention of either buyers or sellers, since acquisition of superdistribution software involves no financially binding commitments.

Conclusion

Superdistribution addresses the perennial unanswered question of those who might provide the reusable software components upon which an advanced software engineering culture might be founded. Where do software components come from? Why should anyone bother to provide them? Why would anyone undertake such unpleasant and costly activities as testing and documenting software components sufficiently that others might want to reuse them? What is in it for me?

Whereas software's ease of replication is a profound liability today (by disincentivizing those who would provide it), superdistribution turns this liability into an asset by allowing information age goods to be distributed for free. Whereas software vendors must spend heavily to overcome software's invisibility, superdistribution thrusts software out into the world to serve as its own advertisement. Whereas the PC revolution isolates individuals inside a standalone personal computer, superdistribution establishes a cooperative/competitive community around an information age market economy.

By decoupling revenue collection from acquisition of copies, hard drives and computers can disappear to become just part of the plumbing that conveys information age goods between producers and consumers. Computers and telecommunications links become invisible; a transparent window through which individuals can communicate, cooperate, coordinate and compete as members of an advanced socioeconomic community.

To rephrase these arguments in the win-lose rhetoric of today's globally competitive markets, what if superdistribution really is a silver bullet for the information age issues that I've raised in this paper? And what if the competition deploys it first?


References

[1] Frederick P. Brooks, Jr.; No Silver Bullet; Essence and Accidents of Software Engineering Computer Magazine; April 1987; First published in Information Processing 1986, ISBN No. 0444-7077-3, H. J. Kugler, ed., Elsevia Science Publishers B.V. (North-Holland) IFIP 1986.

[2] W.E. Deming, Out of the Crisis; Massachusetts Institute of Technology Center for Advanced Engineering Study; Cambridge, Mass; 1986.

[3] Brad Cox; Planning the Software Industrial Revolution; IEEE Software; November 1990.

[4] David Gelernter; Mirror Worlds; Oxford University Press; 1992.

[5] Freidrich Hayek; The Use of Knowledge in Society; American Economic Review, XXXV, No. 4; Sept 1945, 519-30.

[6] This section was adapted from a editorial in the June 1992 issue of the Journal of Object-oriented Programming, What if there is a silver bullet and the competition builds it first?. The editorial was subsequently republished in the October 1992 issue of Dr. Dobb's Journal. The author retained copyright for this material. [7] Ryoichi Mori and Masaji Kawahara; Superdistribution: The Concept and the Architecture; Transactions of The IEICE; Vol E-73 #7 July l990; Special Issue on Cryptography and Information Security. [8] Thomas Kuhn, The Structure of Scientific Revolutions; Chicago Press; 1962
Virtual School Middle of Nowhere Brad Cox