Keeping Software Secure in a Networked World
By Terry Bollinger, MITRE
Abstract: It is not easy to keep software secure in a world that is globally networked. This article provides eight suggestions, some quite heretical, on how we can do better.
The Need for New Approaches to Software Security
Global networking necessarily changes how software acquisition, development, and support should be done. Old habits die hard, however, and it is my belief that some of our most widely held beliefs about how to create secure, high-quality software are badly out of date or just plain wrong. In this article I propose eight principles, some new and some old, for how to acquire, develop, and secure high-quality software for mission critical systems, and how to support that software at a sustainable cost rate. Since my goal is to encourage re-examination of traditional stances, I must emphasize that this list is my own, and does not represent the views of my company or its customers.
1. When protecting mission-critical systems and data, do not assume that “certified” software is safer or more secure than uncertified software.
I believe an excellent case can be made that current software certification methods [1] [2] are ineffective at dealing with what I call chaotic risk. A system with chaotic risk is one in which a tiny change can cause massive damage to critical properties such as safety, reliability, or security. Software is the poster child for chaotic risk, since an error as trivial as misplacement of one punctuation character can have an impact as large as shutting down most of California’s telephone network. [3]
Chaotic risk is the main reason why the software industry cannot provide the kind of solid, scientifically based certifications given to physical components. For example, a physical component certified to be free of mercury remains so despite any minor changes to its packaging, while in software a change as small as one character of code can easily invalidate a security certification that took hundreds of hours to create.
In the smaller and less networked systems of the previous century, chaotic risk was handled by locking down both software and the hardware on which it ran. In the modern networked world, this approach is no longer feasible for three reasons. The first is that when software is extensively networked, it becomes flatly impossible to lock down the hardware configuration on which it executes. Even a very conservative network has daily component failures and dynamic additions, and the software responses needed to handle such changes introduce a small but unavoidable minimum level of chaotic risk. The second problem is that software has become too complex. Even for an embedded software component with few connections to the outside world, the likelihood that it will be so free of errors that it never requires updates is very close to zero. Finally, there is the competitive issue. If the rest of the world is racing ahead with new commercially available hardware and software that makes your locked-down systems look primitive, can you really afford not to make updates to your own capabilities?
The good news is that examples do exist of certification-like processes that reduce chaotic risk nearly to zero. The bad news is that the examples do not scale in any obvious way to large-scale certification of other products.
The best example of a process that minimizes chaotic risk is the one created by Theo de Raadt to support OpenBSD. [4] OpenBSD is a Unix-like Free or Open Source Software (FOSS) [5] operating system that is popular with systems administrators who have stringent security or stability needs. The OpenBSD approach relies on what I refer to as continuous certification, as opposed to the once-only “snapshot” approach used in most software certification methods.
.
The OpenBSD process minimizes chaotic risk in two ways. The high quality of the code contributes by providing a clean floor on which negative change impacts show up quickly against a background of error-free code, in much the same way that a clump of dirt is easier to spot on a clean floor than on a dirty one. This is in contrast to dirty floor software, in which the number of undetected errors and poorly analyzed sections of code (the “dirty floor”) are too high for the negative impacts of a change to be understood easily or quickly. The second feature of the OpenBSD process that helps reduce chaotic risk is its reliance on a dedicated, long-term team of code experts who become familiar with every line of code in the software. This high level of familiarity lets them make rapid, precise assessments of the implications of any code change.
The code quality required for a clean floor process does not come cheaply, since it requires years of effort by a support team that is explicitly dedicated to performing a full cleanup of the code base. For that reason, very few large code bases of any type, whether proprietary, FOSS, or government, are sufficiently high in quality to qualify as clean floor software. In fact, I am not aware of any large code base that fully qualifies as clean floor software except OpenBSD.
If chaotic risk invalidates traditional certification processes, and if the only methods that are effective for removing chaotic risk require years of dedicated effort to accomplish, what is a project manager to do? This leads to my second principle.
2. Seasoned programmers are your best protection against security intrusions.
While very few projects can plausibly aspire to the levels of continuous code certification seen in the OpenBSD project, what they can do is make the best possible use of the seasoned programmers they have on hand.
I should emphasize that when I say “seasoned programmers,” I mean people with significant hands-on experience at reading, writing, and understanding executable code, and who have a wealth of knowledge about the tools and resources that make their jobs easier. I do not mean software architects, software modelers, Universal Modeling Language (UML) designers, or any other discipline that works a level or two away from code-level programming. Only programmers who work with and deeply understand programming at the code level have a plausible chance of identifying the kinds of factors that lead to chaotic risk, since these factors typically originate in the code itself, not in the higher level abstractions that tend to gloss over such issues.
I realize that such advice sounds heretical, since for decades software engineering has pushed the concept that higher levels of abstraction are always better. I happen to agree with that, but only if those higher levels of abstraction are highly automated. Thus in my book, icon-based programming, that results in fully executable applications, is a useful and powerful form of abstraction, whereas getting into the regular habit of drawing lots of boxes that lack formally defined meanings and then calling the result a “software design” is mostly a great way to turn your brain into mush.
In the networked world, it is the hands-on programmers with access to an Internet full of powerful executable abstractions who are the true power users of abstraction. Unlike boxes on paper, abstractions expressed as executable components “talk back” to their designers, refusing to let them get away with sloppy generalizations or incomplete specifications of what happens next. The proliferation of agile methods [6] is a reflection of this need for more precision in the use of abstractions, since such methods force their users to convert each new set of ideas into immediately executable models that always push the understanding of the problem forward.
Due to their ability to find and deal meaningfully with risk factors not readily visible to certification processes, it is seasoned programmers, not certifications, who are your first line of defense against security intrusions. When certifications are required by policy, one strategy for ensuring both code-level security and policy-level certification is to have your most seasoned power programmers help figure out an overall configuration of existing components and custom code that is secure and addresses your most pressing needs. Next, begin a search to see which of those components have already been certified. For components that lack certifications, consider helping the product get certified, possibly in cooperation with other groups that are also using them. The focus in all cases should be to ensure meaningful component and code-level security first, so that the subsequent meeting of policy requirements for component certifications can proceed without inadvertently endangering critical resources.
3. If what you need has been done before, don’t build it again.
If you are encountering skyrocketing software costs, it is hard to beat this simple rule: First search the private sector to see if anything there could meet your needs. This is old advice and hardly surprising. What is different is what you should search for, and why. Searches for existing private sector solutions break down into two main categories:
- Category 1 – Emerging Proprietary Products: These are innovative products that could short-circuit your need to develop a new capability or dramatically alter the approach to achieving it. Innovation in information technology is fast-paced in these days of global networks, so it is a good idea to initiate a new search every time you are confronted with a new need, instead of relying on searches done even just a few months earlier.
- Category 2 – FOSS and FOSS Combinations: Free and Open Source Software (FOSS) is a good source for infrastructure solutions, and especially for shared or networked infrastructure solutions. FOSS also tends to be easier to compose than proprietary software, since in the sharing-first FOSS model there is no profit advantage to locking users into an application. Thus your goal when searching for available FOSS components should not just be to find suitable individual applications, but to determine whether your needs might be met by doing nothing more than composing existing FOSS components in novel ways.
FOSS components are more likely to be composable and interoperable because the communities that support them operate under consortium-like rules in which all participants share equally from the benefits of the software created. This in turn creates a powerful incentive for members to support composability and interoperability. In contrast, proprietary companies often hesitate at opening up interfaces out of a fear (not always valid) that it might reduce market uniqueness and thus profitability.
When evaluating the many possible combinations of FOSS components, it is important to get the assistance of an expert in relevant FOSS technologies, such as a power programmer who is familiar with the FOSS world. Such an expert can help you assess quickly how much of your needs might be met through selection, composition, and configuration of available FOSS parts. The results of such searches can be surprising. New generations of enterprise-oriented FOSS (e.g., the customer management system SugarCRM [7]) have the potential to replace within weeks development efforts that in the past would have required years of development.
4. Dark alleys are your biggest security threat.
Obviously, I am not talking the kind of dark alleys between buildings, although I am talking about places where you could get mugged. A dark alley is any part of a software creation, deployment, or support process where the total number of people who control the code drops to a very low number. An example would be a distribution process that at some point gives one unsupervised person authority to package and distribute an entire software release. Dark alleys do not just present opportunities to get mugged. They actively attract muggers by presenting tempting targets where they could deceive, bribe, or coerce their way into an otherwise sound process.
Using a different terminology, a dark alley is a single point of security failure — that is, a location in the software process at which the actions by a single person could seriously undermine overall security.
The largest single source of dark alleys in military software, networks, and facilities is the use of proprietary software products developed by international firms. The difficulty in such cases is not the use of international development per se, but the fact that for-profit software companies often create and use dark alleys intentionally as a way to keep their software secret and reduce the risk of piracy. It is the combination of global development and excessive use of dark alleys to reduce piracy that can lead to serious security issues. The threat could in principle be minimized by introducing U.S-oriented security procedures into the offshore facilities of the software vendors, but this would almost never be acceptable to vendors whose international buyers would object strongly to such U.S.-centric oversight.
5. Use bright alleys (wikis) to increase security when you partner with others.
As its name indicates, a bright alley is the opposite of a dark alley. That is, it is a process path that has been set up with incentives to encourage active participation by as many interested and knowledgeable people as possible. The free on-line Wikipedia encyclopedia is an example of a global project that makes unrelentingly use of bright alleys, since in the Wikipedia process every article is subject not just to the scrutiny of a global readership, but to direct editing by every member of that community. Wiki is the generic term for a collaborative process that actively encourages scrutiny and review of the products it develops. When a wiki-like process is applied to the creation of software, the result is FOSS software (e.g., Linux). FOSS preceded wikis, however, so FOSS projects tend to use a slightly different terminology and approach.
The critical factor that keeps the wikis from turning into random noise generators is their high-visibility approach to configuration management. Wikis provide permanent, fully visible tracking of all changes made to them, and make it very easy to undo a change that any other participant judges to be vandalism. Thus while it is true that anyone can change the contents of a wiki, the quality of the change and the way it was done immediately becomes part of a highly visible and permanent public record. This visibility strongly discourages most users from making casual or poorly thought out changes.
Even more important is the way wikis encourage participation by subject matter experts. Most such experts are first attracted to wikis because they can quickly and easily repair bad or incomplete entries related to their areas of expertise. Once they have made such changes, there is a strong tendency for them to join with other similar participants in monitoring that part of the wiki to make sure their contributions are not lost or damaged.
It is this tendency of wikis to encourage active monitoring by subject matter experts that makes them “bright” as opposed to simply “clear.” That is, wikis do not simply make the process visible. By providing an incentive for subject matter experts to continue monitoring a site once they have made a change, wikis actively attract the attention and participation of the people best qualified to make accurate and timely corrections. It is this ability that helps explain how the wiki-like processes used to create FOSS can be so unexpectedly effective at finding and correcting programming errors.
The question, however, is how any of this applies to mission-critical military systems.
After all, perimeter security and hierarchical control are both fundamental concepts for ensuring the safety and responsiveness of any military force, and the concept of using bright alleys seems to fly in the face of both of these. How can software be secure if anyone can see and change it, and how can a software development goal be met in a timely fashion if no one person is responsible for overall development?
The answer, surprisingly, is that the military perspective is exactly correct: Those things which truly need to be kept secret should remain secret, and the processes that oversee them should remain tightly managed and goal oriented.
The catch is that the vast majority of operational military software should not be in this category. More specifically, software that requires extensive sharing across multiple groups (often called a federated software) is a poor candidate for tight control, since no one group will ever be fully in charge of it. A bright alley model is a natural fit for adding a higher level of security to such software, since it encourages mutual examination and prevents any one group from trying to take it over.
All of this translates into the following simple corollary:
Keep secret software truly secret, and make the rest as open as you possibly can.
Cryptologists may recognize this as the same advice they learned decades ago: Trying to keep too many things secret at once just increases the odds of you losing your hold on what is really important. The lesson for software is fundamentally the same.
Pragmatically, this means that the safest default choice for shared, federated, or global military software is not proprietary or government software, but FOSS. A wiki-like FOSS model encourages the development of bright alleys, and thus actively discourages partners from trying to gain an advantage through use of dark alleys. The FOSS model thus amounts to a “trust but verify” approach that encourages constructive sharing without requiring full trust. The security and fit of FOSS products to your specific needs can be enhanced further by actively sponsoring members of your team to contribute to and actively participate in relevant FOSS efforts.
6. Hide your data, not your code.
One of the oldest — and unfortunately, most forgotten — software design principles of the U.S. Department of Defense is that it is the data, not the software, that should contain most if not all of the truly sensitive information in a classified system or network. This principle amounts to the end result of the minimizing the amount of code that needs to be classified.
Hiding data instead of code remains a very good design principle for building resilient, tamper-resistant systems. For example, it forces designers to recognize the absurdity of trying to hide secrets in the form of clever code, obscure code, or through compilation. Compilation in particular is one of the weakest forms of data encryption imaginable, since it translates the meaning of the code into an executable form that can be more amenable to automated analysis than the original source code!
7. Reward brevity.
Don’t reward code bloat. Every line of code written is another opportunity for someone to make an error, and errors are far more likely to occur in new code than in old, proven code. If, for example, a design team comes up with a way to create a new system by adding a few short scripts to configure and combine existing FOSS or proprietary components, you should always explore that option first. If it meets your needs, you will have saved a huge amount of risk by avoiding the development of thousands of lines of new, riskier code. Instead, you can focus your quality assurance efforts on the much smaller set of scripting code.
There is a distressing common belief that good code should be readable by anyone. Don’t believe this nonsense for a minute. If an integrated circuit designer came to you and claimed he could create an integrated circuit that was so well documented internally and so simple that anyone could figure out how it works by looking at the circuit through a microscope, you would throw him out of the room. Even if the designer was telling the truth, the resulting silicon monstrosity would be so bloated and ineffective than no one in their right mind would actually use it. Since a typical large software application is more complex than a typical integrated circuit, why would you expect—or want—every line of code in it to be so bloated and dumbed down that “anyone” could read it?
What you really want is code that is well designed and fully understandable to other experts who share the same expertise as the person who wrote the code. Just as the only people who can truly tell whether an integrated circuit design is complete, effective, and secure is someone who has a deep understanding of such circuits, the only people who can effectively evaluate whether a complex new software design is complete, effective, and secure are other experts in creating such software. Furthermore, trying to make such code “readable” to anyone who happens to drop in is almost always an illusion, since such casual viewers will almost never try to read the entire code set.
Ironically, the main effect of bloating code to make it more understandable on a per-line basis is to spreading the real design out so thinly that even the experts will have trouble understanding what the code is doing. Such bloat is one of the main reasons why an expert coder will often throw up her hands at such code and say, correctly, that it would be cheaper and easier to recode the entire application from scratch than to try to fix it.
For a program manager, the solution to all of this is the same one mentioned earlier: get experts who understand the domain to evaluate the work. As with integrated circuits, only the people who understand the code and application domain will be able to judge accurately which features are on target and which are just noise.
8. Reduce long-term support costs by joining FOSS communities.
The main economic driver behind both wikis and FOSS is cost sharing, usually in the context of people who want to create and maintain a product that no one participant can afford to create by themselves. Thus Linux was created by a large group of people who wanted to have Unix-like capabilities on their home computers, Apache was created by people who wanted to create and host web sites, and Wikipedia was created by people who wanted a high-quality online encyclopedia. The incentives behind such efforts thus are very similar to those behind consortia, which are similarly based on the principle that if every member contributes, the entire group will be able to create and share the result.
Government groups can also benefit from the consortium-style costs savings of FOSS communities, but only if they resist the initial temptation to take over source code and maintain it internally. Such a decision inevitably leads to rapidly escalating support costs as the code base loses synchronization with the community that created it.
To see real cost savings, program managers must learn to think in terms of how best to represent their needs in FOSS communities, while at the same time striving to keep the amount of unique code that they must create and maintain internally to an absolute minimum. Supporting team member participation in FOSS communities not only helps ensure that the FOSS effort will address your project’s needs, but also helps your participants understand exactly what options are available for rapid composition when a new set of needs presents itself.
In summary
The networked world presents tremendous opportunities for faster and more effective development of mission critical software, but it also presents equally tremendous risks. Falling back on old habits and assumptions can lead over time to falling behind badly in how well we use and exploit new technologies, and this is something we cannot afford to do. Even if you do not agree with all (or any!) of the recommendations I have made, I hope that some of them may cause you to look at old problems in a different light. The problems faced in creating secure software are hard, but they are not insurmountable.
References
[1] NIAP web site: http://www.nsa.gov/ia/industry/niap.cfm
[2] Cyber Security Industry Alliance, “NIAP Certification: Proposals by CSIA for Strengthening Security Certification,” July 23, 2004. Online at: https://www.csialliance.org/resources/pdfs/CSIA_NIAP_Recommendations.pdf
[3] “DSC takes blame for net failures, says coding error led to STP failure,”http://findarticles.com/p/articles/mi_hb5025/is_199107/ai_n18237914
[4] “Security goals of the OpenBSD Process,” http://www.openbsd.org/security.html#goals
[5] “Use of Free and Open Source Software (FOSS) in the U.S. Department of Defense,” http://en.wikipedia.org/wiki/Dodfoss
[6] “Agile software development,” http://en.wikipedia.org/wiki/Agile_methods
[7] “SugarCRM – Commercial Open Source CRM,” http://www.sugarcrm.com/crm/
About the Author
Terry Bollinger is technology analyst for the Defense Venture Catalyst Initiative (DeVenCI), which enlists the assistance of leading Venture Capitalists to help the U.S. Department of Defense find emerging technologies from the private sector. He is the author of a 2003 DISA report that was the first to document widespread use and the national security importance of free and open source software (FOSS) in military systems. Terry wrote the Wiley Encyclopedia of Software Engineering article on Linux and open source software, was a co-recipient of the Potomac Forum Leadership Award for his work on open source issues, and proposed and edited the first IEEE Software special issue on Linux and open source. More recently he proposed and edited an IEEE Software special issue on how to make classic software engineering properties such as security, scalability, and maintainability more resilient in the face of rapid software change and global networking. An article he did on the pros and cons of what is now called the Capability Maturity Model got him a best paper award. He enjoys working with the hard sciences, and for entertainment he is currently exploring the mathematics underlying the spin statistics problem of quantum physics.
Author Contact Information
Email: terry.bollinger.ctr@osd.mil
Phone: 703-588-7410.
|