Arthur M. Bueche Award

2001 Bueche Award Recipient Remarks

By Dr. Ian M. Ross

First I wish to thank the NAE for bestowing on me the honor of this year's Arthur M. Bueche Award. I also want to thank the Academy for giving me this opportunity for making a few observations. I will briefly reminisce on some of my experiences in the early days of the semiconductor industry, with the Apollo program, and in the conversion of the telecommunication networks to all digital operation. I will end with a few comments on what I see as a current issue in information technology.

My professional career began when I joined Bill Shockley's organization at Bell Labs in Murray Hill New Jersey in March of 1952. This was the organization that evolved from the group that invented the transistor in 1947. When I arrived at Murray Hill most of the world's knowledge of transistor technology still resided in that building. Most of the original cast of contributors were still there.

In April of 1952 a transistor technology symposium was held for Western Electric licensees. The objective was to provide enough information to the 40 attendees to "enable qualified engineers to set up equipment, procedures, and methods for the manufacture of these products". I was asked to organize a laboratory session in which the attendees, in small groups, could measure the characteristics of transistors, specifically point contact transistors, the structure in which Bardeen and Brattain first observed the transistor effect. For most of the students this would be the first time they had actually seen a transistor.

I had arrived from England just a month earlier. I was used to a climate which made a gradual transition from winter to summer during a period called spring. To my surprise, New Jersey that year went from winter to summer in one day, and I suddenly found myself exposed to temperature and humidity levels that were totally new to me. That transition took place two days before the start of the symposium. Murray Hill at that time was not air conditioned. When I got to the lab that morning, I found that my transistors for the session had lost all their electrical characteristics - the CRT traces were flat! I had discovered for myself was what many people already knew. The transistor was sensitive to its environment and particularly to humidity.

The lack of reliability of the early transistors was a huge setback and embarrassment to the semiconductor community. The transistor had been lauded as a device with no failure mechanisms, with nothing to wear out. Instead we had a severe reliability problem and one that took another 14 years before there was a complete solution.

In early 1952 there were two known transistor structures, the point contact transistor and the grown junction transistor. Both were proven to work, but neither of them was reliable nor suitable for large scale manufacture. Thus having invented the transistor, the challenge remained to find ways to design a product that would be reliable and would be easy to manufacture. This phase took the industry approximately another 8 years during which many challenging problems were addressed and fundamental solutions developed. This was an industry effort. Although many of the early advances came from Bell Labs, companies such as GE, TI, Fairchild, and others made major contributions.

The substantial completion of this effort was heralded by the development at Fairchild of the planar Transistor. This structure brought it all together. All the key development and engineering problems were either solved or on course for an elegant solution. Thus by 1960 there was a sound foundation for the long term manufacture of semiconductor devices. The resulting devices would eventually , by a process step added in 1966, be solidly reliable. And all this could be done with batch processing with the promise of high yield and low unit cost. Some 13 years after its invention the transistor now had a sound engineering foundation. The planar transistor also provided the base for the next giant step, the development of the integrated circuit following its invention in 1958.

To my mind there were two outstanding characteristics of this industry effort. The first was the constant search for understanding. The industry strongly believed in the importance of basic understanding and avoiding the empirical approach, albeit resorting to such techniques while no basic solution existed. This attribute has remained with the industry. Even during the many years when empirical solutions were applied to the reliability problem, the search for a basic solution continued and eventually won out. The second characteristic was a willingness to share information. Although the semiconductor industry from its beginning was highly competitive, it operated with an unusual willingness to share information. This attribute has also remained with the industry as is exemplified by the Technology Roadmaps produced regularly under the guidance of the Semiconductor Industry Association.

In 1964 I transferred from Bell Labs to Bellcomm, an AT&T subsidiary established in Washington DC to provide systems engineering support for the Apollo manned space flight program. We for example prepared the top level specification for the Apollo program. I will address just one anecdote from this experience

When I first became involved in the Apollo program there was much discussion on how to specify reliability as it related to mission success and crew safety. There was strong support for the conventional approach of specifying reliability in terms of probabilities with long strings of 'nines', and there were many debates on how long a string of 'nines' was appropriate for a particular phase of the program. At a NASA management meeting one day Wernher von Braun expressed his skepticism about this approach by asking the following question. Are you telling me that an astronaut, when leaving home in the morning, would kiss his wife goodbye on the doorstep and say "Dear, the probability of my being home for dinner is 99.999%"? Wernher had a wonderful ability to get to the heart of any matter. In the end NASA chose to include two simple statements in the top level documentation for Apollo, namely, "No single failure will cause loss of a mission. No two failures will cause a loss of crew". These two statements had
a profound influence on the manner in which the program was designed and executed and represent an elegant example of how program requirements can be expressed.

In 1971 I returned to Bell Labs in charge of the Network Planning Division. This organization was responsible for system engineering studies of the AT&T networks. At about that time it was recognized that the technology was at hand to permit the conversion of the networks to all digital operation and that in turn would lead to significant enhancements of the services that could be provided and the reliability of all network services. This conversion would require not only the design of new digital systems, both transmission and switching, but also a major reconfiguration of the network itself. It was also decided that the new systems should be substantially controlled by software. There were many challenges to be met. Today I will highlight some of those in switching.

The new machines were to have a switching fabric that would handle digital traffic and would be controlled by a stored program processor which was in effect a special purpose computer. AT&T had a specification governing the reliability of switching machines in their network. The machines were to be designed to have a maximum of " 2 hour's down time in 40 years". This may seem somewhat quaint today. Why not 1 hour's down time in 20 years or 3 minutes per year. I believe the intent was to specify two things - the percentage of time a machine could be down and that the switch should be designed to be able to provide service for 40 years. Indeed the machines that were introduced in the early 70s are still providing service today.

This is another example of one of those policy statements that has a powerful impact on all aspects of a project. For example, the only way we knew how to meet the down time requirement was to provide dual processors each simultaneously running an identical program.

Another major challenge resulted from the fact that the software contained several million lines of code. This was large for those days. Perhaps only IBM and some government projects required development of software of this size. We in the industry painfully learned some lessons on how to develop such programs. It became clear that the difficulty in developing software increased rapidly with the size of the program. It was therefore highly desirable, where possible, to partition the programs into smaller modules with clean interfaces so the each module could be checked independent of the others.

We also recognized the benefits, both in development cost and customer satisfaction, in detecting and correcting program bugs as early in the process as possible. It was much preferable to catch the bugs in the design phase before the program was transferred for manufacture. The least desirable occurrence was for a bug to be found in the field.

Looking back on all of these experiences, I am impressed by how much of our effort as engineers is dictated by the quality and reliability required to meet customer needs.

I will end with some observations on a development that as taken place largely since I retired, that is, the rapid expansion of the use of personal computers connected to the internet. The growth in the size of the user community and range of available services has been spectacular. However I have a concern that, without further improvement in the usability and dependability of these systems, future growth in the customer base and of new features may be unnecessarily limited. I will briefly explain my concerns.

In the case of personal computers, the problems I see are not in the hardware but are substantially in the software. Personal computer programs crash, machines hang, programs perform illegal operations, and programs interact with one another in unpredictable and harmful ways. I spend much more time than I should in trying to deal with these problems on my computer. I suppose in a way I enjoy the intellectual challenge as I do with solving crossword puzzles, but sometimes it can be frustrating. My wife comments that if I have such difficulties, how are 'ordinary' people supposed to manage. And that is the heart of the issue. I expect that most of the people in this audience are able to deal with these problems but I suspect that a great majority of people, including potential future users, could not cope with these difficulties, even if they plucked up the courage to try! If this situation persists surely a large number of people will be excluded or will exclude themselves from full participation and that wou
ld be unfortunate.

Speed of access to the internet is another problem. It has been reported that about 75% of customers are still connecting via modems and thus are limited to a maximum 56 kb/s speed. At my house, 5 miles from the telephone office, I never see more than 24 kb/s. Until we have a readily available, cost effective solution to the provision of much higher access speed, the use of existing services and the expansion of new services will surely be limited.

The problem I see with internet services themselves I believe stems from the poor design of many of the web sites, another software problem. This particularly impacts the effectiveness of making transactions via the internet. With a few exceptions, I prefer not to carry out transactions over the internet. This has resulted from my experience that, when they work, internet transactions often take too long and, frequently, they do not work. There are other means to carry out electronic transactions that can be faster and more dependable. I suspect that the majority of potential users would show even more reluctance to put up with this slowness and lack of dependability. Incidentally, there exist many web sites that are easy to use and are dependable. So the problem can be handled if the determination exists.

Clearly this whole enterprise is going through a period of reassessment. The financial markets have taken a much more realistic view of what is needed for an enterprise to be attractive for investment. Similarly, there are now more realistic reports on the extent of the use of internet services, much more realistic than some of the extravagant forecasts previously made.

If my concerns are well founded, perhaps now is a good time to address the need to improve ease of use and dependability. Since many of the solutions depend upon the application of existing engineering principles and practices, I believe we engineers have an obligation to speak up. Perhaps there is also a role for engineering associations and possibly this Academy.

Again I wish to thank the NAE for honoring me with the Arthur M. Bueche Award and for the opportunity to make a few observations today.