Modular. Integrated. Now. Handle Writer/Spell™ Word processing with integrated spelling correction and verification. Handle Calc™ Spreadsheet with up to 32,000 rows and columns. Conditional and iterative recalculation. The Handle Office-Automation Series is a powerful set of modular, integrated software tools developed for today's multiuser office environment. Handle application modules can be used stand-alone or combined into a fully integrated system. The Handle Office-Automation Series modules offer: • Ease of Use and Learning • Insulation from UNIX • Data Sharing Between Multiple Users • Data Integration Between Modules • Data Sharing with Other Software Products • Sophisticated Document Security System Handle Technologies, Inc. Corporal© Office 6300 Richmond 3rd Floor Houston, TX 77057 (713)266-1415 Sales and Product Information 850 North Lake Tahoe Blvd. P.O. Box 1913 Tahoe City, CA 95730 (916) 583-7283 TM-HANDLE HANDLE HOST. HANDLE WRITER. HANDLE SPELL HANDLE WRITER/SPELL and HANDLE CALC ARE TRADEMARKS OF HANDLE TECHNOLOGIES. INC. TM—UNIX IS A TRADEMARK OF AT&T BELL LABORATORIES. Circle No. 255 on Inquiry Card flow to go from UNIX to DOS without compromising your standards. It’s easy. Just get an industry standard file access met,hoc that works on both. C-ISAM™ mom RDS. It’s been the UNIX™ standard for years (used in more |UNIX languages and programs than any othe: becoming the Why? Because o: indexing struc ture offers unlimited indexes. There’s also ai locking and o trails. Plus incl space and cut €> 1985, Relational iXitalw.'Je INFORMIX is a registered Relational Database Syste access method), and it’s fast itandard for DOS. the way it works. Its B+ Tree How can we be so sure C-ISAM works so well? We use it ourselves. It’s a part of INFORMIX: INFORMIX-SQL and File-iti: our best selling database management programs. For an information packet, call (415) 322-4100. Or write RDS, 4100 Bohannon Drive, Menlo Park, CA 94025. You’ll see why anything less than C-ISAM is just a compromise. tomatic or manual record dtional transaction audit ex compression to save disk access times. _ _ Systems, Inc. UNIX is a trademark of AT&T t nademark and RDS, C ISAM and File- It! are trademarks of Inc. RELATIONAL DATABASE SYSTEMS, INC. Circle No. 269 on Inquiry Card How we as part of the program, you can ask more of your database. Using the emerging industry- . ^ standard query language. improved Structured |l i/^vy»r j T O rfl 10 rfA comes with the most complete 'LLvuiy I jell IIm i (jifiSO. set of application building tools. J ^ ^ Including a full report writer Actually, we didn’t change a thing. and screen generator. Plus a family of com- We just combined it with the best panion products that all work together, relational database management system. Like our embedded SQLs for C and Introducing INFORMIX*-SQL. It runs on either MS™-DOS or UNIX™ COBOL. So you can easily link your pro¬ grams with ours. File-it!™ our easy-to-use operating systems. And now with IBM’s SQL file manager. And C-ISAM™ the de facto INFORMIX is a registered trademark and RDS, C-ISAM and File-it! are trademarks of Relational Database Systems, Inc. IBM. UNIX and MS are trademarks of International Business Machines Corporation, AT&T and Microsoft, rested vely. © 1985, Relational Database Systems, Inc. standard ISAM for the system. It’s built into you can buy it separa And when you in the company of sonji panies. Computer m. AT&T, Northern Teletji 60 others. And major Anheuser Busch and Bank of Chicago. Which makes offers a family of prodjr together. As well as w standards. UNIX operating all our products, but ely. ose RDS, you’ll be e other good com- aifiufacturers including om, Altos and over Corporations like he First National ; sense, . After all, only RDS cts that work so well th so many industry So call us for a demo, a manual and a copy of our Independent Software Vendor Catalog. Software vendors be sure to ask about our new “Hooks” software integration program. Our number: 415/322-4100. Or write RDS, 4100 Bohannon Drive, Menlo Park, CA 94025. And well show you how we took a good idea and made it better. RELATIONAL DATABASE SYSTEMS, INC. UNIX REVIEW THE PUBLICATION FOR THE UNIX COMMUNITY Volume 3, Number 11 November 1985 DEPARTMENTS: FEATURES: 6 Viewpoint 24 THE FINAL FRONTIER 8 The Monthly Report By David Chandler 16 The Human Factor By Richard Morin 68 C Advisor By Bill Freiboth and Bill Tuthill 74 Industry Insider By Mark G. Sobell 78 Rules of the Game By Glenn Groenewold 84 Fit to Print By August Mohr 90 Devil's Advocate By Stan Kelly-Bootle 92 The UNIX Glossary By Steve Rosenthal 96 Recent Releases 104 Calendar By Joseph S. Sventek Most major segments of the computational fraternity have received UNIX happily—save the scientific community. 28 A RUN THROUGH THE MILL By Robert Goff Does UNIX have what it takes to handle data analysis? There's no substitute for actual experience. 106 The Last Word 108 Advertisers' Index Cover art by Heda Majlessi 4 wviiir MnHNl Rt i UNIX REVIEW (ISSN-0742-3| Second class postage paid at Francisco, CA 94105. Entire c UNIX REVIEW. Subscriptions to UNIX REVIEW USS85 in all other countries/ai of address) should be sent to 5< South 3rd Street, Renton, WA Letters to UNIX REVIEW or its writer's full name, address necessarily those of UNIX REX UNIX is a trademark of AT&T ■00 ;:6) is published monthly by REVIEW Publications Co. It is a publication dedicated exclusively to the needs of the UNIX community, nton, WA 98055 and at additional mailing offices. POSTMASTER: Please send Form 3579 to UNIX REVIEW, 500 Howard Street, San items copyright 1985. All rights reserved and nothing may be reproduced in whole or in part without prior written permission from are available at the following annual rates (12 issues): USS28 in the US; US$35 in Canada; US$48 in all other countries/surface mail; mail. Correspondence regarding editorial (press releases, product announcements) and circulation (subscriptions, fulfillment, change Howard Street, San Francisco, CA 94105. Telephone 415/397-1881. Correspondence regarding dealer sales should be sent to 901 98055. Telephone 206/271-9605. editors become the property of the magazine and are assumed intended for publication and may so be used. They should include the I home telephone number. Letters may be edited for the purpose of clarity or space. Opinions expressed by the authors are not IEW. Bell Laboratories, Inc. UNIX REVIEW is not affiliated with AT&T Bell Laboratories. UNIX REVIEW NOVEMBER 1985 5 SCIENTIFIC APPLICATIONS 46 DATA ANALYSIS THROUGH INTERACTION By Richard A. Becker and John M. Chambers The desic ners of the S system for data analysis discuss how human factors and UNIX influenced their work. 36 INTERVIEW WITH STEVE WALLACH By Rob Warnock A Crayette pioneer tells why UNIX is becominc a pervasive presence on the supercorrputer front. UNIX IN REALTIME By Clement T. Cole and John Sundman Some may scoff, but UNIX does meet the test—and the performance cost is surprisingly small. PUBLISHER: Pamela J. McKee ASSOCIATE PUBLISHERS: Ken Roberts, Scott Robin EDITORIAL DIRECTOR: Stephen J. Schneiderman EDITOR: Mark Compton ASSOCIATE EDITOR: David Chandler EDITORIAL ADVISOR: Dr. Stephen R. Bourne, Consulting Software Engineer, Digital Equipment Corporation EDITORIAL REVIEW BOARD: Dr. Greg Chesson, Chief Scientist, Silicon Graphics, Inc. Larry Crume, President/Managing Director, AT&T UNIX Pacific Co.. Ltd. Ted Dolotta, Senior Vice President of Technology. Interactive Systems Corporation Ian Johnstone, Project Manager, Operating Software, Sequent Computer Systems Bob Marsh. Chairman, Plexus Computers John Mashey, Manager, Operating Systems, MIPS Computer Systems Robert Mitze, Department Head, UNIX Computing System Development, AT&T Bell Labs Deborah Scherrer, Computer Scientist. Mt. Xinu Jeff Schriebman, President, UniSoft Systems Rob Warnock, Consultant Otis Wilson, Manager, Software Sales and Marketing, AT&T Information Systems HARDWARE REVIEW BOARD: Gene Dronek, Director of Software, Aim Technology Doug Merritt, Consultant Richard Morin, Consultant, Canta Forda Computer Laboratory Mark G. Sobell, Consultant SOFTWARE REVIEW BOARD: Eric Allman. Principal Systems Engineer, Britton Lee, Inc. Ken Arnold, Consultant. UC Berkeley Jordan Mattson, Programmer, UC Santa Cruz Dr. Kirk McKusick, Research Computer Scientist, UC Berkeley Doug Merritt, Consultant Mark G. Sobell, Consultant CONTRIBUTING EDITOR: Ned Peirce, Systems Analyst, AT&T Information Systems PRODUCTION DIRECTOR: Nancy Jorgensen PRODUCTION STAFF: Cynthia Grant, Tamara V. Heimarck, Florence O'Brien, Denise Wertzler BUSINESS MANAGER: Ron King CIRCULATION DIRECTOR: Wini D Ragus CIRCULATION MANAGER: Jerry M. Okabe MARKETING MANAGER: Donald A. Pazour OFFICE MANAGER: Tracey J McKee TRAFFIC: Tom Burrill, Dan McKee, Corey Nelson NATIONAL SALES OFFICES: 500 Howard St San Francisco. CA 94105 (415) 397-1881 Regional Sales Manager Colleen M Y. Rodgers Sales/Marketing Assistant. Anmarie Achacoso 370 Lexington Ave New York, NY 10017 (212) 683-9294 Regional Sales Manager. Katie A. McGoldrick BPA membership applied for in March, 1985. VIEWPOINT The life cycle It has been observed that since UNIX now has a fair amount of market momentum, it must be well past its prime technically. Common wisdom, after all, holds that public acceptance and heavy press coverage are the surest signs of obsolescence. Given this perspective, howev¬ er, it’s difficult to assess the role UNIX might have in the scientific community. Scientists certainly would not be quick to say that the system’s best years are behind it. They know that if UNIX is to make a significant contribution in their field, it will need to achieve a much greater penetra¬ tion than it currently enjoys. This, of course, has given rise to the question: is UNIX, in fact, suitable? The logic in this is good, but the question is bad. For the last 15 years, UNIX has won hearts and minds in almost every other realm by virtue of its porta¬ bility and flexibility. Only the most facile mind can imagine the array of esoteric UNIX adapta¬ tions already in use. The system has survived as long as it has largely because of its ready accep¬ tance of change. So we return to the question: can UNIX be adapted for scientific use? Yes, of course—but prob¬ ably not without a price. The questions that demand answers are: what cost-effective adapta¬ tions might be made and how might UNIX offer solutions that are better than those already available to scientists? This last question is especially intriguing since it takes inertia into account. Scientists, like peo¬ ple in other professions, have a vested interest in the status quo. Apart from explorers with mas¬ ochistic tendencies, most people shun the pain of transition un¬ less they can be assured that the grass is definitely greener on the other side—substantially greener, in fact. UNIX has yet to demonstrate to scientists that its solutions are that much better than the ones offered by VMS. Indeed, some in the scientific community ques¬ tion whether UNIX is better at all. At the root of this doubt lies the Fortran question—a matter that Lawrence Berkeley Lab’s Joe Sventek wrestles with in the lead article of this issue. Bob Goff follows with an ac¬ count detailing some of the data analysis strengths brought to bear by UNIX. As a researcher who has spent much of his life manipulating seismic data, Goff speaks from experience. The tools offered by UNIX are yet another lure deserving atten¬ tion. One tool in particular, the S system, was specifically designed for data analysis. Rick Becker and John Chambers, the gentle¬ men responsible for the system’s development, describe it and dis¬ cuss how UNIX influenced this design. The issue then forges into a bugaboo topic—real time. Some critics say UNIX can’t handle real-time applications effectively. Clem Cole of MASSCOMP disa¬ grees, and he offers seven good reasons why. The theme closes with an in¬ terview of Steve Wallach, the man who helped generate the “Crayette” wave with his design of the Convex C-l. If Wallach’s name sounds especially familiar, it’s probably because you’ve read about him in The Soul of a New Machine. The questions he ad¬ dresses come from Rob Warnock, himself a systems architect. If all this seems to suggest that adventures still lie ahead for UNIX, so be it. In the scientific realm at least, UNIX still has many frontiers left to cross. (jtr 6 UNIX REVIEW NOVEMBER 1985 The First Name In Integrated Office Automation Software Executive Mail Telephone Directory Menu Processor Word Processor Forms/Data Base Spreadsheet Certifie Delivers d and ble Since 1981 XED was the first independent software company to introduce a Unix WP package and achieved early success by selling to the government and international market (XED is the only Unix WP package to meet government specifications). Worldwide sales of XED rank Computer Methods first in both sales and units installed in 1984. ® INTEGRATED OFFICE SOFTWARE Box 3938 • Chatsworth, CA 91313 U.S.A. • (818) 884-2000 FAX (818) 884-3870 • Inti. TLX 292 662 XED UR XED is a registered tradem UNIX is a trademark of AT ark of CCL Datentechnik AG & T Bell Laboratories, Inc. Circle No. 264 on Inquiry Card THE MONTHLY REPORT No simple answers by David Chandler Two friends were chatting one day. One, a casual fellow with a penchant for keeping things sim¬ ple, was commenting on the oth¬ er’s verbosity. “Richard”, he said, “you’re always launching into some diatribe when all I want is a simple response. Can’t you ever give me a straight answer?” To this the friend replied, “Well, yes and no. Let me explain. ...” As mentioned in last month’s Report , AT&T and Sun Microsys¬ tems, Inc., announced in Septem¬ ber a major technology-sharing agreement whereby technical re¬ presentatives from both compan¬ ies will work together “to facili¬ tate convergence” of System V and the 4.2BSD-based Sun OS. At first glance, the agreement seems to hold great potential for contrib¬ uting to the evolution of UNIX as a computer industry standard. Fur¬ ther study, however, reveals that there are certain portions of the announcement which are quite significant, and certain others which are less so. While working to avoid verbosity, an explanation is in order. There was commotion in the UNIX community when the an¬ nouncement was first made— and for good reason. Perhaps the greatest excitement was felt by those who wish for UNIX to become the official standard that many say it already is unofficial¬ ly. Industry watchers thus were stirred when UNIX giants AT&T and Sun signed an agreement. Add to this the opening line of the fact sheet Sun distributed along with its press release: “Sun and AT&T have agreed to work togeth¬ er to converge the two major UNIX standards into a single version.” Further fanning of the flames came from the industry press, as evidenced by the front page story in Computerworld that cried out, “AT&T, Sun to Redo UNIX”. Such stories may not be as racy as amendments to the Ten Com¬ mandments, but they do raise eyebrows. The facts as presented in the announcement of the agreement are these: Sun and AT&T will incorporate a “reasonable super¬ set” of both System V and Sun OS into a single, AT&T-endorsed, en¬ hanced version of System V. The resulting package will be avail¬ able from both companies—Sun will offer an implementation of the common interface on Sun workstations (by summer 1986), and AT&T will license it in a future enhanced version of Sys¬ tem V. (Estimates from Sun hold that the process at AT&T may take as long as two years.) The new system will continue to run the existing base of System V applications and will provide the networking services that pre¬ viously have been available only in 4.2BSD systems. (All of this, of course, supports Bruce Borden’s thesis that, “The way a standard develops is from the implementa¬ tion backward as opposed to the definition forward.” Borden, the manager of engineering at Silicon Graphics, Inc., should know— he’s been in the UNIX game since the Edition 4 days.) Presenting this information, however, raises more questions than it answers. What will the Sun-AT&T convergence include? What will it exclude? Which com¬ pany will contribute what? AT&T and Sun are known for having different views on networking —what does this agreement say about that? The first two questions—what will and won’t be included in the system—are loaded ones, and company sources decline to be specific in responding. This indi¬ cates either that they are (under- 8 UNIX REVIEW NOVEMBER 1985 T A N G O™ Use Tango to: Buy Tango for: COSI • Connect IBM and • Execution of DOS 313 N. First St, compatible PC’s running programs on the PC Ann Arbor, Michigan DOS to UNIX systems. under UNIX control. 48103 (313) 665-8778 • Offload processing to ® Simple elegant tile Telex: 466568 PC’s. transfer under error correcting protocol. Tango is a trademark of COSI. • Control data and UNIX is a trademark of Bell applications on remote • DEC, IBM, and laboratories. PC’s. Tektronix (graphics) terminal emulation. • Distribute processing between UNIX and PC’s. Tango utilizes a standard RS-232 serial port on the PC and connects to the UNIX computer via a modem or direct connection. The t>C-to-UNIX“Connection Circle No. 267 on Inquiry Card The Truth of the Matter is... Prevail is a UNIX-based office automation and application development solution which can be shipped to you today. If you are looking for office automation software or need a fourth-generation language, look to Prevail—an A.T.&T. co-labeled product. Prevail has seven components which will meet your needs. • Word Processing • Spreadsheet • Database Management System • Window Manager and User Interface • Report Writer • Applications Development Language • Telecommunications Prevail is available on AT&T 3B series, AT&T Unix PC Model 7300, NCR Tower, DEC VAX and MicroVax II series, Sun Microsystems computers, and Masscomp computers, Inspiration Systems. Inc. 400 Cummings Park. Suite 4300 Woburn. MA 01801 017 y38-IIOO See Prevail on the IBM PC Booth No. 5124 Yall'SS November 20*24, 1985 Las Vegas Convention Center Las Vegas, Nevada Circle No. 297 on Inquiry Card 10 UNIX REVIEW NOVEMBER 1985 U THE MONTHLY REPORT standably) protective of particu¬ lar innovations to be announced later, or that such matters have yet to be decided, or both. According to Laurence Brown, supervisor of UNIX Networking Systems Engineering at AT&T Bell Labs, “All that’s been agreed to so far is that we will work together to ensure that there is a single UNIX standard that will both support current System V applications and will provide the networking services that tradi¬ tionally have been offered on Berkeley-based systems.” Now, System V, of course, already “supports current System V ap¬ plications”. Does this mean that the agreement essentially re¬ quires nothing more than the grafting of BSD networking facili¬ ties onto System V? Indications suggest that the process is some¬ what more involved. For its part, Sun’s first step will be to add complete compati¬ bility with AT&T’s System V Interface Definition (SVID) to the Sun OS. While it is significant that another major UNIX vendor is making this move, it’s not now considered news. Bill Joy, vice president for research and devel¬ opment at Sun, stated at the UniForum conference in Dallas last January that Sun would commit to SVID. Sun will port its Network File System (NFS) to System V, maintain 4.2BSD fea¬ tures and Sun enhancements, and incorporate 4.3 enhance¬ ments next Spring, but it’s con¬ ceivable that Sun might have done these things even without the agreement with AT&T. The new package is not to be a “dual” or “layered” port. In the Sun System V facility, system calls and other facilities required for System V are implemented as “native” extensions to the Sun OS kernel. A separate library is used for commands and utilities unique to System V. Perhaps even more interesting than Sun’s actions is the question of what AT&T will do. Since the focus of the agreement, as Brown stated, is on support for System V applications and the availability of networking services, and since AT&T is already very much en¬ gaged in the business of support¬ ing System V, a major AT&T emphasis no doubt will be placed on networking. Brown observes: “The root of this agreement is that both companies feel applica¬ tions are important—important to maintain compatibility for ex¬ isting applications as our individ¬ ual systems evolve; and that there is important new functionality coming in networking, and that it’s important to define UNIX standards there. AT&T and Sun will work on those together as part of this agreement. ... We saw networking as an area of poten¬ tial divergence, and we’d like to bring everybody together there.” The fact that networking is a key issue in the agreement is public knowledge. What is not yet public are the specific facilities the companies will use in their joint networking scheme. “Now, the exact technology that’s used to provide those additional [net¬ working] services still needs to be worked out”, Brown said, “and that isn’t covered by the press release. ... We need to agree on a common set of networking ser¬ vices that will be provided on all standard UNIX systems, and then any vendor, [in providing] upward compatibility for its customers, may extend beyond that and offer additional features on its sys¬ tems.” This last remark leads to con¬ siderations of how AT&T’s funda¬ mental document, the SVID, may be altered by the Sun-AT&T agreement. Writing in the Febru¬ ary, 1985, issue of UNIX REVIEW, Doug Kevorkian, supervisor of UNIX System Architecture and Good To Be True? Apollo 660 VAX 780FPA Celerity 0200 Pyramid FPA90X RiUge 32/330 WHETSTONEl Single Precision Pyramid FPA90X Apollo 660 VAX 780FPA Ridge 32/330 Celerity 0200 Benchmark Call Us On It. UNPACK Benchmark Double Precision Million WHETSTONES Per Second Million Floating Point Operations Per Second The C1200 computational system continues to set new performance standards. Results from industry- accepted benchmark s highlight the ClZOO’s performance when executing workloads characteristic of compute-intensive engineering and scientific applications. Similar performance results are achieved in a broad range of application environments: mode ling, simulation, analysis, image proc issing. The C1200 combii performance and feu unmatched local coi you the best price/p value available todav nes mainframe tures with i itrol to provide n Tformance Optimized native UNIX 4.2BSD 32-bit RISC-like architecture Up to 24 MBytes physical memory 4 Gigabytes virtual memory Multiple high speed buses Industry-standard graphics, compilers, networking and communications options Up to 32 users The proof of performance is in the execution of your application. We promise performance. We deliver performance. So Call Us On It. CELERITY COMPUTING Corporate Headquarters: 9692 Via Excelencia, San Diego, CA 92126 (619) 271-9940 Circle No. 266 on Inquiry Card UNIX is a registered Tradem; rk of AT&T Bell Laboratories. VAX is a registered Trademark of Digital Equipment Corporation. CEEGEN-GKS GRAPHICS SOFTWARE in C for UNIX □ Full implementation of Level 2B GKS. □ Outputs, Inputs, Segments, Metafile. □ Full Simulation for Linetypes, Linewidths, Fill Areas, Hatching. □ Circles and Arcs, Ellipses and Elliptic Arcs, Bezier Curves. □ Ports Available on all Versions of UNIX. □ CEEGEN-GKS is Ported to Gould, Masscomp, Plexus, Honeywell, Cadmus, Heurikon, Codata, NBI, NEC APCIII, IBM-AT, Silicon Graphics, Pyramid, Tadpole Technology, Apollo, AT&T 3B2, AT&T 6300, DEC VAX 11/750,11/780 (4.2, 5.2), NCR Tower. □ CEEGEN-GMS GRAPHIC MODELING SYSTEM, An Interactive Object- Oriented Modeling Product for Developers of GKS Applications. CEEGEN-GMS and GKS Provide the Richest Development Environment Available on UNIX Systems. □ Extensive List of Peripheral Device Drivers Including Tektronix 4010, 4014, 4105, 4109, HPGL Plotters, Houston Instruments, Digitizers, Dot Matrix Printers and Graphics CRT Controllers. □ END USER, OEM, DISTRIBUTOR DISCOUNTS AVAILABLE. CEEGEN CORPORATION 20 S. Santa Cruz Avenue, Suite 102 Los Gatos, CA 95030 (408) 354-8841 TLX 287561 mlbx ur EAST COAST: John Redding & Associates (617) 263-8206 UNITED KINGDOM: Tadpole Technology PLC 044 (0223) 861112 UNIX is a trademark of Bell Labs. CEEGEN-GKS is a trademark of Ceegen Corp. Circle No. 298 on Inquiry Card U THE MONTHLY REPORT Operating System Engineering at AT&T Bell Labs, stated, “In defin¬ ing the relationship between Sys¬ tem V and application programs, the SVID describes a minimum set of system calls and library routines that should be common to all operating systems based on System V. The remaining com¬ mands and utilities have been grouped into a logical series of optional extensions to the base definition.” The intent of the Sun-AT&T agreement is to incorporate BSD- derivative networking features into the SVID. Does this then mean that AT&T will adopt Sun’s NFS? Sun’s Bill Joy responded, “What has been announced so far is that [Sun] will supply an NFS for System V, and that NFS will be supportable under the AT&T networking scheme; in oth¬ er words, whatever scheme AT&T has for supporting distributed file systems will support NFS. There hasn’t been any announcement [yet] as to what AT&T’s network¬ ing options for its customers will be.” That is, Sun may or may not be sure how, but, as Joy added, “Half the code in our system is networking, so that has to get worked into the common frame¬ work somehow.” In seeking to determine what points are significant in the an¬ nouncement of this agreement, representatives from all sides fo¬ cus on the pivotal role of the SVID. This will determine how UNIX appears to the end user and the application; the technical ma¬ nipulations that go on behind the interface are of secondary impor¬ tance when one speaks of stan¬ dards. Bernard Lacroute, execu¬ tive vice president and general manager of Sun’s workstation division, emphasized this point: “Of first and foremost impor¬ tance is that, at the application level, a System V application or a 4.2 application can run without knowing whether or not it’s Sys¬ tem V or 4.2.” What of significance then comes from the announcement of the Sun-AT&T agreement? First, Sun will support the SVID. Sec¬ ond, System V will continue to support current System V appli¬ cations, while being modified to provide BSD-derived networking services, the specifics of which will be announced later as the Sun-AT&T relationship matures. The news, then, is not that “The Standard Is Here”, but rather that ‘ ‘another step in the continu¬ ing evolution of the standard is here”. There is a third point of sub¬ stance, or perhaps it should be said, “potential substance”. A popular computer industry per¬ ception holds that UNIX cannot be a standard because so many versions of it exist. AT&T, how¬ ever, has a different perspec¬ tive—one that claims the SVID is the standard UNIX base from which other vendors can add features. These features may give each version a different flavor, but the UNIX system at the base will remain standard. If this Sun- AT&T agreement contributes to the industry’s adoption of AT&T’s perspective, it will be substantive for that alone. “That is certainly our intention”, said Bob Mitze, the department head of UNIX Computer System Development at AT&T Bell Labs. “That is our expectation—that. . .people will find that by writing to the SVID they can write portable programs that will move from machine to machine. We expect we will be able to solidify the standard to the point where we won’t find our¬ selves with [the] perception [that the various UNIX versions are too disparate to be one standard]. Most programs turn out to be fairly easy to port. . . But the perception is nonetheless quite important, because that has a lot 12 UNIX REVIEW NOVEMBER 1985 CREATE LASTING IMPRESSIONS HIEROGLYPH UNIVERSAL REPORT PRODUCTION SYSTEM" INTEGRATED TEXT/DATA/ADVANCED GRAPHICS SOFTWARE A "report" is information presented in organized form- stroke commands. Both text and commands can be in typically a printed document. If you are a professional English or several foreign languages. The user manual is whose work involves preparing reports HIEROGLYPH' is written for three levels: New, Experienced and Expert, for you. HIEROGLYPH is designed to meet the needs of Regardless, you can be doing productive work the first technical and office report preparation and produc- day. HIEROGLYPH software incorporates Text lion where the combination of text, data and .gggpg^.. processing, Document Aids, Document Filing, advanced graphics are essential elements. Graphics, Data Handling and Production Tools. Whether you are an engineer, scientist architect, ,f@^v Together they comprise the most produc tive manager of business reports, a writer or graphic system for creating and producing superior re- artist, HIEROGLYPH integrates all information in ports-documents that make lasting impressions, your UNIX Universe. HIEROGLYPH combines all the elements necessary to generate the final HIEROGLYPH is the produc t of Prescience, composition. You have the option of camera- DD __ Inc. (pronounced pres-ce-ence) 820 Bay ready copy or multiple copies in Color or Black r K EbC 11 N(_ t Avenue, Suite TOO, Capitola, California 95010 and White. HIEROGLYPH uses simple, single- (408) 462-6567. NEW CORPORATE HEADQUARTERS 1(25 SOUTH GRANT STREET • SUITE 510 • SAN MATEO, CA 94402 • (415)573-1507 Circle No. 262 on Inquiry Card U THE MONTHLY REPORT to do with how many people are going to write software. . . . The market frequently seems to be based on perception.” ENCORE TAKES A BOW A much-anticipated official an¬ nouncement from Encore Com¬ puter Corp. has finally come to pass. Three new product lines and a version of UNIX are avail¬ able as of this month: a family of general-purpose superminis; three models of interactive work¬ stations; two models of a network communication computer (the “Annex”); and UMAX, yet an¬ other UNIX flavor. The Multimax is designed to permit up to 20 main processors to share a common memory. Con¬ figurations cover a broad range of capabilities; performance that spans 1.5 to 15 MIPS; memory capacity ranging from 4 to 32 MB; and systems containing from one to 10 I/O channels. System prices begin at $112,000 for a dual processor (1.5 MIPS) system with 4 MB of shared memory, one I/O channel, one 515 MB disk drive, one 6250 bpi half-inch tape drive, and a workstation display or console printer. A large system with the same peripherals confi¬ gured for parallel processing ap¬ plications, with 20 processors (15 MIPS) and 32 MB of memory, is priced at $340,000. The Multi¬ max superminis are aimed at the general-purpose computer mar¬ ket, and so compete with DEC VAXen and the Data General and Prime machines in this range. Although the two low-end mod¬ els in Encore’s HostStation line of workstations—the 100 and 110—are single-processor ma¬ chines, they are upgradable to the top-of-the-line 550, a desktop box with two 32-bit processors and a base package including high reso¬ lution (1056 by 864) 19-inch monochrome display, 1 MB of memory, 41 MB of internal hard disk storage (expandable to over 370 MB), three RS-232 ports; and a 814,000 price tag. The Multimax family runs un¬ der UMAX 4.2, Encore’s version of UNIX offering the full function¬ ality of 4.2BSD. UMAX also offers parallel and distributed process¬ ing extensions, using thousands of hardware and software locks to protect individual elements with¬ in system tables. The system features “multithreading”, a de¬ sign providing simultaneous ac- “Now we can build multi-user applications with a relational database—without the time and expense of programming. THAT'S PROGRESS!" Steve Stone. S.B. Stone & Company, Cleveland. OH “We have been a major supplier of custom applications in the Cleveland market for seven years. We switched to PROGRESS™ cut development times in half and are now delivering solutions to our users at lower cost, faster and more easily!” PROGRESS is the only product that lets you build applications com¬ pletely in a high-level fourth-generation environment. Its English language syntax increases productivity 10 to 40 times over COBOL, BASIC, and C. What’s more, PROGRESS is a relational database with crashproof recovery, high-performance multi-user capability on large databases, and portability across UNIXj" XENIX;* and MS-DOS™ If you want to save money and time on application development, call Data Language Corporation at 617-663-5000 and ask about PROGRESS. e DATA LANGUAGE CORPORATION 47 Manning Road, Billerica, MA 01821 617-663-5000 PROGRESS is a trademark of Data Language Corporation, developers of advanced software technology for business and industry. UNIX is a trademark of AT&T Bell Laboratories. MS-DOS and XENIX are trademarks of Microsoft Corporation. 14 UNIX REVIEW NOVEMBER 1985 Circle No. 289 on Inquiry Card cess of system resou tiple processors. (A copy of UMAX res Multimax shared me facilitates multithre processor version of ates the HostStation tithreading is accom multiprocessing prim on memory locks access by multiple p operating system resi rces UNIX EXPO: BIG APPLE STAR It appears Manha twice taken by storm ber. Hurricane Gloria work the week of th|e week after the seci UNIX Operating Sys for mul- shared ities in the nory, which . A uni- l)JMAX oper- 550.) Mul- plished by itives based provide focessors to ources. si igle ttan was in Septem- did her 23rd, one annual em Exposi- ond tion and Conference, and while Gloria gratefully did not live up to dire predictions, UNIX Expo seems to have met its objectives and then some. Held this year at the New York Hilton in Rockefeller Center, UNIX Expo is a business-oriented show, seeking not only to bring UNIX people together, but to as¬ sist UNIX companies in contact¬ ing potential customers: small companies, DP/MIS personnel in larger companies, VARs—any people or organizations consider¬ ing the purchase of UNIX sys¬ tems. Don Berey, account execu¬ tive of show sponsor National Expositions Co., Inc., said those who came saw what they were hoping for: the event boasted 120 exhibitors and 10,460 attendees. The various conferences (four tracks covering UNIX and Office Automation, UNIX in a Data Pro¬ cessing Environment, UNIX Busi¬ ness Solutions, and UNIX and PCs) “played to standing-room- only crowds”, and the tutorials, designed and developed by AT&T specifically for the show, each operated with attendance “at or near capacity”. Berey also pointed out that a large number of exhibitors have already reserved space for next year’s Expo, to be held again in New York City, this time at the new Jacob Javits Convention Center, October 20-22. David Chandler is the Associate Editor of UNIX REVIEW. ■ UNIX UBACKUP BACKUP, RESTORE, AND MEDIA MANAGEMENT SYSTI USECURE UTIL! TY SYSTEM SECURITY MANAGEMENT SPR PRINT SPOOLING AND BATCH JOB SCHEDULING SOFT! -SOY CAN ONWI WARE OU rET 1TH SSL FULL-SCREEN APPLICATION DEVELOPMENT S-TELEX TELEX COMMUNICATIONS MANAGEMENT YOUR JOB. SSE FULL-SCREEN TEXT EDITOR For more inf c call or write. (703)734-98- trmation, \4 These products are available for most UNIX or UNIX-derivative operating systems, including System V, 4.2 BSD. 4.1 BSD. Xenix, Version 7. System III. Uniplus, and others. UNIX is a trademark of AT&T Bell Laboratories. UNITECH SOFTWARE INC 8 3 3 0 O L D C O U R T H O U S E R D . SUITE 8 00 V 1 E N N A . V 1 R G 1 N 1 A 22180 Circle No. 300 on Inquiry Card THE HUMAN FACTOR Of megaflops and multiprocessors by Richard Morin As noted in a previous column (January, 1985), scientists tend to have voracious appetites for computing power. This, along with their tolerance for new and unusual ideas, makes them good prospects for exploratory com¬ puter architectures. Consequent¬ ly, many firms with unusual hardware designs select the sci¬ entific marketplace as their first target. Unfortunately, this has often consigned scientific users to peculiar and even barbaric ex¬ cuses for operating systems. The scientists, needing lots of mega¬ flops, haven’t been able to be choosy. A new day has dawned, howev¬ er, and UNIX is coming to the rescue. Simply by being available, adaptable, and competently de¬ signed, it has become the operat¬ ing system of choice for the current breed of offbeat scientific number crunchers. The fact that it has a significant following of users and vendors doesn’t hurt either. Manufacturers are freed to produce just number crunchers, knowing that a wide range of workstations and other support¬ ing components will be available from other vendors. Consequent¬ ly, we see a host of vector proces¬ sors, multiprocessors, RISCs (re¬ duced instruction set computer), and other machines showing up at UNIX trade shows. The fight isn’t over, of course, and a number of non-UNIX ma¬ chines still are being developed. Some of these come from old-line manufacturers whose existing operating systems are quite satis¬ factory, at least to their current customers. Others, such as data¬ flow machines, reduction ma¬ chines, and inference engines, are so peculiar as to make tradi¬ tional operating systems such as UNIX entirely unsuitable. Still, the facta large number of vendors have chosen to base all or part of their new ventures on UNIX is suggestive of a strong trend. Before beginning our survey, a few words of warning may be in order. First, different architec¬ tures are optimized for different purposes, and a given machine may be entirely unsuitable for a given purpose, despite glowing performance figures. A vector machine that performs very well on large array calculations may be very poor at monte carlo analy¬ sis. Second, benchmark figures are always somewhat suspect, and published performance rat¬ ings are often chosen to favor a vendor’s product. Thus, the fig¬ ures offered here are more indica¬ tive than definitive. Finally, if real money is to be spent, a purchaser is well advised to investigate the track records of the models and vendors in question. Buying a low serial number product can occa¬ sionally be an all too interesting experience. THE HIGH END It's lonely at the top. Only a few companies are involved in the supercomputer game, and their customers—if few—are wealthy. Addressing hundreds of mega¬ bytes of RAM, and performing hundreds of millions of instruc¬ tions per second, these machines are very powerful indeed. Some of the traditional players are still around, but a number of new companies have also arrived on the scene. Cray Research (with headquar¬ ters in Mendota Heights, MN), inspired by hardware guru Sey¬ mour Cray, is the premier Ameri¬ can producer of scientific su¬ percomputers. Cray’s 64-bit ma¬ chines, optimized for fast floating point calculations and array ma- 16 UNIX REVIEW NOVEMBER 1985 Documentation and Software from Customer Information Center World Headquarters for AT&T Documentation Description 3B2PC INTERF PC 6300 PC INT MFCOBOL LA MF COBOL LE\ AT&T 3B Computer Manuals <\CE GUIDE 999-801-020IS ERFACE GUIDE 999-801-021 IS HlGUAGE REFERENCE MANUAL 999-802-00315 EL II OPERATING GUIDE 999-802-004IS RM/COBOL LA MGUAGE REFERENCE MANUAL 999-802-020IS RM/COBOL USER’S GUIDE 999-802-021 IS RM/COBOL RUNTIME GUIDE 999-802-022IS dBASE II USER dBASE II REFE INGRESS INGRESS INGRESS INGRESS INGRESS INGRESS INGRESS INGRESS INGRESS S GUIDE 999-803-000IS 3ENCE MANUAL 999-803-001 IS SYS"'EM OVERVIEW 999-803-002IS QUERY-BY-FORMS GUIDE 999-803-003IS REPORT-BY-FORMS GUIDE 999-803-004IS REFERENCE MANUAL 999-803-005IS VISUAL FORMS EDITOR USER’S GD 999-803-006IS REPORT WRITER REFERENCE MNL 999-803-007IS EQUEL/C PROGRAMMER’S GUIDE 999-803-008IS SELF-INSTRUCTION GUIDE 999-803-009IS ADIV INISTRATOR’S GUIDE 999-803-010IS INFORMIX MANUAL 999-803-015IS FILE-IT! MANUAL 999-803-016IS C-ISAM MANUAL 999-803-017IS /ANUAL 999-804-000IS .ATION GUIDE 999-806-024IS INVENTORY MANUAL 999-806-025IS MULTIPLAN IV BACS INSTAL BACS ORDER BACS PAYROLL MANUAL 999-806-026IS BACS ACCOUNTS PAYABLE MANUAL 999-806-027IS BACS ACCOUNTS RECEIVABLE MANUAL 999-806-028IS BACS GENERAL LEDGER MANUAL 999-806-029IS BACS WORKSHEETS 999-806-030IS ; PAG INGRESS BACS PACKA REFERENCE SERVICE SYSTEM PRO MANUAL UAL GRAMMER’S GUIDE MAvJ SERVICE USER’S MANU PROGRAMM E DRI CBASIC DRI C DRI PL/1 DRI PASCAL Select Code 981-020 981- 021 982- 003 982-004 982-020 982-021 982- 022 983- 000 983-001 983-002 983-003 983-004 983-005 983-006 983-007 983-008 983-009 983-010 983-015 983-016 983- 017 984- 000 986-024 986-025 986-026 986-027 986-028 986-029 986-030 Special Package Price KAGE OF 8 ITEMS (983-003 thru 983-010) GE OF 5 ITEMS (986-025 thru 986-029) 999-900 999-901 AT&T PC 6300 Documentation 637-400 637-800 982-200 AT&T UNIX PC 7300 Documentation UAL AL R’S GUIDE 962-030 981-312 981-313 Special While Quantities Last AT&T PC6300 System/Programming Software OMPILER MTt CATALOG THE UNIX SYSTEM V SOFTWARE CATALOG (Fall 1984 Issue) 021-105 021-107 021-108 021-109 307-125 ORDER TOLL FREE 1 -800-432-6600 OPERATOR 363 Price $24.00 $15.00 $35.00 $35.00 $40.00 $ 20.00 $ 10.00 $25.00 $25.00 $ 20.00 $ 20.00 $ 20.00 $ 20.00 $ 20.00 $ 20.00 $ 20.00 $ 20.00 $ 20.00 $45.00 $35.00 $35.00 $35.00 $ 10.00 $40.00 $40.00 $40.00 $40.00 $40.00 $5.00 $ 120.00 $180.00 $65.00 $150.00 $65.00 $150.00 $65.00 $65.00 ORIGINAL SPECIAL PRICE PRICE $600 $350 $750 $600 $299 $199 $389 $299 $19.95 $13.95 AT&T The right choice. ■ AT&T’s Customer Information Center, Marketing Dept. 2855 N. Franklin Rd. Indianapolis, In. 46219■ Uthe human factor nipulations, are the standard of comparison for scientific num¬ ber crunchers. A Cray X-MP, for example, can do about 250 megaflops (250,000,000 floating operations per second), for a mere $5 million. The Cray 2 is reputed to be faster still. And, naturally, Seymour is hard at work on the Cray 3. But what about software? COS, similar to CDC’s NOS, has been Cray’s historical proprietary operating system, but that pic¬ ture has changed. Cray Research is using UNIX System V on the Cray 2, and says that it will port UNIX to the other models in the near future. Some Japanese firms (Fujitsu, Hitachi, NEC) have produced very respectable supercomputers. A lack of software, among other things, has kept these machines from being distributed effectively outside of Japan. This is in the process of changing, however, and UNIX is playing a large role. All of these vendors have an¬ nounced computers that run UNIX. In addition, the powerful Japanese Ministry of Internation¬ al Trade and Industry (MITI) has opted for UNIX as its primary standard. It is thus only a matter of time before UNIX-based super¬ computers begin to arrive from Japan. Denelcor (Aurora, CO) has not yet produced a machine that can take on a Cray, but it expects to do so in the near future. Currently, the firm produces only the HEP1 system, composed of up to eight processors, each of which can do 16 MIPS. Previously plagued by a lack of good support software, Denelcor has recently announced the introduction of a real-time, parallel processing version of UNIX for the HEP1. The HEP2, now being prototyped, is expected to be capable of 12K MIPS, put¬ ting it firmly in the supercom¬ puter league. ETA Systems (St. Paul, MN), a CDC spinoff, is scheduled to deliv¬ er its first ETA-10 UNIX-based, vector multiprocessor in late 1986. Delivering performance in the range of 10 gigaflops, the system will be able to support eight 64-bit vector processors, each with up to 32 MB of memory. In addition, the ETA-10 can have up to 2 GB of shared memory. Though commercial supercom¬ puters are generally not well optimized for scientific tasks, their powerful processing and I/O capabilities can occasionally be very useful. With the addition of attached array processors such as those made by Floating Point Systems (Beaverton, OR), a tra¬ ditional commercial mainframe such as an IBM 3084 can easily qualify for scientific supercom¬ puter status. IBM’s interest in UNIX has been tepid to date, however, and its future directions are quite unclear. Still, IBM has (grudgingly) announced support for UNIX on its mainframe com¬ puters. Amdahl (Sunnyvale, CA) is also a name to be reckoned with in the commercial supercomputer field, and it has been a UNIX advocate for some years now. Finally, any number of super¬ computer designs are always brewing in assorted laboratories and universities. Many of these will never be built, and most will be of only academic interest. Still, it is this ferment that has pro¬ duced many of today’s hot ma¬ chines, and it will no doubt SEARCHING FOR STATISTICS At last. TRANSTAT! A statistical package for Unix-based systems, written in C TRANSTAT gives you frequencies, cross-tabs, correlations, regres¬ sions, and more. Completely menu-driven with fully labeled reports. TRANSTAT allows for total control of data recoding, case selection, and missing data. Call for more information on TRANSTAT and Unix hard¬ ware, software, consulting and training. SPECIALISTS IN UNIX COMPUTING 1700 Shattuck Avenue Berkeley, California 94709 415 841 1800 Dealer inquires are welcomed TRANSTAT is a registered trademark of BASIS Unix is a trademark of AT&T Bell Laboratories. Circle No. 296 on Inquiry Card 18 UNIX REVIEW NOVEMBER 1985 YOU CHOOSE: Terminal Emulation Mode MLINK CU/UUCP Menu-driven Interface Yes Expert/brief Command Mode Yes Yes Extensive Help Facility Yes Directory-based Autodialing Yes Automatic Logon Yes Yes Programmable Function Keys Yes Multiple Modem Support Yes Yes File Transfer Mode Error Checking Protocol Yes Yes Wildcard File Transfers Yes Yes File Transfer Lists Yes Yes XMODEM Protocol Support Yes Compatible with Non-Unix Systems Yes Command Language Conditional Instructions Yes User Variables Yes Labels Yes Fast Interpreted Object Code Yes Program Run Yes Subroutines Yes Arithmetic and String Instructions Yes Debugger Yes Miscellaneous Electronic Mail Yes Yes Unattended Scheduling Yes Yes Expandable Interface Yes CP/M, MS/DOS Versions Available Yes MLINK I 5 The choice flexible teleio to use. MUN prehensive unique script MLINK I3S0 easy. Our MLINK Data Communications System is the most powerful and mmunications software you can buy for your Unix™system. And it’s easy K comes complete with all of the features listed above, a clear and com- 275-page manual, and 21 applications scripts which show you how our language satisfies the most demanding requirements. Unix System V Unix System III Unix Version 7 BSD 4.2 Xenix VM/CMS MS-DOS CP/M and more... Choose the best. Choose MLINK. ^ tos Data General IBM Arrete DEC Onyx AT&T Kaypro Plexus Compaq Honeywell and more. MLINK is a trademark trademarks <>l Mir rosof is ideal for VARs and application builders. Please call or write for information. Corporate Microsystems, Inc. P.O. Box 277, Etna, nh 03750 (603) 448-5193 I Corporate Microsystems. Inc. Unix is a itademaik ol AIM Hell Labotaloties. IHM is a registered trademark of IBM Corp. MS-DOS and Xenix are Corp. C'P/M is a registered trademark of Digital Research. Circle No. 254 on Inquiry Card 1-lTHE HUMAN FACTOR continue to be a fertile source of new computer architectures. The ACM SIGARCH newsletters and conference proceedings contain many interesting descriptions of novel theoretical, experimental, and even commercially produced architectures. Electronics maga¬ zine is also a very good source for information on new commercially produced machines and interest¬ ing hardware trends. CRAYETTES It occasionally happens that one's processing requirements are not matched by a budget allowing the purchase of a multi¬ million dollar number cruncher. This could happen to anyone, but fortunately there are several ven- These are exciting times for hardware junkies, and UNIX continues in its role as a distributed laboratory for computer science research. dors who are quite eager to help. The machines they produce, known as Crayettes, typically cost less than a megabuck, but provide as much as a quarter of the power of a Cray. Many of these machines are augmented by an assortment of vectorizing compil¬ ers and other software aids. Several Crayette producers are making full vector processors. Two such machines, aimed di¬ rectly at Cray owners, are pro¬ duced by American Super Com¬ puter and Scientific Computer Systems. These companies have chosen to maintain binary com¬ patibility with Cray 1 processors, and are even porting Cray’s COS. This strategy may be short-lived, however, in light of Cray’s move to UNIX. A number of other vendors have decided to go with UNIX, occasionally assisted by an underlying parallel kernel. Alliant Computer Systems (Ac¬ ton, MA) makes a multiprocessor 4.2BSD system that supports up to 256 MB of real memory, 2 GB of virtual memory, and a mixture of computational and interactive processors. At its full configu¬ ration of eight 32-bit vector pro¬ cessing computational elements, the system can reach speeds of 94 megaflops and 35 MIPS. The Alliant Fortran compiler auto¬ matically detects opportunities for parallel execution, allow¬ ing the runtime environment to perform entire DO loop bodies on multiple processors. Special hardware and software allow the system to deal with dependencies of one iteration on another. Convex Computer Corp. (Rich¬ ardson, TX) produces the C-l 64- bit pipelined vector processing system, which runs an operating system based on 4.2BSD. The C-1 is able to do 60 megaflops and handle up to 128 MB of memory, while maintaining VAX/VMS For¬ tran compatibility. An interesting technique known as “disk strip¬ ing’’ is now being used by Convex. With this technique, a set of disk drives is treated as a single drive. Great-looking TROFF output from low-cost laser printer! ■ Now! Full support for LaserJet+B For several years, Textware has been licensing TPLUSt software to process the output of troff and ditroff for a wide variety of phototypesetters, laser printers, etc. Now, with TPLUS driving the LaserJet*, we have again set a new standard for price/pcrformance. By adding our Graphics Option , with DWBt, you have the total solution to your document production requirements. Many organizations are now getting maximum benefit from the HP LaserJet, using our TPLUS/LJ software. The low-cost LaserJet is a remarkable value on its own—8 page per minute output speed, 300 dot per inch resolution, and typesetter-quality fonts. TPLUS gives you access to all this and more from your own system. We support all the characters and accents needed by troff and eqn; in addition, special characters (©; logos too) can be sup¬ plied or generated to meet specific requirements. Our precise handling of rules and boxes allows you to take full advantage of tbl for forms, charts, etc. While even LaserJet output is not in the same class as the best phototype, it is certainly well suited to documentation and a broad range of other applications. When you do have a need for phototypeset images, TPLUS and the LaserJet will save you time and money. Preview mode lets you proof all aspects of your docu¬ ments conveniently, in-house, before sending out for phototypesetting (from our UNI#TEXT service). Cross-device proofing is a standard feature of TPLUS. The HP LaserJet printer is not only inexpensive—it is an exceptional value! Want proof? This entire ad was set in position using TPLUS on the LaserJet! t TPLUS is a trademark of Textware Inti. t Documenter’s Workbench is a trademark of AT&T For further information, please write or call. Also available for: • AM 5810/5900 & 6400, APS 5 & ^ 5 , CG 8400 & 8600, Mergenthaler 202 • Xerox 4045, 2700/3700 & 8700/9700 • BBN, Sun, 5620 & ‘PC’ CRTs • Diablo, Qume & NEC LQPs • C Itoh & Epson dot-matrix Circle No. 293 on Inquiry Card HI TEXTWARE M] INTERNATIONAL POBoxM Harvard Square Telephone: Cambridge, MA 02238 (617) UNI-TEXT EQN examples lim (tan * ) ,in 21 = 1 X —*7f/2 a+fi n e Sk ‘ k/k k> 1 sin(x) 20 UNIX REVIEW NOVEMBER 1985 ontie to TERM with your enix communications problems. TERM - More Powerful. Easier To Use. Compare Ihese Special Features: » 1 s Easy to i ^ Online u: s Menu dri s Fast - 96 ^ Self instc ^ Powerful ^ Wildcard ^ Automatic ^ Xon/Xoff, cols for i systems smember mnemonic commands ^ Xmodem protocol for remote bulletin boards 5 er’s manual for instant help Full/half duplex emulation modes /en interface ^ Automatic login and logout i DO baud file transfers ^ Auto-dial, auto-redial, answer and hangup lling modem support scripting language with variables ^ Unlimited phone number directory for auto¬ file send/receive capability dialing error-checking and re-transmission ^ Unattended file transfers Etx/Ack, Line and character proto- ^ Remote maintenance capability communications with non-TERM s Sample scripts included MS-DOS and CP/M versions available TERM - Powerful Communications. TERM - Unix'Xenix’s most powerful communications program. TERM Communications Software provides a full-featured, programmable communications tool under the Unix/Xeni> environment. You’ll appreciate wide user base, TERM is both It has extras yoi programs: On- error-checking binary data. TERM’s ease of use, compatibility with a and ability to talk to most other systems, smart terminal and file transfer program, won’t find in other Unix communications HELP, character translation, efficient protocols and file transfers for text and l.ne TERM provide language, auto- unattended for » full modem control, an extensive script ogin and logout functions, and can be run remote maintenance. TERM COMMUNICATIONS SOFTWARE Call or write for more information. 1% CENTURY SOFTWARE TERM is availab e 586, 2086, IBM AT, AT&T 3B2, IBM Pp/XT, out how easy it is MSDOS machines all NOW on the Altos Tandy Model 16, 6000, M* , and many others. Find get your Unix, Xenix and talking together. 295 °. 00 9558 South Pinedale Salt Lake City, Utah 84092 (801)943-8386 visa / MC Unix is a crai iemark of AT&T Bell Laboratories. MS-DOS and Xenix are trademarks of Microsoft Corp. CP/M is a registered trademark of Digital Research Inc. Circle No. 251 on Inquiry Card V THE HUMAN FACTOR Concurrent writing allows data transfer rates to be multiplied, and the increased apparent size of the disk allows much larger files to be handled. Other vendors have opted sim¬ ply to produce fast multiprocessor scalar machines. The most ag¬ gressive design of this sort comes from Flexible Computer (Dallas, TX), whose FLEX/32 can com¬ bine up to 20,840 processor cards, currently based on the 32- bit NS32032 microprocessor. A key design factor, however, is the system’s ability to integrate many different kinds of processing ele¬ ments. Supporting real-time as well as number crunching appli¬ cations, the system allows both hardware and software reconfi¬ guration to be done while the system is running. ELXSI (San Jose, CA), another multiprocessor vendor, has cho¬ sen instead to use small numbers of very powerful processors. The ELXSI product is perhaps more of a parallel mainframe than a mini¬ supercomputer. Its 300 MB-per- second bus supports up to twelve 64-bit processors, each of which is approximately equivalent in power to a DEC VAX 8600. The processors can share 200 MB of memory, to be quadrupled with the introduction of 256 kilobit RAM chips. A company spokes¬ man notes that parallelization of code is often far easier than vectorization, and that such pro¬ grams as SPICE are easily and efficiently run on ELXSI archi¬ tecture. In its newly announced iPSC system, Intel Scientific Comput¬ ers (Beaverton, OR) has opted to use CalTech’s hypercube archi¬ tecture. In this design, 2 N process¬ ing nodes are used, with each node being able to communicate directly with N other nodes. The iPSC can be purchased in con¬ figurations of 32, 64, or 128 nodes, with each node containing an 80286 CPU, an 80287 FPU, and 512 KB of memory. The system is controlled by a UNIX- based “cube manager’’, which is responsible for resource manage¬ ment, user interface, and other support functions. At 25 to 100 MIPS and 2 to 8 megaflops, the iPSC is hardly a full supercom¬ puter, but at $500,000, it does provide a relatively economical base for research into arbitrary multiprocessor topologies. Literally dozens of vendors are producing multiprocessor or oth¬ erwise unusual UNIX systems. February’s UniForum trade show in Anaheim will no doubt be full of such vendors hawking their wares, with the offbeat systems standing next to the YAWN (Yet Another Workstation or Network) products. These are exciting times for hardware junkies, and UNIX continues in its role as a distributed laboratory for com¬ puter science research. Mail for Mr. Morin can be addressed to Canta Forda Com¬ puter Lab, PO Box 1488, Paci¬ fica, CA 94044. Richard Morin is an independent computer consultant specializing in the design, development, and documentation of software for engi¬ neering, scientific, and operating systems applications. He operates Canta Forda Computer Lab in Paci¬ fica, CA. ■ TEC H NOLO GIES UNIX* COMMUNICATIONS X.25 • HASP • SNA3270 • SNA3770 Drop-in communication systems for MULTIBUS* based computers. Offload the CPU intensive process of com¬ munication with the HORIZON"™ Series of boards from MORNING STAR. Complete systems include your choice of hardware and software combinations to custom fit your data communication needs. Available for: Sun Microsystems, Masscomp, Pyramid, Heurikon, Plexus, NCR Tower, Sperry 5000, Celerity and more. Call today for more information Morning Star Technologies, Inc. 1760 Zollinger Road, Columbus, Ohio 43221 In Ohio call [6ia] 451-1883 TWX - 510 - BOO - 32*72 •UNIX is a Trademark of ATST Bell Labs • MULTIBUS is a Trademark of Intel Corp Circle No. 292 on Inquiry Card 22 UNIX REVIEW NOVEMBER 1985 le on the cutting computers. omputersys- The Firebreathers continu edge of high performance The most powerful line of terns made. Gould PowerNodes'" and CONCEPT/32s" Any way you slice it they beat the VAX'" Our main¬ frame PN9000 and CONCEPT 32/97 are up to twice as fast as the K/AX 8600. And even though the mic -range PN6000 and CONCEPT 32/C 7 are 30-50% smaller than the VAX 11/780, they're still up to three times more powerful More power for a slice of the price. Despite their superior power, our mid¬ range models cost 40% less than the VAX 11/780. Our mainframes cost about 30% less than the new VAX 8600. The bottom line is more power for less money. Operating environments that are a cut above the rest. There’s also a choice of system soft¬ ware to consider. Gould’s unique UTX/32® is the first operating system to combine UNIX* System V with Berkeley BSD 4.2. So it allows you to access virtually any com¬ mand format you want whenever you want. And in real-time environments, Gould's MPX/32'” operating system offers perfor¬ mance that's unmatched in the industry, as well. Delivery that’s right on the mark. Unlike the VAX &600, that has up to a 12 month wait for delivery, when you order either a Gould PowerNode or a CONCEPT/32 system, they’ll be shipped within 90 days ARO. You can also be sure with Gould you’re getting a computer that’s backed by years of experience- the kind of experience we used to develop the first 32-bit real-time computer. If you need more information or just have a few questions, give us a call at 1-800-327-9716. See for yourself why VAX no longer cuts it. Go with a Gould computer and ax the VAX. CONCEPT/32 and UTX/32 are registered trademarks and PowerNode and MPX/32 are trademarks of Gould Inc. VAX is a trademark of Digital Equipment Corp UNIX is a trademark of AT&T Bell Labs ■> GOULD IV Gould computers have a big enough edge/to ax the\4AX. Electronics Circle No. 268 on Inquiry Card tjvIhE T^riNAL r RONTIER UNIX and scientific applications: symbiosis or antithesis? by Joseph S. Sventek The popularity UNIX enjoys in many segments of the computing community is hardly a secret. Historically, it has been the oper¬ ating system of choice on mini¬ computers in academic circles. Its recent availability on supermicro computers also has made it an attractive system for the business community. Even the home com¬ puter market has been affected by Xenix, PC/IX, and various UNIX work-alikes. There is, however, one major segment of the computational fraternity that has received UNIX with something less than enthu¬ siasm—the scientific communi¬ ty. This is not to imply that UNIX cannot be applied to scientific problems—the remaining arti¬ cles in this issue provide evidence to the contrary. Even so, there are some legitimate reasons for the reticence scientists have shown in adopting UNIX. This article explores those reasons and offers a prognosis for the future suc¬ cess of UNIX systems in this marketplace. A TAXONOMY OF SCIENTIFIC APPLICATIONS The first major category of scientific applications might best be described as “computationally intensive”. These applications There is one major segment of the computational fraternity that has received UNIX with something less than enthusiasm—the scientific community. are of two general types: numeri¬ cal simulations of physical phe¬ nomena and analysis of experi¬ mental data. Applications of either type make huge demands on a CPU’s ability to perform floating point operations (FLOPs). The operating system features with the biggest impact on the execution of these applications are unlimited process address space and high-level compilers capable of generating efficient floating point object code. The second major category of scientific applications is made up of event-driven tasks, which can be partitioned into data acquisi¬ tion systems and experimental control systems. Each applica¬ tion must be able to respond quickly to external events, where this quickness depends on the system being measured or con¬ trolled. The critical services pro¬ vided by an operating system in support of this category include: 1) small (or bounded) interrupt latency, 2) a user-tailorable prior¬ ity scheduler, and 3) non-block¬ ing system services accessible to user processes. One class of applications in particular represents a combina¬ tion of the “computationally intensive” and “event-driven” requirements: computer graph¬ ics. Graphical summaries of sim¬ ulated or analyzed data usual¬ ly entail significant amounts of floating point computation, while real-time displays sup¬ porting event-driven applications represent additional peripherals for which the processing time must be bounded. Applications of this sort require that the operat¬ ing system provide interactive display tools and standard graph¬ ics library calls for program invocation. In summary, for a single oper¬ ating system to support all major categories of scientific applica- 24 UNIX REVIEW NOVEMBER 1985 Illustration by Heda Majlessi UNIX REVIEW NOVEMBER 1985 25 THE FINAL FRONTIER tions, it must provide the follow¬ ing facilities: • unlimited process address space (virtual memory). • efficient high-level language compilers. • bounded interrupt latency. • priority scheduling. • user-mode asynchrony. • standard graphics subroutine libraries. • interactive graphics utilities. A SHORT HISTORY OF SCIENTIFIC COMPUTING Although we are primarily con¬ cerned with the relevance of UNIX to scientific computation, knowledge of the culture that has developed in scientific computing circles will help us more fully understand the situation. It is important to note that, after code breaking, computationally inten¬ sive scientific applications made the first major use of early com¬ puters. These early programs were written in machine lan¬ guage. As scientific computation grew more commonplace, it became apparent that the low-level pro¬ gramming languages of the day (machine and assembly) were hampering productivity severe¬ ly. In response to this, one of the first high-level languages, For¬ tran (FORmula TRANslation), was designed to permit scientists to program in a language closer to the algebraic formulas used in their initial derivations of prob¬ lems. The tremendous improve¬ ment provided by Fortran quickly led to its adoption as the lingua franca of scientific computing in the early 1960s. Since most com¬ putational resources were quite scarce, Fortran compilers devel¬ oped a reputation for generating very efficient object code. One factor that often deter¬ mines the envelope of experimen¬ tal and theoretical science is the amount of computational power that can be brought to bear on the problems at hand. Other system considerations (like the command language, the program develop¬ ment environment, and the file system) are secondary to the UNIX, as it is commonly delivered, is not able to provide the facilities necessary to support event-driven applications. number of FLOPs that can be performed. As a result, organiza¬ tions with an insatiable appetite for FLOPs (such as the US Depart¬ ment of Energy laboratories) tended to order only bare hard¬ ware from supercomputer man¬ ufacturers in the early days of scientific computing. The system programming staffs of these orga¬ nizations would then craft mini¬ mal batch or timesharing operat¬ ing systems on top of this iron. The highest priority item during the development of these operat¬ ing systems was always an opti¬ mizing Fortran compiler. Other aspects of the system were usual¬ ly made compatible with previous systems—leading to virtual im¬ mortality for a number of primi¬ tive operating system interfaces. As scientists started to become involved in event-driven applica¬ tions, they naturally wanted to use a programming language with which they were familiar. As a result , local extensions to Fortran (both the language and its run¬ time library) were implemented to permit the language’s use in these real-time systems. Such local extensions generally caused a decrease in the productivity of programmers because of the mo¬ bility of scientific researchers. The fact that these originators often were not available for later support of their software meant that others either had to live with the problems they found or had to re-write whole sections of code. To guard against this, several progressive standards were devel¬ oped for the Fortran language (ANSI X3J3). This all has served to make a good Fortran compiler essential to a scientific computer system. The compiler must accept pro¬ grams written in standard For¬ tran and generate efficient object code. Most scientists don’t care much about the rest of the sys¬ tem, opting for compatibility with the past whenever a choice is available. STANDARD UNIX SUPPORT FOR SCIENTIFIC APPLICATIONS UNIX provides many of the facilities necessary to support the computationally intensive class of scientific applications. In particular: • With the introduction of 3BSD, virtual memory support in UNIX systems was established. Berke¬ ley has continued to improve the virtual memory support in BSD releases, while many commer¬ cial vendors—AT&T included— have begun to offer their own support for virtual memory. • A Fortran compiler (f77) is pro¬ vided. While this compiler cor¬ rectly processes standard For¬ tran, some of the design de¬ cisions in the construction of the compiler prevent it from gener¬ ating efficient object code. (See Continued to page 44 26 UNIX REVIEW NOVEMBER 1985 TRAINING Whether you’re training 2000, 200, or two.. .you can select the most efficient and economical training solution for your unique environment. VIDEO-BASED TF AIMING for professionally pro- INTERACTIVE VIDEODISC TRAINING, using state- duced, consistent ti aining that is always available of-the-art technology to dynamically tailor courses at your location. ] to the individual—from novice to expert programmer. PUBLIC SEMINA! for Non-Program ■ Language Progn • UNIX Administi UNIX • Berkeley system and to sp< RS offered in major cities throughout the world: UNIX Overview • UNIX Fundamentals Tiers • UNIX Fundamentals for Programmers • Shell as a Command Language • C’ amming • Shell Programming • Using Advanced UNIX Commands • UNIX Internals ation • Advanced 'C’ Programming Workshop • Advanced C ’ Programming Under undamentals and csh ’ ShelL ON-SITE SEMINARS for training customized to your efcific groups within your organization. ASK FOR OUR AND CURRENT CALL (800) 323 ► Extensive Curricujla manufacturers, s< ► Quality of Instructjioi in teaching UNIX UNIX-based syste 48-PAGE COURSE CATALOG SEMINAR SCHEDULE, -UNIX or (312) 987-4082 Three factors make tl e Computer Technology Group the experts in UNIX and ‘C’ language training: • Experience, throijgl with thousands h training thousands of students worldwide in live seminars, re using our video training at their locations. Supporting All UNIX Versions, creating a client base of ;ftware developers and end users. »n, with instructors and course developers who are experts and ‘C, as well as in designing and implementing a variety of ms. COMPUTER TECHNOLOGY GROUP Telemedia, Inc. 310 S. Michigan Ave. Chicago. IL 60604 The Leading Independent UNIX System Training Company '"UNIX is a trademark of AT&T Bell Laboratories. Circle No. 247 on Inquiry Card A RUN THROUGH THE MILL Experiences with scientific data analysis using UNIX by Robert Goff e W^eientific computing is a very mixed bag. It seems that scien¬ tists are particularly imaginative when it comes to devising appli¬ cations that find the weaknesses in an operating system or hard¬ ware configuration. The diver¬ sity in functionality required for even relatively simple scientific data analysis leads to system complexity and performance re¬ quirements almost unheard of in other major segments of the com¬ puting industry. It also raises the inevitable question: is UNIX real¬ ly up to it? A complete answer, of course, is not possible in the space allo¬ cated to this article, nor, for that matter, in this entire issue of UNIX REVIEW. Rather, what will be attempted here is a discussion of a few of the adversities await¬ ing the scientific system develop¬ er and a sampling of problems that UNIX has played a key role in solving. CONVENTIONAL WISDOM As in any other discipline, much of the “knowledge” sur¬ rounding the use of UNIX for particular types of applications comes in the form of old wives’ tales. Most of these tales have some basis in fact or history, but none should be taken at face value without investigating the implications for the specific ap¬ plication at hand. Many of the restrictions that purportedly be¬ set UNIX either only apply to a restricted set of problems or re¬ flect deficiencies that already have been solved over the course of the operating system’s evolu¬ tion. The old unstablefile system bugaboo is a classic example of a malady that no longer plagues UNIX. Conventional wisdom says if you are trying to acquire even a moderate amount of data in real time, you shouldn’t use UNIX. Everybody knows that UNIX does not function well when subjected to the high interrupt rates char¬ acteristic of this kind of machine activity. Either you won’t be able to keep up with the incoming data or you’ll have to assign the acqui¬ sition such a high priority that the machine will become slug¬ gish, the clock will lose time, the disk will blow revs , and the crash rate will go up alarmingly. But 28 UNIX REVIEW NOVEMBER 1985 Illustration by Stephen G. Luker REAL WORLD EXPERIENCES how much of this is really true? And what does it mean to your application? At the heart of this consider¬ ation is the question: what do you mean by real time? A simi¬ lar, equally perplexing issue is the need to come to grips with what a development environment is and how that differs from a production environment. The virtues of UNIX in a development environment are well known by now and no doubt account for a large component of the system’s popularity in the research com¬ munity. After all, the major cost component in most development projects is people time —a cur¬ rency that well displays the value of UNIX. But does this mean that the system’s value goes down as we get closer and closer to freez¬ ing the code ? In scientific data analysis, a similar contrast often is suggest¬ ed between off-line and re¬ search mode (interactive) pro¬ cessing. Does it become nothing more than a question of volume, or is there something fundamen¬ tally different in the way these two types of computing are done? If a difference is suspected, is it borne of convention or necessity? Can UNIX support a high volume processing environment? Alas, there is no universal answer. The question that really needs to be asked is: what is meant by “high volume"? REALTIME AND UNIX As a simple illustration, I will describe a data acquisition sys¬ tem now underdevelopment. The hardware is a MC68000 Multi¬ bus box running 4.2BSD. We in¬ tend to acquire approximately 20 channels of seismic data and digitize it at 20 samples per second to 16-bit accuracy. The sampling operation must be syn¬ chronized with coordinated uni¬ versal time (UTC) to obtain an absolute sample timing uncer¬ tainty of less than 20 millisec¬ onds. The application also must — 0 - 1 — It seems that scientists are particularly imaginative when it comes to devising applications that find the weaknesses in an operating system or hardware configuration. run in the presence of other processes that route the data to its final destination over a com¬ munications link. In view of the relatively low data rate and the short time allotted for device driver develop¬ ment, we chose a simple, inex¬ pensive multiplexed analog-to- digital converter board offering two major programming modes, both of which generate an inter- rupt-per-sample. In retrospect, it probably would have been wiser to spend about half again as much money on this module to get a board with some on-board buf¬ fering and a DMA bus interface. This was rejected, however, since all the boards that we found required some custom firmware development for an on-board mi¬ cro. We simply chose not to devote that much time to this aspect of the development. The device driver for the sys¬ tem took about two weeks to write and was designed to allow us to use the board in any of its pro¬ gramming modes with any config¬ uration of inputs. We knew that at a rate of approximately 400 interrupts per second, problems were likely, so we paid particular attention to streamlining the in¬ terrupt handler in the hope that we might avoid burning system time at an inordinate rate. Of equal importance was the fact that we needed to maintain an overall system throughput that would allow us to keep up with all of the tasks associated with refor¬ matting and transmitting the data—and would allow us to maintain a reasonable set of rec¬ ords on the state of the system’s health. The upshot is that the system had to be designed so that it wouldn’t monopolize critical system resources unnecessarily. This amounted to integrating ac¬ quisition tasks into the system in a way that was polite to other activities. We next wrote some user-level code to read the data. This is where we really started to learn how the system would react. The first programming mode we tried, known as random mode , re¬ quired that the interrupt handler supply a new channel and re-arm the converter trigger for each sample. We started experiment¬ ing with relatively low trigger rates before cranking up the speed to see if we could reach our goal of an aggregate conversion rate of 400 samples per second. The actual conversion process takes only about .4 milliseconds, so we could tolerate interrupt service delays of as much as 24.6 milliseconds before data over¬ runs would occur. We were somewhat disappoint¬ ed to learn that even at a low 100 sample-per-second rate, we ex¬ perienced an unacceptable num¬ ber of data overruns—even when the machine was unloaded. We tried busy/wait loops in place of 30 UNIX REVIEW NOVEMBER 1985 kernel sleeps and used other methods that made the acquisi¬ tion less polite bun it quickly became clear that we were on the wrong track. Without placing the hardware interrupt priority of the board considerably higher than we wanted, we could pee that this mechanism was not going to work. The other programming mode that was available to us is called scan mode. Under this scheme, a trigger is used to initiate an entire scan that starts ak some low channel number and proceeds to some higher channel number. The trigger begins tne first con¬ version and upon completion, issues an interrupt The act of reading the data initiates conver¬ sion on subsequent Channels un¬ til the high channel is reached, at which time an end-of-scan is signaled and the board stops, waiting for the next trigger. Note that the sequence involves field¬ ing just as many interrupts as random mode would require but allows the interrupts to be spread out in accordion fashion over the full time interval allotted for the scan. In order to use scran mode, we knew we would have to give up the generality of being able to sample the channels in any order. We decided, though, that this sacrifice was of minor practical importance since the data had to be massaged before transmission if samples were to be rearranged with little penalty. Of more con¬ cern was the fact that scan mode would not allow ms to control absolute sample timing as well. Conversions became a function of how quickly we could get around to servicing the interrupts for all the earlier channels in a scan. To solve this problem] some instru¬ mentation was necessary to de¬ termine the length pf the average scan. If we could get a good estimate of the constant portion of the delay between the trigger and conversion for each channel, we felt we could at least remove — 2 - 3 — The major cost component in most development projects is people time —a currency that well displays the value of UNIX. that component of the timing error. Using scan mode, we found that we could speed up the aggre¬ gate conversion rate to more than 1 000 samples per second in the presence of other processing be¬ fore we experienced any data overruns. It should be added that the prototype system was con¬ nected to other machines at our facility via a local area network and had no disk of its own. A fairly severe test of the system came when we inadvertently left the rwho daemon running during one of our test runs. The rwho daemon presents an intermittent load to the system and makes network traffic, in particular, fluctuate wildly. These tests led us to another modification of our approach. We had allowed the interrupt handler to place newly arriving samples into a buffer (allocated in the device driver) that could hold about 1 second of incoming data. We devised a simple wrap-around scheme for handling buffer over¬ flows and a sequencing method to report them as they occurred. The user-level code we were using read one scan at a time and placed the data on disk files (across the network). Only when we started to add functionality did further problems arise. It seemed we just couldn’t make the program run fast enough (or get enough of the CPU) to keep up with the data. The application would run for a while and sooner or later produce a buffer overflow. I suggested that we instrument the code until we really understood where the bot¬ tleneck existed. A few hours later, one of the programmers on the project mentioned that he’d no¬ ticed rwho running while he was performing his tests, leading him to wonder if this might be the source of the trouble. Of course, his comment led us right to where the real problem had been all along—in the device driver’s small buffer size. When we ex¬ panded this buffer to accommo¬ date 8 seconds of data, all prob¬ lems of this nature disappeared. WHAT ARE THE LESSONS? The two solutions illustrated above are not terribly surprising in and of themselves—in fact they are fairly obvious to everyday users of UNIX. What is surprising (or at least was to me) is the magnitude of the effect they por¬ tray. I never would have thought that we could squeeze a ten-fold sampling rate increase out of our system simply by easing the criti¬ cal path problem presented by the interrupt service mechanism. Nor would I have dreamed that we would have to let our user-level code ignore the incoming data stream for anything close to 8 seconds during routine pro¬ cessing. What this illustrates is that, compared to smaller, less func¬ tional operating systems, UNIX gives the appearance of a higher degree of asynchrony in its man- UNIX REVIEW NOVEMBER 1985 31 REAL WORLD EXPERIENCES agement of machine activities. On the face of it, this may seem to be a disadvantage in that system buffers need to be larger and critical paths may become more numerous and difficult to fore¬ cast. But, though critical path management may be more impor¬ tant, the facilities provided by the operating system for doing it are more numerous and general, and the help they provide to the system developer may result in better overall system throughput when the processing load is ana¬ lyzed as a whole. As the limitations on system throughput are explored, it’s in¬ variably found that one critical resource or another is in short supply. Under UNIX, there may be a few more procedure calls be¬ tween you and the handling of an interrupt or the reading of some data—and this may affect maxi¬ mum system throughput if the resource in short supply is CPU time—but UNIX also provides some tools for use in dealing with these problems. Without extraordinary effort, anything approaching total utili¬ zation of all system resources is unlikely. So the task faced by the system developer is to offload processing from resources that are approaching saturation to those that are under-utilized. UNIX helps with these efforts. In the final analysis, the re¬ source that the system developer must manage most carefully is development time. After only two weeks of development, we had a device driver with only one minor deficiency—one that was easy to remedy. Further, the level of func¬ tionality and flexibility we were able to achieve in that time was significantly improved by the completeness and generality of the model on which UNIX de¬ vice driver implementations are based. The availability of facili¬ ties such as a general-purpose kernel sleep mechanism can be invaluable when efficient use of critical resources becomes im¬ perative. Since this application does not tax most of the resources pro¬ vided by our machine (the CPU in particular), it is clear that a significant increase in capacity should be possible. The job of implementing a device driver for a higher performance A/D system would not, in my estimation, be significantly harder or more time consuming than for the simple device we have used. Further, the impact of the acquisition on the remainder of a machine’s pro¬ cessing load should, if anything, be more controllable since the device in question should have — 4 - 5 — Without extraordinary effort, anything approaching total utilization of all system resources is unlikely. more intelligence and its fea¬ tures should be fairly accessible. Thus, custom firmware develop¬ ment aside, an attractive expan¬ sion path for the system seems to exist. In our case at least, this owes in part to our choice of UNIX to do the job. DIFFERENT GOALS, DIFFERENT APPROACH To show how some of these issues stack up in the larger scheme of things, I will describe another data acquisition system I took part in developing. The main emphasis of this project was di¬ rected toward doing a small amount of processing on a large volume of data at a minimal hardware cost. To accomplish this, a PDP-11/23 system run¬ ning RSX-11M was chosen for the development work, with the idea that we would run the pro¬ duction system under RT-11 once we were done. The project involved acquiring 30 channels of data from a micro- wave telemetry link connected to a small array of seismic stations some 70 miles away. Each chan¬ nel was to be sampled at 250 readings per second, making for an aggregate data rate of 7500 samples per second—one consid¬ erably higher than in the previous example. The only processing to perform, however, consisted of demultiplexing and time-stamp¬ ing the data before passing buffer loads of it to another machine (a PDP-11/34), via DMA, for further processing. Since no off-the-shelf inter¬ face was suitable for connection to the telemetry, we had to devel¬ op our own. We chose to build a fairly fancy device for offload¬ ing the demultiplexing from the 11 /23 CPU. This DMA device only required that it be told from time to time (about once a second) the hunk of memory into which it needed to poke its data. It then signaled the CPU with an inter¬ rupt whenever a buffer filled and a new address was needed. Since disposing of the data also was to be done via DMA, most of the heavy work could be handled by specialized hardware. The main tasks for the CPU consisted of reading time information from a UTC clock, attaching that infor¬ mation to buffers as they were filling, doling out buffer address¬ es as needed, and responding to operator requests for system monitoring information. As it turns out, several things slowed the pace of our project, not the least of which was our initial unfamiliarity with the DEC oper- 32 UNIX REVIEW NOVEMBER 1985 ating system and hardware. We had decided early on that since the development of our teleme¬ try interface would take a fair amount of time, we would have to undertake the software develop¬ ment in parallel. Ini dal develop¬ ment proceeded rather slowly and resulted in a system that was far from expandable either in capac¬ ity or functionality. By the time the system finally gave in and showed signs of working, we had evolved to a standalone-system approach for the software. This is not to say that the operating system we had chosen was unsuitable or defi¬ cient, but simply that we found we were using less and less of its facilities as time went by. We finally decided that we could do without it altogeth er since the activities we were trying to man¬ age were few in lumber and highly synchronous in nature. In doing so, we gained considerably tighter control over utilization of the resources at hand. The definition of this system was characterized ty fairly tight machine constraints, a narrow set of goals, and the fact that the desired result of our development was a black box —a prime candi¬ date for cross-develc pment under UNIX. After re-writing our early assembly language software in C (and purchasing a 0 compiler for the 1 1 /23), we now have a system that can be modified easily and yet still boasts the high perfor¬ mance/cost ratio we were looking for. WHAT ABOUT ANALYSIS? Scientific data analysis is an¬ other class of problems for which UNIX has some unique solutions. Like people, analysis techniques come into the work , grow up, get old, and sooner or later die. During the early stages of this life cycle, a technique’s developers will try to ascertain the applica¬ bility of their new analysis tool, so a fairly rough and unsophisticat¬ ed implementation of the idea usually will suffice. If a scheme proves useful and is allowed to continue its growth toward matu¬ rity, it will be cloned many times and the copies that are sent out into society will be molded by their environment into tools di¬ rected at the specific needs of particular projects. As these tech¬ niques grow more sophisticated, some may be trained toward high¬ er and higher degrees of special¬ ization. This transformation may leave little of the essence from which the techniques came. At many of the stages along this evolutionary path, malleability may be the factor that determines — 6 - 7 — Like people, analysis techniques come into the world, grow up, get old, and sooner or later die. whether a particular instantia¬ tion will be a candidate for fur¬ ther duplication or be discarded in favor of a younger sibling. A class of techniques impor¬ tant to seismologists is derived from the realm of signal process¬ ing , and is roughly categorized as time-series analysis. This extended family of processing schemes can be sub-categorized in various ways, but the building blocks include things like dot products, vector products, FFTs, weighted sums, and matrix multi¬ plies. These are the fingers and toes of the individual packages. In the same way that you can tell girls from boys, some people think that you can differentiate between high-volume production packages and research tools. Does this distinction mean that the parts are never interchange¬ able? Given that creative surgery is often the genetic engineering methodology of choice for soft¬ ware developers, we might ask, “Can we graft the toe of the fullback onto the foot of the ballerina and expect good re¬ sults? Will we get a female track star or end up with a clumsy dancer?” PARTS IS PARTS A key method for increasing the predictability of surgery is to make processing modules work more like spare parts than fingers and toes. Over the years, comput¬ ing has generated many helpful models for the fabrication and interconnection of these parts— among these are processes, pro¬ cedures, libraries, include files, and macros. Tools for creation, routine maintenance, repair, and replacement also abound in lan¬ guage compilers, linkers, editors, library managers, and file sys¬ tems. It should be clear, then, that among the important features of an operating system used for development are the availability, generality, flexibility, and porta¬ bility of these models and tools. An ongoing project comes to mind that illustrates how some of the features offered under UNIX are helpful when applied to a specific class of scientific data analysis. In the August, 1985, issue of UNIX REVIEW, Paula Haw¬ thorn describes array processors as backend machines that are used to offload specialized tasks from the general-purpose hosts that they serve. An increasingly popular configuration for these allows you to embed a few small modules in your system by plug¬ ging them right onto the host's bus. While less capable than their UNIX REVIEW NOVEMBER 1985 33 REAL WORLD EXPERIENCES larger, standalone counterparts, they are often easier to use and less expensive to purchase and maintain. We recently purchased one of these and have begun to put together some of the building blocks to do time-series analysis with it. After installing the hardware, we began to unpack the software that came with the unit. Since several people were to work on the project, we had to try to find a sensible home for all of the var¬ ious pieces, but we discovered that some of the objects we had to deal with were a little foreign to our experience. The compiler and linker supplied for the develop¬ ment of AP microcode followed UNIX conventions closely enough that they could be installed in an official place for all to use. But what were we to do with all those things called task files that the compiler and linker produced? And what about the mini-operat¬ ing system for the AP that has to be downloaded with each task in order to run? During the previous several years of working with UNIX, we had adopted a uniform directory template for software develop¬ ment projects. Once we under¬ stood where task files fit and knew what other pieces of soft¬ ware would need to find them, we easily determined how to extend this system to accommodate them and even how to build makefiles to provide for their maintenance. Because the mech¬ anism embodied in the make utility is so general, we instant¬ ly had a powerful tool at our dis¬ posal for dealing with what turned out to be a relatively com¬ plex and recalcitrant hardware/ software subsystem. This turned out to have hidden benefits that showed up not long afterwards. Following a few weeks of deal¬ ing with the all too common problems of incomplete and mis¬ leading documentation and wres¬ tling with the hidden eccentric¬ ities that all new systems exhibit, we became fairly confident that we would be able to produce some spectacular processing by the time we received a visit from the people funding the project. Since plugging in the unit worked in much the same way as a floating point accelerator, it probably was not illogical for the clients to assume that the unit would be — 8 - 9 — UNIX is constantly evolving and most of us in the scientific research community hope it will never lose its ability to do so. installed in the system in such a way that other software could continue to function in its ab¬ sence, with only a speed penalty to pay. While attempting to ex¬ plain to these clients why the world really didn't work that way , it occurred to me that if we took a larger view of our objec¬ tives, maybe the world really should (and could) work that way. One of my golden rales is that unless a really significant im¬ provement in functionality or ca¬ pacity can be achieved by hack¬ ing the kernel , we don’t. If you consider device drivers as part of the kernel, however, I violate my rule routinely. For this reason, we rejected the idea of developing a full implementation using the trap mechanism or of modifying the C system libraries. We did so because we felt these approaches would be too intrusive for our tastes. Instead, we investigated the method of substituting a fun¬ ny device driver for one we used at the time to talk to the AP. Al¬ though we knew this probably would make the kernel grow sub¬ stantially larger, it looked like an attractive idea. What we finally settled on, though—for the time being at least—was a parallel library ap¬ proach. In order for a processing module using the AP to interface with the rest of the system, it must be wrapped in some C clothing (which could be woven together using Fortran, or any other language for that matter; the point is that one must have software that runs in the host). By devising a naming convention for accessing task files, we were able to implement modules that could be emulated exactly by host-only software that had been carefully placed in a parallel library, with entry points accessible by identi¬ cal calling sequences. Had I the source to the AP linker and the time to do it, I truly would have liked to make the linker under¬ stand the files produced by the ar utility, thus making the job of building task files measurably easier. The organization of some of the processing schemes we have ex¬ plored suggests that, for some applications, a host module may want to download several task files successively. For this reason, we might have explored archiving task files into libraries that could be accessed at runtime. Alterna¬ tively, since AP code can be expressed in a relatively compact fashion, we also could have de¬ vised a scheme under which it could be viewed by the host software in much the same way that compile-time initialized data is viewed (this is the sort of interface provided with many APs). If we had adopted symbol 34 UNIX REVIEW NOVEMBER 1985 a Is, tables, procedure c formats that were co those used by existiiji; might have been these into helping scheduling and mar AP processing functions. and file n|sistent with g tools, we to trick ijis with the agement of able CONCLUSIONS Naturally, the exa here fall short of analysis of the ap UNIX to scientific a even within the res problems they constantly evolving us in the scientific munity hope it will ability to do so. Glittdr ities about the syste ness when applied to of problems (such rpples offered thorough dlicability of pplications— tricted set of address. UNIX is and most of research com- i lever lose its inggeneral- m’s useful- a large class las real-time applications) are almost certainly ill-advised, but an understanding of the specifics of the application at hand may well lead to a UNIX solution. What can be said of the solu¬ tions we eventually settled on? Were we able to achieve accept¬ able data acquisition throughput at an acceptable cost in hardware by using UNIX? Yes, given the other goals of our project. Were we able to sustain a high-volume processing environment under UNIX? Yes, given the flexibility required of our approach. Can specialized versions of UNIX that have been tailored for specific environments still be considered UNIX? This can be a religious question, since what you like about UNIX will largely determine what you think UNIX is or should be. Is UNIX really grown up enough to support large-scale sci¬ entific research, given the variety of processing constraints scienti¬ fic problems impose? I think it is, but only time will tell. Mr. Goff is Vice President for Computer Applications at Science Horizons Inc., a California research firm. Before helping to found this organization, he served as Staff Scientist and Senior Systems Engi¬ neer for Systems, Science and Soft¬ ware (now S-Cubed) for approxi¬ mately 11 years. During this time he was a key contributor to software systems now in use at the Center For Seismic Studies, a UNIX-based, DARPA-sponsored seismic research facility. ■ UNIVERSITY OF CALIFORNIA, BERKELEY COMPUTER FACILITY MANAGER World renowned University is recruiting for an experienced indi¬ vidual to serve as Administrative/Technical Manager of a large academic computing ins allation with a multi-million dollar bud¬ get and a staff of 40 p ogrammers and technicians. Coordinate departmental staff and faculty on campus-wide networking and assist faculty researche s with the planning of new equipment. REQUIRES: Proven experience in managing comparable operations and tech¬ nical staff, preferably n an academic or institutional setting. Substantial budget/fisi al experience to develop and monitor multi-million dollar operations is essential. Must possess knowl¬ edge of technical requir rnients for computer installations. Candi¬ date will possess a UNIX background and knowledge of other major operating systems. SALARY: $47.7K—S57.6K and excellent benefits program. UC Berkeley Personnel Office 2539 Channing Way Job #06-148-11 Berkeley, CA 94720 (415) 642-2348 UC BERKELEY IS AN AA/EEO EMPLOYER Circle Mo. 291 on Inquiry Card Communications Software for _WANG .Data General _Jat&t _PRIME —EES X M-inln -foppw Micros Minis Mainframes IBM Any computer with BLAST can talk to any other computer with BLAST, the universal file transfer software linking many different computers, operating systems, and networks No add-on boards, use any asynchronous modems or direct-connect for fast, error-free data transfer, even via noisy phone lines, satellites, LANS, and packet networks $250/micros $495-895minis $2495 up/mainframes Communications Research Group 1-80024BLAST 8939 Jefferson Hwy Baton Rouge. LA 70809 504-923-0888 Circle No. 290 on Inquiry Card J\ichardson. TX, seems like an unlikely locale for a guy from Brooklyn, but Steve Wallach seems to have adjusted to his new home. As the Vice Presi¬ dent of Technology for an ex¬ panding, young company, he has taken part in the area's growth—and as a fellow with big ideas, he has demonstrated a Texas-style appreciation for the proper scale of things. It was a little over a year ago that Convex unveiled the C-l, a 64-bit integrated vector pro¬ cessing system said to offer a quarter of the power of a Cray for a tenth of the cost. The numbers the C-l has put up, in fact, are impressive: able to handle up to 128 MB of memory, the machine has been bench- marked at 60 megajlops. The notoriety this innovation has brought is familiar stuff to Wallach. Indeed, it was Wallach's own notoriety that helped launch Convex. As one of the featured characters in Tracy Kidder's best-selling The Soul of a New Machine (Little, Brown, 1981), Wallach has long had a reputation good for open¬ ing doors. In the book. Wallach was portrayed in his role as the principal architect behind Data General's 32-bit Eclipse super¬ mini series. Later, he served as Product Marketing Manager for Rolm Corporation's 32-bit MIL- spec minicomputer. To explore the suitability of UNIX for number crunching, UNIX REVIEW asked Rob War- nock, who is himself an in¬ dependent computer architect with nearly 20 years of experi¬ ence, to ask Wallach about some of the problems that al¬ ready have been dealt with — as well as some of those that have not. REVIEW: There seem to be a large number of companies pro¬ ducing what you call “ afforda¬ ble supercomputers". Why is that? THINKING BIG An interview with Steve Wallach 36 UNIX REVIEW NOVEMBER 1985 Photos by Don Johnson WALLACH: The reason is actual¬ ly fairly simple. There’s a phe¬ nomenal gap in the market. When we started this company [Convex], the companies making 32-bit superminis were talking about how the next big market was going to be office ai tomation. It was clear that they weren’t going to do anything tc solve the problems of the people trying to do simulations. The folks in the scientific market just aren’t in¬ terested in office automation and menu-driven editors. r "hey want their Fortran and C programs to run fast. So we saw ar opening. New companies tend to be most successful when they re able to create a new markei. Tandem created a market. Apple created a market. And we’re creating a market. Much like those other companies, we’ve identified a niche and we’re right on it. Now—like in everything else— once you see a good tf ing, every¬ one immediately follows. REVIEW: Why did was necessary to low end instead oj Cray performance price? you feel it after the offering fall ct a better are (oi ital WALLACH: There and 50,000 VAXen machines). It’s a lot e customer to get cap zation for $500,00C $10,000,000. DEC eral have conditioned to budget $500,000 e\ a new machine. Cray, not conditioned the spend $10,000,000 should not be forgotte first Crays were bo DoD and DOE. Even company, $10,000,00' spend. REVIEW: Why didn of supermicros or satisfy the market ment? WALLACH: A womaji after nine months, any shortcuts; nine what it takes. At sorr e program performance 110 Crays VAX-class 4sier for the authori- than for Data Gen- the market ery year for hough,has market to every year. It n that the i|ght by the for an oil 0 is a lot to t networks superminis require- gives birth There aren’t months is point, your degrades to At some point, your program performance degrades to the level of your most constrained component. You only can go as fast as the slowest chip. the level of your most constrained component. You only can go as fast as the slowest chip. Micro architectures simply are inappropriate for a lot of the applications out in the market. To be quite honest, a lot of the micros are just overgrown 8-bit and 16- bit machines, but people still try to run 64-bit problems on them. There’s also another issue. Ev¬ eryone talks about CPU perfor¬ mance these days, but the major¬ ity of applications out there are memory-bound and I/O-bound. It’s important to know how much main memory you have and how fast your I/O is. That may have more of an effect on your perfor¬ mance than anything else. REVIEW: Does Amdahl's Rule apply in the computer environ¬ ment? [Amdahl's Rule holds that for each instruction per second a machine should have one byte of main memory and one byte per second of I/O band¬ width.] WALLACH: It applies to every machine, but everybody seems to get hung up on the CPU. Let me give you a simple example: I have one machine that cranks out 20 MIPS and another machine that does 10 MIPS. The 20 MIPS ma¬ chine has 20 MB of memory, in keeping with Amdahl’s Rule. The 10 MIPS machine has 100 MB of memory. My application happens to have 100 MB of data, which is to be referenced several times during the execution of the pro¬ gram. In the latter case, that means that every time I make a reference to memory, my data is always there. On the 20 MIPS machine, 80 percent of my refer¬ ences are going to go to disk, which means 18 msec access time per access. Guess what? The 10 MIPS machine runs the applica¬ tion faster. Applications and the people who buy computers are get- UNIX REVIEW NOVEMBER 1985 37 WALLACH INTERVIEW ting more sophisticated. Rather than just running CPU bench¬ marks, now there are system- level benchmarks that look at I/O and determine how sensitive it is to physical memory. REVIEW: The second part of Amdahl's Rule holds that you should have one byte per sec¬ ond of I/O for every byte of memory . WALLACH: Look, at one point there was a thing called Gross’s Law. It claimed that you needed to spend the square root of price to double your performance. Now, of course, we know that’s bull¬ shit. Just compare the VAX and the MicroVAX II. Semiconduc¬ tor technology obviously has trashed that law. With higher- performance I/O and higher-per¬ formance main memory, I think the rules have changed. Like with the Cray 2 and its two gigabytes of physical memory: if you put ev¬ erything in memory, who cares about I/O? REVIEW: Are you already feel¬ ing a push to put more than 128 MB on your system? WALLACH: Yes, a strong push, in fact. When we started the com¬ pany and said we were intending to build a system with 120 MB of memory, people said, “God, what a waste of time and effort! Who in hell is going to buy that? Just think of the expense!’’ That’s because no one anticipated that the price of 256K RAMs would fall as much as it has. We now hear that, “Customers lust for mem¬ ory. ’ ’ I always explain this by telling a joke: A guy walks up to an Indian who is saying, “Chance, chance.’’ So the fellow asks, “‘Chance’? I thought Indians al¬ ways said ‘how’?’’ “No”, says the Indian. “Already know how, just want chance.’’ Supercomputer users always knew what they could do with more physical memory if they had it, but they never got the chance because— unless you worked for the govern¬ ment—you couldn’t get your hands on a machine with a giga¬ byte of memory. Now, though, we actually have sold several ma¬ chines with over 100 MB of memory. Thus, the question becomes: do you really have to have all that memory? Yes, even for UNIX. This is interesting: 4.2BSD has a notion of disk cache that keeps you from having to go to disk if your block data can be located by the software that maintains the cache. With all this physical memory, we can make the disk cache as big as we like, so when¬ ever we run up against I/O bench¬ marks, we just define a disk cache large enough to keep us from having to go out to disk. As a result, our machines have screamed through benchmarks. Some people cry, “Foul! That’s not a fair benchmark because I can’t do that on my VAX’’—to which, of course, we respond, “Right”. Then we smile and don’t say anything more. REVIEW: Besides disk cache , what aspects of UNIX have you found well suited to your needs? WALLACH: Most things are fine. But the I/O structure is really lacking for a Convex-class machine. REVIEW: What aspect of I/O is a problem? The fact that it's synchronous? WALLACH: Yes. And to deal with that, we’ve added disk striping— just like you now have on a Cray. Striping allows you to take one disk file and make it go across multiple spindles. For example, we can get approximately 1 MB a second out of a single Fujitsu Eagle. A file striped across four Fujitsu Eagles, though, can get 4 MB a second of I/O. We’ve also added asynchronous I/O, mean¬ ing that if you do a disk or tape reference, you can keep going and use the signal mechanism of 4.2 to synchronize yourself. That’s an issue that always comes up. REVIEW: Why did you choose 4.2 over System V? WALLACH: When we started the company in September of 1982, we knew we wanted to put togeth¬ er a virtual memory machine. Berkeley had a virtual memory system; System V did not. We wanted networking and TCP/IP: 4.2 had it; System V did not. And, let’s face it, 4.2 traditionally has focused more on scientific appli¬ cations than on commercial ones. Since we’re after a scientific cus¬ tomer base, 4.2 made sense. We just made a business decision. REVIEW: So is 4.2 that much better suited to scientific appli¬ cations? WALLACH: At a base level. But neither 4.2 nor System V has the capabilities that a lot of people want. They don’t have the disk striping or asynchronous I/O— and they lack a lot of real-time features like pre-emptive sched¬ uling and the ability to lock pages into physical memory. What it boils down to is that the majority of scientific users are accustomed to their VMS, CDC, or Cray oper¬ ating systems. Now, when they look at UNIX, they say, “This is great. But I’m used to these five features. Put them on and you’ve got a sale.” They don’t care about UNIX, SCHMUN1X. It’s features, functions, and benefits that they want. REVIEW: I've always thought that I'd be happy if I could get my hands on a UNIX system that had TOPS-10 real-time features. 38 UNIX REVIEW NOVEMBER 1985 Without FORTRIX; moving up to C FORTRIX can cost you a bundle! The bundle we're referring to consists of your existing FORTRAN programs and files. Costly items you'll have to discard when you move up to C, unless you save them with FORTRIX™! Here at last is a program that automatically and rapidly converts FORTRAN code to C code, allowing you to salvage your FORTRAN material at approximately 600 lines per minute. This incredible speed allows a single programmer to con¬ vert, debug and put into operation a typical 50,000 line package in only one to two weeks. Plus, the resulting "C" program will run 15% to 30% faster than the original FORTRAN program, while occupying 35% less disk space! And the system even helps you learn coding in C language | as you compare your own familiar FORTRAN programs with the corresponding C language w J programs generated by m 't J FORTRIX.™ ' ' There's a complete selection of FORTRIX™ versions to suit the full range of user requirements: Original FORTRIX™-C, which translates FORTRAN code to C code, allowing input data files to remain fully compatible with your new C pro¬ gram; FORTRIX™-C +, with the added ability to handle COMMON and EQUIVALENCE statements, character handling and direct I/O; FORTRIX™-C', the complete FORTRIX™-C+ package con¬ figured for non-UNIX* systems including VAX/VMS; and FORTRIX™-C/micro. stand¬ ard FORTRIX configured for use on the IBM PC and compatibles. FORTRIX™ has already been installed on 26 different brands of hardware, so whichever FORTRIX™ version meets your needs, you can be sure it will exceed your expectations in terms of speed and cost savings realized. Why not act now to save your bundle? Get full technical details, plus references from among over 100 satisfied licensees, from Jim Flynn at (212) 687-6255, Extension 44, or write to him at Rapitech Systems Inc., Dept. A2, wfV 565 Fifth Avenue, New York, NY 10017. FORTRIXr Fortran-to-C Conversionware™ from Rapitech Systems Inc. Telephone (212) 687-6255/Telex 509210 ‘UNIX is a trademark of AT&T Bell Laboratories Circle No. 256 on Inquiry Card WALLACH INTERVIEW WALLACH: Then you might be interested in knowing that we have a customer who’s bringing up a TOPS-20 shell on top of UNIX. There are a lot of program¬ mers who are used to TOPS-20. REVIEW: In light of that, why has UNIX taken over the super¬ computer so quickly? WALLACH: It’s very simple— standards. In any DP shop, the biggest life-cycle cost is Joe in the software department. When peo¬ ple come out of school today, they tend to know UNIX. If you’re trying to hire programmers, you pay attention to that because you know if you have UNIX, your new hires are productive after a week. If you have some proprietary operating system, you’re looking at a six-month training cycle. That and the transportability of code are the name of the game for UNIX. REVIEW: At one time, many vendors were afraid of stan¬ dard operating systems.fearing that their customers might leave. Has that changed? WALLACH: The bigger you are, the less you like standards be¬ cause you want to lock people in. Standards mean that the lock these companies once had isn’t a lock any more. REVIEW: Has your commitment to UNIX caused you any busi¬ ness problems? WALLACH: The only problems relate back to the fact that while these people want to use UNIX, they also want maybe 10 func¬ tions from their old operating system—like one for tape han¬ dling, for example. You know, UNIX is not very big on handling magnetic tapes. But certain in¬ dustries are very dependent on tape—the geophysical [oil] indus¬ try, for example. UNIX also does not support IBM communica¬ tions, but, believe it or not, there are still lots of people out there who want IBM communications. So the problem is that while UNIX is a very good development sys¬ tem, it has some real drawbacks in a production environment. Most of the time we’ve spent on UNIX has been used to build up production and real-time capa¬ bilities—as well as some system management features so that Everyone talks about CPU performance these days, but the majority of applications out there are memory-bound and l/O-bound. customers don’t have to hire a Ph.D. from UC Berkeley to handle their system administration. One of the things that we’ve found as we’ve added these fea¬ tures, though, is that people will say, ’’That’s not within the UNIX philosophy. That’s not UNIX- like.” But our idea is that if there’s someone out there who has hard cash and wants to buy some machines that have certain features, we’re going to say, “Yes, sir.” REVIEW: What about compati¬ bility? WALLACH: These extra features always are extensions, not mod¬ ifications. REVIEW: Besides the need for extra features, have you come across aspects of UNIX or C that have caused you problems? I would imagine, for instance, that you're more used to work¬ ing with Fortran arrays than with C’s pointer types. 40 UNIX REVIEW NOVEMBER 1985 WALLACH: That’s are now doing a compiler that will shortly. It’s an adapl Fortran, so we’ve dev compiler technology correct. We cctorizing C be available ation of our doped a new for it. REVIEW: So you c looking at how peopl then optimizing the re actually e use C and t? WALLACH: Yes, thst’s very im¬ portant, in fact. / lot of our optimizations and a lot of our features come frorr application software. Rather tnan figuring out what to do next, we let benchmarks and user code drive the functionality. We’re a very market-driven company. I can point to features in the architec¬ ture, the compiler, and the oper¬ ating system, and hen point to pieces of major third-party soft¬ ware that stress these features. Asynchronous I/O is in all the finite element codes, like NAS- TRAN and ANSY5. Striping is useful in fluid dynamics code, as seismic interpretatiDn is in reser¬ voir models. It’s my opinion that the com¬ panies that succeed will be the ones that recognize what their customers need. No oody wants or can afford to hire any more pro¬ grammers. Compar ies want ven¬ dors to produce tools that will allow them to increase the pro¬ ductivity of the programmers they already have. I have an interesting story about that. Among many other optimizations, our compiler does what is called “dead code elim¬ ination’’. That is, if a piece of code is never execu :ed, we can de¬ tect it. Every so often, someone will bring in a benchmark that was written 10 years ago, and- like everything el patched for 10 years. We’ll run it through our compiler and get back the message so, dead code n: fellow will look at “Line so and emoved’’. The that and ask, The VAX is a very slow "eye opener". It was built as a minicomputer, you know. “What does that mean?” When we tell him that the code was never executed, he’ll go back and start tracing and sooner or later he’ll say, “I’ll be darned, you’re right. I’ve been maintaining a piece of code for 10 years that’s never been executed.’’ When we first started, the only thing we pitched was MIPS and megaflops. Don’t get me wrong— we still do that. But, more and more, the buy decisions are being made on the basis of productivity issues. Not to downplay the impor¬ tance of hardware, but we now have more software people than hardware people. That’s basically where you have to focus. Hard¬ ware people, of course, can be much more productive now be¬ cause of CAD. In fact, compared to the [Data General] MV-8000, we had less people working on the design of this machine [at Con¬ vex]—even though it’s probably an order of magnitude more com¬ plex than the MV-8000. That’s a very good milestone in my book. I should point out that in all my years of computer development, I have never worked with a more talented or gifted team of people than here at Convex. In 15 months, the designers had a pro¬ totype working with full VLSI— running UNIX and executing code generated by our Fortran com¬ piler. REVIEW: Are there any aspects of UNIX that seem to cause problems for large-machine ar¬ chitectures? Right off hand, I would think that heavy use of character-at-a-time I/O would cause a lot of context switching. UNIX REVIEW NOVEMBER 1985 41 WALLACH INTERVIEW WALLACH: We put a sledgeham¬ mer to that. All the character I/O is off-board on IOPs [I/O proces¬ sors]. One of the best experiences we had with that occurred on a prototype. We were printing out something when the CPU stopped—but we didn’t know it. We were still printing voluminous lines and pages. That’s because there are no device drivers in the CPU—they’re all on IOPs. So what happens is that when you print on a line printer, the kernel executing on the CPU supplies a byte number and byte count. It then interrupts the IOP, and leaves it to do its own work. That’s what an MC68000 is really good for, as opposed to a high¬ speed processor. We can use 68000s to totally offload all disk I/O, all tape I/O, and all character I/O. REVIEW: Does this make it pos¬ sible for customers , for exam¬ ple ■, to write their own device drivers without having to learn your machine language? WALLACH: Yeah—and what is more, all the device drivers are written in C. Even the diagnostics are written in C. You can go from disk to tape over the I/O bus without using the CPU. The thing is: when we built this machine, we built a system. My experience has taught me that, while every¬ one focuses on CPUs, they let the I/O go by the wayside. But you’ve go to hit it with a sledgeham¬ mer—which by the way is some¬ thing DEC hasn’t done yet. The VAX is a very slow “eye opener’’. It was built as a minicomputer, you know. REVIEW: Speaking of minicom¬ puters, has your Data Gener¬ al experience—your notoriety with the MV-8000—been an asset or a liability? WALLACH: Actually, it’s been a massive asset. Since my life is a living resume, there’s very little I could hide even if I wanted to. Companies that we deal with can figure out that we didn’t just decide to build this machine yes¬ terday even though they don’t actually know what we were do¬ ing. There’s a big advantage in having a very public resume. You know, the old joke is that when Bobby Thompson hit his home run in Ebbetts Field, there were 30,000 people there, but 10 years later, if you had walked around Brooklyn, you’d have sworn that 300,000 people had been at the game. We’ve all seen resumes of people we’ve worked with five years earlier and seen things we know aren’t true. It’s great to have some credibility. REVIEW: The process is called “due diligence ”, I believe, WALLACH: That’s right. When we were raising money, the ven¬ ture capitalists made a run on bookstores to get copies of The Soul of a New Machine , because I would tell them, “Look, just read the book. It’s fairly accurate.” REVIEW: You seem to enjoy at¬ tacks on the big powers. Does that belong in your resume? WALLACH: Absolutely. In fact, the biggest thrill for me is the challenge. I’ve never backed away from one yet. I’m one of those people who shouldn’t be kept around if I’m not motivated. You’d be better off getting a clerk to do the job. I think a lot of people around me feel the same way. We’re doing battle now with the big powers, which for reasons known only to themselves have lost their focus on the scientific market. If you’re working on something you believe in, are having a bit of fun, and are making some money to boot, who can ask for more? REVIEW: Do you find that cer¬ tain general design decisions tend to live a long time and that you end up applying them over and over? WALLACH: In a way, yes. I once met Gene Amdahl at a confer¬ ence. This was back when he had just announced his 470 at Am¬ dahl Corporation. I asked him if the machine offered IBM 1401 emulation because a lot of the 360s had it. And he responded, “The son should not pay for the sins of the grandfathers.” You know, at some point we’ve got to stop propagating mistakes. REVIEW: You said earlier that you had found yourself using some of the features of an APL machine you designed in 1971 in the Convex computer. WALLACH: That’s right. We used some features—some con¬ cepts—because they worked be¬ fore. You know, if a wheel is round, it can be used by a car. Let’s have a round wheel. REVIEW: Is UNIX a wheel? WALLACH: That’s a good ques¬ tion. I think it’s more a level and fulcrum than a wheel. The best thing that can happen to UNIX— strictly from a business view¬ point—is for the schism between System V and 4.2BSD to disap¬ pear. As a manufacturer, I’d love to see it. It would be beautiful if UNIX were brought under some sort of ANSI control. Then, at least, there would be a defined document not under the control of a single manufacturer. [The IEEE PI003 Committee is, in fact, at work on a UNIX standard definition as of this writing.] REVIEW: Is there any prece¬ dent for such independent con¬ trol of operating system stan¬ dards? WALLACH: To my knowledge, no. But that doesn’t mean it can’t be done. Lack of precedents cer¬ tainly never stopped UNIX. ■ 42 UNIX REVIEW NOVEMBER 1985 HANDS-ON TRAINING THAT ISN’T SECONDHAND When you learn the L NIX™ System directly from AT&T, you learn it fror l the people who develop it. So all the information you get is firsthand. For over fifteen yea rs, we’ve been teaching our peo¬ ple to use the UNIX System—which makes us the best trained to help you le; irn. The best training starts at your own terminal. That’s why, at AT&T each sti dent gets the use of an individual terminal for real ham Is-on training. Take your pick of c ourses from our extensive cur¬ riculum. Whatever your level of expertise, from first¬ time user to system (eveloper, we have a course that will suit your individual needs. And all our courses are designee. to teach you the specific skills that will soon ©1985 AT&T Information S’ I-1 Yes, I'd like some firsthand information on all UNIX System training courses. ystems. Name have you using the UNIX System to organize and expand your computing system for maximum efficiency. You also get experienced instructors, evening access to training facilities, and your choice of training centers. We can even bring our courses to your company and hold the training at your convenience. And because we are continually expanding our courses to incorporate the developments of UNIX System V you’re assured of always getting the most up-to-date information. So take your training from AT&T. And discover the power of UNIX System V—right from the source. Ceill us today to reserve your seat or for a free catalog. 1-800-247-1212, Ext. 387. Title Company Address State Call 1-800-247-1212, Ext. 387 or send coupon to: AT&T Information Systems P.O. Box 45038, Jacksonville, FL 32232-9974 AT&T The right choice. THE FINAL FRONTIER THE FINAL FRONTIER Continued from Page 26 the discussion below.) • Several implementations of sub¬ routine libraries compliant with Graphics Kernel System are available for use on UNIX systems. • Many standard UNIX utilities provide for interactive data analysis and display (awk, plot, hist, and S, among others). In addition to these facilities, several other UNIX system utili¬ ties can provide significant sup¬ port during the software develop¬ ment cycle. These include vi, make, SCCS/RCS, and the sys¬ tem’s various document prepara¬ tion tools. Note that UNIX, as it is com¬ monly delivered, is not able to provide the facilities necessary to support event-driven applica¬ tions. This does not imply that the system can never be used in such a capacity, but it does indi¬ cate that kernel modifications typically are required to integrate a UNIX-based computer into real¬ time applications. THE FORTRAN PROBLEM By now, it should be apparent that the single most important demand made by scientific appli¬ cations of an operating system is for an optimizing Fortran compil¬ er with a robust, standard For¬ tran runtime library. The f77 compiler does not satisfy this requirement very well. It is impor¬ tant to look into the reasons for the deficiency, and to see how the problem can be rectified. UNIX achieved its first notori¬ ety as a system programming and document preparation engine, neither of which require much in the way of floating point sup¬ port. C, meanwhile, is an excel¬ lent systems implementation lan¬ guage. and is the source language of most of the UNIX system. As a result, most work in compiler optimization for UNIX systems is devoted to C. Development of compilers for different high-level languages on the same system can follow two general approaches: 1) each com¬ piles the source code directly into object code, or 2) each compiles UNIX provides many of the facilities necessary to support the computationally intensive class of scientific applications. the source code to an intermedi¬ ate representation that a single¬ code generator then can use to produce object code. In the first case, the compiler writer can generate object code that takes maximum advantage of the in¬ struction set of the machine: in the latter, the compiler writer can generate intermediate code that maximizes use of the intermedi¬ ate machine architecture. Despite the portability and economies of scale represented by the second scenario, efficient object code generation is dependent on the richness of the intermediate ma¬ chine architecture. The f77 compiler under UNIX uses the intermediate approach for object code generation, taking advantage of the code generator offered by the system’s C compil¬ er. Unfortunately, many of the classic programming idioms em¬ ployed by Fortran programmers are not typical of the way C programs use machine resources. As a result, there is a poor match between the idioms and C’s inter¬ mediate machine architecture, leading to non-optimal object code for many of the most heavily used Fortran constructs. One possible solution to this quandary is to convince scientists to use a different language. Much has been learned concerning the use of program and data struc¬ tures since the first Fortran com¬ pilers appeared. The lack of gen¬ eral data structures and a pointer data type often cause algorithms that are really quite simple (when expressed in a modern structured language) to take on the appear¬ ance of spaghetti when expressed in Fortran. A new language will not succeed, though, unless it can be shown unequivocally to out¬ perform Fortran in candidate sci¬ entific applications. Only then will scientists be induced to ac¬ cept the startup costs of learning a new language. Another possible way to in¬ crease the appeal of UNIX for the scientific community is to aban¬ don f77 and develop an optimiz¬ ing Fortran compiler that com¬ piles source code directly into object code. The portability of the UNIX system permits vendors to quickly provide a proven, sophis¬ ticated, multi-programming oper¬ ating system. With an optimizing Fortran compiler, these same vendors would be able to increase their penetration of the scientific and engineering market sectors. THE FUTURE IS NOW The major roadblock to a more general acceptance of UNIX in the scientific community is the availability of an optimizing For¬ tran compiler for each particular hardware architecture. Despite the revulsion purists experience when contemplating such a proj¬ ect, several manufacturers are starting to pursue this approach. This is especially true among the 44 UNIX REVIEW NOVEMBER 1985 supercomputer vendors. Note that though the Cray 2 provides a UNIX environment, it makes use of its own optimizing Fortran compiler. The same applies to Convex Computer Coi p. and oth¬ er “affordable supercomputer” manufacturers. Most UNIX systens provide little of the necessary support for event-driven applica ions. Some companies have attempted to provide such facilities, but only at the expense of making ma¬ jor changes in the underlying UNIX kernel. It is also true (at least in the experimental physics community) that rruch of the data acquisition anc experimen¬ tal control perform systems is handled ed by such by dedicated microprocessor systems running standalone operating system ker¬ nels. The communication be¬ tween these micros and other timesharing hosts typically oc¬ curs by way of standard local area networks. Thus, the need for a standard operating system to support event-driven applica¬ tions is substantially reduced. Of course, one can always hope that a structured successor to Fortran will eventually emerge. Despite the improved program¬ ming environment such a lan¬ guage would undoubtedly pro¬ vide, its performance will have to be vastly superior to today’s For¬ tran if it is to win general accep¬ tance in the scientific communi¬ ty. In the meantime, we will find that Fortran continues to be heavily used in scientific ap¬ plications, and that scientists continue to pass up UNIX sys¬ tems unless they can be shown that UNIX satisfies their Fortran needs in a realistic manner. Joe Sventek is a member of the Computer Science research staff at Lawrence Berkeley Laboratory and a member of the Computer Science faculty at the University of Califor¬ nia at Berkeley. In a previous life, he authored programs representative of both general categories of scienti¬ fic applications—none of which were crafted or run on UNIX systems. ■ Copyright 1985 by Joseph S. Sventek. 3-CALC A superior SDreadsheet on UNIX* As powerful as Lotus 1-2-3* large spreadsheet many bus ness functions complete GRAPHICS package translates 1-2-3 models into Q-CALC already ported to: VAX, Callan, Fortune, c B2, Cyb, Plexus, Codata, Cadmus, Masscomp, Sun, etc. Ideal for '/ARs/ISVs Available since Jan. ’84 For more information write/call Quality software Products 348 S. Clark Drive Beverly Hills, CA 90211 3-659-1560 •Lotus 1-2-3 is Corp. U^IX a trademark of Lotus Development is a trademark of AT&T. eZ68020 SOFTWARE TOOLS WE ARE PROUD TO ANNOUNCE THE BIRTH OF THE NEWEST MEMBERS OF OUR 68000 FAMILY ... YOUR 68020 TOOLS ARE HERE! TOOL KIT AVAILABILITY • 68000/10/20 Assembler VAX, microVAX, 8600, Sun, Package: Pyramid, Masscomp. IBM/PC, - Macro Cross/Native OASYS Attached Processors for Assembler VAX and PC, others. Runs under - Linker and Librarian VMS. Bsd 4.2. System V, MS/DOS. - Cross Reference Facility dozens more. - Symbol Formatter Utility You name it... - Object Module Translator We provide a "One-Stop Shopping • Green Hills C 68000/10/20 service for more than 100 produc, Optimizing Compilers running on. and/or targeting to. the most popular 32-. 16- and 8-bit micros • Symbolic Debuggers and operating systems FEATURES • Written in C: fast, accurate. • Runs native or cross. portable. Extensive libraries. • Supports 68000 and 68010. • Supports OASYS compilers. • 5,000 line test suite included. • Generates PROMable output • EXORmacs compatible. and PIC. • Produces full listings and maps. • Outputs S-records and Tek-Hex formats. • Full Floating Point support. Over 100 Other OASYS software tools to choose from. -a Division or xel — 60 Aberdeen Avenue, Cambridge, MA 02138 ( 617 ) 491-4180 Circle Mo. 294 on Inquiry Card Circle No. 295 on Inquiry Card UNIX REVIEW NOVEMBER 1985 45 DATA ANALYSIS THROUGH INTERACTION Use of the S system to emphasize human effectiveness by Richard A. Becker and John M. Chambers is a language and a system for the interactive analysis of data. The system has applications in any field where data is involved: financial analysis, business graphics, quality control, engineering, and many more. It runs under the UNIX operating system and is described in detail by a 550-page user’s guide, S: An Interactive Environment for Data Analysis and Graphics , by Becker and Chambers (Wadsworth, 1984). The system is currently used by businesses, universities, and research laboratories. Although it is hard to be precise, we know that there are hundreds of S sites and thousands of users. The design goal for S, stated most broadly, is to enable and encourage good data analysis. S provides users with an environment that helps them look quickly and conveniently at many displays, summaries, and models for their data. It allows the user to follow the kind of iterative, exploratory path that most often leads to a thorough analysis. By typing simple but general expressions to the system, the user gets immediate, informative feedback, possibly including output on a graphical device. In addition, the system is open to change; even though the S system has many capabilities, a variety of mechanisms are available for extending the system as new applications and techniques appear. OVERALL ORGANIZATION An S user types expressions that describe the analysis to be done. Some examples can be found in Figure 1. The expressions involve a wide variety of operators and Junctions that carry out arithmetic and mathematical operations, statistical analyses, graphics, data manipulation, and other computa¬ tions. Expressions also use and create datasets containing data structures, such as vectors, arrays, time series, and tables. Datasets are automatically accessed by name. The S executive interactively parses expressions and controls their evaluation. The organization of S resembles that of an 46 UNIX REVIEW NOVEMBER 1985 Illustration by Robert Williamson t, . , .' tfy > j f py^.- *? Jc-tunjMii 4<* **^V c-i5 • sc-'vP*' vi .t> -V ; ' l v :.■ "■= -a - • ^ a .- tt ‘ , > V THE S SYSTEM # read a vector of numbers from a file, create data set mydata mydata read("my.data.file") mydata - mean(mydata) * subtract the mean from each value « Given a matrix of predictor variables longley.x « and a response variable longley.y * get the residuals from a multiple linear regression model r regress(longley.x.longley.y)$resid # compute the residuals * larger than the median absolute residual r [ abs(r) > median(abs(r)) ] Figure 1 — Some sample S expressions. interactive operating system: the executive corre¬ sponds to a command interpreter, the datasets relate to files, and the functions can be equated with individual commands. Specific similarity to the UNIX system organization is probably not coinci¬ dental, although it was not conscious. There are significant differences, however. The expressions for data analysis need a richer syntax than the commands in an operating system, particularly for algebraic expressions, and data for arguments and results need more structure (commands in the UNIX system operate largely on unstructured streams of bytes). S was designed in a research environment for statisticians who continually develop new tech¬ niques, so it was essential that the system be extensible. Some of this extension (macros and new data structures) can be done within the interpretive S language itself. Other extensions involve the creation of new S functions. Facilities for extension are intended for users; they are not restricted to those familiar with the internal workings of S. EXPRESSIONS: THE LANGUAGE The user who types expressions to an applica¬ tions system wants a combination of simplicity and flexibility. Simple requests should be straightfor¬ ward and brief. At the same time, unusual but sensible requests should not be impossible or unreasonably complicated. Novice and expert users will place different emphasis on the simple or the unusual. In S, all user commands follow one general syntax: everything is an expression. The expres¬ sions that are given to S may be as short or as long as is comfortable for the user. Expressions in S use functional and algebraic syntax, as Figure 1 shows. For users with some background in mathematics, science, or engineer¬ ing, this syntax is readable and familiar. Extensions to ordinary algebraic notation introduce a few special operators; for example, a colon is a sequence operator such that x:y is a vector going in steps of ± 1 from x to y. When an expression is given to S, it is evaluated. The result may be assigned a name and thus saved as a dataset. If the result of an expression is not as¬ signed or used inside another expression, it is printed for the user. Algebraic notation (prefix or infix operators, in other words) is natural for functions with one or two arguments. However, data analysis quickly becomes involved with functions that have many arguments. Functions in S can have arbitrarily many argu¬ ments that can be specified by either position or name. Typical functions to carry out statistical or graphical analysis will have a few arguments to say what data is to be analyzed or plotted as well as many optional arguments to control details. Options are most easily supplied in the form name = value; the options of interest can be specified in any order. Functions return data structures that may have an arbitrary number of named components; thus, functions may have any number of inputs and produce any number of outputs. One of the most powerful functions in the S language is represented by the subscripting opera¬ tor. Since S deals with vectors, it is natural that sub¬ scripts are also vectors. Thus: X[ 1:5 ] returns the first five values in x. Since it is frequently necessary to exclude observations during data analysis, negative subscripts specify the values to be excluded: x[ -6 ] returns x with the sixth value omitted. Subscripting can also be used to answer data¬ base-like queries. Logical expressions used as subscripts cause the selection of data corresponding to TRUE values in the subscript. For example, the query “give the names of people under 25 who make more than $30,000“ would be expressed as: name[ age < 25 & salary > 30000 ] The subscripting operation extends naturally to multiway arrays, and in this context an empty subscript denotes all values in that subscript position. For a matrix y: y[ • 6:2 ] 48 UNIX REVIEW NOVEMBER 1985 XENIX Opornting System Mankind searched the world over for the multiuser operating system of the future. 1 Then IBM® chose aENIX® for the PC AT. And the future was now. THE SANTA CRUZ OPERATION PRESENTS ttftX HWf AN STARRING SCO PRODUCTION in exclusive association with MICROSOFT CORPORATION THE MULTIUSER, MULTITASKING PC BLOCKBUSTER “XENIX NOW!” VISUAL SHELL • MULTISCREEN' • MICNET • THE BERKELEY ENHANCEMENTS AND INTRODUCING C-MERGE AS THE MS-DOS DEVELOPMENT ENVIRONMENT featuring WORlLD FAMOUS SCO TRAINING AND SUPPORT for DEALERS • END USERS • ISVs /lnd an INTERNATIONAL CAST OF HUNDREDS OF XENIX APPLICATIONS OEMs INCLUDING LYRJX" AS THE UNIX/XENIX WORD PROCESSING SYSTEM PRODUCED AND DIRECTED BY THE SANTA CRUZ OPERATION SCREENPLAY ADAPTED BY THE SANTA CRUZ OPERATION FROM ORIGINAL STORIES BY MICROSOFT AND AT&T IN BREATHTAKING SELECTABLE COLOR NOMINATED FOR BEST DOCUMENTATION! ★ BEST SUPPORT! ★ BEST TRAINING! BEST ELECTRONIC MAIL AND NETWORKING! ★ MOST APPLICATIONS! ★ MOST COMPLETE UNIX SYSTEM! SCO THE SANTA CRUZ OPER; TION RELEASED FOR MOST POPULAR PERSONAL COMPUTERS. APPLICATIONS ALSO AVAILABLE: LYRIX, MULTIPLAN®, INFORMIX®, LEVEL II COBOL"*, 3270 MAINFRAME COMMUNICATIONS. ( 408 ) 425-7222 TWX: 910-598-4510 SCO SACZ Circle No. 263 on Inquiry Card M MULTIUSER OP ERATION SUGGESTED XENIX WILL TURN YOUR P( INTO A REAL COMPUTER ©MCMLXXXIV The Santa Cruz Operation. Inc. The Santa Cruz Operation, Inc., 500 Chestnut Street, P.O. Box 1900, Santa Cruz. CA 95061 (408) 425-7222 UNIX is a trademark of AT&T Bell Laboratories • Lyrix and Multiscreen are trademarks of The Santa Cruz Operation, Inc. • IBM is a registered trademark of International Business Machines Corporation • XENIX and Multiplan are registered trademarks of Microsoft Corporation • Informix is a registered trademark of Relational Database Systems, Inc. • LEVEL II COBOL is a trademark of Micro Focus, Ltd. THE S SYSTEM Specific similarity to the UNIX system organization is probably not coincidental, although it was not conscious. returns all rows of columns six through two. As this example illustrates, the subscript operator can also permute data values (here reordering columns six through two). The function order generates subscripts corre¬ sponding to a sorted version of its argument. Thus: x[ order(x) ] is equivalent to: sort(x) Using order also makes it simple to do passive sorting: name[ order(salary) ] lists names in increasing order of salaries. The print function, implicitly invoked whenever a result is not assigned, represents numerical results to the appropriate number of decimal places and can neatly lay out matrices, time series, multiway tables, and character data. The function apply is able to invoke another function repeatedly on portions of data structures. In its simplest form, apply invokes a function on each of the rows or columns of a matrix. Thus: apply( y. 1. "mean" ) invokes mean once on each row (dimension 1) of the matrix y and returns the vector of row means. With other choices for its second argument, apply can deal with slices of multiway arrays. Functions can also be applied over hierarchical data structures and ragged arrays. DATA STRUCTURES AND DATA MANAGEMENT Datasets in S contain self-describing, hierarchi¬ cal (list-like) data structures. Datasets are created automatically by assignment expressions; no user control of storage is required. The elementary data structures are vectors of numbers, logical values, or character strings: > response 1.01 .97 3.1 7.21 > response >2.5 F F T T > species.name "Setosa" "Virginica" "Versicolor" (Here the “>” is the S prompt for an expression). The numeric data modes are “real” and “inte¬ ger”, but for the most part the distinction is unimportant to the user. In S, the value of the expression “3/2” is 1.5, even though in many programming languages integer arithmetic would produce an integer result of 1. A special operator is provided for integer division when it is needed. There is a special value, NA (not available), that can be used to signify missing data. Any arithmetic operations on NAs produce NAs. General data structures consist of any number of components, each component being either a vector or another general data structure. Each component has a component name ; syntactically, the compo¬ nent named Label of a structure z is denoted zSLabel. We designed S so that most users are unaware of the details of data structures, but also so that structures can be defined and manipulated easily to handle new analyses. Simplicity for the user is obtained because all functions that deal with a given type of data structure (for example, matrices, time series, or tree structures from clustering) recognize the structure type by looking for compo¬ nents with certain specific names. Functions that produce such structures as their value simply return structures with the appropriately named components. For example, a multiway array is defined as a structure with two vector components: one named Data containing the data values for the array (listed column-by-column), and one named Dim containing the extents of the array on each dimension. A 2 by 3 matrix, x, with data value 2i+j in the [i.j] position corresponds to the following list representation: ( "x" STR ( "Dim" INT 2 3 ) ( "Data" REAL 3 5 4 6 5 7 ) ) Certain functions make use of a list representation of S data structures to enable structures, or entire 50 UNIX REVIEW NOVEMBER 1985 TANDY... Clearly Superior '' The Tandy (3000 lets your office balance the books, track sales and write memos... simultaneously. costing, and sales analysis. The Tandy 6000 comes with 512K of memory, XENIX 3.0 operating sys¬ tem and a 15-megabyte hard disk drive (26-6022, $5499). Discover how your business can benefit from a Tandy 6000 multi-user ] office system. Drop by your local Radio Shack Computer Center for a free demonstra¬ tion. Ask about our leasing plan, too. time and effort. Your accounting can be processed in one office, word pro¬ cessing in another, and data base man¬ agement in a third office. The Tandy 6000 can also help with other departmental functions, like financial planning, inventory, job For many companies, it’s hard to justify the cost of a separate computer for each employee. That’s why we de¬ signed the efficient Tandy 6000 multi¬ user computer. The Tandy 6000 system allows three people to simultaneously access programs and data, and you can expand with up to six users at any time. With a single Tandy 6000 and printer, you can save L: TANDY 6000 Available at ovei 1200 Radio Shack Computer Centers and at participating Radio Shack stores and dealers. Radio /hack COMPUTER CENTERS A DIVISION OF TANDY CORPORATION Prices apply at Radio Shack Computer Centers and participating stores and deal¬ ers. Display terminals sold separately. XENIX/TM Microsoft Corp. THE S SYSTEM The user who types expressions to an applications system wants a combination of simplicity and flexibility. databases, to be written to files in character form and subsequently read back in. The ordinary user does not see this structure, however; x just appears to be a matrix. When a matrix or array is printed, it is laid out conventional¬ ly with no explicit reference to the components of the structure: > x Array: 2 by 3 Ml [.2] [.3] [1.] 3 4 5 [ 2 .] 567 Matrices and arrays are created and manipulated by a large number of S functions. Data structures such as arrays or time series are so widely recognized that they are considered to be built into the language. Most of the basic functions, such as arithmetic, logic, printing, and plotting, include some special facilities for treating these structures sensibly. For example, the result of adding together two time series is a time series on the intersection of the two time domains. A broader special class consists of vector struc¬ tures which are data structures that can act like vectors but have special structure in addition. Vector structures can be used in arithmetic and, in general, can act as a vector argument to any S function. Arrays and time series are examples of vector structures, but the class is open-ended. Internally, any structure with a vector component named Data is considered a vector structure. The Data component is the part that acts like a vector when necessary. Functions that operate element- by-element on a vector structure change the data values but leave the other components unaltered. If x is the matrix above, sin(x) produces a 2 by 3 ma¬ trix with data sin(3), and so forth, while x<4 is a matrix of logical values. 52 UNIX REVIEW NOVEMBER 1985 Functions that rearrange the order of elements, on the other hand, throw away the structure and leave just the data: sort(x) sorts the data values in the matrix but its result is a simple vector. Since the original design of S, vector structures have been added to represent such structures as distance measures, categorical variables, and multiway tables. These structures can be used as vectors throughout the language, with no modification of the various S functions involved. THE EXECUTIVE The S executive performs tasks roughly compara¬ ble to an operating system command interpreter (such as the UNIX system shell). It controls most interactions with the user, parses user expressions, schedules the execution of various functions, and handles interrupts and error recovery. User expressions are accepted by a parser built using the yacc compiler-compiler with a customized lexical analyzer. The process by which the executive invokes an S function is crucially system dependent. S consists of a large collection of functions (currently around 300). Furthermore, users must be free to write and use their own functions. The facilities of the operating system running S determine how such a collection can be maintained and used in a reasonably efficient way. Operating system con¬ straints have forced us to use several different implementation strategies. For the original version of S, on a Honeywell computer with a relatively primitive operating system (no virtual memory or process control), we wrote our own dynamic loader. Each S function was an overlay, read in by the executive; control was passed by a standardized transfer vector. When we first moved S to PDP-1 Is running the UNIX system, the major constraint was the 16-bit program address space. For this environment, we implemented each S function as an independent program. The executive used the fork and exec operations to start up new processes, and they shared data by means of a common file and noted completion by means of signals. The current implementation on 32-bit hardware exploits the larger address space to incorporate some or all of the S functions as part of the program containing the executive. For our goals of flexibility and extensibility, it is essential that these changes in implementation affect only the executive, not the source code for individual functions. Even in the executive, only a relatively small fraction of the code is system- dependent. This code, however, is more crucial to Name the computer that’s so modular and expandable it lets you upgrade from 16-bit to 32-bit processing... Expand from .5 to 7 Mbytes of memory... Or go from monochrome display to high-resolution color graphics... All in a snap , w ■' Introducing the HP 9000 Series 300 The computer that Starting right now, HP is going to change your thinking on the ways that computers can change. Because now, there’s a computer system so easy-to-configure that it meets today’s application requirements quickly and cost-effectively, and so modular and expandable that it embraces future application needs as well. Whatever the job at hand — advanced CAD and measurement automation, or word processing, spread sheets, and database management — the new HP 9000 Series 300 is equal to the challenge. Your pick of processing power. The Series 300 offers you the appropriate processing power for the job, running your choice of two Motorola microprocessors: the 68010 16/32 bit and the 68020 32 bit. You can start with the 68010 and easily upgrade to the 68020 when more processing power is required. Just as important, you have complete object code compatibility across the product line. So when you change processors, there’s no need to recompile. Changing CPUs in the HP 9000 Series 300 is a snap. You simply plug in a new card set and , with object code compatibility , you shift from a 68010 running at 10 MHz to a 68020 running at 16.6 MHz. Adding peripherals is easy. The Series 300 has the built-in interfaces to handle HP’s large, fully compatible family of peripherals. There are many compatible monitors of varying resolution, too, so you can go from 12-inch monochromatic display all the way to high-speed, high-resolution color graphics. i » I loves changes In addition, there are a n to choose from: input and printers, and more. umber of HP peripherals for you mass storage devices, plotters, hel Productive pro] You also have a complet^ tools to work with, to your application. For in BASIC, as well as HP-UX System V UNIX™ ope supports industry standa: FORTRAN 77, Pascal, grar^iming language options. set of programming language p you better meet the needs of stance, the Series 300 runs HP — HP’s robust version of AT&T’s ating system. And HP-UX programming languages, too — C. id i and < Link entire systems The Series 300 is designed Your initial application system. But the Series 3 a sophisticated 100-node Ethernet™. With LAN, with the Series 200 and family, plus the popular , not just users. to be linked with other systems, ay call for a simple, single-user has what it takes to grow into LAN based on IEEE 802.3 or the Series 300 can share data 500 computers in the HP 9000 HP 1000 and 3000 family. m: CO Consistent HP quality. With the HP Series 300, you can count on cost of maintenance below 4 percent, the result of exceptional HP product quality, uniformly maintained with exacting tests in temperature, shock, humidity, altitude, and many others. Couple this with our complete service and support package and you have still more reasons to go with HP. Call us today! Choose the system that will change to meet the applica¬ tion requirements of you, your users, and your customers today and tomorrow. Call your local HP sales office listed in the white pages. Or call 1-800-522-FAST (in Colorado, 223-9717 collect) for the number of the sales office nearest you. Now, get data on-line, 24 hours a day! For immediate information, use your computer and modem and dial 1-800-367-7646 (1200 baud, 7 bits even parity, 1 stop bit). In Colorado call 1-800-523-1724. ha HEWLETT r PACKARD THE S SYSTEM Users often react to plots by finding the unexpected and using this new information to shape subsequent analysis. the reliability and efficiency of the system than its size might suggest—adapting the control of such a large-application software system to the features of a non-UNIX system is relatively difficult. GRAPHICS Data analysts use plots iteratively as an intimate part of their study of data. The unique role of plots comes from their information content: no other form of output conveys so much information so quickly. Users often react to plots by finding the unexpected and using this new information to shape subsequent analysis. A variety of graphical techniques for data analysis are presented in Graphical Methods jor Data Analysis , by Cham¬ bers, Cleveland, Kleiner, and Tukey (Wadsworth, 1983). S emphasizes interactive graphics as one of the most important tools in data analysis. Graphics functions in S provide the simple displays that are predominant in statistical graphics—most notably the scatter plot—in a flexible and easy-to-use form. For example: plot(x.y) ^scatter plot qqnorm(x) ^Normal probability plots The general data structures and expressions in S help to provide graphical output from a variety of sources. Many analyses produce results that define a scatter plot: for example, a probability plot shows an ordered set of data plotted against corresponding quantiles of a probability distribution. Deviations from a straight-line pattern help assess distribu¬ tional assumptions. Rather than duplicating scat¬ ter-plot software for each such plot, S functions return as their value a plotting data structure , which is passed automatically to the plot function to be displayed. The expression: qqnorm(mydata) produces a probability plot of mydata against quantiles from the standard normal distribution. Internally, qqnorm only generates the plotting data structure and then invokes the scatter-plot func¬ tion; qqnorm needs to know nothing about plotting. The data structure consists of two vector compo¬ nents for the x and y coordinates of the points to plot. Once the probability plot is seen as a data structure, it is straightforward to use this structure for further analysis—by fitting some suitable line to the points in the plot, for example. The graphical functions are not locked into specific devices because both the user-typed expres¬ sion and the underlying algorithms are written independently of specific graphic devices. Actual graphical output is produced through a device driver that converts the graphics output, at a relatively low level, into commands for a particular device (see Figure 2). The commands are passed from the function to the device driver by means of a set of pipes. Drivers exist for ordinary printing terminals and a range of interactive plotting terminals. A driver is written by implementing routines to carry out a specified set of graphic primitives (such as “draw a line” or “plot a character”), and by providing a definition of the device in terms of basic graphic parameters (for example, the device coordinate system or raster size). Incorporating a new device typically takes a few days or less; the process is straightforward enough that users can write their own device drivers by following the instructions in Extending the S System , by Becker and Chambers (Wadsworth, 1985). Figure 2 — Operation of device-independent graphics. 56 UNIX REVIEW NOVEMBER 1985 m Only Sperry can following four sta Our PC runs th^ system, as well as Our 4 new mic: run the UNIX sys Our new minic the UNIX system. Our Series 1100 run the UNIX sys All of which me a great deal we can ake the tlements. XENIX™ MS-DOS™ rbcomputers trni. qmputer runs mainframes tfem. ans there is do for you. For instance, our family of computers based on UNIX systems has incredible trans¬ portability for all your software. And being able to accom¬ modate from two to hundreds of users, it’s impossible to out¬ grow our hardware. Of course, this linking of all your computer systems can add measurably to your productivity. And a fast way to find out more is to get a copy of our Sperry Information kit. For yours, or to arrange a demon¬ stration at one of our Productivity Centers, call 1-800-547-8362 (ext. 60). *UN1X is a trademark of AT&T Bell Laboratories XENIX and MS-DOS are trademarks of Microsoft Corporation ©Sperry Corporation 1985. ^SPER^Y Circle No. 270 on Inquiry Card Introducing an idea that makes obsolescence obsolete. The UNIX operating system from PC to mainframe. Resellers: Cal Speny at1-800-547-8362, ext 125 to carry the only complete UNIX PC-to-mainframe line. THE S SYSTEM S was designed using the model of a language operating on complete datasets, interactively, in a nonsequential manner. TOOLS: THE OPERATING SYSTEM The complete S system contains about 6000 lines of interface language, 35,000 lines of algorithm language and 9000 lines of C code. Development and maintenance of S by the two of us requires efficient use of time. Our experience is that three as¬ pects of the design particularly affect human efficiency: the languages in which programming is done, the tools for maintaining the application system, and the operating system interface. We developed our own interface language and algorithm language. This may have accounted for perhaps 10 to 15 percent of our total effort, but this development has been cost-effective. Interface routines describe arguments to S functions, check for errors in arguments, allocate space for data structures, call computational routines, and return results. If interface routines were written directly in a general language like Fortran, they would be much more complicated and error-prone, and all but the most sophisticated users would find it impossible to write their own S functions. During compilation, an interface routine typically expands into a much larger Fortran routine (representing an order of magnitude more lines of source code). Much of this expansion reflects inherent clumsiness in using Fortran to express the argument processing, dynamic storage management, and result genera¬ tion encompassed in an S function. At the same time, the use of Fortran as an intermediate language is important. We could not re-implement all the basic computational algorithms previously written in Fortran. The use of software tools is essential for creating and maintaining a system such as S. Compiler- compilers, macroprocessors, and more specialized tools ease the burden of system development. During compilation, the interface language goes through our own simple compiler, two passes of the M4 macroprocessor, RATFOR, and Fortran. Obvi¬ ously, we are not trying to optimize compilation time. This multistep process, however, does enable us to modify individual steps as our needs change. Other tools are used to provide specific utilities for S developers. The make system for maintaining programs is used to generate the S executive and the individual functions. For tools to be useful in large applications systems, they themselves should be easily adapt¬ able. For example, our use of make is highly specialized. The interface routines and the support programs, whether based on RATFOR or C, all take advantage of special S facilities. We therefore replace and extend make’s built-in rules for compiling to include these special features. The result is a customized tool, itself built from a number of tools. The ease with which tools are put together is also a function of the operating system environment. The UNIX environment is convenient for developing a system such as S, both because of specific facilities and because the operating system tries not to be unnecessarily restrictive. Facilities such as pipes and a flexible command interpreter make the creation of customized tools much easier. The absence of complex rules about file formats and interprocess protocols, on the other hand, has meant that our implementation has had fewer barriers to scale than it might have otherwise. The dependence of the current version of S on its operating system environment involves both the internal dependencies and the use of operating system features in the tools. The dependencies on computer hardware , such as machine accuracy, are relatively easy to handle. The large majority of S code passes through Fortran during compilation. Non-portable features, such as the choice of special characters and machine precision, are isolated in the macroprocessing phase and kept in a single file. The use of Fortran as an intermediate language and the parametrization of machine-dependencies make S source code quite portable. On the other hand, implementing and using a system like S benefits from a good general computing environ¬ ment. The UNIX system has allowed us to combine and modify tools to put together S. In a more restrictive system, we would have been obliged to provide more of the support environment ourselves. Perhaps most importantly, the UNIX system is being used on a variety of new computer systems, and when that is done, S goes along for free. Because of this form of portability, S currently runs on hardware ranging in size from AT&T’s UNIX PC to large IBM mainframes. HISTORY Work on S began at Bell Laboratories in 1976 and 58 UNIX REVIEW NOVEMBER 1985 S represents an approach to computing that emphasizes the effectiveness cf the human as the most important design criterion. an initial implementation on a large Honeywell mainframe system was in use late that year. (This was the machine left over after Bell Laboratories dropped out of the Mul :ics project.) Starting in 1978, a version of S was developed for the UNIX system on an Interdata compute-just after the UNIX Seventh Edition port was accjmplished on that machine. Since 1981, the UNIX-based version of S has been distributed outside Bell Laboratories by AT&T. When the design of S began, a group of us at Bell Laboratories conside-ed the statistical software that existed at the time in terms of our goal of good data analysis, particjlarly in an interactive, ex¬ ploratory environment. We could ascertain three main approaches to daing statistics on a computer: programming in a conventional language, usually Fortran (this had been our own previous approach): mainframe statistical packages such as BMD, SAS, and SPSS; and a few interactive languages, notably APL. We recognized the need for better use of human resources than was possible when it was necessary for individuals to develop their own Fortran applications, but we found problems with the existing alternatives to Fortran. Statistical packages arose during the 1960s and were closely modeled on the idea of sequentially processing a series oi records on punched cards or magnetic tape. This model has had several bad influences. Good dati analysis is highly iterative, responding to impo-tant facts observed in the analysis itself. Picturing analysis as processing a sequence of records through a limited set of statistical commands discourages this freewheeling interaction with the data. In particular, interactive use of the statistical packages was either not available or consisted largely of the ability to set up the card deck and run it from a terminal. S, on the other hand, was designed using the model of a language operating on complete datasets, interac¬ tively, in a nonsequential manner. A number of modern statistical techniques, like robust estima¬ tion. cannot easily be expressed in the sequential form, and are therefore hard to incorporate in some of the packages. Another result of the batch approach was the tendency to "shotgun” output, printing all the summaries likely ever to be relevant from a particular model or process. Instead, S tries to provide a wide variety of displays, particularly graphical, that can be used interactively to see summaries relevant to a particular user. Graphics, like interaction, was not part of the original design of the mainframe packages. Since 1976, many of these packages have added graphical facilities, but the graphics tend to be viewed as “reports” rather than as integral parts of the analysis. For example, most of the graphics add-ons do not include graphic input, which in our opinion is essential for identifying important features observed in the plots. The APL language, while not designed for statistical computing, offered a very different (and, in many ways, more attractive) approach. It was intended for interactive use, with users typing expressions that operated on whole datasets and produced immediate output at the terminal. Users can extend the language by defining interpreted functions that can then be used in the same way that primitive APL operators are. These are all features that contribute to APL’s usefulness for data analysis, and thus have been incorporated into S. The consistency and functionality of APL’s opera¬ tors are also present in S; in S, however, such operations are normally carried out by functions rather than by operators. The main problems with APL are its syntax, its data structure, and its isolation from other languages. S represents an approach to computing that emphasizes the effectiveness of the human as the most important design criterion, as shown by the emphasis on friendly interactive access to comput¬ ing, information hiding, and on greater flexibility through delayed binding. Our philosophy is that the effectiveness of the human is the most important criterion for the design of any computer system. EXPERIENCE AND EVOLUTION A significant contribution to the evolution of S has come from user activities and experience. By far, the majority of our users are not professional statisticians. Instead, they are professionals in other areas who have a need for data analysis, graphics, or other S facilities to enhance their own work. In a number of cases, their specialized use of S. has led them to develop, in effect, unique systems for their own specific user communities. This is usually done by creating a set of S macros to Continued to Page 100 UNIX REVIEW NOVEMBER 1985 59 il U vV 1 9l1 UNIX IN REAL TIME What it takes to make the grade by Clement T. Cole and John Sundman W tii As UNIX has moved from the orld of computer science re¬ search into The Real World , its character has been altered. This *ticle explores some of the modi- ^ations that have been made to support “real-time” processing id looks at a number of the demands that real-time applica- e\ th; ms make of the operating sys¬ tem. Real-time operating systems cdn be described in terms of seven requirements. It is our contention that UNIX, suitably modified, can fill all seven. Before listing these requirements, how- er, it’s important to understand at they stem from applica¬ tions—which, unlike UNIX, have standard definition. (Admit- dly, the /usr/group UNIX In- dace Standard published in arch, 1984, has its detractors, but the IEEE PI003 Portable Operating System Environment irking Group has started work a better definition.) Because real-time applications la