Implementors who find an assignment difficult to accomplish will do well not to complain. an internationalized version of UNIX might be excused for not including everything, but developers who need added functionality shouldn't be frozen out until hardware and storage costs have fallen enough to prompt a complete redesign. Any com­pany prepared to shoulder the additional cost before then should be allowed to. It should be emphasized that cost is the real issue in any of these considerations. Technical difficulty is only an issue if it makes a solution too expensive, or if it prevents a solution from being regular and ex­pandable. Implementors who find an assignment difficult to accomplish will do well not to complain. They were hired to solve hard problems, after all— if they don't like the job, they should consider a ca­reer in politics. Technical difficulty is only an issue if it makes a solution too expensive, or if it prevents a solution from being regular and ex¬ pandable. Implementors who find an assignment difficult to accomplish will do well not to complain. They were hired to solve hard problems, after all— if they don’t like the job, they should consider a ca¬ reer in politics. COMPARING, COLLATING, AND CONVERTING Comparisons, collations, and conversions of characters are accomplished completely indepen¬ dent of encoding. That’s because these are seman¬ tic—rather than lexical—issues. The distinction may make it difficult to base sorting algorithms closely on encoding, but that’s the implementor’s problem. Users will need table-driven routines to perform tasks like sorting, character comparison, and other similar functions if their systems are to be capable of adapting to different languages and intended uses. English itself needs help with these tasks. In British telephone directories, for instance, it would be helpful if the names “McDonald” and "MacDon¬ ald’’ could be sorted to the same position. Such approaches can be extended into absurdity, of course, with the suggestion that O Donnell also be sorted to the same position—but it is not for the implementors to say where the line should lie between a genuine language-dependent problem and absurdity. The customer decides that question. And. in keeping with that, users should be able to generate their own collating and sorting tables so that they can introduce new schemes when the standard ones aren’t adequate. Conversions. Capabilities like toupper and to- lower run into some entertaining internationaliza¬ tion problems. As far as I can ascertain, the German lower case character /J (the "sharp s”) has no upper case equivalent, but converts into “SS” instead. What does toupper return in such circumstances? A string? This is a piece of woodwork that promises to reveal many worms once the paint is peeled off. REGULAR EXPRESSIONS The ramifications of internationalized regular expressions are actually unlikely to concern the great majority of system users. However, implemen¬ tors may want to use them to express concepts with which most of us are familiar. Some common ones include: [a-z] all lower case alphabetics [A-Za-z][A-Za-zO-9]* identifiers in C (almost) [-A-Za-z] any non-alphabetic character We could go on, but let’s assume that these have been used in sed and awk scripts for data validation. How then are they to deal with Norwe¬ gian—a language that is nearly as hard to pro¬ nounce as it is to write regular expressions for? All of the Scandinavian languages have the interesting feature of "extended” alphabets, with more letters than English. (Of course, the Scandinavians think that English suffers from a “restricted” alphabet.) All of the extra letters in Norwegian are vowels: s, 0 , and a. Circle No. 10 on Inquiry Card UNIX REVIEW DECEMBER 1985 65 ADVISOR Debugging with adb by Bill Tuthill m r \~V/ M In many ways, effective debug¬ ging is as critical as intelligent programming. The UNIX operat¬ ing system has achieved a certain stability in the marketplace, making debugging skills all the more important as consolidation of various versions of the system proceeds. Application software also demands debugging since there is at least one program for every known application, and at least one bug for every page of program source. Some bugs are merely a nuisance; others pre¬ vent programs from working altogether. Debugging isn’t glamorous. The joy of creating something original is largely absent. Afterwards, there's no new program to show your users. Nor can you produce pages of code to impress your manager. All you can say is that the software you’ve tended to works better than it did before. To my knowledge, no university offers a course on debugging, and no textbook exists that purports to teach debugging. This is too bad. because debugging requires much skill and programmers would benefit greatly from training on the subject. Even after Fred Brooks counseled against it in The Mythical Man Month, novice programmers have continued to be assigned debugging chores while more experienced programmers have been allowed to write new code. It is best if programmers maintain the code they write, but this does not usually happen because talented programmers like to move on to new challenges. On the other hand, some programmers actually enjoy debugging. Programmers who are good at the task usually fall into three categories: 1) those with good intuition and a grasp of the “big picture”; 2) 66 UNIX REVIEW DECEMBER 1985 those with great patience and attention to detail: and 3) those with a good debugger. It is hard to find programmers in the first two groups. And, unfortunately, the standard UNIX debugger, adb, does not qualify its users for the third group. [It should be men¬ tioned, though,that other debug¬ gers are available under UNIX— sdb, dbx, and cdb in particular.) All too often programmers use printfQ statements instead of em¬ ploying a debugger. This is a slow method, because code must be recompiled at every step. Intelli¬ gent use of a good debugger can yield better production. This article is the first of a series that describes the UNIX debuggers. The most widely propagated UNIX debugger is adb, which first appeared on Version 7 and has been on every major UNIX release since. One reason why UNIX programmers use printf() statements instead of a debugger is that adb is so limited. It may have the worst user interface of any UNIX program. Furthermore, it is not symbolic, so you can’t display C source code as you debug. Better debuggers are provided on other systems, including VMS and MS- DOS. THE adb DEBUGGER Most of the time, programmers use adb to find out why a program dumped core (stack backtrace). To ensure valuable output, it’s first necessary to check that the program hasn’t been stripped of its symbol tables. If it has, few of adb’s features will work. In¬ voke the debugger as follows, where program is the pathname of the executable file that dumped core: $ adb program core Also, consider the program listed in Figure 1, ^include (stdio.h) ^define LIMIT 5 mainQ /* print message and die */ { int i: for (i = 1: i <= 10 : i++) { printf("Goodbye world!\n") : dumpcore(i): } exit(O)= } dumpcore(lim) /* de-reference NULL pointer */ int lim : { int *ip: if (lim >* LIMIT) { ip = NULL; *ip = lim : } } Figure 1 — A program that de-references a NULL pointer. which de-references (references through) a NULL pointer. This is a common (but illegal) operation on VAX/UNIX, but causes a core dump on MC68000- based UNIX systems. On many machines, an assignment to address zero will cause a core dump due to a segmentation violation or memory fault. Here’s how you could find out why the program died: S adb core file = core / program = a.out memory fault $c _dumpcore[80b8](5) + 26 _jiain[8074](1.fffd84.fffd8c) + 2e $C _rlumpcore[80b8](5) * 26 ip: 0 _main[8074](1.fffd84.fffd8c) + 2e i: 5 The request $c yields a C stack trace, while $C yields a stack trace and also prints the value of all local variables. Other useful requests are $r to print the contents of all registers, $e to print the value of ex¬ ternal variables, and $m to print out the memory maps. Note that the values of the local variables ip and i are just what we would expect—0 and 5. You can print the values of local variables in active procedures (ones that actually are located on the stack) by typing the procedure name, a period, and then the variable name, followed by a slash: main.i/ fffd68: 5 = orb «0,d0 dumpcore.ip/ fffd58: 0 = ??? The value of ip in the dumpcoreQ procedure is suspicious because it doesn’t point to anything. The three question marks are an indication that some¬ thing is amiss. If you are an assembly language buff, you can see the assembler instructions at the beginning of mairt() by typing: main.5?i _main: _main: link a6.#0 addl #-4,a7 moveml «<>,sp@ movl #1.a6@(-4) cmpl #a.a6@(-4) Now you’ll probably want to edit the program. To get out of adb, type CTRL-D or use the $q request. Since 68 UNIX REVIEW DECEMBER 1985 Some adb Format Letters Letter Description one byte as a character one short word in octal one short word in decimal one short word in hexadecimal one long word in octal one long word in decimal one long word in hexadecimal single-precision floating point double-precision floating point machine instruction a null terminated character string the value of dot (the address) print a newline print a tab decrement dot (not really a format) Figure 2 — A table of formats for the adb debugger. adb traps signals, you can’t interrupt out of it. SYNTAX SUMMARY You can examine locations in an executable file with the ? request, or locations in a core file with the / request. These requests take the form: address ? format address / format The address may be a number or a symbol. The cur¬ rent address, called dot, is set when you specify an explicit address, and can be advanced by pressing RETURN. A table of formats is given in Figure 2. Your system remembers these formats, so once you give an address and format, RETURN advances through memory in the same format. Note that capital letters indicate added length, as in the difference between short word and long word. Requests are different from formats because they cause adb to react, rather than simply to print data. Put your UNIX ™ Training in High Gear with... Learning that gets your attention. USER TR ammE CORPOR3TIO/J See us at UniForum Booth #1317 (408) 370-9710 130-B Knowles Drive • Los Gatos, CA 95030 Circle No. 15 on Inquiry Card UNIX REVIEW DECEMBER 1985 69 w C ADVISOR The adb dubugger may have the worst user interface of any UNIX program. The general form of a request is: address.count command modifier This sets dot to address and executes command count times. Figure 3 lists the meaning of various adb commands. The useful commands presented there —Sc for a stack trace, Sr for the registers, and Se for the externals—are all considered miscella¬ neous requests. SETTING BREAKPOINTS Many programmers are intimidated by the adb documentation for breakpoints, but it isn’t hard to learn and is well worth the effort. The main problem is that adb can set breakpoints only at the subroutine level—but not at the statement level (this is possible, however, with the 4.2BSD debugger dbx.) When you invoke adb, give a dash as the second argument to indicate that the core file should be ignored. (On some systems a second argument is not necessary.) This will let you run the program under the control of adb: S adb a.out - dumpcore+4:b $b breakpoints count bkpt command 1 _dumpcore+4 Some adb Commands Command Description ? print contents from a.out file / print contents from core file = print value of dot breakpoint control $ miscellaneous requests request separator ! escape to shell Figure 3 — The meaning of various adb commands. On an MC68000, set the breakpoint at the subrou¬ tine plus 4 (the first instruction sets up the stack frame pointer) and then list the breakpoints with the Sb request. To run the program, enter :r. To con¬ tinue the program after the breakpoint, enter :c. Do t his five times, printing the variable i to make sure it works: : r Goodbye world! breakpoint _dumpcore+4: main.i/ fffd68: 1 :C Goodbye world! breakpoint _dumpcore+4: main.i/ fffd68: 2 addl #-4.a7 orb #0.d0 addl #-4,a7 orb #0.d0 When the value of i reaches 5, the program will have a memory fault. This is because the NULL pointer is de-referenced only when lim becomes 5 or greater: main.i/ fffd68: 5 = orb *0.d0 : C memory fault stopped at _dumpcore+26: moveml a6@(-4).«<> Now you know exactly how the program got to the point where it core-dumped. If single-stepping proceeds too slowly, you can remove breakpoints with the :d request, which has the same syntax as :b. SOME ANOMALIES Like ed, adb issues no prompt. When it doesn’t understand a request, it types its name back at you. For example, if you forget to put the slash after a variable name, you’ll see something like this: main.i adb This is the same response you’ll get if you try to in¬ terrupt it. You can print external variables simply by giving their name before the slash. Local variables, however, must be preceded by their function name. If you forget to do this, you’ll see the following message: i/ symbol not found 70 UNIX REVIEW DECEMBER 1985 Many programmers are intimidated by the adb documentation for breakpoints, but it isn't hard to I earn and is well worth the effort. If everything fails, you are probably trying to debug an executable file that has been stripped of its symbol tables. Make sure the file command reports that it is “executable not stripped” before starting to worry. If the file has been stripped, recompile the program and try to duplicate the bug that caused it to dump core in the first place. With multiple benefits for who understands more about what makes your y0Ui business special than any collection of suppliers Software transportability. Vertically, as well as could ever hope to. horizontally. Because Sperry is the world s largest supplier Open channels of information, communication of hardware running the UNIX™ 0/S, it’s the only and the creative energy that results when many company that can offer you all the benefits of users can share the same ideas, address the same the UNIX Operating System in such a range of problems. superior products, including. A growth path that is impossible to outgrow. Desktop PCs. And therefore protects your investment, by Microcomputers, enhancing it. . Minicomputers. And the securitv of dealing with a single supplier Superminis. t+H tHittW tf4 Hi ItHmtm.WttHTtHw tHiWMWWtWtttltH tHi tW ttHIHt HU HU HU ttUHU HU HU tfi HUHUHUHHHHHUHUnUHUHUHUHl And up. And Mainframes. Most of these readers probably also know that Apple quickly responded by filing a lawsuit against its former leader, claim¬ ing that by planning and starting a new enterprise, Jobs had violat¬ ed his obligations to the company and thus had harmed it. It’s too soon, of course, to know what will come of this controver¬ sy. Possibly the entire matter will end up being settled quietly. If so, that would be that. But if this doesn't happen, the dispute has the potential to generate a land¬ mark court decision—one which could have a major impact on the future of the computing industry. Jobs, after all, can be consid¬ ered the archetypical computer entrepreneur. He started a busi¬ ness with Stephen Wozniak in a garage in 1976, and then presided over the endeavor's growth into a Fortune 500 company. In the process he not only became a rich man. but, as much as anyone, helped bring the computer out of the domain of giant corporations and into America’s schools and homes. A new industry typified by such rapid technological innova¬ tion at the hands of a large number of gifted—and generally quite young—individuals is prob¬ ably without parallel in our histo¬ ry. Though the infant motion picture industry had its rough- and-tumble beginning, film tech¬ nology stabilized relatively quick¬ ly, and control over theaters and escalating costs of film produc¬ tion soon limited access for new participants. By contrast, computer technol¬ ogy continues to advance at dizzy¬ ing speed, and no one can predict where it might take us. Moreover, unlike other 20th Century indus¬ tries founded on new technol¬ ogies—automobile and aircraft production, for instance—com¬ puting has continued to provide nearly limitless opportunities for perceptive individuals and small concerns. Requirements of scale and financing precluded this sort of individual entrepreneurship in other fields. He had been Ap¬ ple’s vice-president until he was pushed out of that position last May. Even afterwards he re¬ mained chairman of the com¬ pany’s board of directors. In these positions, the law dictates that he had a greater responsibility to serve the interests of the com¬ pany than even a key employee would have. In whichever capacity, Jobs certainly could have been ex¬ pected to acquire intimate knowl¬ edge of Apple’s operations—at least through last April. It would be surprising if this knowledge did not encompass many, if not all, of Apple’s trade secrets. In addition, it seems reasonable to assume that Jobs would have become familiar with a great deal of proprietary information which, though lacking the exalted sta¬ tus of a trade secret, neverthe¬ less remains valuable to Apple in maintaining its competitive position. Apple’s suit against its former chairman was triggered by the disclosure that he intended to launch a new company oriented toward the university market, and that five of Apple’s employees would be joining him in the ven¬ ture. As might be expected, each side has its own version of the facts beyond these bare details. Apple contends that the former employees who will be associated with Jobs’ new company, Next, Inc., held key technical, finan¬ cial, and marketing positions. Specifically, Apple’s suit names Richard A. Page, who while em¬ ployed at Apple allegedly worked on the type of technology that it’s speculated Next intends to utilize. Who really cared about protecting a proprietary interest in something that soon would be obsolete? 