Aphonlab.210 net.bugs.2bsd utcsrgv!utzoo!decvax!ucbvax!ucsfcgl!sdcarl!sdcattb!sdcsvax!phonlab!donn Tue Mar 9 03:50:07 1982 2.8 on the 11/34 Hello. I'm a novice kernel hacker who has taken on the job of getting Berkeley 2.8 Unix running on a pair of PDP11/34s and I have some technical questions about Unix's innards and 11/34 hardware. If you think small 11s are boring then feel free to skip to the next message... 1) Speed. We are using one of our 11s as a single-user system to which we've attached an Evans & Sutherland Picture System II graphics unit. We have molecular modelling software for the PS2 that was developed under Version 6 and it seems to run fine under V7, except that actual display manipulation runs some 5-10% slower under V7 than under V6. (Thanks to Berkeley's big disk blocks, database manipulation runs 10-45% faster.) I'm curious about where the lost time has gone and what can be done about it. One of the first things I noticed about Berkeley 2.8 is that large 11s benefit from an "ed" script that turns calls to the priority level functions "spl[0-7]" in the assembler output of the kernel C compilation into actual SPL instructions. While the 11/34 has no SPL instruction, it can set priority with MFPS and MTPS, instructions which are equivalent to a MOV with a source (resp. destination) of the low byte of the processor status word. >From my reading of the DEC Processor Handbook for PDP's, it appears that a call to one of the "spl" routines should take some 25-30 mu-sec, whereas a corresponding MFPS and/or MTPS takes 4-7 mu-sec, a substantial saving. There are lots and lots of calls to these functions, too; in the source for our currently config- ured kernel I counted well over 100. I wrote a little C program that takes "spl[0-7]" and "splx" calls in assembly language and intelligently turns them into MFPS and MTPS instructions (incidentally producing a net small saving in space). The kernel I get from this post-processing boots and seems to run all right (which is to say I booted it once and ran "fsck" with no ill effects, but was too nervous to persevere). Has anyone out there tried this already, or does anyone care to venture an opinion about whether this is both safe and more efficient? I'm a little suspicious of this change because if it works, why wasn't it done before? 2) Time. I noticed that one difference from V6 in 2.8 is that the priority of the clock routine in "clock.c" is (redundantly) set to level 6 just before performing callouts, while in V6 it gets set to level 5. I assume that this is done to block clock interrupts, since I can't find anything else that interrupts at that high a priority. The V6 code is written to be re-entrant, so that a clock interrupt can occur while a previous clock inter- rupt is still being serviced, with no harm done (as far as I can tell). Berkeley 2.8 V7 locks out clock interrupts, except at "lightning bolt" time after load metering is done. Thus a pretty large amount of code is executed at level 6 at certain times: I can visualize a clock interrupt being lost when (say) some code somewhere else in the kernel sets priority level to 6 just before a clock interrupt wants to arrive; then when the interrupt finally does get serviced, it may have lots of callouts to do followed by a lot of load metering to do (say that it's lightning bolt time), so that by the time an "spl1" is done 34 milliseconds have been spent at priority level 6... Does this seem reasonable on an 11/34? And why do so many routines set priority to 6 to block clock interrupts when if the priority is merely elevated to 1 then the clock routine does almost nothing significant but decrement the callout timer? The change to the clock routine to set priority to 6 is not commented, but there are remarks in e.g. "bk.c" that indicate that this has something to do with clock interrupt blocking in certain device drivers. The particular comment in "bk.c" says that "dzrint" should not be executed at clock interrupt time in certain places so "spl5"'s need to be "spl6"'s, but the clock routine won't do any callouts at all if the priority was elevated at interrupt time... What's this all about? If I don't have any DZ's (I don't) can I change it back to 5? 3) Size. The bane of nonseparate I/D. The 2.8 distribution didn't come with "awk", a program I'm fond of using under 4.1 BSD on our VAXen. I found "awk" source on the 2.8 prerelease tape, and discovered that I needed "yacc" and "lex" to compile it. Well, it turns out that the available flavors of "yacc" on the prerelease tape (MEDIUM and HUGE) are rather too large for my poor old 11/34; with some hacking I managed to scrunch "yacc" down to about half of MEDIUM size, and it compiled. This SMALL "yacc" was sufficient to compile a SMALL "lex", but neither of these was capable of handling the parser and scanner of "awk"... I succeeded in compiling a nonseparate I/D "awk" on an 11/44 and brought it over to my 11/34, but most unfortunately it runs "out of space in ALLOC" when presented with any but the most trivial programs. Has anyone in netnews-land solved this problem with compiling "awk" before (short of buying a bigger machine)? 4) PDP11 long integer brain damage. Our facility has bought and is buying more VAXen, but we don't yet have enough cash to replace our PDP11s (partly because of the way our users are funded). One way to conserve resources and perhaps also to implement inter-machine communication is to share disks among our CPUs. We'd like to do this by dual-porting controllers. Wouldn't it be nice if it were possible to have the same filesys- tem structure on all our machines? I asked Bill Jolitz about this at USENIX and he warned me that it would be a tough job, because although the filesystem structures are superficially similar, the PDP11 stores long integers the wrong way, with the high order word at the low address, unlike a VAX. All the places in the kernel where VAX long integers would be read in from disk would have to be modified to flip the words into PDP11 order. Well, I thought of a way to accomplish this indirectly. It turns out that the most significant part of the funny word order on the 11s has to do with how they are loaded into registers (the high order word takes the lower-numbered register); instructions like ASHC, MUL and DIV can only operate on long quantities in regis- ter. I have munged a version of the C compiler to put long integers into register the way the PDP11 likes them but to keep them in memory in the right (VAX) word order. The compiler seems to work (cross my fingers) and I am about ready to start trying out programs like "mkfs" to see if they do the right thing. Before I sink too much more of my time into this, does anyone see why this wouldn't do what I want? Sorry for the long opus; I hope people find it interesting enough to send me suggestions at ucbvax!sdcsvax!sdchema!donn or post them to the newsgroup. Donn Seeley UC San Diego Chemistry Dept. ucbvax!sdcsvax!phonlab!donn ucbvax!sdcsvax!sdchema!fred!donn etc. ----------------------------------------------------------------- gopher://quux.org/ conversion by John Goerzen of http://communication.ucsd.edu/A-News/ This Usenet Oldnews Archive article may be copied and distributed freely, provided: 1. There is no money collected for the text(s) of the articles. 2. The following notice remains appended to each copy: The Usenet Oldnews Archive: Compilation Copyright (C) 1981, 1996 Bruce Jones, Henry Spencer, David Wiseman.