Aphonlab.210
net.bugs.2bsd
utcsrgv!utzoo!decvax!ucbvax!ucsfcgl!sdcarl!sdcattb!sdcsvax!phonlab!donn
Tue Mar  9 03:50:07 1982
2.8 on the 11/34
Hello.  I'm a novice kernel hacker who has taken on the job of
getting Berkeley 2.8 Unix running on a pair of PDP11/34s and I
have some technical questions about Unix's innards and 11/34
hardware.  If you think small 11s are boring then feel free to
skip to the next message...

1) Speed.  We are using one of our 11s as a single-user system to
which we've attached an Evans & Sutherland Picture System II
graphics unit.  We have molecular modelling software for the PS2
that was developed under Version 6 and it seems to run fine under
V7, except that actual display manipulation runs some 5-10%
slower under V7 than under V6.  (Thanks to Berkeley's big disk
blocks, database manipulation runs 10-45% faster.) I'm curious
about where the lost time has gone and what can be done about it.
One of the first things I noticed about Berkeley 2.8 is that
large 11s benefit from an "ed" script that turns calls to the
priority level functions "spl[0-7]" in the assembler output of
the kernel C compilation into actual SPL instructions.  While the
11/34 has no SPL instruction, it can set priority with MFPS and
MTPS, instructions which are equivalent to a MOV with a source
(resp. destination) of the low byte of the processor status word.
>From my reading of the DEC Processor Handbook for PDP's, it
appears that a call to one of the "spl" routines should take some
25-30 mu-sec, whereas a corresponding MFPS and/or MTPS takes 4-7
mu-sec, a substantial saving.  There are lots and lots of calls
to these functions, too; in the source for our currently config-
ured kernel I counted well over 100.  I wrote a little C program
that takes "spl[0-7]" and "splx" calls in assembly language and
intelligently turns them into MFPS and MTPS instructions
(incidentally producing a net small saving in space).  The kernel
I get from this post-processing boots and seems to run all right
(which is to say I booted it once and ran "fsck" with no ill
effects, but was too nervous to persevere).  Has anyone out there
tried this already, or does anyone care to venture an opinion
about whether this is both safe and more efficient?  I'm a little
suspicious of this change because if it works, why wasn't it done
before?

2) Time.  I noticed that one difference from V6 in 2.8 is that
the priority of the clock routine in "clock.c" is (redundantly)
set to level 6 just before performing callouts, while in V6 it
gets set to level 5.  I assume that this is done to block clock
interrupts, since I can't find anything else that interrupts at
that high a priority.  The V6 code is written to be re-entrant,
so that a clock interrupt can occur while a previous clock inter-
rupt is still being serviced, with no harm done (as far as I can
tell).  Berkeley 2.8 V7 locks out clock interrupts, except at
"lightning bolt" time after load metering is done.  Thus a pretty
large amount of code is executed at level 6 at certain times: I
can visualize a clock interrupt being lost when (say) some code
somewhere else in the kernel sets priority level to 6 just before
a clock interrupt wants to arrive; then when the interrupt
finally does get serviced, it may have lots of callouts to do
followed by a lot of load metering to do (say that it's lightning
bolt time), so that by the time an "spl1" is done 34 milliseconds
have been spent at priority level 6...  Does this seem reasonable
on an 11/34?  And why do so many routines set priority to 6 to
block clock interrupts when if the priority is merely elevated to
1 then the clock routine does almost nothing significant but
decrement the callout timer?  The change to the clock routine to
set priority to 6 is not commented, but there are remarks in e.g.
"bk.c" that indicate that this has something to do with clock
interrupt blocking in certain device drivers.  The particular
comment in "bk.c" says that "dzrint" should not be executed at
clock interrupt time in certain places so "spl5"'s need to be
"spl6"'s, but the clock routine won't do any callouts at all if
the priority was elevated at interrupt time...  What's this all
about?  If I don't have any DZ's (I don't) can I change it back
to 5?

3) Size.  The bane of nonseparate I/D.  The 2.8 distribution
didn't come with "awk", a program I'm fond of using under 4.1 BSD
on our VAXen.  I found "awk" source on the 2.8 prerelease tape,
and discovered that I needed "yacc" and "lex" to compile it.
Well, it turns out that the available flavors of "yacc" on the
prerelease tape (MEDIUM and HUGE) are rather too large for my
poor old 11/34; with some hacking I managed to scrunch "yacc"
down to about half of MEDIUM size, and it compiled.  This SMALL
"yacc" was sufficient to compile a SMALL "lex", but neither of
these was capable of handling the parser and scanner of "awk"...
I succeeded in compiling a nonseparate I/D "awk" on an 11/44 and
brought it over to my 11/34, but most unfortunately it runs "out
of space in ALLOC" when presented with any but the most trivial
programs.  Has anyone in netnews-land solved this problem with
compiling "awk" before (short of buying a bigger machine)?

4) PDP11 long integer brain damage.  Our facility has bought and
is buying more VAXen, but we don't yet have enough cash to
replace our PDP11s (partly because of the way our users are
funded).  One way to conserve resources and perhaps also to
implement inter-machine communication is to share disks among our
CPUs.  We'd like to do this by dual-porting controllers.
Wouldn't it be nice if it were possible to have the same filesys-
tem structure on all our machines?  I asked Bill Jolitz about
this at USENIX and he warned me that it would be a tough job,
because although the filesystem structures are superficially
similar, the PDP11 stores long integers the wrong way, with the
high order word at the low address, unlike a VAX.  All the places
in the kernel where VAX long integers would be read in from disk
would have to be modified to flip the words into PDP11 order.
Well, I thought of a way to accomplish this indirectly.  It turns
out that the most significant part of the funny word order on the
11s has to do with how they are loaded into registers (the high
order word takes the lower-numbered register); instructions like
ASHC, MUL and DIV can only operate on long quantities in regis-
ter.  I have munged a version of the C compiler to put long
integers into register the way the PDP11 likes them but to keep
them in memory in the right (VAX) word order.  The compiler seems
to work (cross my fingers) and I am about ready to start trying
out programs like "mkfs" to see if they do the right thing.
Before I sink too much more of my time into this, does anyone see
why this wouldn't do what I want?

Sorry for the long opus; I hope people find it interesting enough
to send me suggestions at ucbvax!sdcsvax!sdchema!donn or post
them to the newsgroup.

                                        Donn Seeley
                                        UC San Diego Chemistry Dept.
                                        ucbvax!sdcsvax!phonlab!donn
                                        ucbvax!sdcsvax!sdchema!fred!donn
                                        etc.


-----------------------------------------------------------------
 gopher://quux.org/ conversion by John Goerzen <jgoerzen@complete.org>
 of http://communication.ucsd.edu/A-News/


This Usenet Oldnews Archive
article may be copied and distributed freely, provided:

1. There is no money collected for the text(s) of the articles.

2. The following notice remains appended to each copy:

The Usenet Oldnews Archive: Compilation Copyright (C) 1981, 1996 
 Bruce Jones, Henry Spencer, David Wiseman.