Monday, August 10, 2015

about DNIX

DNIX (original spelling: D-Nix) was a Unix-like real-time operating system from the Swedish company Dataindustrier AB (DIAB). A version called ABCenix was also developed for the ABC1600 computer from Luxor.
(Daisy Systems also had something called Daisy DNIX on some of their CAD workstations. It was unrelated to DIAB's product.)

History

Inception at DIAB in Sweden

Dataindustrier AB (literal translation: computer industries shareholding company) was started in 1970 by Lars Karlsson as a single-board computer manufacture in Sundsvall,Sweden, producing a Zilog Z80-based computer called Data Board 4680. In 1978 DIAB started to work with the Swedish television company Luxor AB to produce the home and office computer series ABC 80 and ABC 800.
In 1983 DIAB independently developed the first UNIX-compatible machine, DIAB DS90 based on the Motorola 68000 CPU. D-NIX here made its appearance, based on a UNIX System V license from AT&T. DIAB was however an industrial automation company, and needed a real-time operating system, so the company replaced the AT&T-supplied UNIXkernel with their own in-house developed, yet compatible real-time variant. This kernel was originally a Z80 kernel called OS8.
Over time, the company also replaced several of the UNIX standard userspace tools with their own implementations, to the point where no code was derived from UNIX, and their machines could be deployed independently of any AT&T UNIX license. Two years later and in cooperation with Luxor, a computer called ABC 1200 was developed for the office market, while in parallel, DIAB continue to produce enhanced versions of the DS90 computer using newer versions of the Motorola CPUs such as Motorola 68010, 68020, 68030and eventually 68040. In 1990 DIAB was acquired by Groupe Bull who continued to produce and support the DS machines under the brand name DIAB, with names such asDIAB 2320DIAB 2340 etc., still running DIABs version of DNIX.

Derivative at ISC Systems Corporation

ISC Systems Corporation (ISC) purchased the right to use DNIX in the late 1980s for use in its line of Motorola 68k-based banking computers. (ISC was later bought by Olivetti, and was in turn resold to Wang, which was then bought by Getronics. This corporate entity, most often referred to as 'ISC', has answered to a bewildering array of names over the years.) This code branch was the SVR2 compatible version, and received extensive modification and development at their hands. Notable features of this operating system were its support of demand paging, diskless workstations, multiprocessing, asynchronous I/O, the ability to mount processes (handlers) on directories in the file system, and message passing. Its real-time support consisted largely of internal event-driven queues rather than list search mechanisms (no 'thundering herd'), static process priorities in two classes (run to completion and timesliced), support for contiguous files (to avoid fragmentation of critical resources), and memory locking. The quality of the orthogonalasynchronous event implementation has yet to be equalled in current commercial operating systems, though some approach it. (The concept that has yet to be adopted is that the synchronous marshalling point of all the asynchronous activity itself could be asynchronous, ad infinitum. DNIX handled this with aplomb.) The asynchronous I/O facility obviated the need for Berkeley sockets select or SVR4's STREAMS poll mechanism, though there was a socket emulation library that preserved the socket semantics for backward compatibility. Another feature of DNIX was that none of the standard utilities (such as ps, a frequent offender) rummaged around in the kernel's memory to do their job. System calls were used instead, and this meant the kernel's internal architecture was free to change as required. The handler concept allowed network protocol stacks to be outside the kernel, which greatly eased development and improved overall reliability, though at a performance cost. It also allowed for foreign file systems to be user-level processes, again for improved reliability. The main file system, though it could have been (and once was) an external process, was pulled into the kernel for performance reasons. Were it not for this DNIX could well have been considered a micro kernel, though it was not formally developed as such. Handlers could appear as any type of 'native' Unix file, directory structure, or device, and file I/O requests that the handler itself could not process could be passed off to other handlers, including the underlying one upon which the handler was mounted. Handler connections could also exist and be passed around independent of the filesystem, much like a pipe. One effect of this is that TTY-like 'devices' could be emulated without requiring a kernel-based pseudo terminal facility.
An example of where a handler saved the day was in ISC's diskless workstation support, where a bug in the implementation meant that using named pipes on the workstation could induce undesirable resource locking on the file server. A handler was created on the workstation to field accesses to the afflicted named pipes until the appropriate kernel fixes could be developed. This handler required approximately 5 kilobytes of code to implement, an indication that a non-trivial handler did not need to be large.
ISC also received the right to manufacture DIAB's DS90-10 and DS90-20 machines as its file servers. The multiprocessor DS90-20's, however, were too expensive for the target market and ISC designed its own servers and ported DNIX to them. ISC designed its own GUI-based diskless workstations for use with these file servers, and ported DNIX again. (Though ISC used Daisy workstations running Daisy DNIX to design the machines that would run DIAB's DNIX, there was negligible confusion internally as the drafting and layout staff rarely talked to the software staff. Moreover, the hardware design staff didn't use either system! The running joke went something like: "At ISC we build computers, we don'tuse them.") The asynchronous I/O support of DNIX allowed for easy event-driven programming in the workstations, which performed well even though they had relatively limited resources. (The GUI diskless workstation had a 7 MHz 68010 processor and was usable with only 512K of memory, of which the kernel consumed approximately half. Most workstations had 1 MB of memory, though there were later 2 MB and 4 MB versions, along with 10 MHz processors.) A full-blown installation could consist of one server (16 MHz68020, 8 MB of RAM, and a 200 MB hard disk) and up to 64 workstations. Though slow to boot up, such an array would perform acceptably in a bank teller application. Besides the innate efficiency of DNIX, the associated DIAB C compiler was key to high performance. It generated particularly good code for the 68010, especially after ISC got done with it. (ISC also retargeted it to the Texas Instruments TMS34010 graphics coprocessor used in its last workstation.) The DIAB C compiler was, of course, used to build DNIX itself which was one of the factors contributing to its efficiency, and is still available (in some form) through Wind River Systems.
These systems are still in use as of this writing in 2006, in former Seattle-First National Bank branches now branded Bank of America. There may be, and probably are, other ISC customers still using DNIX in some capacity. Through ISC there was a considerable DNIX presence in Central and South America.

Asynchronous Events

DNIX's native system call was the dnix(2) library function, analogous to the standard Unix unix(2) or syscall(2) function. It took multiple arguments, the first of which was a function code. Semantically this single call provided all appropriate Unix functionality, though it was syntactically different from Unix and had, of course, numerous DNIX-only extensions.
DNIX function codes were organized into two classes: Type 1 and Type 2. Type 1 commands were those that were associated with I/O activity, or anything that could potentially cause the issuing process to block. Major examples were F_OPENF_CLOSEF_READF_WRITEF_IOCRF_IOCWF_WAIT, and F_NAP. Type 2 were the remainder, such as F_GETPIDF_GETTIME, etc. They could be satisfied by the kernel itself immediately.
To invoke asynchronicity, a special file descriptor called a trap queue had to have been created via the Type 2 opcode F_OTQ. A Type 1 call would have the F_NOWAIT bit OR-ed with its function value, and one of the additional parameters to dnix(2) was the trap queue file descriptor. The return value from an asynchronous call was not the normal value but a kernel-assigned identifier. At such time as the asynchronous request completed, a read(2) (or F_READ) of the trap queue file descriptor would return a small kernel-defined structure containing the identifier and result status. The F_CANCEL operation was available to cancel any asynchronous operation that hadn't yet been completed, one of its arguments was the kernel-assigned identifier. (A process could only cancel requests that were currently owned by itself. The exact semantics of cancellation was up to each request's handler, fundamentally it only meant that any waiting was to be terminated. A partially completed operation could be returned.) In addition to the kernel-assigned identifier, one of the arguments given to any asynchronous operation was a 32-bit user-assigned identifier. This most often referenced a function pointer to the appropriate subroutine that would handle the I/O completion method, but this was merely convention. It was the entity that read the trap queue elements that was responsible for interpreting this value.
struct itrq {                   /* Structure of data read from trap queue. */
        short   it_stat;        /* Status */
        short   it_rn;          /* Request number */
        long    it_oid;         /* Owner ID given on request */
        long    it_rpar;        /* Returned parameter */
};
Of note is that the asynchronous events were gathered via normal file descriptor read operations, and that such reading was itself capable of being made asynchronous. This had implications for semi-autonomous asynchronous event handling packages that could exist within a single process. (DNIX 5.2 did not have lightweight processes or threads.) Also of note is that any potentially blocking operation was capable of being issued asynchronously, so DNIX was well equipped to handle many clients with a single server process. A process was not restricted to having only one trap queue, so I/O requests could be grossly prioritized in this way.

Compatibility

In addition to the native dnix(2) call, a complete set of 'standard' libc interface calls was available. open(2)close(2)read(2)write(2), etc. Besides being useful for backwards compatibility, these were implemented in a binary-compatible manner with the NCR Tower computer, so that binaries compiled for it would run unchanged under DNIX. The DNIX kernel had two trap dispatchers internally, one for the DNIX method and one for the Unix method. Choice of dispatcher was up to the programmer, and using both interchangeably was acceptable. Semantically they were identical wherever functionality overlapped. (In these machines the 68000 trap #0 instruction was used for the unix(2)calls, and the trap #4 instruction for dnix(2). The two trap handlers were really quite similar, though the [usually hidden] unix(2) call held the function code in the processor's D0 register, whereas dnix(2) held it on the stack with the rest of the parameters.)
DNIX 5.2 had no networking protocol stacks internally (except for the thin X.25-based Ethernet protocol stack added by ISC for use by its diskless workstation support package), all networking was conducted by reading and writing to Handlers. Thus, there was no socket mechanism, but a libsocket(3) existed that used asynchronous I/O to talk to the TCP/IP handler. The typical Berkeley-derived networking program could be compiled and run unchanged (modulo the usual Unix porting problems), though it might not be as efficient as an equivalent program that used native asynchronous I/O.

Handlers

Under DNIX, a process could be used to handle I/O requests and to extend the filesystem. Such a process was called a Handler, and was a major feature of the operating system. A handler was defined as a process that owned at least one request queue, a special file descriptor that was procured in one of two ways: with a F_ORQ or a F_MOUNTcall. The former invented an isolated request queue, one end of which was then typically handed down to a child process. (The network remote execution programs, of which there were many, used this method to provide standard I/O paths to their children.) The latter hooked into the filesystem so that file I/O requests could be adopted by handlers. (The network login programs, of which there were even more, used this method to provide standard I/O paths to their children, as the semantics of logging in under Unix requires a way for multiple perhaps-unrelated processes to horn in on the standard I/O path to the operator.) Once mounted on a directory in the filesystem, the handler then received all I/O calls to that point.
A handler would then read small kernel-assigned request data structures from the request queue. (Such reading could be done synchronously or asynchronously as the handler's author desired.) The handler would then do whatever each request required to be satisfied, often using the DNIX F_UREAD and F_UWRITE calls to read and write into the request's data space, and then would terminate the request appropriately using F_TERMIN. A privileged handler could adopt the permissions of its client for individual requests to subordinate handlers (such as the filesystem) via the F_T1REQ call, so it didn't need to reproduce the subordinate's permission scheme. If a handler was unable to complete a request itself, the F_PASSRQ function could be used to pass I/O requests from one handler to another. A handler could perform part of the work requested before passing the rest on to another handler. It was very common for a handler to be state-machine oriented so that requests it was fielding from a client were all done asynchronously. This allowed for a single handler to field requests from multiple clients simultaneously without them blocking each other unnecessarily. Part of the request structure was the process ID and its priority so that a handler could choose what to work on first based upon this information, there was no requirement that work be performed in the order it was requested. To aid in this, it was possible to poll both request and trap queues to see if there was more work to be considered before buckling down to actually do it.
struct ireq {                   /* Structure of incoming request */
        short   ir_fc;          /* Function code */
        short   ir_rn;          /* Request number */
        long    ir_opid;        /* Owner ID that you gave on open or mount */
        long    ir_bc;          /* Byte count */
        long    ir_upar;        /* User parameter */
        long    ir_rad;         /* Random address */
        ushort  ir_uid;         /* User ID */
        ushort  ir_gid;         /* User group */
        time_t  ir_time;        /* Request time */
        ulong   ir_nph;
        ulong   ir_npl;         /* Node and process ID */
};
There was no particular restriction on the number of request queues a process could have. This was used to provide networking facilities to chroot jails, for example.

Examples

To give some appreciation of the utility of handlers, at ISC handlers existed for:
  • foreign file systems
    • FAT
    • CD-ROM/ISO9660
    • disk image files
    • RAM disk (for use with write-protected boot disks)
  • networking protocols
    • DNET (essentially X.25 over Ethernet, with multicast capability)
    • X.25
    • TCP/IP
    • DEC LAT
    • AppleTalk
  • remote filesystems
    • DNET's /net/machine/path/from/its/root...
    • NFS
  • remote login
    • ncu (DNET)
    • telnet
    • rlogin
    • wcu (DNET GUI)
    • X.25 PAD
    • DEC LAT
  • remote execution
    • rx (DNET)
    • remsh
    • rexec
  • system extension
    • windowman (GUI)
    • vterm (xterm-like)
    • document (passbook) printer
    • dmap (ruptime analog)
    • windowmac (GUI gateway to Macintosh)
  • system patches
    • named pipe handler

ISC's Extensions

ISC purchased both 5.2 (SVR2 compatible) and 5.3 (SVR3 compatible) versions of DNIX. At the time of purchase, DNIX 5.3 was still undergoing development at DIAB so DNIX 5.2 was what was deployed. Over time, ISC's engineers incorporated most of their 5.3 kernel's features into 5.2, primarily shared memory and IPC, so there was some divergence of features between DIAB and ISC's versions of DNIX. DIAB's 5.3 likely went on to contain more SVR3 features than ISC's 5.2 ended up with. Also, DIAB went on to DNIX 5.4, a SVR4 compatible OS.
At ISC, developers considerably extended their version of DNIX 5.2 (only listed are features involving the kernel itself) based upon both their needs and the general trends of the Unix industry:
  • Diskless workstation support. The workstation's kernel filesystem was removed, and replaced with an X.25-based Ethernet communications stub. The file server's kernel was also extended with a mating component that received the remote requests and handed them to a pool of kernel processes for service, though a standard handler could have been written to do this. (Later in its product lifecycle, ISC deployed standard SVR4-based Unix servers in place of the DNIX servers. These used X.25 STREAMS and a custom-written file server program. In spite of the less efficient structuring, the raw horsepower of the platforms used made for a much faster server. It is unfortunate that this file server program did not support all of the functionality of the native DNIX server. Tricky things, like named pipes, never worked at all. This was another justification for the named pipe handler process.)
  • gdb watchpoint support using the features of ISC's MMU.
  • Asynchronous I/O to the filesystem was made real. (Originally it blocked anyway.) Kernel processes (kprocs, or threads) were used to do this.
  • Support for a truss- or strace-like program. In addition to some repairs to bugs in the standard Unix ptrace single-stepping mechanism, this required adding a temporary process adoption facility so that the tracer could use the standard single-stepping mechanism on existing processes.
  • SVR4 signal mechanism extensions. Primarily for the new STOP and CONT signals, but encompassing the new signal control calls as well. Due to ISC's lack of source code for the adb and sdb debuggers the u-page could not be modified, so the new signals could only be blocked or receive default handling, they could not be caught.
  • Support for network sniffing. This required extending the Ethernet driver so that a single event could satisfy more than one I/O request, and conditionally implementing the hardware filtering in software to support promiscuous mode.
  • Disk mirroring. This was done in the file system and not the device driver, so that slightly (or even completely) different devices could still be mirrored together. Mirroring a small hard disk to the floppy was a popular way to test mirroring as ejecting the floppy was an easy way to induce disk errors.
  • 32-bit inode, 30-character filename, symbolic link, and sticky directory extensions to the filesystem. Added /dev/zero, /dev/noise, /dev/stdXXX, and /dev/fd/X devices.
  • Process group id lists (from SVR4).
  • #! direct script execution.
  • Serial port multiplication using ISC's Z-80 based VMEbus communications boards.
  • Movable swap partition.
  • Core 'dump' snapshots of running processes. Support for fuser command.
  • Process renice function. Associated timesharing reprioritizer program to implement floating priorities.
  • A way to 'mug' a process, instantly depriving it of all memory resources. Very useful for determining what the current working set is, as opposed to what is still available to it but not necessarily being used. This was associated with a GUI utility showing the status of all 1024 pages of a process's memory map. (This being the number of memory pages supported by ISC's MMU.) In use you would 'mug' the target process periodically through its life and then watch to see how much memory was swapped back in. This was useful as ISC's production environment used only a few long-lived processes, controlling their memory utilization and growth was key to maintaining performance.

Features that were never added

When DNIX development at ISC effectively ceased in 1997, a number of planned OS features were left on the table:
  • Shared objects - There were two dynamically-loaded libraries in existence, an encryptor for DNET and the GUI's imaging library, but the facility was never generalized. ISC's machines were characterized by a general lack of virtual address space, so extensive use of memory-mapped entities would not have been possible.
  • Lightweight processes - The kernel itself already had multiple threads that shared a single MMU context, extending this to user processes should have been straightforward. The API implications would have been the most difficult part of this.
  • Access Control Lists - Trivial to implement using an ACL handler mounted over the stock file system.
  • Multiple swap partitions - DNIX already used free space on the selected volume for swapping, it would have been easy to give it a list of volumes to try in turn, potentially with associated space limits to keep it from consuming all free space on a volume before moving on to the next one.
  • Remote kernel debugging via gdb - All the pieces were there to do it either through the customary serial port or over Ethernet using the kernel's embedded X.25 link software, but they were never assembled.
  • 68030 support - ISC's prototypes were never completed. Two processor piggyback plug-in cards were built, but were never used as more than faster 68020's. They were not reliable, nor were they as fast as they could have been due to having to fit into a 68020 socket. The fast context switching ISC MMU would be left disabled (and left out altogether in proposed production units), and the embedded one of the 68030 was to have been used instead, using a derivative of the DS90-20's MMU code. While the ISC MMU was very efficient and supported instant switching among 32 resident processes, it was very limited in addressability. The 68030 MMU would have allowed for much more than 8 MB of virtual space in a process, which was the limit of the ISC MMU. Though this MMU would be slower, the overall faster speed of the 68030 should have more than made up for it, so that a 68030 machine was expected to be in all ways faster, and support much larger processes.

No comments: