Channel based scheduling - the transputer model Larry Dickson tjoccam@tjoccam.com 4-5 October 2008 SUMMARY The purpose of this note is to show how both hardware and software IO can drive scheduling of processes (by whatever name - "tasklets" is the term used in Stackless Python) that are run in a single memory and sequential CPU or analogue. No multitasking OS is needed (or desired) because the model handles all scheduling. Asynchronous stimulation by hardware IO readiness is OK. The architecture described is that of the transputer, which is slim and can easily be simulated (see the Transterpreter project). Transputer instructions, whose names are usually self-explanatory, will occasionally be referred to in single quotes, e.g. 'startp'. The Reference gives details. This note is derived from the Reference, which can still be found. Unfortunately the Reference does not seem to be available in machine-readable form. PROCESSES A process, in the sense meant in this note, comes into existence ('startp'), carries out a series of effectively sequential operations, and then terminates ('endp'). When it comes into existence, it takes possession of a block of memory keyed at a workspace address, W, and takes control of an instruction sequence starting at an instruction address, I. When it terminates, its block of memory returns to the possession of a parent, and its instruction sequence is abandoned. By "effectively sequential" is meant that each instruction except the last has a unique successor, and each except the first has a unique predecessor. However, large gaps of time may exist between the execution of an instruction and its successor (blocking or swapping out). This means multitasking can be done, as managed by queues and stimuli, with the process's instruction sequence being a proper subset of the CPU's instruction sequence. The process envisioned here is a maximal sequential process. I do not generalize the term "process" in either direction - to smaller processes (like procedure calls or event responses) or to larger processes (like PAR constructs of several processes running multitasked and intercommunicating). QUEUES A process takes or loses control of the CPU's instruction sequence by being scheduled or descheduled. Except for the last time ('endp') it always gets (re)scheduled after being descheduled, and therefore information must be stored that makes this rescheduling possible. ALMOST ALL SUCH INFORMATION IS STORED IN THE PROCESS'S WORKSPACE, which is dedicated to the process throughout the process's life. However, in every case the pointer W to the workspace itself must be stored in an external queue to make rescheduling possible. There are three kinds of queues. (In process and timer queue cases, the transputer has two of each kind, corresponding to high and low priority processes, but I will ignore that here for simplicity.) (1) Process queue: A front pointer and back pointer are maintained by the system, with the front pointer containing W for the first process in the queue, and the last process in the queue containing the address of the back pointer. Processes that swap out, as by timeslicing, go into (the end of) the queue, and thus it acts as a round robin. Processes that are descheduled, as for timer waits or IO, go to other queue(s) for a while and when ready are restored to the end of the process queue. (2) Channel: An IO channel, which is declared OUTSIDE the scope of the communicating processes, contains a queue with at most one process in it. When communication is ready in the two-process case, it places the queued (early) process into the process queue, while the other communicating process continues without descheduling. In the one-process case (hardware link IO or communication to a peripheral that is outside the CPU instruction stream), the IO channel is declared outside the scope of all process programming, i.e. it is a fixed system resource. "Hardware" (i.e. service code for the communication link in question) places the process into the process queue when the communication is done. (3) Timer: A process waiting on a timeout enters a standard timer queue, from which it is removed after the timeout either expires or is nullified (the latter case arises in an ALT or select that the timeout loses). If appropriate it is then placed in the process queue. ADDRESSING Transputers use a "falling stack" so that the most used local variables are at small positive offsets from W. The special process-wide communication and queueing usages are at small negative offsets. W[0] Used for scratch in ALT and short communications. W[-1] Set to next I when process is descheduled. W[-2] Set to W of next process in process queue. W[-3] Set to address of channel holding this process, or ALT state. W[-4] Set to W of next process in timer queue, or ALT time selection state. W[-5] Set to timeout. Not all these usages are necessary to the state machines of the queues; some are to aid post-mortem analysis. A compiler decides how many of them are needed for a given process, based, for instance, on whether timers or ALTs are used in that process. Notice also that singly linked lists are used; doubly linked lists, at least in the timer ALT case, could have speed advantages. CHANNEL COMMUNICATION This (together with Hardware Select, below) is the only state machine needed if timer and ALT (the select capability of occam) are not used, or are replaced with a timer-based event loop as (I believe) in Stackless. A channel is only one word long: C[0]=W where W is the address of the waiting process. Otherwise C[0]=NotProcess.p where NotProcess.p is a special value that can never be a workspace address. When the early process reaches the IO instruction, it finds NotProcess.p in the channel and replaces it with its own workspace address. Then it deschedules itself without placing itself in the process queue. When the late process reaches the communicating IO instruction, or the external communication finishes, it finds the workplace address of the early process in the channel. It places it in the process queue and writes NotProcess.p in the channel. More complex things happen in case of an ALT. TIMER After checking whether it needs to deschedule (i.e. whether timeout already happened), the process reaching a timer wait enters the queue of that timer and proceeds along it until it finds the place for its timeout. Then it inserts itself into that place in the timer queue and deschedules itself without placing itself in the process queue. Timer "hardware" checks the front of the queue at each tick and reschedules any processes whose timeouts have expired. More complex things happen in the case of an ALT. HARDWARE SELECT If "hardware" (external) communication(s) and/or timeout(s) are waiting, and no process is in the process queue, "hardware" (i.e. virtual machine code external to the instructions of all the processes) should do a blocking select on all the pending external communications, including the earliest timeout if any. The winner(s) then trigger channel or timer queue action. More than one winner is OK. All winners go into the process queue. If any process is in the process queue, and any other process is waiting on hardware communications, then, before branching to the next process in the process queue, a non-blocking select (i.e. timeout 0) should be used to add any ready processes to the process queue. If any process is waiting on a timeout, time should be checked. If these checks are too burdensome, the checks could take place once per queue cycle, or after a certain instruction count. If interrupt code is accessible to the "hardware" or virtual machine, this rescheduling code can be placed in the interrupt code for efficiency, eliminating these selects. ALT To be distinguished from the hardware select used to reschedule unconditional channel IO or timers, the ALT construct places first-past-the-post communication or timeout selection into the process code itself. The transputer offers channel ALT, timer ALT, and skip (immediately valid) ALT branches. They can be prioritized explicitly, which is a good idea, because fairness can be enforced in the process code by rotating the priorities to avoid starvation. Unlike C select, the transputer ALT offers only one winner upon which it branches, like an if-then-else or case statement. This could be modified at the cost of complexifying the code. Actions passing through an ALT are called "conditional" because they do not take place if they lose. The transputer allows ALT only on an input, because a conditional input communicating to a conditional output could lead to a standoff. There are three states within an ALT: enabling waiting ready These are distinguished by the special values Enabling.p, Waiting.p, and Ready.p, which are all distinguished from any channel address and go in W[-3]. The sequence is: start the ALT ('alt' or 'talt') enable each branch ('enbc' or 'enbt' or 'enbs') wait if necessary ('altwt' or 'taltwt') disable each branch ('disc' or 'dist' or 'diss') end the ALT ('altend') If, when enabling, any branch is found already ready, the state goes from Enabling.p to Ready.p and the wait is skipped. The order in which the branches are enabled does not matter, but the order in which they are disabled determines the priority. In the case where a C select is used in a virtual machine, it is probably easiest to treat the 'enbs' (if there) as a zero timeout and always do the wait for hardware channels, so that you do not have to call a lot of non-blocking selects to see if each 'enbt' or hardware 'enbc' branch is ready. In the transputer, all the branches are enabled and disabled even if there is an early winner or some are cancelled by booleans; but equivalent, more efficient strategies could be used in a virtual machine. If an output finds a process W in its channel and an ALT underway in that inputtimg process, it does not do the communication, but replaces the inputting process W with its own process W in the channel, deschedules itself, and forces the inputting process's ALT to go into disabling, so that this communication has a chance to win immediately. If it does not win, the outputting process has now put the channel into a ready-to-communicate state, so the next ALT will detect this during enabling. Timer ALTs are special. When enabling, W[-4] must take the state TimeNotSet.p until the first timer guard is enabled and then it gets the timeout in W[-5] and W[-4] goes to TimeSet.p. Further timer branches may cause this timeout to advance. Either it is already ready or it goes into one timer queue. If the ALT puts a process on the timer queue, it is pulled from the queue at the end of the wait, whether or not the timer wins. No matter what branch wins, the end of the wait places the process back on the process queue, and when it reaches the front the ALT disabling begins. This is to be distinguished from the actual communication, if applicable, which takes place after the 'altend' and the branching, and reschedules the other (outputting) process without any blocking, as is completely standard behavior since the outputter always counts as "early" by this point. REFERENCE Inmos Ltd: "Transputer Instruction Set, a Compiler Writer's Guide." Prentice-Hall, New York, 1988. See Section 10, "Architectural Notes."