Emulation of Multics Standard Tapes

-proposal-

Introduction:
The development of mainframe emulators on PCs encouraged the development of formats for emulating tape devices on disk, CD-Rom, etc. One of the first formats was AWS, introduced with the XT-370, which emulated IBM 3420 9-track tape devices. AWS allows for recording of tapemarks, segmenting of large blocks, and forward and backward reading. A recent extension to the AWS standard adds support for individual block compression and larger data sements (up to 64K).
[NOTE: At this time there are competing proposals for indicating compressed blocks. This proposal uses the most widely used.]

Multics Standard Tapes:

[This standard uses the term "block", rather than "record" to describe the data on the tape since each block is the result of a single physical write operation. The term "segment" is used to describe the basic unit of I/O also known as a "chunk".]
The Multics Standard Tape format, a "checksummed, labeled, headered, trailered, fixed-length-record tape format with periodic tape marks" is a standard defining all "native" tapes used on Multics systems. Key features are a fixed-length block including a header and trailer on each block, and tape marks every 128 blocks to facilitate repositioning and recovery. The standard defines the size of a block as a modulo 1024-word (36864-bit) data space, plus sixteen words (576 bits) of control information. An earlier standard specified a 256-word data space. All blocks on a tape are the same length. Physically, two words are written as 9 eight-bit bytes ("frames") on tape. A block with a 1024-word data space has a length of 4680 bytes. The MST format description provides detail on the contents of an MST.

MST Emulation:
Multics Standard Tapes are emulated by files in compressed AWS format. AWS organizes data into segments. An AWS segment may contain an entire record or a portion of a record. For MST emulation, each MST block is completely contained in one segment. Each segment is prefixed with a 6 byte header as follows:


        1 AWSTAPE_BLKHDR     unaligned,
          2 curblkl unsigned fixed binary(16), /* Current segment length       */
          2 prvblkl unsigned fixed binary(16), /* Previous segment length      */
          2 flags1  bit(8),
            /* '80'bx AWSTAPE_FLAG1_NEWREC   - This segment is start of block  */
            /* '40'bx AWSTAPE_FLAG1_TAPEMARK - This segment is a tapemark      */
            /* '20'bx AWSTAPE_FLAG1_ENDREC   - This segment is end of block    */
            /* '02'bx AWSTAPE_FLAG1_BZLIB    - Segment uses BZLIB compression  */
            /* '01'bx AWSTAPE_FLAG1_ZLIB     - Segment uses ZLIB compression   */
            /* (if both bits are zero, segment is uncompressed)                */
          2 flags2  bit(8);
            /* (Currently unused)                                             */
      

The length fields are stored in little-endian (Intel) format.
AWSTAPE_BLKHDR.curblkl is the length of the current segment, not including the header. This value is zero for tapemarks, 1224 for blocks with a 256-word data space, 4680 for blocks with a 1024-word data space, etc. The maximum size segment allowed by is 65535 bytes. According to the MST definition, end-of-reel (EOR) blocks may not be completely written on a reel, so this length could conceivably be shorter.
AWSTAPE_BLKHDR.prvblkl is the length of the previous segment to facilitate "read backward" operations.
AWSTAPE_BLKHDR.flags1 is 'A0'bx, 'A1'bx, or 'A2'bx for data blocks (both NEWREC and ENDREC set, block completely contained in the current segment, compression optional), and '40'bx (TAPEMARK only) for tapemarks.
AWSTAPE_BLKHDR.flags2 is '00'bx for future expansion.

The first segment header on the "tape" has prvblkl=0. A tapemark (EOF segment) has curblkl=0, and the second of two EOF segments in a row has both curblkl and prvblkl=0. An emulated MST is always terminated by two EOF segments.

Capacity:
On a 1600 bpi tape, one MST block occupies approximately 3.425 inches including interrecord gap. A full 2400-foot (28,800 in.) reel of tape contains a maximum of approximately 8320 blocks, including allowance for tapemarks every 128 blocks; approximately 37MB on disk without compression. A CD-Rom (mode 1) has a capacity of 673 megabytes, or about 18 full reels of tape. Most tape reels would not be completely filled. Contrary to my expectations, 36-bit data seems to compress fairly well using standard compression algorithms. Compression in the area of 30-40% is achievable.

MST Format:
A detailed description of the format of an MST is contained in the Honeywell Multics Programmer's Manual: Peripheral Input/Output (AX49-01). The data portion of each label and data block is exactly 4608 eight-bit bytes (1024 words) long, padded if necessary, preceeded by an eight-word (36-byte) header, and followed by an eight-word trailer. giving a total block length of 4680 bytes. The first block on the tape is a label block. Bootable and non-bootable tapes have different label formats. In case of tape errors the same-numbered block will be rewritten with an indication that this is a rewritten block. It is not necessary to copy the originals of rewritten blocks to emulated tape files.

The organization of data on an MST is:


      Label block
      TAPEMARK
      Data block 0
      ...
      Data Block 127
      TAPEMARK
      Data Block 128
      ...
      TAPEMARK
      EOR block
      TAPEMARK
      TAPEMARK
    
The data recorded on the tape is written in binary mode. It should be interpreted as two 36-bit words per nine eight-bit bytes. The data itself may be binary (machine instructions, etc.) or ASCII, recorded as four ASCII characters per word, each character right-adjusted in 9 bits.

Label blocks are differentiated from data blocks by bits in the block header. Information contained in header and trailer blocks indicates the sequence number of this file on a reel, and/or of this reel in a multi-volume set.

The following are definitions of bootable and non-bootable MST label blocks for 32-bit systems. All character(n) fields have been changed to bit(n*9). To convert character fields to ASCII the bit field must be unpacked by extracting the last eight of every nine bits. All fixed binary(n) fields have been changed to bit(n+1); if the data may be signed special handling is required.


dcl 1 stand_label_record   based unaligned,        /* Multics std tape label */
      2 head               like mstr_header,       /* tape record header */
      2 installation_id    bit (32*9),             /* inst. that created tape */
      2 tape_reel_id       bit (32*9),             /* tape reel name */
      2 volume_set_id      bit (32*9),             /* name of the volume set */
      2 pad (1000)         bit (36),               /* record body */
      2 trail              like mstr_trailer;      /* record trailer */

dcl 1 mst_label            based unaligned,        /* Bootable Tape Label    */
      2 xfer_vector        (4),
        3 lda_instr        bit(36),
        3 tra_instr        bit(36),
      2 head               like mstr_header,       /* tape record header */
      2 installation_id    bit (32*9),             /* inst. that created tape */
      2 tape_reel_id       bit (32*9),             /* tape reel name */
      2 volume_set_id      bit (32*9),             /* name of the volume set */
      2 fv_overlay   (0:31),
        3 scu_instr        bit(36),
        3 dis_instr        bit(36),
      2 fault_data      (8)bit(36),
      2 boot_pgm_path      bit(168*9),
      2 userid             bit(32*9),
      2 label_version      bit(36),
      2 output_mode        bit(36),
      2 boot_pgm_len       bit(36),
      2 copyright          bit(56*9),
      2 pad            (13)bit(36),
      2 boot_pgm      (840)bit(36),
      2 trail              like mstr_trailer;      /* record trailer */

dcl 1 mstr_header          based unaligned,        /* Multics std tape rec hdr */
      2 c1                 bit (36),               /* constant = '670314355245'B3 */
      2 uid                bit (72),               /* unique ID */
      2 rec_within_file    bit (18),               /* phys. rec. # within file */
      2 phy_file           bit (18),               /* phys. file # on tape */
      2 data_bits_used     bit (18),               /* # of bits of data in rec */
      2 data_bit_len       bit (18),               /* bit length of data space */
      2 flags,                                     /* record flags */
        3 admin            bit (1),                /* admin record flag */
        3 label            bit (1),                /* label record flag */
        3 eor              bit (1),                /* end-of-reel record flag */
        3 pad1             bit (11),
        3 set              bit (1),                /* ON if any of following set */
        3 repeat           bit (1),                /* repeated record flag */
        3 padded           bit (1),                /* record contains padding flg */
        3 eot              bit (1),                /* EOT reflector enc flg */
        3 drain            bit (1),                /* synchronous write flg */
        3 continue         bit (1),                /* continue on next reel flg */
        3 pad2             bit (4),
      2 header_version     bit (3),                /* current header version num */
      2 repeat_count       bit (9),                /* repetition count */
      2 checksum           bit (36),               /* checksum of header and trlr */
      2 c2                 bit (36);               /* constant = '512556146073'B3 */

dcl 1 mstr_trailer         based unaligned,        /* Multics std tape record trlr*/
      2 c1                 bit (36),               /* constant = '107463422532'B3 */
      2 uid                bit (72),               /* unique ID (matches header) */
      2 tot_data_bits      bit (36),               /* tot data bits wr on log tape*/
      2 pad_pattern        bit (36),               /* padding pattern */
      2 reel_num           bit (12),               /* reel sequence # */
      2 tot_file           bit (24),               /* phys. file number */
      2 tot_rec            bit (36),               /* phys. record # for log tape */
      2 c2                 bit (36);               /* constant = '265221631704'B3 */
References:
  1. AWS format description, for a slightly different version see Bus-Tech.
  2. MST format description.
  3. SIMH format description. (PDF).

Procedure to compute checksum:

See Multics Programmer's Reference Manual (AG91-04), p.F-10 for checksum computation.
This procedure is a PL/I transliteration of the ALM code provided there.

/*-------------------------------------------------------*/
/* Validate Checksum of Multics Standard Tape Block      */
/* on 32-bit system.                                     */
/* Called with: addresses of MST header and trailer      */
/* Returns: '0'b if checksum error; '1'b if no error.    */
/*-------------------------------------------------------*/
checksum: proc(pHdr,pTrlr) returns( bit(1) );
  dcl   (pHdr,pTrlr)         ptr;
  dcl  1 block_bin           aligned based,
         5 block_wrd (0:1039)bit(36) unaligned;
  dcl    i                   fixed bin(31);
  dcl    checksum            bit(36);
  dcl    this_word           bit(36);
  dcl    carry               bit(1);

  checksum, carry = '0'b;
  do i=0 to 5, 7;
    this_word = pHdr->block_wrd(i);
    call awca(checksum,this_word,carry);
    call alr( checksum );
    end; /* do i */
  do i=0 to 7;        
    this_word = pTrlr->block_wrd(i);
    call awca(checksum,this_word,carry);
    call alr( checksum );
    end; /* do i */
  this_word='0'b;
  call awca(checksum,this_word,carry);
  call awca(checksum,this_word,carry);
  this_word = pHdr->block_wrd(6);   /* Get checksum word */
  if this_word ¬= checksum then do;
    return( '0'b );                 /* Error return      */
    end;  

  return( '1'b);                    /* Normal return     */

awca: proc(sum,n,carry);
  dcl    sum                 bit(36);
  dcl    n                   bit(36);
  dcl    carry               bit(1);
  dcl    b                (9)bit(4) based;
  dcl    i                   fixed bin(31);
  dcl    tmp                 fixed bin(31);
  dcl    bits          (0:15)bit(4) static init(
         '0000'b, '0001'b, '0010'b, '0011'b, '0100'b, '0101'b, '0110'b, '0111'b,
         '1000'b, '1001'b, '1010'b, '1011'b, '1100'b, '1101'b, '1110'b, '1111'b
                             );       
  do i=9 to 1 by -1;
    tmp = addr(sum)->b(i) + addr(n)->b(i) + carry;
    carry = '0'b;
    if tmp>15 then do;
      tmp=tmp-16;
      carry='1'b;
      end;
    addr(sum)->b(i) = bits(tmp);
    end; /* do i */
  end awca;

alr: proc(word);
  dcl    word                bit(36);
  word = substr(word,2,35) || substr(word,1,1);
  end alr;

  end checksum;

E-Mail comments on this proposal to:
Peter Flass <Peter_Flass@Yahoo.com>
Revised 2 Feb, 2005
Change references to HET format to "Compresed AWS", add checksum procedure.