A large commit.
[pdp8.git] / sw / kermit / k12 / k12mit.not
1 From: Charles Lasner <lasner@watsun.cc.columbia.edu>
2 To: PDP-8 Lovers Everywhere <pdp8-lovers@ai.mit.edu>
3 Subject: Announcement of additional KERMIT-12 utilities.
4
5 While no changes have been made to the body of KERMIT-12 itself,
6 several things have been changed/added.
7
8 At the request of the KERMIT distribution service (KERMSRV)
9 certain files have been slightly modified so they are acceptable to
10 that bitnet, etc. facility. (Seems to be a problem with LRECL>80.)
11 All files are now 80 or less. Except for the .DOC file, all it took
12 was a little "cosmetic surgery" on a few lines. FTP'd copies are
13 mostly unaffected. Most of the problems have to do with
14 interpretation of the inter-page FF character being treated as the
15 first character of the "record" in this non-stream-oriented system.
16
17 At this time there is no actual doc file, as the file K12MIT.DOC
18 is merely a truncation of the listing of K12MIT.PAL as passed through
19 PAL8 and CREF. Anyone with a system big enough to support a 200K+
20 long source file can create this file themselves. In addition, due
21 to certain quirks within PAL8 and CREF "beating" against unix line
22 conventions, the file K12MIT.DOC at watsun.cc.columbia.edu was
23 slightly different from the precise output of the assembly process,
24 but again, only a cosmetic change.
25
26 Since this file greatly exceeded the KERMSRV restriction, it has
27 been withdrawn in favor of the source fragment equivalent to it taken
28 directly from K12MIT.PAL. This source fragment is short enough that
29 even an RX01-based OS/8 system can create the listing file from it
30 thus recreating the original K12MIT.DOC locally. All this will
31 disappear in the future when a "proper" doc file appears. In the
32 meantime, K12MIT.DOC in whatever form it is available contains
33 hardware hints and kinks, assembly options, and other info useful to
34 users and anyone interested in the "innards" of the program, as well
35 as an edit history of how K12MIT got to be where it is now starting
36 from its "grandfather" K08MIT. It ends at the first line of the code
37 in K12MIT.PAL, but includes all of the special purpose definitions
38 particular to the various devices supported, such as DECmate I,
39 DECmate II, etc. Any changes to customize KERMIT-12 are still
40 accomplished using the separate patch file K12PCH.PAL which is
41 unchanged.
42
43 New files cover two areas: 1) direct loading without KERMIT-12,
44 and 2) .BOO format support.
45
46 1) Many users have the hardware for running KERMIT-12, but don't
47 already have it or another suitable program to acquire it yet, a real
48 "catch-22" situation. Towards that end, a set of utilities has been
49 provided to directly load KERMIT-12 without already having it.
50
51 Most PDP-8 sites do have access to some other machine.
52 Hopefully, the serial connection to be used is fairly "clean" and
53 error-free, or at least some of the time. These programs depend on
54 this fact. This could either be a connection to a remote multi-user
55 system or something like a null-modem connection to a nearby IBM-PC.
56 The programs assume only a few things:
57
58 a) The connection is error free.
59
60 b) The other end doesn't absolutely require anything be sent to
61 it to send data to the PDP-8 end. (The -8 end will not send ^S/^Q or
62 anything like that because this is unnecessary; all data goes only
63 into PDP-8 memory directly.)
64
65 c) The other end will send the data at a time controlled from
66 its end, or after at most one character sent from the PDP-8 end of
67 the link.
68
69 The first situation is illustrated by the example of a PC
70 connected to the -8. The -8 program is started, and it waits
71 indefinitely after the -8 user presses any one key. (The
72 corresponding character is sent to the PC where it is ignored.) The
73 PC end is initiated with a command such as COPY K12FL0.IPL AUX: and
74 the data goes to the -8.
75
76 The second situation is illustrated by a remote system where a
77 command such as TYPE K12FL0.IPL is available. The delimiting CR is
78 not typed at this time, and will be finished later by the loading
79 program. The initial connection up until the TYPE command is not
80 covered by the loading program itself, so the user must supply a
81 basic comm program, which is possible to accomplish in about 10 words
82 or less if the rates are "favorable", or worst-case, a terminal can
83 be used and the line switched over to the -8 at the appropriate time.
84 In any case, CR or other appropriate character is hit on the -8 and
85 the loading program echoes it down the line (and on the console) to
86 initiate the data down-load.
87
88 d) The other end is assumed to send the file verbatim without
89 insertion of <del> characters (octal 177) and upper-case/lower-case
90 is preserved.
91
92 If all of these assumptions are met, then the down-load
93 accomplishs a partial acquisition of K12MIT.SV, the primary binary
94 file of KERMIT-12. The process must be repeated several times to
95 acquire all portions. If a local compare utility is available that
96 can compare absolute binary files, perhaps the process can be totally
97 repeated to assure reliable results by comparing runs.
98
99 The method used is borrowed from the field-service use of a
100 medium-speed serial port reader on the -8 for diagnostic read-in.
101 This reader is *almost* compatible with the device 01 reader such as
102 the PC8E. The difference is that the *real* PC8E is fully
103 asynchronous, whereas the portable reader just spews out the
104 characters without any protocol. The PC8E can't drop any characters
105 in theory, although there are reports of misadjusted readers that
106 drop characters at certain crucial data rates. (The PC8E runs at
107 full speed if possible, and failing this falls back to a much slower
108 speed. All operations depend on the use of the hardware handshakes
109 of the IOTs etc., so nothing should be lost but throughput.
110 Misadjusted readers may drop characters when switching over to the
111 slower mode.)
112
113 The reason the field reader is acceptable is that it is used only
114 to load diagnostics directly into memory using the RIM and BIN
115 loaders. These minimal applications can't possibly fall behind the
116 reader running at full speed. This is the same principle used here
117 to down-load KERMIT-12.
118
119 The loading program is a 46 word long program suitable to be
120 toggled into ODT and saved as a small core-image program. The user
121 starts the program and then (at the appropriate time) presses one key
122 (usually CR if it matters) and the loader waits for remote input. As
123 the other end sends the data, it is directly loaded into memory.
124 There is a leader/trailer convention, just like paper-tape binary, so
125 at end-of-load the program exits to OS/8 at 07600. At this time the
126 user issues a SAVE command. This completes the down-load of a single
127 field of K12MIT.SV.
128
129 At the current time, there are actually two fields of K12MIT.SV,
130 namely 00000-07577 and 10000-17577, and there are two such loaders.
131 There is no check for proper field, so the proper loader must be used
132 with the proper data, else the fields will get cross-loaded and will
133 certainly fail.
134
135 Once the two fields are obtained as separate .SV files (named
136 FIELD0.SV and FIELD1.SV) they can be combined using ABSLDR.SV with
137 the /I switch (image mode) set. The resultant can be saved as
138 K12MIT.SV. This, if all went well, is identical in every way to the
139 distributed K12MIT.SV (which is only distributed in encoded form; see
140 below). Actual file differences will only exist in the extraneous
141 portions of the file representing the header block past all useful
142 information and the artifacts of loading which represent 07600-07777
143 and 17600-17777 which are not used. This is the normal case for any
144 OS/8 system when any file is saved. Merely saving an image twice
145 will cause this to happen. At this point, K12MIT.SV can be used as
146 intended, namely to acquire, via KERMIT protocol, the entire release.
147 It is recommended that the provisional copy of K12MIT.SV be abandoned
148 as soon as the encoded copy is decoded since the encoding process
149 provides some assurances of valid data (using checksumming, etc.).
150
151 This process can be accomplished on any KL-style -8 interface
152 including PT08, etc., or on the printer port of VT-78 and all
153 DECmates. When used on the DECmates, there may be some minor
154 problems associated with the down-load which may have to be done as
155 the first use of the printer port after power-on, or some other
156 restriction. The loader includes a suggested instruction for DECmate
157 use if problematic (and raises the program length to 47 words).
158 Also, due to observed bugs in the operating system (OS/278 only),
159 there are restrictions on the use of ABSLDR.SV that cause certain
160 command forms to fail while other seemingly equivalent forms succeed!
161 This is documented in the latest K12MIT.BWR file in the distribution.
162 The command form stated in the K12IPL.PAL file is the only known form
163 that works correctly on these flawed systems.
164
165 The format for down-load files is known as .IPL or Initial
166 Program Load format. It consists of a leader containing only
167 lower-case letters (code 141-177 only) followed by "printable" data
168 in the range 041 (!) through 140 (`). Each of the characters
169 represents six bits of data, to be read left to right as pairs, which
170 load into PDP-8 12-bit memory. The implied loading address is
171 always to start at 0000 of the implied field. The leader comment
172 contains documentation of which field of data from K12MIT.SV it is.
173 The trailer consists of one lower-case character followed by anything
174 at all. This is why it is crucial that DEL (177) not appear anywhere
175 in the body of the file.
176
177 Throughout the file, all codes 040 or less are ignored. This
178 allows for spaces in the lower-case leader for better readability,
179 and for CR/LF throughout the entire file. CR/LF is added every 32
180 words (64 characters) to satisfy cetain other systems' requirements.
181 The trailer contains documentation on a suggested SAVE command for
182 the particular data just obtained.
183
184 2) PDP-8 ENCODE format is the format of choice to obtain binary OS/8
185 image files because of the validation techniques employed, etc. This
186 is the standard method of distributing K12MIT.SV as well as other
187 "critical" files such as TECO macros and other image files. In the
188 MS-DOS world there exists another very popular format known as .BOO
189 encoding. It would be useful to support this format on the PDP-8 as
190 well.
191
192 .BOO format files are smaller because they use six-bit encoding
193 instead of five-bit encoding, or at least in theory. Both ENCODE and
194 .BOO use repeat compression techniques, but ENCODE can compress
195 12-bit words of any value, while .BOO only compresses zeroes and that
196 itself is based on a byte-order view of the data. PDP-8 programs
197 often include large regions of non-zero words such as 7402 (HLT)
198 which would not compress when looked at as bytes. Such files would
199 show compression rations quite different from the norm.
200
201 In any case, .BOO format is useful on the PDP-8 because it allows
202 inter-change with .BOO files created on other systems, such as PCs.
203 This allows the exchange of unusually formatted files, such as TECO
204 macros between PDP-8s and PCs. (Both systems support a viable
205 version of TECO.)
206
207 The new KERMIT-12 utilities include a .BOO encoder and .BOO
208 decoder, known as K12ENB.PAL (or ENBOO.PAL) and K12DEB.PAL (or
209 DEBOO.PAL) respectively. They use .BOO encoded files unpacked in the
210 standard OS/8 "3 for 2" order to preserve the original byte contents
211 when the files originate from other systems. (Technically, .BOO
212 format doesn't require this, but the obvious advantages dictate it.
213 Anything encoded into .BOO format must merely have a 24-bit data
214 structure encoded into four six-bit characters, so in theory any
215 encoding of two adjacent PDP-8 12-bit words would be acceptable. By
216 additionally supplying the bits in OS/8 pack/unpack order guarantees
217 the inter-system compatibility as well.)
218
219 There is an inherent weakness in the original .BOO format which
220 must be addressed. .BOO format files always end on one of two data
221 fields: either a repeat-zero compression field, or on a 24-bit field
222 expressed as four characters. Should the data in a 24-bit field
223 consist of only two or even one bytes, there are one or two
224 extraneous null bytes encoded into the field to complete it.
225
226 Presumably the need to add the extra bytes is to allow validation
227 of the format. In any case, only the encoder knows just how many (0,
228 1, 2) bytes are extraneous. We can presume that if the last byte is
229 non-zero, it is significant. If the last two are both zero, then the
230 last or possibly both are extraneous with no way to tell.
231
232 On PC systems, the general trend is to ignore these one or two
233 extra bytes because so far there haven't been any complaints of
234 failure. I have personally discovered that a widely used PC .BOO
235 encoding program (written in C) erroneously adds two null bytes as
236 a short compression field beyond the data! This is not a .BOO format
237 issue, but rather a genuine program bug. Apparently few PC users are
238 concerned that encoding their files prevents transparent delivery to
239 the other end.
240
241 In the OS/8 world, the situation is quite different. Each OS/8
242 record is 256 words or 384 bytes. If even a single byte is added,
243 this creates an additional all-zeroes record. Besides wasting space,
244 it is conceivable that such a file could be dangerous to use under
245 OS/8 depending on content. (Certain files, such as .HN files are
246 partially identified by their length. File damage, such as
247 lengthening a file from two to three records will confuse the SET
248 utility, etc.) Many files cannot be identified as having been
249 artifically lengthened (and may be hard to shorten!), so this must be
250 avoided.
251
252 I have invented a fix for the problem: repeat compression fields
253 are expressed as ~ followed by a count. 2 means two null bytes and
254 is thus the smallest "useful" field to be found. (It takes two
255 characters to express what would take 2-2/3 characters in encoded
256 format. One null would only take 1-1/3 characters, not two, so this
257 case is vestigial, but must be supported for the benefit of
258 brain-dead encoders.) The value of 0 means a count of literally zero,
259 thus ~0 is a "NOP" to a decoder. I have successfully tested MS-DOS
260 programs written in BASIC and C that decode .BOO files successfully
261 even if ~0 is appended to the end with no ill effects. (They
262 correctly ignored the appended fields.)
263
264 In my encoding scheme, ~0 at the end of a data field containing
265 trailing zeroes means to "take back" a null byte. ~0~0 means to take
266 back two null bytes. Thus files encoded with ENBOO.PAL either end in
267 a repeat-compression field as before, or in a data encoding field
268 possibly followed by ~0 or ~0~0 if necessary. The corresponding
269 DEBOO.PAL correctly decodes such files perfectly.
270
271 Should files encoded with ENBOO reach "foreign" systems, they
272 will do what they always do, i.e., make files one or two bytes too
273 long occasionally, with no other ill effects. Files originating from
274 such systems will certainly be lacking any trailing correction fields
275 and will cause DEBOO to perform as foolishly as MSBPCT. Extraneous
276 null bytes will appear at the end of the file in OS/8 just as in
277 MS-DOS in this case. (Note that if the file length is not a multiple
278 of 384 bytes, additional bytes are added by DEBOO as well, but this
279 is not a design weakness of .BOO format. It is caused by the clash
280 of fixed record size and a variable size format.)
281
282 Hopefully, files originating on OS/8 will be decoded on OS/8 as
283 well, thus preserving file lengths. Most "foreign" files will
284 probably be ASCII, so the ^Z convention will allow removal of
285 trailing null bytes at either end. It is hoped that MS-DOS and other
286 systems "upgrade" their .BOO format files to be compatible with the
287 PDP-8 version.
288
289 All KERMIT-12 files are available via the normal distribution
290 "paths" of anonymous FTP and/or KERMSRV. The user is directed to the
291 file /ftp/pub/kermit/d/k12mit.dsk as a "roadmap" to the entire
292 distribution. Each .PAL file includes assembly instructions. Most
293 use non-default option switches and non-default loading and saving
294 instructions, so each must be carefully read. The development
295 support files (TECO macro, .IPL generator, recent copies of PAL8,
296 CREF, etc.) are included in the total collection. Development is not
297 possible on RX01 systems due to inadequate disk space, but RX02's are
298 barely adequate with a lot of disk exchanges. (Future versions may
299 require larger disks for development.)
300
301 Charles Lasner (lasner@watsun.cc.columbia.edu)