sw/kermit/k12/k12mit.bwr

   1 Date: Fri, 1 May 1992 17:00:00 EDT
   2 From: Charles Lasner <lasner@watsun.cc.columbia.edu>
   3 Subject: DECmate I problems and more patching problems
   4
   5 DECmate I problems.
   6
   7     Attempts to use the distributed Kermit-12 Version 10g on a
   8 DECmate (I) system will certainly fail.  The coding specific to the
   9 DECmate I was never tested until recently.  Two key routines wait for
  10 status flags that never raise because the affected registers do not
  11 generate flag changes/interrupts.  This is unrelated to general
  12 serial data handling which works as originally coded.
  13
  14     There is a simple patch to the program to alleviate the problem:
  15
  16 .LOAD SYS:KERMIT.SV/I$*$        [load the file in image mode and then
  17                                  ask for more input.  The $ which is
  18                                  printed signifies the use of <ESC>
  19                                  as the command terminator.  The * is
  20                                  printed by the system command
  21                                  decoder requesting further input.
  22                                  The second $ signifies the use of
  23                                  <ESC> to end input to the command
  24                                  decoder.  The loader program
  25                                  terminates and control returns to
  26                                  the keyboard monitor.]
  27
  28 .ODT                            [call in ODT to patch the program]
  29
  30 7/ 0007 0012                    [change default baud rate to 2400;
  31                                  this is optional]
  32
  33 353/ 5352 7000                  [make a JMP .-1 into a NOP]
  34 10302/ 5301 7000                [make a JMP .-1 into a NOP]
  35 12243/ 3607 3610                [bump the version number]
  36
  37 ^C                              [^C to exit to monitor]
  38
  39 .SAVE SYS KERMIT                [save patched file]
  40
  41 This also updates the release revision from 10g to 10h.  Future
  42 versions will eliminate the overhead of the now defunct routines.
  43
  44     The only DECmate versions remaining to be tested are:
  45
  46     DECmate I with DP278B (the system used for testing has DP278A)
  47     DECmate III without internal modem
  48     DECmate III with internal modem
  49     DECmate III+
  50
  51 Patching problems.
  52
  53     The restrictions placed on patching apparently stem from a bug
  54 going back at least as far as OS/8 V3D (likely further).  Apparently,
  55 when a JSW value of 1 is used (as Kermit-12 does), the GET command
  56 doesn't work.  Apparently, the system confuses the need to save the
  57 contents of 10000-11777 with the need to load it in the first place.
  58 Kermit-12 operates by first placing once-only code in the affected
  59 area, then discarding it in favor of a locked-in copy of the USR
  60 routine.  To avoid overhead, the JSW value of 0001 is set to indicate
  61 there is no need to save this dead code when the USR is swapped in
  62 over it.  Apparently, the GET command sets the =1 too early in the
  63 load process, so the code that uses the USR to carry out the actions
  64 of the GET operation doesn't properly load the Kermit-12 code.
  65
  66 Consequently, the warnings documented in previous chapters of the
  67 .BWR file (below) apply in all cases.  The interaction with CCL sited
  68 below may not apply in all versions, but the GET command problem is
  69 apparently universal.
  70
  71 cjl
  72
  73 ------------------------------
  74
  75 Date: Mon, 28 October 1991 20:00:00 EST
  76 From: Charles Lasner <lasner@watsun.cc.columbia.edu>
  77 Subject: Kermit-12 patching restrictions revisited yet again
  78
  79     Even more operating system ills (will it ever end?):
  80
  81     Still further investigation into operating system bugs in OS/278
  82 V2 on DECmates reveals that the problem in even worse that realized
  83 two weeks ago (see previous .BWR article):
  84
  85     When a SAVE command is executed from OS/278 involving a loaded
  86 handler, the SAVE operation fails!  The contents of the files will be
  87 corrupted in general and will likely become (at least partially) all
  88 zeroes!  The exact scope of the problem has not been ascertained, but
  89 certain loading tests reveal that the command fails even when
  90 accessing additional memory beyond field zero and one.
  91
  92     All operations to SYS: or any device co-resident with SYS: (or
  93 when DSK:=SYS: which is typically the case in many systems but not a
  94 rule) are unaffected beyond the restrictions reported previously.
  95
  96     Until recently, SAVE commands were of little interest to the
  97 casual user of OS/278, since program execution and ordinary file
  98 creation are unaffected.  Since there are now several programs to be
  99 loaded and saved by users, the problem is more significent.  Users of
 100 the direct loading method of acquiring  KERMIT-12 are also in the
 101 affected category.
 102
 103     Clearly all developers and anyone assembling any part of the
 104 KERMIT-12 package should be aware of this problem.  As a precaution,
 105 all persons using the SAVE command for any reason are advised to use
 106 the form involving SYS: only to avoid this problem.  (Advanced users
 107 can determine which handlers are possibly co-resident and are thus
 108 acceptable as well.) The resultant file can always be copied to any
 109 device as required after the fact.
 110
 111 cjl
 112
 113 ------------------------------
 114
 115 Date: Thu, 10 October 1991 05:00:00 EDT
 116 From: Charles Lasner <lasner@watsun.cc.columbia.edu>
 117 Subject: Kermit-12 patching restrictions revisited and .BOO problems
 118
 119     More operating system ills regarding file loading:
 120
 121     Further investigation into operating system bugs in OS/278 V2 on
 122 DECmates reveals that the problem is worse than first realized:
 123
 124     When using GET or LOAD (ABSLDR) commands, especially when loading
 125 image files such as FIELD0.SV, FIELD1.SV (the partial load files from
 126 direct memory loading of K12MIT), or K12MIT.SV with /I, the JSW and
 127 starting field/address can become "mangled" into unusable values.
 128 One particular case achieved the impossible value of 6303 for a
 129 starting field change instruction (legal values are 6203 through 6273
 130 by 10s).
 131
 132     Consequently, the general recommendation for SAVE commands as
 133 used in various utilities throughout KERMIT-12 configuration etc., is
 134 to use explicit starting address and loading locations and JSW
 135 values.  In short, always give a complete description of the SAVE
 136 operation under OS/278.  For example, when direct-loading K12MIT
 137 through the printer port into the DECmate, the following commands
 138 should be used:
 139
 140 }LOAD FIELD0.SV/I,FIELD1.SV/I$*$
 141
 142 }SAVE SYS K12MIT.SV 00000-07577,10000-17577;00200=0001
 143
 144     As discussed earlier, the CCL form of the ABSLDR LOAD command
 145 works even though other seemingly equivalent forms don't.  The
 146 complete SAVE command forces all parameters to be taken explicitly
 147 from the command; no reliance on system "assumptions" or loading
 148 artifacts.  Always use the complete values for loading taken from the
 149 relevant program documentation.
 150
 151     Most users of KERMIT-12 are running OS/8 V3D, etc., where this
 152 sort of system bug isn't seen.  In the future, all KERMIT-12
 153 documentation will give the "verbose" form of the command to contain
 154 this OS/278 V2-specific problem.
 155
 156 Regarding .BOO format encoding:
 157
 158     The newest release of KERMIT-12 includes .BOO format encoding of
 159 all binary files and TECO macros as an alternative to ENCODE format.
 160 ENCODE format is still the preferred method of distribution, but .BOO
 161 format allows for use with other systems, such as MS-DOS.  For
 162 example, TECO macros used with OS/8 TECO can be interchanged in .BOO
 163 format with similar files used with MS-DOS TECO.  Intermediary sites,
 164 such as unix systems will not destroy the "delicate" nature of such
 165 files, etc.
 166
 167     The KERMIT-12 .BOO utilities are NOT totally compatible with
 168 existing .BOO utilities on other systems! Just like OS/8 ENCODE and
 169 DECODE, ENBOO and DEBOO do a perfect encoding/decoding of OS/8 files
 170 into their original form.  When used with "foreign" .BOO decoders,
 171 some unpredictable things might occur.
 172
 173     Certain other .BOO encoders are known to throw in extraneous null
 174 bytes at the end of the file.  Further, there is a design weakness in
 175 the original .BOO format that causes more null bytes to possibly
 176 appear.  The KERMIT-12 programs utilize a superset of the original
 177 format to ensure correct encoding/decoding.  When passing these files
 178 which now contain "correction bytes" to older decoders, the files are
 179 decoded with inflated lengths because the older decoders don't
 180 recognize the length correction.  When passing files created by older
 181 encoders to the PDP-8, the resultant decoded files will also have
 182 inflated lengths because the older encoders failed to place
 183 correction bytes into the file.
 184
 185     The general rule for dealing with .BOO files originating from
 186 other systems is that they may have incorrect lengths.  The resultant
 187 files may be (falsely) padded out with extraneous null bytes.  In any
 188 case, since the files generally have no blocking structure, the files
 189 will be padded by OS/8 up to the nearest whole record or multiple of
 190 384 bytes anyway.  Unless the file is ASCII and has a ^Z at the end,
 191 there is no way to determine the original intended file length.
 192 Files may  be padded by null bytes introduced by other systems' bugs,
 193 the inherent weakness of the original .BOO format, or ultimately by
 194 OS/8 padding requirements.
 195
 196     ASCII files from other systems may be adjusted by using an editor
 197 such as TECO which stops at the ^Z.  A second generation of the
 198 transferred file may be somewhat shorter when processed this way.
 199
 200     Should a file originating in OS/8 be intended for OS/8 use only
 201 (such as an encoding of a .SV file), it should not be decoded on an
 202 intermediate system, because a re-encoded version may differ from the
 203 encoded original because of ignored correction bytes, bugs, or the
 204 inability to insert correction bytes.  Violating any of these rules
 205 could lead to OS/8 files corrupted into being too long.  It is
 206 conceivable that these altered files are even dangerous to use under
 207 OS/8 because of their inflated lengths. (Certain files are validated
 208 by their restricted size, such as .HN files which must be exactly two
 209 or three blocks long depending on whether they are for one or two
 210 page handlers.  If a one-page handler became three pages in file
 211 length, it could conceivably be confused with a two-page handler,
 212 etc.)
 213
 214 cjl
 215
 216 ------------------------------
 217
 218 Date: Sun, 7 October 1990 12:00:00 EDT
 219 From: Charles Lasner <lasner@watsun.cc.columbia.edu>
 220 Subject: Kermit-12 patching restrictions
 221
 222     All Kermit-12 configuration done according to the documentation
 223 works "as advertised." Users are tempted to patch the distributed
 224 image file K12MIT.SV as a "quick and dirty" method to make small
 225 modifications such as changing the default baud rate, etc.  There is
 226 "conventional wisdom" that this can be accomplished using GET, SAVE
 227 commands to allow the use of ODT; this method is ordinarily used with
 228 other OS/8 family programs.  It has been reported that this does NOT
 229 work on OS/278, the usual operating system for the DECmates.  The
 230 following method should be avoided (a work-around is offered later):
 231
 232 .GET SYS KERMIT                 [setup current image for patching]
 233
 234 .ODT                            [call in ODT to patch the program]
 235
 236 7/ 0007 0012                    [change default baud rate to 2400]
 237
 238 ^C                              [^C to exit to monitor]
 239
 240 .SAVE SYS KERMIT                [save patched file]
 241
 242 This method follows the exact procedure described in virtually every
 243 OS/8 document regarding patching of image files.  The cited example
 244 changes the default baud rate from 1200 Baud to 2400 Baud by
 245 replacing the value chosen from the DEC standard table for 1200 Baud
 246 with the applicable value for 2400 Baud.  This value is stored within
 247 Kermit-12 as the corresponding twelve-bit word with all high-order
 248 bits zeroed.  (The location used is 000007; this is valid for Version
 249 10g, but could change in later versions.)
 250
 251     This attempt to make changes the "conventional" way produces a
 252 corrupted image file of K12MIT.SV (renamed to KERMIT.SV in the above
 253 example) when using OS/278 Version 2, the usual operating system on
 254 the DECmate II, etc.  This method probably works in earlier (OS/8
 255 V3D, etc.) systems, however no attempt has been made to trace this
 256 bug in prior systems.  A "fool-proof" method is required that works
 257 in spite of bugs in the operating system.
 258
 259     A work-around was attempted using OS/278 V2 on a DECmate II hard
 260 disk system:
 261
 262 .LOAD SYS:KERMIT.SV/I           [load the file in image mode]
 263
 264 .ODT                            [call in ODT to patch the program]
 265
 266 7/ 0007 0012                    [change default baud rate to 2400]
 267
 268 ^C                              [^C to exit to monitor]
 269
 270 .SAVE SYS KERMIT                [save patched file]
 271
 272 This also fails!
 273
 274     For reasons not understood yet, the following seemingly
 275 equivalent command DOES work:
 276
 277 .LOAD SYS:KERMIT.SV/I$*$        [load the file in image mode and then
 278                                  ask for more input.  The $ which is
 279                                  printed signifies the use of <ESC>
 280                                  as the command terminator.  The * is
 281                                  printed by the system command
 282                                  decoder requesting further input.
 283                                  The second $ signifies the use of
 284                                  <ESC> to end input to the command
 285                                  decoder.  The loader program
 286                                  terminates and control returns to
 287                                  the keyboard monitor.]
 288
 289 .ODT                            [call in ODT to patch the program]
 290
 291 7/ 0007 0012                    [change default baud rate to 2400]
 292
 293 ^C                              [^C to exit to monitor]
 294
 295 .SAVE SYS KERMIT                [save patched file]
 296
 297 This allows ODT commands to patch the file as intended, and also
 298 causes the subsequent SAVE command to work properly.  All OS/8 family
 299 systems support this command (as long as CCL is enabled), so it will
 300 "always" work.
 301
 302     For those users who run with CCL turned off, the following
 303 sequence will also work:
 304
 305 .R ABSLDR                       [run the loading program directly]
 306 *KERMIT.SV/I                    [load Kermit in image mode]
 307 *$                              [<ESC> is typed to terminate the
 308                                  loading process.]
 309
 310 .ODT                            [call in ODT to patch the program]
 311
 312 7/ 0007 0012                    [change default baud rate to 2400]
 313
 314 ^C                              [^C to exit to monitor]
 315
 316 .SAVE SYS KERMIT                [save patched file]
 317
 318     The newer OS/8 family systems generally can't turn off the CCL
 319 mechanism.  Since the R and RU commands are typically disabled on
 320 newer releases, only the CCL command work-around applies.  Users
 321 opting to disable CCL are likely running "older" systems, such as
 322 OS/8 V3D on DECtapes.  On these systems, ANY of the above methods
 323 should work, because the problematic bug didn't exist on those
 324 systems.  Had DEC not gone "backwards" we could have avoided this
 325 entire discussion!
 326
 327     It is assumed the user will make "correct" patches to KERMIT-12;
 328 at least there is a "safe and proper" mechanism available to
 329 accomplish it!
 330
 331 cjl
 332
 333 ------------------------------
 334
 335 Date: Thu, 6 September 1990 12:00:00 EDT
 336 From: Charles Lasner <lasner@watsun.cc.columbia.edu>
 337 Subject: Kermit-12 potential problems
 338
 339     A newly implemented ENCODE/DECODE method should eliminate the
 340 reported problems with regard to passing encoded binary files through
 341 problematic "paths." The method chosen is a variant on the 5-bit
 342 encoding algorithm suggested.  Encoded files now pass right through
 343 all of the WPS-related utilities.  It is necessary to acquire
 344 virtually all files of this re-release of KERMIT-12 since all ENCODed
 345 files are different, as well as the source programs for the
 346 ENCODing/DECODing utilities themselves.  Due to the file being
 347 "bare", the TECO macro K12GLB.TEC is possibly defective when it
 348 arrives at a user site; it will now be ENCODed as K12GLB.ENC to avoid
 349 this problem.
 350
 351     The KERMIT-12 source files are different due to maintenance work,
 352 requiring the user to obtain the re-released files.  The sources now
 353 include a file to "pre-clear" memory.  This aids in reducing the size
 354 of the ENCODed binary file K12MIT.ENC since undefined areas are no
 355 longer "relics" of random values, rather they are all set to 0000
 356 octal.  The long strings of identical words will be eliminated since
 357 the new encoding format does repeat compression.
 358
 359     KERMIT-12 has still not been tested on any DECmates other than
 360 the DECmate II, as no volunteers have come forward with the proper
 361 hardware:
 362
 363     DECmate I with DP278A
 364     DECmate I with DP278B
 365     DECmate III without internal modem
 366     DECmate III with internal modem
 367     DECmate III+
 368
 369     A tentative volunteer for the DECmate I with DP278A configuration
 370 has been contacted, but testing has not yet started.
 371
 372 cjl
 373
 374 ------------------------------
 375
 376 26-Jul-90  1:15:43-GMT,15259;000000000001
 377 Return-Path: <lasner@cunixf.cc.columbia.edu>
 378 Received: from cunixf.cc.columbia.edu by watsun.cc.columbia.edu (5.59/FCB)
 379         id AA26223; Wed, 25 Jul 90 21:15:41 EDT
 380 Received: by cunixf.cc.columbia.edu (5.59/FCB)
 381         id AA11871; Wed, 25 Jul 90 21:16:19 EDT
 382 Date: Wed, 25 Jul 90 21:16:18 EDT
 383 From: Charles Lasner <lasner@cunixf.cc.columbia.edu>
 384 To: fdc@cunixf.cc.columbia.edu
 385 Subject: This was sent out to PDP8-LOVERS
 386 Message-Id: <CMM.0.88.648954978.lasner@cunixf.cc.columbia.edu>
 387
 388     I thought you might want to see this; it refers to the encoding
 389 problem I reported for a user with the problem (he has no net
 390 capability) in the programs using that encoding scheme we
 391 discussed...
 392
 393 From:   Charles Lasner (cjl)
 394 To:     PDP8-LOVERS
 395 Subj:   Feedback on encoding issues regarding archived files.
 396
 397     I have written a pair of OS/8 programs to ENCODE and DECODE
 398 binary files into an "ASCII-fied printable" format.  Those of you
 399 familiar with either uuencode/uudecode or .BOO format will understand
 400 my intentions.  They were originally written for the purpose of
 401 distribution of binary (.SV) files of KERMIT-12 by Columbia
 402 University in NY as part of the standard KERMIT collection (K12*.*).
 403 Columbia imposes a restriction on all files: they must be distributed
 404 in ASCII only.  This is to ensure proper distribution regardless of
 405 the "path" taken between Columbia and the end user.  Be advised that
 406 various problematic E-mailers, ASCII-EBCDIC EBCDIC-ASCII
 407 translations, filters for reserved codes, known problematic character
 408 substitutions, etc. are lurking out there! Consider yourself lucky if
 409 you get your sender's copy intact without some form of "cosmetic"
 410 reformatting.  By encoding the binary files into an appropriate
 411 subset of ASCII, these problems hopefully are avoided.  While we
 412 can't prevent ALL problems, we can usually tackle the most likely
 413 ones.
 414
 415     My original design was based on a discussion I had with Frank da
 416 Cruz of Columbia University (of KERMIT fame) regarding what to
 417 restrict ourselves to in a robust format.  He was "unhappy" with some
 418 of the vulnerabilities of the uuencode and .BOO formats, which while
 419 popular, are not impervious to some "real" problems that have come
 420 up.  We essentially designed an archiving format that was PDP-8
 421 oriented, but not limited to -8s only.  Some of the highlights of the
 422 format are:
 423
 424 a)  File format restricted to "printable" six-bit subset of ASCII
 425 only.  All else ignored; this was to minimize the "garble" factor,
 426 yet maintain a fairly high rate of efficiency: two ASCII characters
 427 equal one PDP-8 12-bit word. (This has proved to be problematic and
 428 is why we are here!)
 429
 430 b)  The archive file contains imbedded commands, not implied ones.
 431 By validating the commands, you can "trust" the contents.  Commands
 432 are available for whatever purpose arises.  Already implemented are
 433 commands to start ("(FILE filename.ext)") and end ("(END
 434 filename.ext)") the imbedded file, and an official comment command
 435 ("(REMARK anything)") to help document the contents of the rest of
 436 the file.  This is of course expandable.  My OS/8 programs create all
 437 three types of commands.  The start and end commands also
 438 theoretically allow multiple files in an archive, but I ignore the
 439 end command in the decoder and only allow one file per archive.  I do
 440 support the start command completely, which includes a suggested name
 441 for the file.  This name can be used at the user's option, or can be
 442 locally overridden.  The encoding program inserts the original file
 443 name in this field, as this is of course the most likely name for the
 444 file at the other end.
 445
 446 c)  The archive contains a checksum for its contents to ensure the
 447 validity of the file.
 448
 449 d)  All "white space" character considerations are ignored; imbedded
 450 extraneous space characters, formfeeds, extra CR/LF, etc. are
 451 harmless.  The CR/LF must be present at appropriate intervals, but
 452 this is only a problem with files passed through unix systems that
 453 delete the CR.  Since OS/8 requires the CR and LF to be considered
 454 "printable", this is not a problem;  the use of programs such as
 455 c-KERMIT will insert the CR if configured properly (SET FILE TYPE
 456 TEXT).  Programs such as Rahul Dhesi's FLIP program are available to
 457 correct the problem easily if necessary: FLIP -m *.* or equivalent
 458 will remedy this.
 459
 460 e)  There is an internal record length of 64 characters with framing
 461 characters, to ensure the validity of each record.  This prevents the
 462 file from getting "out of sync" with its original.  This causes each
 463 line to be 68 characters including CR and LF, which is usually
 464 reasonable to most systems.
 465
 466     Unfortunately, this scheme has proved to be flawed in an
 467 important way that "matters."  This format must deliver files to
 468 OS/278 systems by the prevailing paths of existent systems connected
 469 to DECmates containing only the normally present DEC release
 470 software.  This could include sending the files via DEC-DX through
 471 WPS8, or acquiring the files on either DECmate CP/M-80 or DECmate
 472 MS-DOS, possibly using KERMIT-80, or KERMIT-MS as appropriate.  If a
 473 file is received in the CP/M-80 environment, it can be converted to
 474 WPS8 format using a DEC-supplied program called WPSCONV.  If a file
 475 is received in the MS-DOS environment, it can be converted to WPS8
 476 format using a DEC-supplied program called CONVERT.  Incidentally,
 477 CONVERT can also convert CP/M-80 files as well, using MS-DOS format
 478 as an intermediary;  WPSCONV is known to have bugs, which were
 479 corrected in CONVERT (which requires the MS-DOS hardware, not just
 480 the CP/M-80 hardware).  These CP/M-80 and  MS-DOS files can also come
 481 to the DECmate directly from a Rainbow as well, since the
 482 corresponding Rainbow systems are format compatible with the DECmate.
 483 DECmate MS-DOS additionally supports IBM-PC diskettes (160K or 180K
 484 single-sided only and read-only) as yet another source.  Thus there
 485 are many paths to WPS8 versions of our files.
 486
 487     The problem with these methods is that apparently there is a bug
 488 in OS/278 WPFLOP, the only distributed conversion program a user
 489 would already have on his OS/278 system. (We haven't actually
 490 isolated the problem to WPFLOP, as the complaining user was taking
 491 the files from MS-DOS via CONVERT then to OS/278 via WPFLOP;
 492 conceivably the problem is in CONVERT, but in any case the problem
 493 exists somewhere in this supported path.)
 494
 495     The internal encoding used is to break the 12-bit word into two
 496 six-bit halves.  Each half is in the range 00-77 octal.  Adding 041
 497 to the value yields characters in the range of ! through ` or 041
 498 through 140 octal.  The codes for 101-132 are A through Z and can be
 499 replaced by 141-172 for a through z if desired.  This prevents
 500 case-sensitivity which is another possible network anomaly.  We
 501 identified the DECmate problem as an anomaly regarding @ and `.  The
 502 character codes for 100 and 140 are not treated uniquely, so the
 503 resulting OS/278 file is an inaccurate representation of the file.
 504 The decoding program correctly failed the conversion on a checksum
 505 error, so at least the user was aware of the problem!
 506
 507     As the PDP8-LOVERS, we will hopefully acquire an archive site for
 508 our files, so all of these considerations will apply.  We need a file
 509 format that is "bullet-proof" to avoid problems like this one.  I am
 510 soliciting suggestions for improvements on this encoding scheme (and
 511 any other overall file format suggestions as well) to provide an
 512 effective solution.  The resultant programs will be added to the
 513 KERMIT-12 collection freely distributed by Columbia as well as other
 514 sources (DECUS, etc.).
 515
 516     Some suggestions have already been made:
 517
 518 1.  Just "quick-fix" the problem by providing an alternate character
 519 to the ` to make it non-anomalous with @.  The available choices are
 520 { | } and ~ only.  The DEL character (octal 177) is unsuitable for
 521 other reasons; all other characters are either already used, or
 522 unprintable, or lower-case.  This has the advantage of being most
 523 compatible with the existing programs, since the original character
 524 code can be supported as well; the "preferred" character would be
 525 generated by all future versions of the ENCODE program, and existing
 526 files could be trivially edited for compatibility as needed.  This
 527 would have to be tested  -- it is possible that the bug would
 528 persist.  The choice is further narrowed to { and | only, since 175
 529 and 176 are sometimes treated as alternates to ESC.  It is likely
 530 that systems which "mangle" the case of a character which is
 531 alphabetic could also do the same to { | } and ~ making them [ \ ]
 532 and _ respectively.  This makes the entire suggestion unworkable.
 533
 534 2.  Change the format to "Hexafied-ASCII" where each PDP-8 12-bit
 535 word becomes represented by three characters from a 16-character set
 536 such as 0-9,A-F or A-P.  The alphabetic codes would be immune to case
 537 conversion, and virtually every system supports this subset of ASCII.
 538 Instead of 64 characters on a line representing 32 12-bit words, each
 539 line would be 72 characters on a line representing 24 12-bit words
 540 (not counting framing characters and CR/LF).  This also allows for
 541 many additional codes if needed.  This scheme has the drawback of
 542 making the encoded file more inefficient, as the file will generally
 543 be 50% longer than those created by the original six-bit scheme.
 544 This robust scheme is workable.
 545
 546 3.  Modify 2. to include some form of compression.  The easiest is to
 547 incorporate repeat compression.  One simple scheme is to use an
 548 indicator character (R was suggested) as a prefix for an encoded
 549 count.  It could be followed by three characters encoding the value
 550 of the 12-bit word and two characters encoding the value of the
 551 repeat count.  Since this occupies six characters, as does two
 552 adjacent 12-bit encoded words, this scheme saves space when used for
 553 repeat compression lengths greater than two.  The compressed field is
 554 the same length as two compressed "triplets", so overall file
 555 validation techniques wouldn't require special-case checks, as long
 556 as trailing "fill" characters were allowed for the last record before
 557 the short checksum record (which is signalled by its length).  (T was
 558 suggested for this trailer character to be used to pad the last line
 559 with 0-69 characters.) This allows for compressing 3-258 repeated
 560 12-bit words into six characters.  This would benefit files
 561 containing large areas of zeroes or HLT instructions, etc., as this
 562 can be the actual contents of binary files.  If a .BN file created by
 563 PAL8, etc. is loaded and saved, then "junk" areas are created in the
 564 .SV file.  Unfortunately, this is the norm, and the junk increases
 565 the size of the encoded version of the file.  If the .BN file is
 566 loaded AFTER loading an all-zeroes file such as the binary output of:
 567
 568         *0
 569
 570         ZBLOCK  7600
 571
 572         $
 573
 574 or equivalent as necessary (extended memory zeroed if required,
 575 etc.), then the file has all-zeroes gaps in it.  These would repeat
 576 compress out using this scheme.  Incidentally, an additional
 577 advantage of this method is that the resulting "cleaner" core-image
 578 file is slightly easier to disassemble, in case the source is lost.
 579 (Anyone who ever disassembled a .SV file or equivalent understands
 580 what I mean!).  This also makes a binary papertape file (such as a
 581 diagnostic) loaded into a .SV file a little easier to follow when
 582 consulting the write-up, as memory is zeroed in between the locations
 583 referenced in the listing.  The .SV file is smaller when encoded than
 584 the .BN file due to elimination of the paper-tape encoding overhead.
 585 OS/8 files of diagnostics could therefore be more efficiently
 586 archived as .SV files (encoded) than .BN files.
 587
 588 4.  Change to a 5-bit encoding with compression.  This would use 32
 589 codes chosen from A-Z, 0-9 to encode the file five bits at a time per
 590 character.  Five PDP-8 12-bit words would be encoded in 12
 591 characters.  Since PDP-8 binary files are always multiples of 128
 592 12-bit word pages, there would need to be 4.8 "junk words" at the end
 593 of each block to encode the implied length of 130 words/block.  Each
 594 line would be 78 characters (plus framing characters and CR/LF) so
 595 that four lines encodes a PDP-8 page, just as in the original six-bit
 596 scheme (the original scheme used 64 characters per line!).  The last
 597 line of the file would contain 0-77 padding characters as necessary
 598 to maintain the line width as before.  Repeat compression schemes can
 599 be expressed in any way that is a multiple of 12 characters; perhaps
 600 one or two adjacent expressions of repeat compression similar to
 601 above.  Expected efficiency of this scheme is similar to the original
 602 six-bit method, or possibly slightly better; if compression is NEVER
 603 useful, then the file is 1.2 times as large.
 604
 605     There is an implementation restriction placed on the DECODE
 606 program: it should be relatively short, since it must be distributed
 607 in source form.  It must also be written in a subset of PAL8
 608 compatible with the original PAL8 of the PS/8 days (ugh!) to ensure
 609 viability on any OS/8 family system.  PAL8 Version B0 from OS/278 is
 610 distributed in ENCODed form, so this restriction need not apply to
 611 any other programs such as the ENCODE program or KERMIT-12, etc.  It
 612 has been determined that PAL8 Version B0 and the companion CREF
 613 Version B0 will correctly function on any OS/8 family system on any
 614 PDP-8 member suitably configured to run the operating system the user
 615 already has running. (There is a minor anomaly when using input files
 616 from the TTY: handler; see K12MIT.DOC for a detailed explanation.)
 617 CPU extensions such as BSW and IAC RAL are not present in these
 618 programs, as was the original intention of OS/8 (which eventually was
 619 lost as newer members of DEC's programming staff were ignorant of
 620 this problem!).  It is acceptable to have a "bare-bones" subset of
 621 the DECODE program distributed in "old" PAL8-compatible source form,
 622 along with a "fancier" version written in a more modern PAL8
 623 language, as the binary could then be DECODed with the subset DECODE
 624 program, or the source could be assembled with PAL8 Version B0 to
 625 "bootstrap" the "full" version of the DECODE program as necessary.
 626
 627     For those of you who can't wait, and want these utilities as they
 628 stand (using the fallible six-bit method), they are available via
 629 anonymous FTP from Columbia University (watsun) as
 630 /w/kermit/d/k12dec.pal and /w/kermit/d/k12enc.pal for the DECODer and
 631 ENCODer respectively.  More information is available in
 632 /w/kermit/d/k12mit.doc or /w/kermit/d/k12mit.pal regarding use of
 633 PAL8 Version B0, other assemblers (such as PAL10 or P?S PAL) or other
 634 KERMIT-12 issues, etc.
 635
 636 Charles Lasner (lasner@cunixf)
 637 cjl
 638
 639 ------------------------------
 640
 641 Date: Fri, 4 May 90 13:55:02 EDT
 642 From: Charles Lasner <lasner@watsun.cc.columbia.edu>
 643 Subject: Kermit-12 problems
 644
 645     If the release files of KERMIT-12 are brought to DECmate MS-DOS
 646 via any of the various paths that can be used (such as from a Rainbow
 647 in either CP/M RX50 or MS-DOS RX50 format, etc.; in this particular
 648 case the reporting user obtained them using IBM-PC SSDD 180k 5-1/4"
 649 PC-DOS format.) then the files are available as DECmate II MS-DOS or
 650 CP/M-80 files on one of its standard devices (a:,b:,c:,d: floppies or
 651 e:,f:,g:,h: hard disk volumes).
 652
 653     The ultimate goal is to get these files (un-scathed!) to DECmate
 654 II OS/278 for KERMIT-12 installation.  The standard DEC CONVERT
 655 program alledgedly can convert any combination of MS-DOS or CP/M-80
 656 or WPS/8 from/to each other.  By converting the files to WPS/8
 657 documents, the files can be translated to OS/278 later (using the
 658 OS/278 WPFLOP utility).
 659
 660     There is a problem with DEC's CONVERT.EXE: it only CORRECTLY
 661 supports Rainbow/DECmate RX50 MS-DOS and CP/M diskettes, so the other
 662 formats (8" CP/M-80 diskettes and one-sided PC diskettes) have to be
 663 pre-converted with the appropriate copy commands to a supported
 664 diskette or hard disk volume first before using CONVERT.  This is not
 665 a big problem, as we are merely using standard procedures, but the
 666 point is that much of this is undocumented or obscure.  (I had to
 667 help the reporting user to copy his files to a "friendlier" device
 668 for CONVERT's benefit which only delayed our discovery of the REAL
 669 problem!)
 670
 671     The CONVERT program alledgedly supports ASCII/WPS format
 672 conversion from/to any of MS-DOS, CP/M-80, or WPS/8 (but only on
 673 a:,b:,c:,d:,e:,f:,g:,h: logical drives, not on the other hardware or
 674 media possibly hooked up to the DECmate!).  Our purpose is to move
 675 the K12MIT files to WPS/8 format.  This can be attempted with
 676 standard commands of CONVERT, but there apparently is a bug:
 677
 678     When you boot to OS/278 and retrieve the WPS/8 documents (via
 679 WPFLOP) which are the ENCODed files of KERMIT-12 as OS/278 files,
 680 there is a character anomaly between two encoding characters
 681 (specifically @ and `) that destroys the integrity of the affected
 682 file.  This is possibly due to a bug in OS/278 WPFLOP, but more
 683 likely is a problem with MS-DOS CONVERT.  Regardless of the
 684 perpetrator, this path is not viable to obtain the ENCODed files of
 685 the KERMIT-12 release.
 686
 687     Fortunately, the source files are not affected, as the anomalous
 688 characters are not part of the PDP-8 assembly language, and only
 689 comments could be affected. (As far as I can tell, there aren't any
 690 affected characters even in the comments!) It is therefore necessary
 691 to assemble KERMIT-12 directly from the sources when installing it on
 692 the DECmate II if obtaining it via any path which includes
 693 CONVERT/WPFLOP.  The other ENCODed files are for PAL8 Version B0 and
 694 CREF Version B0 which are already present on the DECmate II as part
 695 of the standard release of OS/278 for the DECmate II and are thus
 696 superfluous.  All ENCODed files can be recreated from OS/278 itself
 697 using the sources, etc., so the intended release files can be
 698 recreated for distribution to other OS/278 sites (bypassing the
 699 CONVERT/WPFLOP path).  Future versions of the DECODE program will
 700 obviate this problem when an appropriate alternative format is
 701 supported properly which is immune to DEC's glitch.
 702
 703     A related problem surrounds the GLOBAL TECO macro K12GLB.TEC (aka
 704 GLOBAL.TEC).  Due to the "delicate" nature of TECO macros, they could
 705 get "mangled" by the time they get to a user site.  Future releases
 706 of KERMIT-12 will "protect" the macro by ENCODing it into K12GLB.ENC.
 707 It has also been reported that there are problems running the macro
 708 on certain releases of OS/8 family TECO and on other TECOs for other
 709 machines, and also problems running certain versions of OS/8 TECO on
 710 the DECmates.  The author will investigate this problem eventually,
 711 but the main usage of the macro is for KERMIT-12 source maintainence
 712 on an OS/8 V3D system using the corresponding version of TECO; it is
 713 beyond the scope of KERMIT-12 development to investigate the myriad
 714 releases of TECO and their hardware and operating system
 715 dependencies; perhaps some TECO hackers can assist us!
 716
 717     An obscure problem indeed!  Users give good feedback...
 718
 719    Can you suggest a fix for the CONVERT/WPFLOP-induced corruption?
 720 One is to allow the current format as a subset, but use a
 721 substitution character for the garbled character.  Our character set
 722 is the 64 characters from ! through `, so the anomalous occurrences
 723 of @ are problematic.  If we change the preferred character for ` to
 724 a lower-case letter (only octal 141 up is available, so let's assume
 725 the use of a) we avoid the CONVERT/WPFLOP problem.  Newer released
 726 ENCODed files would then be immune to the treachery, but would
 727 require the newer DECODing program (or use TECO to change all
 728 occurrences of a to ` and then use the old DECODE program).
 729
 730     Should we abandon this inner format altogether?  We could use an
 731 even more robust format like ASCII hex: 0-9 and A-F (allowing a-f as
 732 well!) at the expense of longer files (currently 2 characters=12
 733 bits, but would become 3 characters=12 bits).  This would also hold
 734 up better through EBCDIC network conversion...
 735
 736 cjl