data compression on the HP48 DAV: todo: implement pucrunch; from the Q&A48.48 file, archived as ``Old HP48 FAQs'' on hpcalc.org .

  From Jim Donnelly (jimd@cv.hp.com):

        A simple length-encoding technique can be put to use for a
        free-format, very compact multi-field data storage system.
        Two tiny programs, SUBNUM and STRCON are here to help the
        process, and are listed near the end of this note.  At the
        end of the note is a directory that may be downloaded into
        the HP 48 that contains the examples.

        The principle is to store starting indices in the beginning
        of a string that point to fields stored subsequently in the
        string.  The indices are stored in field order, with an
        additional index at the end to accommodate the last field.
        There are several small points worth mentioning:

        1) Fields may be 0-length using this technique.
        2) The execution time is uniform across all fields.
        3) This technique saves about 4 bytes per field after
           the first field, because the string prolog and length
           are omitted for fields 2 -> n.


        EXAMPLE:
        --------

                         Indices  |          Fields
          Character               |     1 11111111 12222222222
          Position :   1  2  3  4 |567890 12345678 90123456789
                      +--+--+--+--+------+--------+-----------+
          String :    | 5|11|19|30|Field1| Field2 |  Field 3  |
                      +--+--+--+--+------+--------+-----------+

        This is a string that contains 3 fields, and therefore 4
        index entries.  The first field begins at character 5, the
        second field begins at character 11, and the third field
        begins at character 19. To keep the pattern consistent,
        notice that the index for field 4 is 30, which is one more
        than the length of the 29 character data string.

        To extract the second field, place the string on the stack,
        use SUBNUM on character 2 to extract the starting position,
        use SUBNUM on character 3 to extract the (ending position +1),
        subtract 1 from the (ending position+1), then do a SUB to
        get the field data.  NOTE: The index for field 1 is stored
        as character code 5, NOT "5"!  To place the field index for
        field 1 in the string, you would execute "data" 1 5 CHR REPL.


        PROGRAM:
        --------

        The following program accepts an encoded data string in
        level 2 and a field number in level 1:

        DECODE   "data"  field#  -->  "field"

        <<  --> f
          <<
            DUP f SUBNUM                ; "data" start -->
            OVER f 1 + SUBNUM           ; "data" start end+1 -->
            1 -                         ; "data" start end -->
            SUB                         ; "field" -->
          >>
        >>


        DATA ENCODING
        -------------

        The following program expects a series of 'n' strings on
        the stack and encodes them into a data string suitable
        for reading by the first example above.

        The programs SUBNUM and STRCON are used to assemble the
        indices.

        ENCODE      field n  ...  field 1   n   -->  "data"

        << DUP 2 + DUP 1 - STRCON --> n  data
          <<
            1 n
            FOR i
              data i SUBNUM OVER SIZE   ; ... field index fieldsize
              + data SWAP               ; ... field "data" index'
              i 1 + i + SWAP CHR REPL   ; ... field "data"'
              SWAP + 'data' STO         ; ...
            NEXT
            data                        ; "data"
          >>
        >>

        In this example, four strings are encoded:

        Input:  5: "String"
                4: "Str"
                3: "STR"
                2: "STRING"
                1:         4

        Output: "xxxxxSTRINGSTRStrString"      (23 character string)
        (The first five characters have codes 6, 12, 15, 18, and 24)



        VARIATION:
        ----------

        The technique above has a practical limit of storing
        up to 254 characters of data in a string.  To overcome
        this, just allocate two bytes for each field position.
        The code to extract the starting index for becomes a
        little more busy.  In this case, the index is stored as
        two characters in hex.

                      Indices  |          Fields
       Character               | 11111 11111222 22222223333
       Position :   12 34 56 78|901234 56789012 34567890123
                   +--+--+--+--+------+--------+-----------+
       String :    |09|0F|17|21|Field1| Field2 |  Field 3  |
                   +--+--+--+--+------+--------+-----------+

           <<  --> f
             <<
                DUP f 2 * 1 -           ; "data" "data" indx1 -->
                SUBNUM 16 *             ; "data" 16*start_left_byte  -->
                OVER f 2 * SUBNUM +     ; "data" start
                OVER f 2 * 1 + SUBNUM   ; "data" start end_left_byte -->
                16 * 3PICK f 1 + 2 *
                SUBNUM + 1 -            ; "data" start end -->
                SUB                     ; "field"  -->
             >>
           >>



        TWO VERY TINY HELPFUL PROGRAMS
        ------------------------------

        SUBNUM          "string"  position  -->  code

        << DUP SUB NUM >>



        STRCON          code  count  -->  "repeated string"

        << -->  code count
          << "" code CHR 'code' STO
             1 count START code + NEXT
          >>
        >>


        A DIRECTORY YOU CAN DOWNLOAD
        ----------------------------

        This is a directory object.  Cut after the === to the end of
        the file and download to your HP 48 using the ASCII transfer.

========================================================================
%%HP: T(3)A(D)F(.);
DIR
  DECODE
    \<< \-> f
      \<< DUP f
SUBNUM OVER f 1 +
SUBNUM 1 - SUB
      \>>
    \>>
  ENCODE
    \<< DUP 2 + DUP 1
- STRCON \-> n data
      \<< 1 n
        FOR i data
i SUBNUM OVER SIZE
+ data SWAP i 1 +
SWAP CHR REPL SWAP
+ 'data' STO
        NEXT data
      \>>
    \>>
  STRCON
    \<< \-> code count
      \<< "" code CHR
'code' STO 1 count
        START code
+
        NEXT
      \>>
    \>>
  SUBNUM
    \<< DUP SUB NUM
    \>>
END