UUCP, protocol, FAQ: UUCP `j' Protocol

13. UUCP `j' Protocol

UUCP `j' Protocol

The `j' protocol is a variant of the `i' protocol. It was also written by Ian Lance Taylor, and first appeared in Taylor UUCP version 1.04.

The `j' protocol is a version of the `i' protocol designed for communication links which intercept a few characters, such as XON or XOFF. It is not efficient to use it on a link which intercepts many characters, such as a seven bit link. The `j' protocol performs no error correction or detection; that is presumed to be the responsibility of the `i' protocol.

When the `j' protocol starts up, each system sends a printable ASCII string indicating which characters it wants to avoid using. The string begins with the ASCII character `^' (octal 136) and ends with the ASCII character ` ' (octal 176). After sending this string, each system looks for the corresponding string from the remote system. The strings are composed of escape sequences: `\ooo', where `o' is an octal digit. For example, sending the string `^\021\023 ' means that the ASCII XON and XOFF characters should be avoided. The union of the characters described in both strings (the string which is sent and the string which is received) is the set of characters which must be avoided in this conversation. Avoiding a printable ASCII character (octal 040 to octal 176, inclusive) is not permitted.

After the exchange of characters to avoid, the normal `i' protocol start up is done, and the rest of the conversation uses the normal `i' protocol. However, each `i' protocol packet is wrapped to become a `j' protocol packet.

Each `j' protocol packet consists of a seven byte header, followed by data bytes, followed by index bytes, followed by a one byte trailer. The packet header looks like this:

`^' Every packet begins with the ASCII character `^', octal 136.

HIGH LOW These two characters give the total number of bytes in the packet. Both HIGH and LOW are printable ASCII characters. The length of the packet is `(HIGH - 040) * 0100 + (LOW - 040)', where `040 <= HIGH < 0177' and `040 <= LOW < 0140'. This permits a length of 6079 bytes, but there is a further restriction on packet size described below.

`=' The ASCII character `=', octal 075.

DATA-HIGH DATA-LOW These two characters give the total number of data bytes in the packet. The encoding is as described for HIGH and LOW. The number of data bytes is the size of the `i' protocol packet wrapped inside this `j' protocol packet.

`@' The ASCII character `@', octal 100.

The header is followed by the number of data bytes given in DATA-HIGH and DATA-LOW. These data bytes are the `i' protocol packet which is being wrapped in the `j' protocol packet. However, each character in the `i' protocol packet which the `j' protocol must avoid is transformed into a printable ASCII character (recall that avoiding a printable ASCII character is not permitted). Two index bytes are used for each character which must be transformed.

The index bytes immediately follow the data bytes. The index bytes are created in pairs. Each pair of index bytes encodes the location of a character in the `i' protocol packet which was transformed to become a printable ASCII character. Each pair of index bytes also encodes the precise transformation which was performed.

When the sender finds a character which must be avoided, it will transform it using one or two operations. If the character is 0200 or greater, it will subtract 0200. If the resulting character is less than 020, or is equal to 0177, it will xor by 020. The result is a printable ASCII character.

The zero based byte index of the character within the `i' protocol packet is determined. This index is turned into a two byte printable ASCII index, INDEX-HIGH and INDEX-LOW, such that the index is `(INDEX-HIGH - 040) * 040 + (INDEX-LOW - 040)'. INDEX-LOW is restricted such that `040 <= INDEX-LOW < 0100'. INDEX-HIGH is not permitted to be 0176, so `040 <= INDEX-HIGH < 0176'. INDEX-LOW is then modified to encode the transformation:

* If the character transformation only had to subtract 0200, then INDEX-LOW is used as is.

* If the character transformation only had to xor by 020, then 040 is added to INDEX-LOW.

* If both operations had to be performed, then 0100 is added to INDEX-LOW. However, if the value of INDEX-LOW was initially 077, then adding 0100 would result in 0177, which is not a printable ASCII character. For that special case, INDEX-HIGH is set to 0176, and INDEX-LOW is set to the original value of INDEX-HIGH.

The receiver decodes the index bytes as follows (this is the reverse of the operations performed by the sender, presented here for additional clarity):

* The first byte in the index is INDEX-HIGH, and the second is INDEX-LOW.

* If `040 <= INDEX-HIGH < 0176', the index refers to the data byte at position `(INDEX-HIGH - 040) * 040 + INDEX-LOW % 040'.

* If `040 <= INDEX-LOW < 0100', then 0200 must be added to indexed byte.

* If `0100 <= INDEX-LOW < 0140', then 020 must be xor'ed to the indexed byte.

* If `0140 <= INDEX-LOW < 0177', then 0200 must be added to the indexed byte, and 020 must be xor'ed to the indexed byte.

* If `INDEX-HIGH == 0176', the index refers to the data byte at position `(INDEX-LOW - 040) * 040 + 037'. 0200 must be added to the indexed byte, and 020 must be xor'ed to the indexed byte.

This means the largest `i' protocol packet which may be wrapped inside a `j' protocol packet is `(0175 - 040) * 040 + (077 - 040) == 3007' bytes.

The final character in a `j' protocol packet, following the index bytes, is the ASCII character ` ' (octal 176).

The motivation behind using an indexing scheme, rather than escape characters, is to avoid data movement. The sender may simply add a header and a trailer to the `i' protocol packet. Once the receiver has loaded the `j' protocol packet, it may scan the index bytes, transforming the data bytes, and then pass the data bytes directly on to the `i' protocol routine.

Next Previous Contents