Thursday 27 April 2017

A method for encoding Egyptian quadrats in Unicode

A new document 'A method for encoding Egyptian quadrats in Unicode' is now available from the UTC document register as L2/17-122 [pdf]. The system described takes into account discussions last July during the Informatique et Égyptologie Cambridge meeting and afterwards about extensions to Unicode plain text support to handle vertical hieroglyphic and various complex forms of quadrat structure.

The place for questions, discussion and suggestions is the Egyptian Hieroglyphs in the UCS mailing list (see Informatique et Égyptologie, Cambridge, 2016).

L2/17-122 contains a feasibility report based on three prototype OpenType font developments (Glass, Nederhof, and Richmond) which I hope goes a long way to alleviate concerns raised last year by Egyptologists about the viability of flexible hieroglyphic font implementations in Unicode.

L2/17-122 identifies 9 controls as follows:

Basic quadrat structures

EGYPTIAN HIEROGLYPH VERTICAL JOINER
EGYPTIAN HIEROGLYPH HORIZONTAL JOINER

These two controls were proposed in L2/16-018 (January 2016). They are similar to the original Manuel de Codage (MdC85) ':' and '*' controls.

EGYPTIAN HIEROGLYPH SEGMENT START
EGYPTIAN HIEROGLYPH SEGMENT END

These two controls operate in a similar way to MdC85 brackets '(' and ')'.

L2/17-122 does not contain structure extensions such as the group joiners suggested in L2/16-214 [pdf] to simplify encoding of quadrats in vertical text and tall quadrats in horizontal text. Therefore for most applications the basic quadrat structures of L2/17-122 are encoded as exact equivalents to those of MdC85 (itself derived from the Buurman 1976 model).

However, there are subtle differences from MdC85, most importantly (i) L2/17-122 has more clearly defined control behaviour and (ii) quadrat appearance is determined by a font (or equivalent) so there is more flexibility in handling issues such as hieroglyph sizing, kerning, etc. in plain text implementations.

Hieroglyph combinations

EGYPTIAN HIEROGLYPH STACK MIDDLE

This control overlays one hieroglyph on top of another - a direct equivalent of the MdC88 '#' control (encoded as '##' in JSesh).

EGYPTIAN HIEROGLYPH INSERT TOP START
EGYPTIAN HIEROGLYPH INSERT BOTTOM START
EGYPTIAN HIEROGLYPH INSERT TOP END
EGYPTIAN HIEROGLYPH INSERT BOTTOM END

These four geometrical ligature controls are proposed in place of the L2/16-018 EGYPTIAN HIEROGLYPH LIGATURE JOINER (which was based on an abstract ligature model for non-grid quadrat elements). This set of four ligature controls originates from a consensus formed at the I&E 2016 meeting that four 'corner control' ligatures are sufficient to meet anticipated plain text ligature needs of corpus projects such as Ramses and TLA and that the Egyptologists present preferred geometrical to abstract ligatures. This is a new approach to ligatures although they link fairly well to usage of the original MdC ''&''ligature and MdC extensions familiar to JSesh users.

Bob Richmond

No comments:

Post a Comment