Indic-Computing Logo

Script information

SourceForge Logo
Home Project Documentation Mailing Lists Site Map

The Indic-Computing Project > Indic-Computing Handbook > Indic Scripts > Scripts and Writing Systems > Script information

11.2 Script information

Each chapter covering an Indic script has the following proposed structure:

Background Information

General information about the script that would be of interest to developers. This would include a brief descriptions of the following:

  • The history of the script and its evolution.

  • The family that this script belongs to.

  • The salient typographical characteristics of this script.

  • The writing direction.

  • Geographical distribution of usage.

  • The languages that are expressed using this script.

  • Noteworthy variants of this script.

Character Repertoire

The list of basic ``characters'' of the script.[1]

We would list the following pieces of information in this section:

  • The minimum set of graphemes needed for everyday use of the script.

  • Historical and rarely used graphemes.

  • The complete list of graphemes, if such a list exists.

  • The classification of graphemes into alphabetics, numerics, punctuation and other symbols as appropriate.

  • Character set standards that encode this script.

  • Font standards that encode this script.

Standard character set encodings

We would then discuss any ``standard'' character set encodings that are in use for this script. For each such encoding we would cover:

  • The exact set of graphemes encoded.

  • Errors in the encoding, for example:

    • Graphemes not encodeable using the encoding.

    • Redundant graphemes.

    • Erroneous semantics in the standard

  • The mapping between the graphemes of the script and the characters encoded by the character set encoding.

  • The organization that authored the encoding, the current revision of the encoding and if any revisions are being planned.

Font encodings

We would then discuss any ``standard'' font encodings associated with this script. For each such encoding we would cover:

  • The organization that authored the encoding, the current revision of the encoding and if any revisions are being planned.

  • The list of graphemes supported by the font encoding.

  • The process of mapping the glyphs provided by the font to the graphemes of the script.

  • Any typographical quirks that a developer needs to know about.

Keyboard layouts

A script may be supported by multiple keyboard layouts. We would describe these in this section.

  • The layout itself would be described.

  • Its prevalence would be described.

  • Whether the layout has an ``owner'', or is a ``standard'', and if so whether the layout is due for revision.

  • The completeness of the layout would be examined.

  • Metrics for the ease of use of the layout, if available.

Typographical Rules

The basic typographical rules for the script that are not language or region specific. This list would include:

  • Line breaking rules.

  • Ligature formation.

  • Hyphenation rules.

  • Paragraph filling.

  • Grapheme formation from the basic elements of the script.

Region-specific and language-specific variations to the basic ruleset would then be covered.

References

References to additional information about this script.

Notes

[1]

We prefer to use the term ``grapheme'' in the Handbook.

This, and other project documentation, can be downloaded from [ http://indic-computing.sourceforge.net/documentation.html ].


Copyright © 2001--2009 The Indic-Computing Project.
Contact: jkoshy
View document revision history
Built With WebMake
Site Search Google