Studies in Computational Linguistics No. 1, The Recognition of Alphabets

Edward F. Storm, Syracuse University

SU-CIS-70-01

Description/Abstract

Formal parsing rules for programming languages often have machinery for recognizing identifiers, numerical constants, and other substrings whose internal structure is only marginally relevant to the language structure as a whole. In this note alphabets are introduced in which identifiers, constants, etc. are regarded as single symbols. An alphabet is thus constructed out of a finite set of characters, is identified as a regular language and a simple recognition algorithm is described, giving the language designer considerable latitude in his choice of alphabet.