Studies in Computational Linguistics No. 1, The Recognition of Alphabets
Formal parsing rules for programming languages often have machinery for recognizing identifiers, numerical constants, and other substrings whose internal structure is only marginally relevant to the language structure as a whole. In this note alphabets are introduced in which identifiers, constants, etc. are regarded as single symbols. An alphabet is thus constructed out of a finite set of characters, is identified as a regular language and a simple recognition algorithm is described, giving the language designer considerable latitude in his choice of alphabet.