Stretching Conceptual Structures in Classifications Across Languages and Cultures

SUMMARY The authors describe the difficulties of translating classifications from a source language and culture to another language and culture. To demonstrate these problems, kinship terms and concepts from native speakers of fourteen languages were collected and analyzed to find differences between their terms and structures and those used in English. Using the representations of kinship terms in the Library of Congress Classification (LCC) and the Dewey Decimal Classification (DDC) as examples, the authors identified the source of possible lack of mapping between the domain of kinship in the fourteen languages studied and the LCC and DDC. Finally, some preliminary suggestions for how to make translated classifications more linguistically and culturally hospitable are offered.


INTRODUCTION
Michèle Hudon points out that one of the problems traditionally associated with the construction of multilingual thesauri is that of stretching the language of the component vocabularies to make them fit a foreign conceptual structure to the point where they become barely recognizable to their own speakers. 1 In this paper we extend this problem to classification schemes in general.
Over the last few decades we have seen a move towards unification and standardization of bibliographic systems, not just in the United States, but also globally. This means that traditional classifications, originally designed in a particular country (such as the Dewey Decimal Classification), or even for a particular collection (such as the Library of Congress Classification), are now being stretched, in Hudon's words, to cover cultural and linguistic artefacts and concepts quite different from those originally intended.
As classification schemes are being expanded and translated to "go global," we are faced with many of the same problems encountered in translation in general: issues of vocabulary, syntax, and semantics. In addition to these concerns, however, when dealing with classifications it is also necessary to consider the differences in knowledge structures-that is, the way in which the classification scheme represents a set of terms and concepts, but also how it comprises a pattern of relationships among those concepts. These relationships reflect an overall view of how the concepts are construed by a given discourse community in a given context. Thus, in harmonizing classification schemes across languages and cultures, we must address not only the issues of the terms, but also the way in which these terms are bound up in knowledge representations.
Why is this important? First of all, we might consider the basic purpose of classification schemes, which is to provide pointers and access to a body of works, as well as to the ideas and knowledge recorded in those works. To do this effectively, a classification must reflect concepts in such a way that a searcher can make use of several strategies: 1. The first strategy is that of finding what one already knows is there. This is called a known-item search. For example: "I'm looking for a recipe for flan." I hope, therefore, that the classification incorporates a concept for flan, and by using the term flan, I will find a recipe for it.
2. The second strategy is to be able to find what one hopes or suspects is there, but which one is perhaps unable to articulate. For this, a classification is helpful by grouping similar things together so that a searcher can locate a promising "neighborhood" and explore it. So, I might look under desserts and find a recipe for flan.
For both of these strategies to work it is necessary that searchers know what the ideas and concepts are to begin with, and then how they might be grouped. Beyond these basic functions, classifications have a third very useful role in knowledge organization and retrieval, and that is to represent a field of knowledge in such a way that a great deal of information becomes evident through the classification structure itself. 2 For instance, if we learn from a classification of desserts that a flan is a type of custard, then we have gained a quick and efficient way of knowing quite a bit about flans (providing, of course, that we know what a custard is). In this way classifications are tools for learning and discovery, and not just for storage and retrieval of documents. For a classification to fulfill this particular role adequately, it must be a reflection of some sort of consensual meaning. That is, it must be reasonable and "true" for a user that a flan is a kind of custard. The problem arises, however, when we realize that we cannot take for granted that such a relationship of flan to custard is universally held, or that this will be the first or preferred way of construing the notion of flan.
Clare Beghtol 3 argues that making classifications culturally hospitable by including provisions for specific aspects of different cultures will enhance their appropriateness and utility for the purposes of worldwide information flow. For a classification designed from one perspective and for one culture to be hospitable to a different culture and language, it must take into account other possible relationships and other possible ways of identifying and labeling.
In this paper we provide one example of the differences in knowledge structures from language to language, culture to culture, and then suggest ways in which these differences can be accommodated in culturally hospitable translations. For our example we have chosen the culturally bound domain of kinship terms because notions of kinship are basic and universal (in that we all have relatives), but also unique to specific cultures (in that each culture integrates the concept of family differently). We explore the differences in kinship terms and relationships in fourteen languages and compare this to the representation of kinship terms and relationships in the Dewey Decimal Classification (DDC) 4 and the Library of Congress Classification (LCC). 5 The purpose of this inquiry is to demonstrate and describe the various kinds of problems that arise if one tries to extend, or stretch, the DDC and LCC for use in these languages, and the cultures in which they are embedded.

Our Informants
We interviewed fourteen informants (eight women and six men) of diverse language and cultural backgrounds. All but one are graduate students studying in the United States. We included four major Asian languages, two Slavic languages, and a single representative each from the following language groups: Indian, Dravidian, Negro-African, Oceanic Indonesian, Semitic, Turkic Altaic, Germanic, and Romance. See Table 1 for a summary of the informants' languages, language families, and countries of origin.

Data Collection Challenges
Our greatest challenge in designing the data-collection procedures was to be able to discover what terms and relationships are used in our informants' own languages. Since all of them are fairly proficient English speakers and familiar with U.S. culture, we did not want them to respond by anticipating our own understanding of kinship. We needed a technique that would elicit responses without overly influencing these responses to suit us, the interviewers. This was not to be an exercise in one-to-one translation. For example, we did want to start with the English term uncle, and ask what the equivalent is in their language. Doing so would assume that there is in fact a term for uncle in their language, that the notion of uncle is more or less the same as ours, and that the term uncle extends to the same sorts of people as it does in our culture. Thus, we adapted a form of ethnographic interview suggested by Spradley,6 in which the interviewer and informant are co-researchers-that is, they explore the question together, but as much as possible from the informant's point of view. The researcher tries not to impose his or her own conceptual structures, but instead seeks to elicit terms and the meaning of the terms from the informants' narratives. It is an iterative process in which the researcher attempts to use and reuse the informant's terminology in consequent questions and clarifies the meaning of relevant terms, again using the informant's own vocabulary. Put another way, the challenge was to avoid "putting words into our informants' mouths." We also wanted to collect information about the contextual nuances of the various terms and relationships.

The Interviews
The interviews were conducted as informal conversations, and did not follow a set format. We followed these general steps, with some minor differences from respondent to respondent: 1. To get things started and to provide a conceptual anchor, we asked the informants to imagine an important family gathering, e.g., a holiday dinner or a wedding, and to tell us who would typically be there. 2. As the informant described the family gathering, he (or she) identified terms used for various kin, both their personal name and then the generic name for that relative. For instance, he or she might mention that Aunt Theresa would be there, and then tell us the term for aunt. Each generic term was written in the informant's language on a sticky note and placed on a large sheet of newsprint. (Languages using other alphabets were transliterated.) In the center of the page the informant placed himself or herself. The various relatives were arranged around the "self " in whatever way the informant found useful. The purpose of physically laying things out was to provide the informant with a visual display, thus triggering other terms that should be included for completeness. 3. The terms generated by the initial question eventually suggested other ones, and gradually the informant filled in gaps in the structure. As prompts, we asked for other similar relationships and terms, and also for differences and distinctions. 4. As we went along, the informant sometimes drew lines between the terms to show special connections, or modified the structures as new terms came to mind.
5. Besides the terms themselves, and the relationships among them, the informants also offered many examples of language in use, as well as cultural background to explain the various terminology. 6. We also asked for extended uses of the terms.
All interviews were audiotaped for later reference. Several informants mentioned that exploring their own culture and language and laying the structure out on paper was quite revealing to them, which indicates to us that the process did in fact tap implicit knowledge.
The interviews yielded rich and informative descriptions of the domain of family and kinship in the fourteen languages we studied. Some languages, such as Dutch, seem compact, making relatively few distinctions between various types of kinfolk. Others, for example Chinese and Malay, have elaborate schemes with distinctions made among relatives along several dimensions: age, birth order, gender, mother's or father's side, and so on, with a separate term for each. Informants produced inventories of terms ranging from a low of about twenty terms to a high of over fifty.
Since the purpose of the study is to demonstrate certain issues, we did not attempt to be comprehensive in gathering the data. Respondents continued reporting terms more or less to the level of five generations, with themselves as the middle level, and to one or two layers of cousins and aunts and uncles. The extent of reporting was often determined by what the informant meant by family. Some cultures emphasize closeness in large, extended families. Others consider family to be only the very nearest of relatives.

ANALYSIS
The results of each of the interviews were compared to English kinship terminology and structures. In particular we looked for the following: • differences in the scope of each term. Did it cover the same entities? • empty lexical or conceptual categories, where the informant's language has no term for an English term or vice versa.
• differences in criteria for distinction. • extended uses of terms. • differences in how terms are used in practice.
This comparative analysis allowed us to identify patterns of issues, which we describe generically in the following section.

Insufficient Specificity
Some languages make distinctions that we do not make in English. For instance, we do not distinguish an aunt who is a mother's sister, from an aunt who is a father's sister, or from an aunt who is an uncle's wife. All are called "aunt." Such a distinction is made, however, in many other languages. Put another way, some languages have terms for concepts that we do not bother to name separately. A user would not be able to search using these more specific subject terms because they do not exist in English. Here is a partial list of such distinctions made by other cultures: • uncles and aunts on mother's and father's side • grandparents on mother's and father's side • wives of uncles and husbands of aunts • wives and husbands of siblings distinguished by siblings' age relationship to speaker • siblings, cousins, aunts, uncles, distinguished by age, relative to the speaker, or based on birth order (e.g., [my] younger sister, mother's oldest brother, first-born son)

Terms Too Specific
Conversely, we make distinctions in English that are not made in some languages. For instance, we distinguish siblings and cousins, while in some languages, siblings and cousins are covered by the same word, but may be distinguished along some other criterion, such as age. Here are typical examples of English terms that are not always distinguished in other languages, and thus will have the same term in that language to cover both of them: • parents and step-parents (both called parents) • brothers-in-law and uncles; sisters-in-law and aunts (called by name given to uncle or aunt) • sons and daughters (both called child) • granddaughters and grandsons (both called grandchild) • mothers and aunts (aunts, and sometimes older sisters are called mother) Sometimes this lack of distinction is prompted by cultural attitudes. For instance, in cultures where divorce is rare, or is perhaps glossed over, there are no special terms for stepchildren and stepparents.

Missing Terms
A lexical hole occurs when there is no term in one language for a term that exists in another language. For example, in many cultures, the role of the godparents is very important, and there are terms not only for the godparents themselves, but also for the relationship between them. There is no term for this relationship in English, not because there is no notion of godparents, but rather because there is no notion of a unique kinship relationship between them. Conversely, there are terms in English that are so specific to our culture that they may be irrelevant, and therefore not named, in other cultures. An example of this is "gay marriage."

Misclassifications
A misclassification occurs when a concept in one language is classified in another language in a way that does not conform to how that concept is construed. For example, DDC places the notion of mistress under 306.736 (extramarital relations). Of the languages and cultures we studied there are several in which the mistress lives in the same home as the family and is considered part of it. The legitimacy of her place in the family is shown by names such as little mother. In yet other cultures, the mistress is considered "extramarital" but is considered legitimate for tax-deduction purposes, and thus falls somewhere between little mother and other woman.

Differing Classification Criteria
A common reason for a lack of mapping from one classification to another is the fact that entities are classed based on different criteria. For example, we classify our kin by generation (forbears and children), by marriage (in-laws) and by sex (daughters, sons, aunts, uncles). We do not distinguish siblings by birth order; we do not use father's and mother's side as criteria of distinction, and yet these are typical in other languages. Some languages distinguish by sex where we do not (cousins, for instance), or by marriage (uncles' wives being distinct from parents' sisters).

Extensions: The Case of "Aunties"
In almost all languages, kinship terms have a way of being extended to individuals who are not related by blood or marriage. What is interesting is that the term in one language can be affectionately extended (such as calling all close women friends of your mother's age auntie), or it can be rude in another language (such as insulting a woman by implying she is old by calling her auntie). The purpose of most of the extended meanings is to give kinship status to those not technically related, and is meant as a way of showing closeness. Sometimes, though, the extended meanings are used to smooth over disruptions in family life, for example extending the term mother to your stepmother, or the term father to the older brother of a deceased father (in Shona).

Language in Use
Each culture experiences shifts in kinship norms and values, and the classification and terminology eventually reflect this. Where once distinctions were made for political or social reasons such as royal inheritance, they may no longer be relevant. On the other hand, new social forms emerge requiring new labels. For instance, many Dutch couples do not marry, and yet there is no universally accepted term for the man and woman who live together and have children but are not married. There is a gap in the classification, or a shift. Thus, in The Netherlands, the term illegitimate child has very little meaning if a large proportion of children technically fall into this category, and yet are considered "legitimate" in every other way: legally and socially.

COMPARISON WITH LCC AND DDC
Once the basic differences were identified we wanted to see how the issue played out in the LCC and the DDC. We looked for the samples of representations of kinship terms in the schedules of each of these schemes and found a number of ways in which the classifications provided by our informants did not map well to LCC and DDC.

The Library of Congress Classification
This is the most widely used classification in academic and research libraries. It was originally a scheme devised to accommodate the collection of the United States Congress, hence its disproportionate coverage of certain topics, such as the military and political sciences. It is thus a document-centered (rather than subject-centered) classification. Since its inception, however, the LCC has grown to reflect much wider collections than those of the Congress of the United States, making it in many ways a de facto national classification. If a subject category does not already exist, it gets added as works get published; thus we can assume that there is at least one work for each subject category in the LCC. In this way we can say that LCC emerges from a strong cultural and literary warrant.
Kinship and family are covered mainly in the H schedules (the Social Sciences), as well as scattered throughout other schedules for various special topics such as psychological, legal, mythical, and religious aspects. For the sake of simplicity, we cover only the Hs in this discussion, since these sections provide the most straightforward treatment of family and will serve to show up the various issues.
We see in Figure 1 that the LCC representation of kinship and family contains a blend of straightforward kinship terms, terms representing social phenomena, as well as a few terms that seem out of place (e.g., HQ759.2: Mother's Day). Many of the problems discussed above are evident in this classification: • insufficient specificity to describe, for instance, different terms for aunts and uncles on mother's and father's side; • overly specific terms that may not be used in other languages, such as cousins; • culturally significant terms that may not map accurately to all languages, such as working mothers; • different criteria of distinction, in that there is no provision for distinguishing by age or birth order, which is critical in some languages (for instance, first son, little sister, and so on).
All in all, the LCC does not seem culturally hospitable. Because it is an enumerative system (that is, the main goal is to find a place for each subject, rather than to build a coherent structure), it is difficult to see how it could be altered easily to accommodate concepts and conceptual structures from other languages, except to add them here and there in the same arbitrary way the English terms seem to be added. In other words, the LCC's classification does not seem to do a particularly good job of describing our own kinship terms and structures. This might add to the problems of translating it into other languages.

Dewey Decimal Classification
This classification, developed by Melvil Dewey over a century ago, is based on a model of knowledge that reflects nineteenth-century academic disciplines in the United States. Even though it has undergone over twenty revisions, it still shows this bias in the distribution of classes and the relative difficulty of using it for non-Christian, non-Western works. The DDC is a deductive classification, which means that categories exist (or can be built) even if there is no work published on a given topic.
The main concepts of family and kinship are represented in two ways in the DDC. The first is in the main schedules in the 300's (Social Sciences). This deals with family as a social institution, and focuses on family relationships (see Figure 2). This section of the DDC does not really represent kinship per se, but rather relationships among kin, as well as the various ways in which families might be configured. It is a list of subjects, however, that is clearly embedded in our own culture. Various key relationships from other cultures are not included. For instance, the DDC does not include the notion of relationships with aunts and uncles as well as with cousins, which is critical in many cultures. As in the LCC, there are terms that might be perplexing or irrelevant to other cultures (e.g., suburban family). The conceptual structure, though, is generic and open, and thus, it is possible to add subjects without too much confusion. For instance, one could add the Chinese notion of "reverse marriage" where the husband joins the wife's family, and becomes part of it, rather than the other way around. Under grandparent-child relationships (306.8744), one could add the specific types of grandparents, such as those on one side of the family or the other. The other way in which the DDC represents concepts of family is through Table 7, Groups of Persons (see Figure 3). The subject categories from the tables are not used alone, but rather are added as suffixes to subject categories from the main schedules in order to make them more specific. For example: 306.8 is the number for "Marriage and Family." We can add a suffix from Table 2, the geographic tables, -095, for instance, which would yield 306.8095, meaning "Marriage and family in Asia." Table 7 in the DDC is interesting because instead of classifying kinship relationships by using our own English terms (and therefore our criteria for distinction), the categories are described generically. So, instead of calling the category Sons and Daughters (our term for this concept), it is called Direct Descendants. It would be relatively easy to add the various specific names for first son, oldest daughter, even if a language did not have exactly equivalent terms for and undifferentiated son or daughter. It would also be relatively simple to add categories for "inside" and "outside" families, which differentiates by the mother's and father's sides of the family.
Another interesting feature of Table 7 is that it does address the issue of "age" in individuals, but age is construed as a phase in a person's life, such as being a teenager, rather than a permanent condition. In other languages, age is important as well, but it stays static. That is, once you are the first-born son, you are older brother to your younger siblings, and you remain in this category forever. You do not grow out of it, and even death does not alter your seniority in terms of labeling.

A FRAMEWORK FOR CULTURALLY HOSPITABLE CLASSIFICATION TRANSLATION
Having demonstrated the various errors of classification mapping and difficulties of translation, in this section we present some preliminary suggestions for how a classification can more successfully be translated or extended to other cultures and languages.
• For errors of insufficiently precise terminology in the target language, add the appropriate terms.
• For errors of extraneous categories, prune the classification of terms that make no sense in the target language, or leave them "fallow" to be unused.
• In the case of one term in the target language being used for two or more terms in English, make sure to make cross references (see also). This is to ensure that the notion of cousins will not be lost to a person who searches • For errors of conceptual structure, such as the misclassification of mistress, add modifiers or scope notes to clarify the terms and treat them each as a separate entity. Then classify each in its appropriate place in the scheme: Mistress (illicit extramarital relationship); Mistress (legitimate extramarital relationship).
• Describe categories as generically as possible so that a variety of terms can be logically classed in them.
This does not solve all of the problems, of course. An ideal translation that is 100 percent culturally and linguistically sensitive is probably not achievable because the criteria we would have to use are extremely complex, dynamic, and subjective. If we adopt culturally and linguistically hospitable practices, however, we will improve our classification-translation results in terms of making them more useful to their constituencies. Even if we are successful in this endeavor, though, we will still have to address problems that might arise from trying to be everything to everyone: these might include lack of clarity and cohesiveness or inability to incorporate diverse structures due to fundamental differences.

CONCLUSION
Many problems arise in the process of translation of a classification system from the source to another language and culture. Among these are finding corresponding terminology and being able to reflect the relationship between terms in the target language correctly. We presented evidence that in the process of translating classification structures there may be structural shifts. Some terms have broader definitions; others, narrower ones. There may be differences in how similar terms are construed. There may be additional criteria of distinction (such as birth-order).
We suggest that not only terms themselves but also inter-term relationships need to be preserved in cross-cultural cross-lingual classification translations. It is important to avoid merely translating the source classification word for word, structure by structure. Instead, it is necessary to understand the key classificatory dimensions in any given language. The domain of kinship terms provides a good example because even though it is universal in some ways, there are large differences in how people view family and kinship. In translating a classification scheme of kinship terms what is important to know? If the classification is translated, will it truly reflect the notions of kinship in that language?