The prevailing approach to language mapping, whether in printed atlases or on various kinds of electronic maps, is to use polygons to show the approximate boundaries of individual languages or language groups. Such a strategy turns out to be highly unsuitable for mapping a greater part of the modern languages for a number of reasons. The main ones concern visibility, especially for smaller languages; accuracy for the indicated locations; and ways to model human languages from a geographic perspective. A detailed discussion of these issues was presented in GISLI (GIS in Linguistics)An Interactive Language Atlas, a research proposal submitted to the Swedish Research Council by one of the authors (Dahl) in 2005. A shortened version is included here.
Consider the following simple statistics from the latest edition of The Ethnologue: Languages of the World, an online reference database for the languages of the world.
Thus, most of the world's languages can be described as "small" languages in terms of the number of people who speak them. Small languages tend to be represented in a rather inefficient way on traditional maps and more often than not, they are completely absent. An example of one of the better traditional maps (shown in Figure 1) illustrates this point. This map of the Caucasus was originally compiled by the Central Intelligence Agency and quoted here from the University of Texas Perry-Castaņeda Map Library Collection.
This map successfully conveys the substantial variety of peoples and languages in the Caucasus. This region is rightly referred to as the Jigsaw Puzzle of Languages. However, the map is otherwise of limited use because
However, according to Vinogradov, there are also villages in Azerbaijan where Rutul is spoken. When those additional villages are mapped, the locations where Rutul is spoken do not form a continuous area. As pointed out in the previously cited paper by Dahl, languages with several thousand speakers or fewer ought to be mapped on the settlement levelsomething that GIS makes possible. Relating individual languages to specific populated places will hereafter be referred to as geocoding languages.
The Limitations of Polygons
In addition to issues of location accuracy and displaying information about smaller languages, using polygons to map languages has another serious drawback. The map in Figure 2 is a snapshot of the Caucasus region from the mapping service by Global Mapping International (imf.geocortex.net/mapping/worldmap/launch.html). This map uses the same approach as the previous map. It gives the rather erroneous impression that dialects or languages are discrete entities with clearly definable boundaries. In fact, most linguists now recognize that setting such clear boundaries is not possible.
Another serious problem concerns the language-dialect distinction. Anyone who has taken a basic linguistics class knows that the distinction between language and dialect is one of the great unsolvables in language science. The definition of what constitutes a language typically involves political, cultural, and other factors. The distinction is rarely based only on linguistic features.
The level of detail offered by GIS tools makes it possible to design a language mapping system that is flexible enough to reflect not only different views of the language-dialect dichotomy for individual languages and various levels of language groupings. Using point filesinstead of polygonsfor languages and dialects eliminates the need to set up unrealistic boundaries.
In summary, present-day language geography makes an insufficient use of GIS tools. Moreover, whenever GIS tools are used, they are used to implement the traditional approach to language mapping (i.e., using polygons to represent languages). The maps thus generated may be inaccurate as far as language location is concerned and erroneous in the sense that languages and their varieties are represented as discrete entities with clearly definable boundaries. In addition, these maps are not flexible enough to show different levels of language groupings. These levels range from local varieties to larger dialect groups or higher-level language groups or families. Finally, smaller languages-which are the majority of languages of the worldare represented in an inefficient manner or not mapped at all.
Continued on page 2