DIMACS Series in
Discrete Mathematics and Theoretical Computer Science

VOLUME SEVENTEEN
TITLE: "Language Computations"
EDITOR: Eric Sven Ristad
Published by the American Mathematical Society


Ordering Information

This volume may be obtained from the AMS or through bookstores in your area.

To order through AMS contact the AMS Customer Services Department, P.O. Box 6248, Providence, Rhode Island 02940-6248 USA. For Visa, Mastercard, Discover, and American Express orders call 1-800-321-4AMS.

You may also visit the AMS Bookstore and order directly from there. DIMACS does not distribute or sell these books.



PREFACE


A language computation is a computation that underlies the comprehension, production, or acquisition of human language. Language comprehension is a computation that maps a physical sensation to a representation of the language user's knowledge about that sensation. Language production is a computation that maps a language user's linguistic intention to the complex motor activity that expresses that intention. Language acquisition is a computation that maps a stream of linguistic evidence to an internal representation of the human language from which that evidence was culled. These computations lie at the very heart of human language. The goal of this volume is to advance our understanding of these language computations. Our focus is on computations related to the sounds and words of a language.

Our volume begins in part I with an investigation of the sensory-motor representation of speech sounds (phonetics). Osamu Fujimura opens this part with a revolutionary new theory of how the primitive sensory-motor goals of the speaker are expressed in an acoustic signal. Next, Andras Kornai presents an abstract mathematical framework for modeling the relation between the discrete representations attributed to the language user and the continuous physical events directly observable by the scientist. Together, these two articles present a comprehensive view of the relation between the phonetics and the acoustic signal.

In part II, we examine phonological stress, a phonological subsystem used to indicate constituency and hierarchy. Morris Halle and William Isdardi begin by presenting an explicit theory of the language user's knowledge of phonological stress. Their theory explains how stress is represented and computed in the language user, and it enumerates the humanly-possible stress systems. Next, Elan Dresher proposes a resource-limited algorithm capable of acquiring the Halle and Idsardi stress system on the basis of limited phonological evidence. Finally, Luigi Burzio presents a radically different theory of phonological stress, that relates the possible stress patterns of a language to the morphology of that language. These three articles each represent a significant advance in our understanding of phonological stress.

In part III, we consider the abstract question of how it is possible to reliably learn a human language in the face of noisy, incomplete, and inconsistent evidence. Ming Li and Paul Vitanyi present a general theory of how to reassign probability to a hypothesis space on the basis of empirical evidence. Next, Jorma Rissanen and Eric Ristad review the philosophy and practice of the Minimum Description Length (MDL) principle, and demonstrate its application to the metrical acquisition problem considered in part II. The central contribution of these two articles is to provide a powerful conceptual framework within which to address the logical problem of language acquisition.

In part IV, we study the relation between the sound and the meaning of words (morphology). Stephen Anderson introduces this part with a detailed explanation of the intricacy of morphological computations. Eric Ristad concludes with an explicit statement of the morphological acquisition problem, and a proof that the intricacies of this computation -- as detailed by Anderson -- straighforwardly give rise to NP-completeness. Traditionally, morphology has been viewed as a trivial component of linguistic knowledge, of little scientific or engineering importance. By detailing the empirical complexity of the morphology and demonstrating its corresponding computational demands, these two articles offer a serious challenge to traditional views of morphology.

This volume constitutes the official, refereed record of the DIMACS Workshop on Human Language, which was held March 20-22, 1992 at Princeton University. The workshop drew together many of the world's most prominent linguists, computer scientists, and learning theorists with the common goal of advancing our understanding of language computations. The workshop was sponsored by the NEC Research Institute and by the National Science Foundation, via grant number 91-19999 to the Center for Discrete Mathematics and Theoretical Computer Science (DIMACS).

The articles in this volume are directed toward the technically-competent researcher with an interest in human language and in computation. No article in this volume requires an expertise in either linguistics or computer science. However, the reader may wish to strengthen his background in these areas by consulting the following introductory material: phonology and phonetics [3]; morphology [1]; computer science [4], computational complexity theory [2]; computational theory of language [8], inductive reasoning [5]; and minimum description length principle [6,7].


TABLE OF CONTENTS




Forward								  xi

Preface                                                         xiii

Part I. Phonetics

	C/D model: A computational model of phonetic
	  implementation
		Osamu Fujimura					   1

	Relating phonetic and phonological categories
		Andras Kornai					  21


Part II. Metrical Phonology

	General properties of stress and metrical structure
		Morris Halle and William Idsardi		  37

	Acquiring stress systems
		B. Elan Dresher				          71

	Metrical consistency
		Luigi Burzio			                  93



Part III. Learning Frameworks

	Inductive reasoning
		Ming Li and Paul Vitanyi		         127

	Language acquisition in the MDL framework
		Jorma Rissanen and Eric Sven Ristad


Part IV. Morphology

	Parsing morphology:  "Factoring" words
		Stephen R. Anderson			         167

	Complexity of morpheme acquisition
		Eric Sven Ristad		                 185
                                		

Index Index of Volumes
DIMACS Homepage
Contacting the Center
Document last modified on October 28, 1998.