Non-Transformational Syntax -  - E-Book

Non-Transformational Syntax E-Book

0,0
113,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

This authoritative introduction explores the four main non-transformational syntactic frameworks: Head-driven Phrase Structure Grammar, Lexical-Functional Grammar, Categorial Grammar, and Simpler Syntax. It also considers a range of issues that arise in connection with these approaches, including questions about processing and acquisition. * An authoritative introduction to the main alternatives to transformational grammar * Includes introductions to three long-established non-transformational syntactic frameworks: Head-driven Phrase Structure Grammar, Lexical-Functional Grammar, and Categorial Grammar, along with the recently developed Simpler Syntax * Brings together linguists who have developed and shaped these theories to illustrate the central properties of these frameworks and how they handle some of the main phenomena of syntax * Discusses a range of issues that arise in connection with non-transformational approaches, including processing and acquisition

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 1012

Veröffentlichungsjahr: 2013

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Contents

Contributors

Introduction

1 Elementary Principles of Head-Driven Phrase Structure Grammar

1.1 Introduction

1.2 Grammars, Types, and Constraints

1.3 Feature Structures and Feature Structure Descriptions

1.4 Signs and Their Attributes

1.5 Constraints and Structure-Sharing

1.6 Selection and Agreement

1.7 Semantics

1.8 Constituent Structure

1.9 Constituent Order

1.10 The Lexicon, Lexical Relations, and Lexical Rules

1.11 Complementation

1.12 Extraction

1.13 Binding

1.14 Further directions

2 Advanced Topics in Head-Driven Phrase Structure Grammar

2.1 Introduction

2.2 Argument Structure

2.3 Phrase Structure and Linear Order

2.4 Syntactic Abstractness and Reductionism

2.5 Meaning in HPSG

2.6 Issues in Morphosyntax

2.7 Advances in Logical Foundations

2.8 Conclusions

3 Lexical-Functional Grammar: Interactions between Morphology and Syntax

3.1 Introduction

3.2 Basic LFG

3.3 Nonconfigurationality

3.4 Variable Head Positioning and Distributed Exponence

3.5 Conclusions

4 Lexical-Functional Grammar: Functional Structure

4.1 Introduction

4.2 Grammatical Functions

4.3 Lexical Mapping Theory (LMT)

4.4 Control and Secondary Predication

4.5 Unbounded Dependencies

4.6 Binding

4.7 Expletives and F-Structure

4.8 Conclusions

5 Combinatory Categorial Grammar

5.1 Introduction

5.2 The Crisis in Syntactic Theory

5.3 Combinatory Categorial Grammar

5.4 The Combinatory Projection Principle

5.5 The Bounded Constructions

5.6 The Unbounded Constructions

5.7 Scrambling

5.8 Gapping and the Order of Constituents

5.9 Intonation Structure and Parentheticals

5.10 Implications for Performance: the Strict Competence Hypothesis

5.11 Computational Applications

5.12 Conclusion

6 Multi-Modal Type-Logical Grammar

6.1 Introduction

6.2 Grammatical Composition as Inference

6.3 Structural Constraints on Inference

6.4 Implicit Grammatical Concepts

6.5 Implicational Reasoning

6.6 The Curry–Howard Correspondence

6.7 Hypothetical Reasoning with Higher-Order Types

6.8 Higher-Order Types and “Empty Categories”

6.9 Surface Polymorphism

6.10 Modal Control I: Licensing Behavior

6.11 Modal Control II: Exacting Behavior

6.12 Surface Polymorphism Revisited I: Dutch

6.13 Surface Polymorphism Revisited II: English

6.14 Conclusions

7 Alternative Minimalist Visions of Language

7.1 Introduction: Goals and Constraints

7.2 Two Kinds of Minimalism

7.3 Ways in which Covert Syntax is Not Minimal

7.4 Basic Mechanisms for Building Syntactic Structure

7.5 Addressing Acquisition: What Does the Child Have to Learn?

7.6 Can These Examples Be Disregarded as “Peripheral”?

7.7 Learning and Innateness: Satisfying the Evolutionary Constraint

8 Feature-Based Grammar

8.1 Introduction

8.2 Features and Values

8.3 Feature Compatibility

8.4 A Subsumption-Based Alternative

8.5 Foundations and Implications

8.6 Conclusions

9 Lexicalism, Periphrasis, and Implicative Morphology

9.1 Introduction

9.2 A Taxonomy of Lexicalist Approaches to Periphrasis

9.3 Inflectional Periphrasis: Compound Tenses

9.4 Derivational Periphrasis: Phrasal Predicates

9.5 Conclusions

10 Performance-Compatible Competence Grammar

10.1 Introduction: Competence and Performance

10.2 Contemporary Psycholinguistics

10.3 Constraint-Based Grammar

10.4 CBL Grammars and Sentence Processing

10.5 A Minimalist Alternative

10.6 Conclusions

11 Modeling Grammar Growth: Universal Grammar without Innate Principles or Parameters

11.1 Introduction

11.2 Head-Driven Phrase Structure Grammar

11.3 The “One-Word” Stage: Learning Words

11.4 Getting Syntax: Multi-Word Utterances

11.5 The Projection of Lexical Properties to Phrases

11.6 Interim Summary

11.7 The Acquisition of Questions

11.8 Conclusions

12 Language Acquisition with Feature-Based Grammars

12.1 Introduction

12.2 Language Learning and the Poverty of the Stimulus

12.3 Possible UGs

12.4 Success in Learning a Grammar

12.5 Learners

12.6 Conclusions

Index of Subjects

Index of Languages

This edition first published 2011© 2011 Blackwell Publishing Ltd

Blackwell Publishing was acquired by John Wiley & Sons in February 2007. Blackwell’s publishing program has been merged with Wiley’s global Scientific, Technical, and Medical business to form Wiley-Blackwell.

Registered OfficeJohn Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom

Editorial Offices350 Main Street, Malden, MA 02148-5020, USA9600 Garsington Road, Oxford, OX4 2DQ, UKThe Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK

For details of our global editorial offices, for customer services, and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com/wiley-blackwell.

The right of Robert D. Borsley and Kersti Börjars to be identified as the editor of the editorial material in this work has been asserted in accordance with the UK Copyright, Designs and Patents Act 1988.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.

Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold on the understanding that the publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought.

Library of Congress Cataloging-in-Publication data is available for this book.

ISBN hbk: 9780631209652

A catalogue record for this book is available from the British Library.

This book is published in the following electronic formats: ePDFs 9781444395013; Wiley Online Library 9781444395037; ePub 9781444395020

Contributors

Farrell Ackerman, University of California, San Diego

Jason Baldridge, University of Texas at Austin

James P. Blevins, University of Cambridge

Kersti Börjars, University of Manchester

Robert D. Borsley, University of Essex

Joan Bresnan, Stanford University

Georgia M. Green, University of Illinois

Ray Jackendoff, Tufts University

Andreas Kathol, SRI International

Helge Lødrup, University of Oslo

Rachel Nordlinger, The University of Melbourne

Richard T. Oehrle, Cataphora, Inc.

Adam Przepiórkowski, Polish Academy of Sciences

Ivan A. Sag, Stanford University

Mark Steedman, University of Edinburgh

Gregory T. Stump, University of Kentucky

Jesse Tseng, CNRS and University of Toulouse

Aline Villavicencio, Federal University of Rio Grande do Sul

Thomas Wasow, Stanford University

Gert Webelhuth, University of Göttingen

Introduction

Robert D. Borsley and Kersti Börjars

In his Syntactic Structures (Chomsky 1957), Noam Chomsky introduced two very important ideas to the linguistics community: generative grammar and transformational grammar. These are rather different ideas but it is not uncommon for them to be confused and used interchangeably. Generative grammar is a view of how to model language. It emphasizes the importance of precise and explicit analyses. Thus, Chomsky (1965: 4) remarks that “If the grammar is … perfectly explicit … we may … call it a generative grammar,” and Chomsky (1995a: 162, fn.1) comments that “I have always understood a generative grammar to be nothing more than an explicit grammar.”1 In contrast, transformational grammar is a specific type of theory developed within this view. Its hallmark is the assumption that grammars involve movement processes so that the superficial position of a word or phrase may be quite different from its underlying position.

It is not uncommon for the term “generative grammar” to be used to mean transformational grammar, which has developed through different stages and is currently known as the Minimalist Program or Minimalism. It is quite possible, however, for approaches that eschew the movement processes of transformational grammar to be precise and explicit. The three theoretical frameworks dealt with in the first six chapters of this book all fall into that category. Equally it is possible for transformational approaches to be imprecise and inexplicit, and in fact this is one of the main criticisms that outsiders have of contemporary transformational work.

Until the mid-1970s, generative grammar and transformational grammar were more or less coextensive. However, at that time, a number of influential linguists developed concerns about aspects of transformational grammar. For instance, concern about the over-reliance on structural factors in explanations led to the development in the 1970s of Relational Grammar, an approach within which direct reference can be made to grammatical relations such as subject and object, allowing generalizations to be stated as changes in relational status rather than as structural transformations (see Perlmutter 1983; Perlmutter & Rosen 1984; Blake 1990; Postal & Joseph 1990).2 As we shall see, within Lexical-Functional Grammar it is also possible to make direct reference to grammatical relations, though they are no longer primitives in recent versions of LFG (see Lødrup, this volume).

We are grateful to all those who acted as referees during the lengthy preparation of this volume: Miriam Butt, Harald Clahsen, Janet Fodor, Georgia Green, Geert-Jan Kruijff, Bob Levine, Helge Lødrup, Michael Moortgat, Dick Oerhle, John Payne, Carl Pollard, Mark Steedman, Nigel Vincent, and Mary McGee Wood. We are also grateful to Anna Oxbury for her meticulous work as Project Manager and to Fiona Sewell for her careful and perceptive copy-editing. We would also like to thank Ewa Jaworska for help with the index.

Around the same time, some linguists also noted problems relating to the lack of clear, formal, and explicit descriptions within transformational grammar. With implicit reference to then current work in transformational grammar, Gazdar et al. (1985: ix) stated that one cannot “evade the entire enterprise of generative grammar by announcing ‘We assume some recursive function that assigns to each grammatical and meaningful sentence of English an appropriate structure and interpretation.’ One must set about constructing such a function, or one is not in the business of theoretical linguistics.” In this spirit, Generalized Phrase Structure Grammar (GPSG) was developed. It is a precise, monostratal framework, which uses alternatives to capture generalizations expressed through movement in transformational frameworks. Such generalizations are captured instead through the introduction of complex categories or the use of meta-rules that map specific phrase structure rules to other phrase structure rules.

In the 1960s, psycholinguistic work was carried out to test the psychological reality of assumptions made within transformational theory at the time. In particular the Derivational Theory of Complexity was tested. According to this theory, transformational complexity should lead to psychological complexity; that is, we would expect that it would take longer to process or produce a sentence the more transformations it involves. The early psycholinguistic work found evidence for syntactic categories and syntactic structures, but not for transformations. Bresnan (1978: 2) then describes transformational grammar as presented in Chomsky’s work as “psychologically unrealistic.” Her desire to develop a more realistic theory of grammar resulted in LFG in the early 1980s. Like GPSG, LFG uses mapping rules to account for some of the relations that were captured by transformation; however, unlike GPSG, within early LFG the rules relate lexical elements in the lexicon.3

The development of Categorial Grammar (CG) differs from those approaches described so far in that its early development predates transformational grammar and does not take place in reaction to it in the way development does in other approaches. CG can be traced back to Ajdukiewicz’s work in the 1930s (Ajdukiewicz 1935) and has been developed in various ways since the 1950s. The early developments were closely linked to the first attempts at computational linguistic work (e.g. Bar-Hillel 1953). In some earlier work, CG was in fact combined with transformational grammar (e.g. Lyons 1968; Partee 1975), but this development never took off. However, there are aspects of Minimalism that bring it closer to CG (see e.g. Vermaat 2004, 2005).

In the mid-1980s, Head-Driven-Phrase Structure Grammar (HPSG) appeared as a monostratal theory exploiting the potential of complex categories even more fully than GPSG had done and incorporating ideas from CG, LFG, and other approaches. Since the mid-1990s, it has utilized hierarchies of constructions to capture generalizations of various kinds.

Since the mid-1980s, LFG, CG, and HPSG have developed into well-worked-out alternatives to transformational grammar, and they have been joined in recent years by the Simpler Syntax framework developed by Peter Culicover and Ray Jackendoff, which has a lot in common with HPSG. This is the subject of Jackendoff (this volume).

The aim of this book is to give an insight into some of the well-developed alternatives to transformational grammar. This is done in two parts. The first part (chapters 1–6) contains introductions to HPSG, LFG, and CG. As one would expect, slightly different analyses have developed within the theories. In the case of HPSG and LFG, the chapters included here (Green, ch. 1; Kathol et al.; Nordlinger & Bresnan; Lødrup) indicate alternatives where they exist, but present a generally accepted core. In the case of CG, the chapters (Steedman & Baldridge; Oehrle) present two different versions of the general approach. The second part of this book (chapters 7–12) discusses specific or general issues from a non-transformational perspective. There are many other approaches we could have included in the first part, among them Functional Grammar (Dik 1978, 1983; Siewierska 1991), Role and Reference Grammar (Van Valin 1993; Van Valin & La Polla 1997), and Dynamic Syntax (Cann et al. 2005). The motivation for our choice is partly one of personal preference, but the three theories discussed share a number of properties, for instance in being feature based and involving some form of unification, while also involving interesting differences. Though there are many plausible and well-developed alternatives, transformational grammar remains the most influential approach to syntax. In our view, these alternatives deserve to be more influential. One thing we want to achieve with this book, then, is to make some of the alternatives more accessible.

When non-transformational approaches were developing in the 1980s, transformational grammar in the form of Government and Binding (GB) theory was very influential. As Newmeyer (1986: 224) notes, a large number of syntacticians found “its premises convincing and its results impressive,” and as a result a large number of linguists turned to it both for theoretical analysis and for essentially descriptive work. It also won some converts from other theories. In recent years, however, there is evidence of a growing disenchantment with the Minimalist Program (Chomsky 1995a). This has been subject to criticism not only from long-standing critics of transformational grammar, such as Postal, but also from syntacticians once quite close to the transformational mainstream, such as Culicover, Jackendoff, Newmeyer, and Webelhuth.4

What many see in the framework is a great deal of rhetoric but little in the way of real achievement. Thus, Newmeyer (2003: 586) remarks that “one is left with the feeling that Chomsky’s ever-increasingly triumphalistic rhetoric is inversely proportional to the actual empirical results that he can point to.” Expanding on this observation, Newmeyer (2003: 589, fn. 7) notes that when Chomsky is asked in an interview what the “results” of our field are, “he responds by citing descriptive generalizations uncovered in pre-minimalist work, such as the distinction between strong and weak islands, rather than pointing to concrete empirical problems solved under the [Minimalist Program]” (see Chomsky 2002: 151, 153). Occasionally it is claimed that there are some important results, but then qualifications are made, which suggest that the claims should not be taken very seriously. Thus, Chomsky (1995a: 249) suggests that “phrase structure theory can be eliminated entirely, it seems, on the basis of the most elementary assumptions,” but then he remarks later that “we still have no good phrase structure theory for such simple matters as attributive adjectives, relative clauses, and adjuncts of different types” (1995a: 382, fn. 22). In an apparent attempt to justify the absence of results, proponents of Minimalism insist that it is “just a program.” But if it is only a program, it is not clear why it should not be considered less advanced than other frameworks, for example those represented here, which have precise and detailed analyses of many syntactic phenomena.

Turning to the rhetoric of Minimalism, a central feature is the idea that language may be “perfect.” Thus, Chomsky (2002: 58) remarks that “it has become possible to pose in a productive way the question of the ‘perfection of language’: specifically, to ask how closely human language approaches an optimal solution to design conditions that the system must meet to be usable at all.” This idea does not fit very comfortably with another central Chomskyan idea, the idea that linguistics is “an approach to the mind that considers language and similar phenomena to be elements of the natural world, to be studied by ordinary methods of empirical inquiry” (Chomsky 1995b: 1). We are not aware of any other element of the natural world where the central research question is: how perfect is it? Moreover, Minimalists do not appear to take this question very seriously. Thus, one textbook introduction, Radford (2004), mentions the idea on p. 9 but ignores it thereafter, while another, Adger (2003), ignores it altogether and rightly in our opinion.

Another feature of Minimalist rhetoric, which it inherits from earlier transformational work, is the claim that transformational analyses explain while non-transformational analyses only describe. Thus, Chomsky (2000) remarks that the Minimalist Program “encourages us to distinguish genuine explanations from ‘engineering solutions’ – a term I do not mean in any disparaging sense.” It seems to us that there is absolutely no basis for this idea. Let us consider a concrete example, namely English non-finite relative clauses. Unlike finite relative clauses, they allow only a PP and not an NP as the clause initial wh-constituent. Thus, we have the following contrast:

(1)

        

(2)

        

This data raises the following question:

(3) Why do non-finite relatives allow only a PP in this position?

In a detailed HPSG discussion of relative clauses, Sag (1997) proposes that non-finite relatives are instances of a phrase type whose non-head daughter is required to be a PP. Thus, HPSG gives the following answer to (3):

(4) Because the relevant phrase type allows only a PP as a non-head daughter.

Minimalism assumes just three main syntactic mechanisms: Merge, Agree, and Move. Hence, for Minimalism, the properties of phrases are a consequence of the feature makeup of their heads. In the case of relative clauses, the head is a complementizer that is phonologically empty when there is an overt filler. Thus, Minimalism must give the following answer:

(5) Because the relevant phonologically empty complementizer allows only a PP as its specifier.

These are different answers but there is absolutely no reason to think that one offers a description and the other an explanation.5

A further aspect of the rhetoric is the suggestion that a transformational approach is “a kind of conceptual necessity, given the undeniable existence of the displacement phenomena” (Chomsky 2001: 8–9, n. 29). Clearly, if transformational grammar were conceptually necessary, a book on non-transformational approaches would make no sense. It would be akin to a book on round squares. In fact, however, transformational processes only appear necessary because of two quite dubious assumptions. The first is that sentences have multiple levels of syntactic structure, which arises from the assumption that different types of information, such as constituent structure, grammatical relations, and semantic relations, are all represented in the same way, namely as a binary branching tree structure. As Culicover and Jakendoff (2005) show, it is only certain unargued “uniformity” assumptions that necessitate this view of syntactic structure. The second assumption is that grammars are sets of procedures. As Postal (2003b) shows, this is not at all a necessary. Instead grammars can be viewed as sets of constraints. All the approaches that are the focus of the present volume reject the first of the assumptions that lead to transformational operations and all except CG reject the second as well. Hence, they have no need for transformational operations.

There are many other features of Minimalism that lead to the skepticism of outsiders. One is the absence of the kind of detailed and precise analyses that one would expect within generative grammar. There is a sharp contrast here with the approaches represented in this book. It is not uncommon in HPSG work in particular to find appendices setting out formal analyses. Ginzburg and Sag (2000) has a 50-page appendix and Sag et al. (2003) a 34-page appendix. Such appendices are unheard of in Minimalist work. It is also common for these approaches to be utilized in computational work. In HPSG, the LKB (Linguistic Knowledge Builder) grammar engineering system (Copestake 2002) has allowed the development of broad coverage computational grammars. Perhaps the most notable is the LinGO English Resource Grammar (ERG) developed in Stanford. Within the projects ParGram and ParSem, LFG is used to produce computational grammars of a wide range of languages.6 There is little computational work drawing on Minimalism. Curiously, though, the word “computational” is used extensively in Minimalism.

An important example of the lack of precision is the lexicon. As indicated above, the features of lexical items, especially those of phonologically empty functional heads, are of fundamental importance for Minimalism in that they are the main source of syntactic variation. One might think, then, that the nature of the lexicon would be a central concern. Surprisingly however, it seems to have received very little attention. Newmeyer (2005: 95, fn. 9) comments that “in no framework ever proposed by Chomsky has the lexicon been as important as it is in the MP. Yet in no framework proposed by Chomsky have the properties of the lexicon been as poorly investigated.” This is in contrast to work within the theories on which this book focuses, where there are formal and explicit descriptions of the features of words and the role they play in the construction of phrases (for a discussion, see Blevins, this volume).

Connected to features and the lexicon is the role of morphology. The features that are crucial to syntactic movement are also in many cases responsible for the shape of the word, that is, they are morphological features. In spite of this central role, morphology has received little attention within any version of transformational grammar. Essentially the assumption has been that words are constructed in much the same way as phrases, and morphological phenomena that cannot be accounted for under this assumption, such as suppletion or defectiveness, have largely been ignored. Since the early 1990s, an approach to morphology has been developed within the general assumptions of the Minimalist Program that takes such phenomena more seriously: Distributed Morphology (DM; Halle & Marantz 1993). DM very explicitly rejects the Lexicalist Hypothesis, which essentially assumes a distinction and separation between morphological and syntactic processes. As in previous versions of transformational grammar, DM instead assumes that the processes that are traditionally associated with the lexicon and with morphology are distributed over other components. The theories included in this volume assume some version of the Lexicalist Hypothesis and are more natural associates of approaches to morphology such as A-morphous Morphology (Anderson 1992) or Paradigm Function Morphology (Stump 2001). For a thorough discussion of versions of the Lexicalist Hypothesis and its role particularly within LFG and HPSG, see Ackerman et al. (this volume).

A further dubious feature of the transformational tradition is the tendency to treat speculative ideas as if they were firmly established facts. A typical example is the idea that language variety is the product of a relatively small set of innate parameters. This is standardly presented as a well-established result. Thus, for example, Boeckx (2006: 59) writes that “grammarians came to the conclusion [in the 1980s] that something like a P&P [Principles and Parameters] account of the language faculty was essentially correct.” More recently, however, Boeckx (forthcoming) concedes that “empirically the expectations of the traditional Principles and Parameters model have not been met. GB theorists expected a few points of variations each with lots of automatic repercussions throughout the grammar of individual languages (‘macro-parameters’), but they found numerous, ever more fine-grained, independent micro-parameters.” We would agree with Newmeyer (2006: 9) when he writes that “After a quarter-century of its well-documented failures and retreats, one is forced to conclude that the parametric program … is little more than an exercise in wishful thinking.”

The approaches dealt with in this book differ in various ways, but it is important not to exaggerate the differences. Similarities tend to be obscured by differences of notation. However, it is possible to represent the analyses of the various frameworks in other notations. For example, LFG or CG ideas can be represented in HPSG notation. Ackerman and Webelhuth (1998) might be seen as a version of LFG in HPSG notation. The general point is demonstrated in Shieber (1986).

All the approaches considered here have simpler and more concrete syntactic structures than transformational grammar in its various manifestations. They all reject movement processes and do not have multiple levels of structure.7 They also make little or no use of the empty elements that have been central to transformational work since 1980. One consequence of this is that these approaches fit more easily into a model of linguistic performance than do transformational approaches. This point is developed in the present volume by Sag and Wasow.

Of course these approaches also have their limitations. An important one is that they have been largely concerned with synchronic syntax and semantics and related computational work. There has been very little diachronic work (though see, e.g., Butt & King 2001) and also very little work on acquisition. In both areas mainstream Chomskyan work has been largely unchallenged. Two chapters in the present volume, Green’s chapter 11 and Villavicencio’s chapter 12, consider acquisition from a non-transformational perspective. We hope they may provide a stimulus to further work.

In the preceding paragraphs we have outlined some of the background to the chapters that follow. We have said something about the origins of the approaches that are presented here. We have also drawn attention to the weaknesses of the Chomskyan approach. We do of course recognize the crucial positive impact the work of Chomsky and other transformationalists has had on the development of formal linguistics and the way it is viewed by those outside the field. However, we are concerned about the dominance that the transformational approaches have enjoyed over the last fifty years or so. It seems to us that the weaknesses raised in this introduction suggest that alternatives merit very serious consideration. This book is intended as a contribution to the accessibility of some of the alternative approaches.

Notes

1 The term “generative” (or “generative-enumerative”) is sometimes used, e.g. by Pullum and Scholz (2001), to refer to procedural approaches to grammar and not declarative (or model-theoretic) approaches. We prefer the broader usage.

2 A slightly earlier alternative to transformational grammar was Systemic Grammar, as presented in Hudson (1971). A non-transformational version of generative grammar was also sketched in Harman (1963).

3 The exact nature of the mapping relations has changed as the theory has developed.

4 See Postal (2003a); Ackerman & Webelhuth (1998); Culicover & Jackendoff (2005); Pinker & Jackendoff (2005); Newmeyer (2003, 2005).

5 Postal (2003a: 5) argues that the Minimalist rhetoric about explanation and description displays a “thinly disguised contempt for getting the facts right” and involves “the fantastic and unsupported notion that descriptive success is not really that hard and so not of much importance.” Many outsiders would agree.

6 See the projects’ web page at www2.parc.com/isl/groups/nltt/pargram.

7 Rejecting movement processes does not automatically lead to simple and concrete syntactic structures. Relational Grammar and its relative Arc Pair Grammar reject movement processes but assume a variety of relation changing processes and has structures similar in complexity to those of transformational grammar.

References

Ackerman, Farrell and Gert Webelhuth. 1998. A Theory of Predicates. Stanford: CSLI.

Adger, David. 2003. Core Syntax: A Minimalist Approach. Oxford: Oxford University Press.

Ajdukiewicz, Kazimierz. 1935. Die syntaktische Konnexität. Studia Philosophica 1: 1–27. English trans. in Storrs in McCall (ed.). 1967. Polish logic: 1920–1939 Oxford: Oxford University Press, 207–31.

Anderson, Stephen R. 1992. A-Morphous Morphology. Cambridge: Cambridge University Press.

Bar-Hillel, Yehoshua. 1953. A quasi-arithmetical notation for syntactic description. Language 29: 47–58.

Blake, Barry J. 1990. Relational Grammar. London: Routledge.

Boeckx, Cedric. 2006. Linguistic Minimalism: Origins, Concepts, Methods, and Aims. Oxford: Oxford University Press.

Boeckx, Cedric. Forthcoming. Approaching parameters from below. In Anne-Marie Di Sciulo & Cedric. Boeckx (eds.), Biolinguistic Approaches to Language Evolution and Variation. Oxford: Oxford University Press.

Bresnan, Joan. 1978. A realistic transformational grammar. In Morris Halle, Joan Bresnan, & George A. Miller (eds.), Linguistic Theory and Psychological Reality. Cambridge, MA: MIT Press, 1–59.

Butt, Miriam & Tracy Holloway King (eds.). 2001. Time over Matter: Diachronic Perspectives on Morphosyntax. Stanford: CSLI.

Cann, Ronnie, Ruth Kempson & Lutz Marten. 2005. The Dynamics of Language: An Introduction. Amsterdam: Elsevier.

Chomsky, Noam. 1957. Syntactic Structures. The Hague: Mouton.

Chomsky, Noam. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.

Chomsky, Noam. 1995a. The Minimalist Program. Cambridge, MA: MIT Press.

Chomsky, Noam. 1995b. Language and nature. Mind 104: 1–61.

Chomsky, Noam. 2000. Minimalist inquiries: the framework. In Robert Martin, David Michaels, & Juan Uriagereka (eds.), Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik. Cambridge, MA: MIT Press, 89–155.

Chomsky, Noam. 2001. Beyond explanatory adequacy. MIT Occasional Papers in Linguistics 20. Cambridge, MA: MIT.

Chomsky, Noam. 2002. On Nature and Language. Cambridge: Cambridge University Press.

Copestake, Anne. 2002. Implementing Typed Feature Structure Grammars. Stanford: CSLI.

Culicover, Peter & Ray Jackendoff. 2005. Simpler Syntax. New York: Oxford University Press.

Dik, Simon. 1978. Functional Grammar. London: Academic Press.

Dik, Simon (ed.). 1983. Advances in Functional Grammar. Dordrecht: Foris.

Gazdar, Gerald, Ewan Klein, Geoffrey K. Pullum, and Ivan A. Sag. 1985. Generalized Phrase Structure Grammar. Oxford: Blackwell.

Ginzburg, Jonathan and Ivan A. Sag. 2000. Interrogative Investigations: The Form, Meaning and Use of English Interrogatives. Stanford: CSLI.

Halle, Morris & Alec Marantz. 1993. Distributed morphology and the pieces of inflection. In Ken Hale & Samuel J. Keyser (eds.), The View from Building 20: Essays in Honor of Sylvain Bromberger. Cambridge, MA: MIT Press, 111–76.

Harman, Gilbert. 1963. Generative grammars without transformation rules: a defense of phrase structure. Language 39: 597–626.

Hudson, Richard. 1971. English Complex Sentences: An Introduction to Systemic Grammar. Amsterdam: North-Holland.

Lyons, John. 1968. Introduction to Theoretical Linguistics. Cambridge: Cambridge University Press.

Newmeyer, Frederick J. 1986. Linguistic Theory in America. 2nd edn. New York: Academic Press.

Newmeyer, Frederick J. 2003. Review article. Language 79: 583–600.

Newmeyer, Frederick J. 2005. Possible and Probable Languages. Oxford: Oxford University Press.

Newmeyer, Frederick J. 2006. A rejoinder to “On the role of parameters in Universal Grammar: a reply to Newmeyer” by Ian Roberts and Anders Holmberg. Available at http://ling.auf.net/lingBuzz.

Partee, Barbara 1975. Montague grammar and transformational grammar. Linguistic Inquiry 6: 203–300.

Perlmutter, David M. (ed.). 1983. Studies in Relational Grammar 1. Chicago: University of Chicago Press.

Perlmutter, David M. & Carol G. Rosen (eds.). 1984. Studies in Relational Grammar 2. Chicago: University of Chicago Press.

Pinker, Steven & Ray Jackendoff. 2005. The faculty of language: what’s special about it? Cognition 95: 201–36.

Postal, Paul M. 2003a. Skeptical Linguistic Essays. Oxford: Oxford University Press.

Postal, Paul M. 2003b. (Virtually) conceptually necessary. Journal of Linguistics 39: 599–620.

Postal, Paul M. & Brian D. Joseph (eds.). 1990. Studies in Relational Grammar 3. Chicago: University of Chicago Press.

Pullum, Geoffrey K. and Barbara C. Scholz. 2001. On the distinction between model-theoretic and generative-enumerative syntactic frameworks. Paper presented at the Fourth Conference on Logical Aspects of Computational Linguistics, Le Croisic.

Radford, Andrew 2004. Minimalist Syntax: Exploring the Structure of English. Cambridge: Cambridge University Press.

Sag, Ivan A. 1997. English relative clauses. Journal of Linguistics 33: 431–83.

Sag, Ivan A., Thomas Wasow & Emily M. Bender. 2003. Syntactic Theory. 2nd edn. Stanford: CSLI.

Shieber, Stuart. 1986. An Introduction to Unification-Based Approaches to Grammar. Stanford: CSLI.

Siewierska, Anna. 1991. Functional Grammar. London: Routledge.

Stump, Gregory T. 2001. Inflectional Morphology: A Theory of Paradigm Structure. Cambridge: Cambridge University Press.

Van Valin, Robert D. Jr (ed.). 1993. Advances in Role and Reference Grammar. Amsterdam: John Benjamins.

Van Valin, Robert D. Jr & Randy La Polla. 1997. Syntax: Structure, Meaning and Function. Cambridge: Cambridge University Press.

Vermaat, Willemijn. 2004. The Minimalist Move operation in a deductive perspective. Research on Language and Computation 2: 69–85.

Vermaat, Willemijn. 2005. The logic of variation: a cross-linguistic account of wh-question formation. PhD thesis, Utrecht University. Available at www.lotpublications.nl/publish/issues/Vermaat/index.html

1

Elementary Principles of Head-Driven Phrase Structure Grammar

Georgia M. Green

This work was supported in part by the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. Some parts are reworked versions of material that appears in Green and Levine (1999). I am grateful to Ash Asudeh, Bob Borsley, Jong-Yul Cha, Bob Levine, Carl Pollard, and Ivan Sag for comments on earlier versions, and useful advice about consistency and clarity in describing a theory that (like all theories, and maybe all organisms) evolves piecemeal, a few systems at a time.

1.1 Introduction

This chapter describes the theoretical foundations and descriptive mechanisms of Head-Driven Phrase Structure Grammar (HPSG), as well as proposed treatments for a number of familiar grammatical phenomena. The anticipated reader has some familiarity with syntactic phenomena and the function of a theory of syntax, but not necessarily any expertise with modern theories of phrase structure grammar. The goal of this chapter is not so much to provide a tutorial in some consistent (and inevitably dated) version of HPSG as to explicate the philosophy and techniques of HPSG grammars, and to familiarize readers with foundations and techniques of HPSG accounts of grammatical phenomena so that readers can access the primary literature.

In my opinion, the best means to fully understanding this approach, and to being able to write and read HPSG grammars, is to build an HPSG grammar from scratch, inventing and revising the details as one goes along, in accordance with the constraints imposed by the formal model (but not necessarily by every constraint ever proposed in the language of that model).

This chapter assumes the reader is curious about HPSG, perhaps attracted by claims that it aims for psychological plausibility, or that it is computationally tractable and adaptable for computational implementations in both research and practical applications, or perhaps merely interested in seeing how HPSG accounts for the properties of natural languages that any adequate theory of natural language must account for. I have sought to provide an in-depth introduction to the guiding principles and the nuts and bolts, as well to the notation, and to forgo the hard sell. Section 1.2 describes the character of HPSG grammars, and the elements and axioms of the system. Section 1.3 describes how linguistic entities are modeled, and how grammars describe the modeled entities. Section 1.4 describes the ontology of feature structure descriptions in HPSG, and section 1.5 deals with the expression of constraints, especially those involving the notion “same” or “matching.” Section 1.6 discusses issues relating to selection, including the treatment of agreement. Section 1.7 describes the compositional treatment of semantics in HPSG. Section 1.8 discusses the representation of constituent structure, and section 1.9 addresses the treatment of the order of elements within constituents. HPSG is very much a lexicon-driven theory, and section 1.10 describes the organization of the lexicon, relations among lexical items, and the nature of lexical rules relating them. Section 1.11 describes treatments of complementation, including the treatment of Equi and Raising constructions, and their interaction with expletive noun phrases. Section 1.12 describes variations on the treatment of so-called extraction constructions and other unbounded dependencies (e.g. pied piping), with some attention to multiple extractions and so-called parasitic gaps, as well as the nature of alleged empty categories like traces and zero pronouns. It concludes with a discussion of constraints on where extraction gaps can occur. Section 1.13 describes the HPSG account of the binding of pronouns and anaphors, and the final section indicates further directions. Two appendices summarize salient aspects of the sort inheritance hierarchies discussed, and the constraints embedded within them.

1.2 Grammars, Types, and Constraints

Two assumptions underlie the theory defining HPSGs. The first is that languages are systems of sorts of linguistic objects at a variety of levels of abstraction, not just collections of sentence(-type)s. Thus, the goal of the theory is to be able to define the grammars (or I-languages) that generate the sets of linguistic expressions (e.g. English You’ve got mail, seeks a unicorn, the, and so forth) that represent the set of natural human languages, assigning empirically satisfactory structural descriptions and semantic interpretations, in a way that is responsive to what is known about human sentence processing.1 The other is that grammars are best represented as process-neutral systems of declarative constraints (as opposed to constraints defined in terms of operations on objects, as in transformational grammar). Thus, a grammar (and for that matter, a theory of Universal Grammar) is seen as consisting of an inheritance hierarchy of sorts (an is-a hierarchy), with constraints of various kinds on the sorts of linguistic object in the hierarchy. More exactly, it is a multiple-inheritance hierarchy, which simply means that sorts can inherit properties from more than one “parent.”

A simple sort hierarchy can be represented as a taxonomic tree representing the sort to which belong all the linguistic entities with which the grammar deals. For each local tree in the hierarchy, the sort names that label the daughter nodes partition the sort that labels the mother; that is, they are necessarily disjoint subsorts that exhaust the sort of the mother. For example, subsorts of the sort head can be “parts of speech” (not words!) of various kinds. (Words have phonological and morphological properties, but parts of speech are abstractions, and do not.) Some of the subsorts of part-of-speech are further partitioned, as illustrated in (1).

(1)A partial inheritance hierarchy for “parts of speech”:

        

A multiple-inheritance hierarchy is an interlocking set of simple hierarchies, each representing a dimension of analysis that intersects with other dimensions. The need for this sort of cross-classifying inheritance has long been obvious in the case of the lexicon: verbs have to be classified by the number and syntactic characteristics of the arguments they require, but they may also need to be classified according to inflectional class (conjugation), by semantic properties of the relations they describe (e.g. whether they represent states or properties or events, whether their subjects represent agents or experiencers, and so on (Green 1974; Levin 1993)), and of course by mood, voice, tense, and the person and number of their subjects. But comprehensive and detailed analyses of many phrasal constructions also demand the variety of perspectives that multiple inheritance reflects, as exemplified in work on inverted clauses (Green and Morgan 1996) and relative clauses (Sag 1997).

A grammar is thus a system of constraints, both unique and inherited, on sorts of linguistic objects. It would be naive to assume that all grammars have the same sorts or the same constraints on whatever sorts they might have in common. Nevertheless, all grammars are hierarchies of sorts of phrases and words and the abstract linguistic entities that need to be invoked to define them. As detailed in appendices A and B, there are quite a few intermediate-level linguistic objects like part-of-speech that have subsorts, some of which have subsorts of their own (e.g. index, synsem, person, number, gender). All grammars constrain these various sorts in terms of properties of their component parts. One may speculate that the limited range of variation among grammars of natural languages that makes them learnable comes from the fact that grammars are as alike as they are because there are only a small number of economical solutions to the problems posed by competing forces generally present in languages. For example, languages with free word order enable subtle (non-truth-conditional) distinctions to be expressed by variation in phrase order, while languages with fixed word order simplify the task of parsing by limiting the possibilities for subsequent phrases. An elaborate inflectional system reduces ambiguity (especially temporary ambiguity), while relatively uninflected languages simplify the choices that have to be made in speech production. At the same time, whatever psychological properties and processes guide the incremental learning about the world that is universal among human beings in their first years of life must contribute to constraining grammars to be systems of information that can be learned incrementally.2

Sorts can be atomic (unanalyzed) like acc, fem, +, and sg, or they can be complex. Complex sorts of linguistic objects are defined in terms of the attributes they have (represented as features), and by the value-types of those features. In HPSG, a feature’s value may be defined to be one of four possible types:

an atomic sort (like +, or finite);a feature structure of a particular sort;a set of feature structures;3a list of feature structures.4

If a value is not specified in a feature structure description, the value is still constrained by the sort-declarations to be one of the possible values for that feature. That is, it amounts to specifying a disjunction of the possible values. Thus, if the possible values for the feature NUM are the atomic sorts sg and pl, then specifying either NP[NUM] or NP amounts to specifying NP[NUMsg V pl], and similarly for all the other possible attributes of NPs (i.e. all the features they can have).

Sort declarations are expressed in formulae of a logic for linguistic representations (King 1989; Pollard 1999), and can be perspicuously abbreviated in labeled attribute–value matrices (AVMs) as in (2), where F1, …, Fn are feature names and sorti,…, sortk are sort names.

(2)

        

Sort definitions thus specify what attributes an instance of the sort has, and what kinds of things the values of those attributes can be, and sometimes what particular value an attribute must have (either absolutely, or relative to the value of some other attribute5). Sorts inherit all of the attributes of their supersorts and all of the restrictions on the values of those attributes. The set of feature structures defined by a grammar is a partial subsumption ordering, that is, a transitive, reflexive, and anti-symmetric relation on the subsumption relation. (A description X is said to subsume a description Y if all of the objects described by Y are described by X. If a set is a partial subsumption ordering, then some (relevant) subsets include other (relevant) subsets.) Thus, linguistic expressions, or signs, are words or phrases, and this is reflected in the fact that the sort sign subsumes both phrase and word and no other sort. In fact, since the specifications for phrase and word are mutually exclusive (phrases have attributes that specify their immediate constituents, and words don’t), the sorts phrase and word partition the sort sign. Sorts that have no subsorts are termed “maximal sorts” because they are maximally informative or specific.

1.3 Feature Structures and Feature Structure Descriptions

All linguistic entities (including both expression types and the abstract objects that are invoked to describe them – indices, categories, cases, synsems, locals, and so on) are modeled in HPSG as feature structures.6 A feature structure is a complete specification of all the properties of the object it models.

To keep the distinction clear between a feature structure, which models a maximal sort, and the feature structure descriptions that are used to describe grammatically relevant classes of feature structures in the generalizations that constitute the statements of the grammar, feature structures themselves are represented as directed graphs. A partial feature structure for a simplified account of the English verb phrase sleeps is given in (3); for simplicity’s sake, the directed acyclic graph (DAG) for the non-empty-synsem-list that is the value of the two SUBJ features is not represented.

(3)

        

The feature structure in (3) reflects the following information: the phrase in question has syntactic and semantic properties represented by the feature SYNSEM, as well as the property of having a head daughter (HEAD-DTR) but no other subconstituents; its NON-HD-DTRS (non-head daughters) attribute is an empty list. Its “part of speech” (HEAD) value is of subsort v, has finite inflectional form, and agrees with something whose agreement (AGR) value is 3rd person and singular, and its head daughter’s part of speech (HEAD) value is exactly the same. In addition, the phrase subcategorizes for (i.e. requires) a subject, but no complements, and the phrase it subcategorizes for has precisely the syntactic and semantic properties of the phrase its head daughter subcategorizes for.

As is clear from this example, the directed graphs that represent feature structures differ from the directed graphs conventionally used to represent constituent structure, in that distinct nodes can be the source for paths to (i.e. can “dominate”) a single node. This situation, as indicated by the convergence of the arrows in (3), represents the fact that the part-of-speech (of subtype v) of the head daughter is the same feature structure as the part-of-speech of the phrase itself.

Graphic representations of feature structures (like (3)) are awkward both to display and to read, so descriptions of feature structures in the form of AVMs are commonly used instead. Attribute or feature names are typically written in small capitals in AVMs, and values are written to the right of the feature name, in lower case italics if they are atomic, as in (4).

(4)

        

Feature structures are the entities constrained by the grammar. It is crucially important to distinguish between feature structures (fully specified objects that model linguistic expressions) and feature structure descriptions, representations (usually underspecified) that (partially) describe feature structures, and that feature structures allowed by a grammar must satisfy. Feature structure descriptions characterize classes of objects. For example, the NP she could be represented by a fully specified feature structure (representable as a directed graph), but “NP” is (an abbreviation for) a feature structure description, and could not be so represented. Put another way, a partial description such as a feature structure description represented by an AVM is a constraint on members of a class of feature structures, while a total description is a constraint that limits the class to a single member. For the most part, grammar specification deals with generalizations over classes of words and phrases, and therefore with (partial) feature structure descriptions.

1.4 Signs and Their Attributes

HPSGs describe languages in terms of the constraints on linguistic expressions (signs) of various types. Signs are, as in the Saussurean model, associations of form and meaning, and have two basic subsorts: phrases, which have immediate constituents; and words, which don’t. Signs are abstractions, of course; an act of uttering a linguistic expression that is modeled by a particular sign amounts to intentionally producing a sound, gesture, or graphical object that satisfies the phonological constraints on that sign, with the intent that the product of that act be understood as intended to have syntactic, semantic, and contextual properties that are modeled by the respective attributes of that sign. For more on the nature of this modeling, see Pollard and Sag (1994: 6–10, 58).

Signs have phonological, syntactico-semantic, and contextual properties, each represented by the value of a corresponding feature. Thus all signs have PHON and SYNSEM attributes, recording their phonological and syntactico-semantic structures, respectively. PHON values are usually represented in standard orthography, solely for the sake of convenience and readability. The value of the SYNSEM attribute is a feature structure that represents the constellation of properties that can be grammatically selected for. It has a LOC(AL) attribute, whose value (of type local) has CAT(EGORY), CONT(ENT), and CONTEXT attributes. Local values are what is shared by filler and gap in so-called extraction constructions like whom you trust and Those, I watch. The SYNSEM value also has a NONLOC(AL) attribute, which in effect encodes information about all types of unbounded dependency constructions (UDCs), including “missing” wh-marked subconstituents. (For the analysis of unbounded dependency constructions, see section 1.12.)

The CATEGORY attribute takes as its value an entity of the sort category, whose attribute HEAD has a part-of-speech as its value and whose SUBJ, COMPS, and SPR attributes have as their values lists of synsems representing the subcategorized-for arguments. The valence attributes of a sign (SUBJ, SPR, COMPS) record the subject, specifier, and complements that the sign subcategorizes for. These attributes take lists of synsems as their values; in S’s and referential NPs, all the lists are empty lists. These type declarations, and others discussed in this chapter, are summarized in appendix A. The partition of part of speech in (1) classifies objects having a HEAD attribute into nouny (N, N′, NP), verby, and so on.

In the case of words, categories also have an argument structure (ARG-ST) feature whose value is a list of the synsems of the sign’s arguments. As with the valence features SUBJ, COMPS, and SPR, the synsems in the list are ordered by the obliqueness of the grammatical relations they bear, and the ARG-ST list represents the obliqueness record that is invoked in constraining binding relations (cf. Pollard & Sag 1994: ch. 6, or Sag et al. 2003: ch. 7).7 Arguments are ordered from least oblique to most oblique on the ranking familiar since Keenan and Comrie (1977): subject < direct object < secondary object < oblique argument. In most cases, the ARG-ST list is the concatenation of the contents of the SUBJ, SPR, and COMPS lists, in that order. Exceptions are provided by the pro-synsems, which represent null pronouns in null-subject languages, and the gap-synsems, which represent “extracted” elements. Both appear in ARG-ST lists, but not in valence lists (see section 1.11 for discussion of the latter). The null subjects of controlled infinitives appear in both ARG-ST lists and SUBJ lists, though not in constituent structures (see section 1.11).

In Pollard and Sag (1994), the value of the CONTENT attribute is a nominal-object if the sign is a referring expression like them or the bagels they ate, but a parameterized state-of-affairs (or psoa) if it is a predicative expression like talks or angry or in, or a quantifier. (Psoa is not a Greek word for anything, but just a funny-looking name for a representation of propositional content as a feature structure). Psoas are subtyped by the relation they express, and have attributes for the roles of their arguments. Nominal-objects in that theory have index-valued INDEX attributes and psoa-set valued RESTR(ICTION) attributes. More current versions (e.g. Ginzburg and Sag 2000; Sag et al. 2003; Copestake et al. 2006) of the theory include an INDEX and a RESTR attribute for predicative as well as referring expressions,8 as illustrated in (5).

(5)

        
        

The details of this representation of propositional content do not reflect an essential property of HPSG. It would make no difference if some other kind of coherent representation of a semantic analysis were substituted, as long as it provided a way of indicating what properties can be predicated of which arguments, how arguments are linked to individuals in a model of the universe of discourse, and how the meaning of each constituent is a function of the meaning of its parts. In other words, the exact form of the representation is not crucial as long as it provides a compositional semantics.

Indices for expressions that denote individuals are of the subsort indiv-ind, while indices for expressions that denote properties or propositions (situations) are of the sort sit-ind.9 Individual-indices in turn have attributes for PER(SON), NUM(BER), and GEN(DER). For perspicuity, in abbreviated AVMs, index values are often represented as letter subscripts on category designations: NPi, for example. The CONTENT specification is abbreviated as a tag following a colon after a category designation; VP: refers to a VP with the CONTENT value .

Finally, the CONTEXT attribute records indexical information (in the values of the SPEAKER, ADDRESSEE, and LOCATION features), and is supposed to represent, in the value of its BACKGROUND attribute, linguistically relevant information that is generally considered pragmatic. For some discussion, see Green (1995, 2000).

1.5 Constraints and Structure-Sharing

As indicated in section 1.3, constraints on feature structures are expressed in terms of feature structure descriptions. The more underspecified a description is, the larger the class of objects that satisfy it, and the greater the generalization it expresses. Anything that is entailed in sort definitions (including lexical representations) or in universal or language-specific constraints10 does not have to be explicitly mentioned in the constraints on (i.e. descriptions of) classes of linguistic objects. For example, since the (presumably universal) HEAD Feature Principle requires that the HEAD value of the head daughter of a phrase be the same as the HEAD value of the phrase itself, the details of this value need to be indicated only once in each representation of a phrase.

The notion of the values of two attributes being the same is modeled in feature structures as the sharing of structure. This is represented in feature structures by means of distinct paths of arcs terminating at the same node, as illustrated in (3); for this reason, this property is sometimes referred to as re-entrancy. Structure-sharing is represented in descriptions by means of identical boxed integers (TAGS like ) prefixed to feature structure descriptions, denoting that they are constrained to describe the same structure, as illustrated above in (5c). Technically, a tag refers to a feature structure description that unifies all of the feature structure descriptions with the same tag. The unification of two feature structure descriptions is a consistent feature structure description that contains all of the information in each one. STRUCTURE-SHARING is a crucial concept in HPSG. Because it refers to token-identity, and not just type-matching, it does not have a direct counterpart in transformational theories. Structure-sharing amounts to the claim that the value of some instance of an attribute is the same feature structure as the value of some other instance of an attribute, that is, it is the SAMETHING – not something that just shares significant properties with it, or a different thing that happens to have all the same properties.

Because of this, the three AVMs in (6) are equivalent descriptions of the same feature structure, which in this case is a representation of a third person singular noun phrase consisting of a single head noun like Joan or she. (In the following AVMs, sort annotations are omitted, and feature-name pathways like [A [B [C x]]] are represented as [A | B | C x], as is conventional. For perspicuity, sometimes values are labeled with the name (in italics) of the sort that structures their content, but such information is usually omitted when predictable. The attributes HEAD-DTR and NON-HD-DTRS organize information about the constituent structure of phrases. Their properties are described in subsequent paragraphs and elaborated in section 1.8.)

(6)

        

Because identically tagged descriptions unify, all three descriptions convey exactly the same information, and there is only one way to satisfy the token-identities in the three descriptions.

Structure-sharing is a key descriptive mechanism in HPSG. For example, the structure-sharing required by the description of a topicalization structure like whom they admired requires the SYNSEM | LOCAL value of the filler daughter (e.g. whom) to be the same as the single member of the SLASH value of the head daughter (in this case, they admired), as explained in section 1.12. The sort declaration in (7) constrains the properties of the mother and the daughters in a phrase of this type; it does the work of an I(mmediate)D(ominance)-rule in GPSG (an ID-schema in Pollard & Sag 1987, 1994), comparable to a phrase structure rule that does not impose linear order on the daughters.

(7)

        

There is a variety of restrictions that generalize across various subtypes of phrases. These are, in general, constraints on the highest type they apply to in the phrase-type hierarchy. Several depend on the notion of structure-sharing to constrain feature-value correspondences between sisters, or between mother and some daughter, for particular features. These include principles like the HEAD-Feature Principle (HFP, which constrains the HEAD value of a phrase to be the same as the HEAD value of its head daughter) and the Valence Principle (see section 1.6), as well as a principle that governs the projection of the unbounded dependency features (SLASH, REL, and QUE), such as the Nonlocal Feature Principle of Pollard and Sag (1994), or the SLASH Inheritance Principle and WH-Inheritance Principle of Sag (1997). Semantic compositionality principles that constrain the CONTENT value of a phrase to have a certain relation to the CONTENT values of its daughters, depending on what subtype of phrase it is, are specified in the sort declarations for high-level subsorts of phrase.

1.6 Selection and Agreement

An example of a principle that is represented as part of the description of a subsort of phrase is the Valence Principle (a reformulation of the Subcategorization Principle of Pollard and Sag 1994). It constrains subcategorization relations of every object of the sort headed-phrase so that the value of each valence feature corresponds to the respective valence value of its head daughter, minus elements that correspond to elements with the same SYNSEM values in the NON-HD-DTRS list for that phrase. In other words, the Valence Principle says that the SUBJ, COMPS, and SPR values of a phrase correspond to the respective SUBJ, COMPS, and SPR values of its head daughter except that the synsems on those lists that correspond to phrases constituting any non-head daughters are absent from the valence attributes of the mother. Thus, while wants in wants a lollipop has a singleton nouny synsem as the value of both its SUBJ and its COMPS features, wants a lollipop has such a value for its subj attribute, but its COMPS value is the empty list, because it has an NP daughter. In versions of HPSG with defaults (such as Sag 1997 or Ginzburg & Sag 2000), the Valence Principle can be formulated as a constraint on headed phrases to the effect that the values of the valence features of a phrase are the same as those of its head daughter, except where specified to be different, as in (8).11

(8)

        

Valence features of the phrase would be specified to be different in sort declarations for particular headed-phrase types only where the synsems of the signs that are sisters to the head are absent from the appropriate valence feature on the phrase, as discussed in section 1.8.

Other aspects of selection also rely on the notion of structure-sharing. Adjuncts (e.g. attributive adjuncts and temporal, locative, and manner adverbs) select heads via a HEAD feature MOD, and determiners select heads via a HEAD feature SPEC in very much the same way as heads select arguments by valence features. Structure-sharing is the essence of the HFP. The HFP is described in Pollard and Sag (1994) as an independent constraint, but is perhaps more perspicuously represented as part of the sort declaration for the sort headed-phrase. as shown in (9).

(9)

        

HPSG licenses phrase types either through Immediate Dominance Schemata (Pollard and Sag 1994), or through sort declarations for particular phrasal constructions (Sag 1997). Constituent-structure trees have no formal status in HPSG (as indeed they did not in the versions of transformational grammar that relied on rewriting rules), although immediate constituents are represented by the various daughters attributes of phrasal signs, and trees are used as a convenient graphic representation of the immediate constituents and linear order properties of phrasal signs. In informal arboreal representations like (10), nodes are labeled by analyzable category names in the form of AVMs (or very underspecified abbreviations for them, like NP or NP[3SG]), linear order is imposed, and branches may be annotated to indicate the relation (e.g. head, adjunct, complement) of daughter to mother. The AVMs are usually abbreviated, with predictable parts of paths suppressed as shown here.

Agreement in person, number, and gender has always (cf. Pollard & Sag 1987, 1994) been treated as a correspondence between the person, number, and gender properties of the INDEX of a nominal expression and the (inflected) form of a word that selects it. Thus, the valence features of a verb may require, depending on its morphological type (with implications for its phonological form), that a certain argument (e.g. the subject) have certain INDEX specifications (e.g. [PER3, NUMsg]), or the MOD feature of an attributive adjective may require that the noun it modifies have certain number, gender, and/or case properties. Thus, wants is 3rdsg-verb and the first synsem on its ARG-ST list includes the specification [INDEX [PER3rd, NUMsg]], and the lexical specification of these includes [MOD [HEADn, NUMpl]]. Section 1.10 discusses how so-called lexical rules relate the PHON value and the agreement properties of selected elements. Empirically, case concord is independent of index agreement, and this is reflected in the architecture of HPSG: INDEX is an attribute of content, while CASE is a syntactic HEAD feature.

More recently, Kathol (1999) and Bender and Flickinger (1999) have found reasons why index agreement must be mediated by an index