Docs: What does the MRS "Index" identify?

I want to make sure I have my head around (so to speak) what the Index clause of an MRS is identifying. I suspect it is such a basic topic that it really doesn’t get defined anywhere in the wiki.

Cutting to the chase, I believe the following is correct (if not, please comment!):

The Index in an MRS contains the variable that was introduced by the predication that represents the syntactic head of the phrase. That predication will be what the phrase is “about”, or what the phrase is “built around”.
Notes:

  • It can be a conjunction like and in a phrase such as “he ran and cried”
  • It will not always be a predication representing a word in the actual sentence when the semantics required the ERG to insert something that was implied (example needed)

What I found to generate that (besides my own experience using the system):

The one place I found a definition is:

INDEX always comes from the syntactic head (see comments at top of page about ambiguity in terminology in DMRS).
StanfordAlgebraExperiment

Digging around the Internet I found a few places that define “syntactic head” such as:

The core of every phrase is its head
– In the VP walk the pugs, the verb walk is the head
Syntax: The Sentence Patterns of Language

The ‘head’, the word around which the constituent is built, determines the grammatical properties of its constituent. In the example phrase ‘the Cheshire cat’, ‘cat’ is the word around which the phrase is built. It is the head of the phrase. Since cat is a noun, ‘the Cheshire cat’ is a noun phrase, or NP. A head can be one word such as ‘Harriet’.
Syntactic Constituency

I also found a couple of places that talk about the intracies of Index:

I would add that the point of INDEX is to constrain composition. In brief: when two phrases are composed, the MRSs are combined (nothing is thrown away), and some parts can be identified with each other. One MRS acts as a “functor” and the other as an “argument”. The only parts of the argument that can be identified with something in the functor are the elements of the HOOK (INDEX, LTOP, XARG) and the SLASH list.

INDEX is also sometimes used for other purposes, but composition is why it’s defined in the first place.

I broadly agree with what you’ve written, but there’s a contradiction between “the predication that represents the syntactic head” and “not always… a predication representing a word”. I think the problem is that a single token can introduce multiple predicates, so “represent” is too vague.

Wonderful, that is a very helpful insight. I feel like I might need to pull a summary of HPSG parsing into the summary somewhere to help describe this. Understanding how the MRS is constructed (at least at a conceptual level) would give more depth to understanding and dissecting them.

Does the following address your top level point about the vagaries:

The MRS will represent the syntactic head of the phrase with one or more predications. Index will point to the one that could (in principle) be used to further compose the phrase with other phrases [See HPSG Backgrounder]. In general, the index predication can be used to determine what the phrase is “about”, or what the phrase is “built around”.

Notes:

  • It can be a conjunction like and in a phrase such as “he ran and cried”
  • Since the syntactic head might be represented by multiple predications (some implied) the index predication will not always map directly to a word in the actual sentence

This sounds reasonable.

The problem is mapping the other way round, surely? Each predication either maps to a word (when introduced lexically) or to a phrase (when introduced by a phrase, e.g. compound). If there are any cases where the INDEX maps to a multiple-word phrase (I can’t think of any off the top of my head), that would presumably be some kind of non-headed phrase, contradicting the assumption that the phrase has a head.

Sorry, I meant that the index may reference a predication that doesn’t start with “_”, i.e. isn’t represented directly by a word in the sentence. But I see how this is confusing. Hopefully this makes it clearer?

  • The syntactic head might be represented by multiple predications, some which could be abstract and not map 1-1 with a word in the actual sentence.

Yes, I think “not mapping one-to-one” makes it clear.

We find this in coordination in languages that don’t use separate coordinator words, for example.

1 Like

I think the best place to start for this is still Copestake et al 2005:

Great, thank you!