Doubts on scopal combination and passing up of INDEX

I apologize if the following is trivial or has already been asked before, although I searched and did not find any post on this.
From here, we have the this slide


In MRS composition, “The hook of a phrase is the hook of the semantic head daughter, which is determined uniquely for each construction type.” (Copestake, 2005).

I have the following questions:

  1. Here I think probably should be the semantic head daughter of the phrase probably sleeps because it supplies a slot to be filled by the other daughter sleep. However, the INDEX of probably sleeps is shown to be e_s, which is the INDEX of the non-head daughter. So, I guess I have misunderstood something here?
  2. What if the semantic head daughter has no INDEX? What would be the INDEX of the phrase?
  3. Furthermore, it seems that now all non-quantifier EPs have an intrinsic variable of type e or x, so to me it seems that INDEX can then be identified with the intrinsic variable of scopal modifiers, e.g. that of probably, what does not sound correct to me. How can one identify the EPs whose intrinsic variable can be put to INDEX if not all are allowed?

I hope the questions are not too trivial, since I have gone through quite a number of readings and tutorials on MRS already. Any reading suggestion is welcomed! Thanks in advance for the help!

1 Like

There’s no need to apologise!

You’re right that probably is the semantic head. The answer to your puzzle is that its INDEX is identified with its argument’s INDEX. If you look at slide 144, you can see the re-entrancy labelled [e].

This re-entrancy is easy to miss, so I can understand your confusion! It’s also a subtle point, because the INDEX isn’t part of anything in the RELS list.

Thank you for the clarification! However, I am still not very sure about the passing up mechanism of INDEX.
For instance, when I experiment with the phrase run fast (as in I run fast) in ACE with LUI, the phrase has its INDEX identified by the INDEX of run, which is not the semantic head daughter, and I also do not see similar coindexation in the lexical entry of fast as in that of probably which allows the INDEX of the argument of fast to take over the phrase’s INDEX. What did I miss here?

There are two distinct but related questions here:
(1) what are the rules for semantic composition with MRS?
(2) how is this done in the ERG?

For (1), there is unfortunately no definitive up-to-date document. Copestake (2007) specifically discusses composition, but some details have changed. In particular, adjectives and adverbs (including “fast”) are now analysed as having their own event variables, even when acting as modifiers. This means that they always have an intrinsic variable (e.g. there is an event variable for “furry” in both “the furry dog” and “the dog is furry”). It also allows us to model modifiers being modified (e.g. “the surprisingly furry dog”).

However, this also means that the HOOK can’t be directly passed up. (This is discussed here.) When “fast” modifies “run”, “fast” is the functor and “run” is the argument. Each of them has an event variable, which is their own INDEX. The INDEX of the composed phrase comes from “run”. The difference between scopal modification (e.g. “probably”) and non-scopal modification (e.g. “fast”) is that only scopal modifiers provide a new LTOP.

There has been some discussion about producing an updated semantic algebra (see also here), and I will try to point you to an updated document when I can. But as a brief answer to your question: the INDEX and LTOP will come from either the functor or the argument, depending on the construction.

For part (2) of your question, for the ERG v2018, the passing up of the HOOK is controlled by the syntactic rule, rather than the lexical entry. When “fast” modifies “run”, it uses the hd-aj_int-unsl_c rule, which is a subtype of phrase_or_lexrule and head_compositional. The phrase_or_lexrule type identifies SYNSEM.LOCAL.CONT.HOOK with C-CONT.HOOK, and the head_compositional type identifies C-CONT.HOOK with HD-DTR.SYNSEM.LOCAL.CONT.HOOK. So together, this means that the hook of the phrase (“run fast”) is the hook of the syntactic head daughter (“run”).

1 Like

Thanks for the informative reply! The links are on the point. You demonstrated the passing up of HOOK with an example from ERG v2018. As I am currently experimenting with the ERG v1214, is the passing up of HOOK implemented very differently in v2018 than the v1214? It would be nice if I can have an idea briefly of what is changed/improved in the v2018. Any URLs would be good, to save your time on answering. Do you suggest that I switch to v2018?

Besides, after some reading, I have the following “summary” of the mechanism of the passing up of HOOK. Forgive me for the imprecision in the terminology. Would you take a look and let me know if I got anything wrong?

First, consider the scenario where:

A word/phrase X and a word/phrase Y is unified with a syntactic rule R to give a phrase Z

From here,

INDEX always comes from the syntactic head. LTOP normally comes from the syntactic head, but comes from the modifier in scopal modifier examples (`probably sleeps’) in the dmrscomp grammar and I expect there are other cases where this will be necessary.

Given a R, the followings are uniquely determined:

  1. Whether the INDEX of Z comes from X or Y, whichever is the syntactic head as licensed by R

  2. If each of X and Y contains at least one EP (to guarantee semantic composition): whether the semantic composition is scopal (/H as in DMRS), or intersective (/EQ), or neither (/NEQ), and furthermore:

    • If intersective: LTOP of Z is equated with LTOP of X and Y
    • Else if scopal: LTOP of Z is equated with LTOP of the semantic head daughter, i.e. the scopal modifier
    • Else: LTOP of Z is equated with LTOP of either X or Y, which ever is the syntactic head as licensed by R

In other words, the mechanism of the passing up of INDEX and LTOP is encoded in R, and independent of what is in X and Y, which is different from what is described in the slide of my first post. (OR still, probably would encode such information (its INDEX is identified with its argument’s INDEX)
which is still essential to guide the passing up of HOOK's information together with R?)

I find it hard browsing for the relevant and concise documentations regarding my doubts, so I could only resort to this forum (luckily this exists!). I hope that the discussion is meaningful. Sorry for the lengthy post. Thank you for your patience!

I was referring to v2018 to be concrete, but what I said would also apply to v1214. The 2018 version is newer, so I would suggest using that if you can. For changes, see the README.

Your summary sounds correct, except that the rule R is not “independent” of X and Y. Each rule can only apply to certain kinds of phrases.

When comparing the slides you linked to and the ERG, remember that they don’t need to have exactly the same feature structures in order to perform the same semantic composition. This is why I said that there are two distinct but related questions here.

1 Like

When comparing the slides you linked to and the ERG, remember that they don’t need to have exactly the same feature structures in order to perform the same semantic composition.

I guess by this you mean that different implementation of the feature structures could lead to the same unification result, and that the ERG encodes the information about HOOK passing up in the syntactic rule applied, R, only, but the slides I referred to choose to encode such information in the lexical entries X and Y for demonstration. Does it sound right? Otherwise, if both R and (X and/or Y) store such information and contain conindexation at the HOOK, the passing up mechanism would be unclear, and even cycle may occur I guess?

As an aside, I seemed to have come up with an inaccurate description about the determination of only one type of semantic composition by a syntactic rule R, as it should possible for both intersective, scopal and neither to occur at the same time in a semantic composition during application of the rule R. For instance n_n_cmpnd_phr introduces both intersective and neither-scopal-nor-intersective combinations, where the arguments of a newly introduced predicate compound_rel are identified with the variables in X and Y through both types of combination.

Yes that sounds right.

As for types of semantic composition, it may be helpful to think of a noun-noun compound as having two semantic composition steps (one step for each argument of compound_rel). These two steps can be bundled together, for efficiency.

At the recent Delph-in summit, I gave a talk on the DMRS algebra, which might be of interest.

2 Likes

Nice materials to catch up with the latest development! Thank you for the links!

Is there exactly one syntactic head daughter for each binary construction rule of ERG?

As described here, the form of such rules are <DaughterSequence>_<Annotations>_c. I can comprehend rules such as sb-hd_mc_c and hd-cmp_u_c, where there’s hd within the rule name. And for mrk-nh_cl_c, I am guessing that nh of the right daughter means its a non-head, which implies the left daughter is the syntactic head-daughter? Furthermore, some rules’ name do not contain any info regarding the head, e.g. mrk-nh_cl_c and vp_rc-redrel_c.

I do not know much about HPSG and I am a bit confused here. Any help regarding this is much appreciated. Thanks!

Hi,

not all rules are headed. You can see a table showing all the rules, whether they are headed and if so unary, left or right, here: <https://lr.soh.ntu.edu.sg/~bond/cgi-bin/ERG_mo/rules.cgi>
hover your mouse over the triangles for an explanation.

Yours,

2 Likes

I have experimented with some more sentences and rules and I am a bit confused. Two questions here:

  1. At #5, Given a R, the followings are uniquely determined:

    1. Whether the INDEX of Z comes from X or Y, whichever is the syntactic head as licensed by R

    For headed rules, I guess it is clear how the INDEX of the phrase is determined. Then, what happens to INDEX (and LTOP?) of a phrase that is instantiated using a non-headed rule (and under each case: 1. binary non-headed one and 2. unary non-headed one)? Is there a general rule for such cases?

  2. tree
    Moreover, for the sentence I hit and run., the mrk-nh_evnt_c was told to be a binary right-headed rule, so I should expect the INDEX of run to take over the phrase and run 's INDEX, and the INDEX of and is no longer accessible any further. Contrary to my thought, the INDEX of the whole sentence is the INDEX of and. I also notice that vp-vp_crd-fin-t_c is a non-headed rule, so I believe this is where the INDEX of and is somehow made accessible again?

It seems that the two-point summary I made in #5 was too general and there are still many exceptions which is not known by me. Any guidance on these? Thank you!

Hi @guyemerson do you have anything publish about this work?

Besides, here at IBM after some presentations that I gave showing MRSs from the Wordnet glosses, people got very interesting in MRS as a possible general semantic representation language to use with the neuro symbolic tools we are implementing. I was asked to give a presentation about MRS and maybe the related representations like DMRS.

Does anyone have good directions and references that I should use to prepare my presentation?

For a non-headed rule, the rule defines the INDEX and LTOP. In a binary rule, there are three “input” MRSs: the first daughter, the second daughter, and the rule’s CCONT. For a unary rule, there are two: the daughter, and the rule’s CCONT. However, in terms of the composition algebra, we can think of the CCONT MRS in the same way as any other MRS. This might be easiest to explain with an example. I mentioned noun-noun compounds before, and perhaps I should explain this in more detail, comparing a noun-noun compound against a noun modified by a prepositional phrase:

(1) biology research
(2) research into biology

In (2), we have the predicates _research_n_1, _into_p, and _biology_n_1, each of which comes from the lexical entry of one word. (I’m ignoring the quantifiers, for simplicity.) We first compose into’s MRS with biology’s MRS. We then compose the resulting MRS with research’s MRS.

In (1), we have a compound predicate instead of _into_p, but otherwise the semantics is the same. The compound predicate comes from the phrase, rather than a lexical entry. However, we can see the composition in (1) in the same way as in (2). We first compose the phrase’s MRS with biology’s MRS. We then compose the resulting MRS with research’s MRS.

In the ERG, the two MRS composition steps in (1) are done in a single syntactic rule.

Coordination is complicated, because there are many types of coordination. I didn’t cover coordination in my talk at the last Delph-in summit, and as far as I’m aware, previous work on an MRS algebra didn’t cover coordination either. The short answer is that and’s MRS acts as the functor. It takes two arguments (in your example, hit and run). Semantic composition can proceed in the same way as we’ve seen in other examples.

I haven’t published anything on this, and Ann’s draft DMRS algebra wasn’t published either. My talk at the last summit is the most up-to-date thing to point to, I’m afraid! I think the links so far in this thread have already covered the main references.

Thanks for the detailed explanation! Then precisely I should regard the summary as false, i.e. the INDEX does not necessarily comes from the syntactic head daughter (even for binary rule without CCONT predicates).
Then, naturally, a follow-up question would be whether coordination is the exception (or one of the few exceptions) here, and most other rules still preserve the property that the syntactic head daughter provides the INDEX? I am currently researching about surface realization from DMRS, where the ERG derivation recipes would provide the information regarding the semantic composition process of DMRS. Therefore, I am curious whether the headedness of the rules or some simple principles would characterize the propagation of the INDEX/LTOP, as such information about index’s availability would help prune the search space. Otherwise, if such decision is highly specific to each rule, then I really have to consult the ERG about it.

I think you can see coordination as an exception. In fact, at a discussion at the last summit, Dan Flickinger said that coordination is one of the hardest things to implement in a compositional grammar.

If you’re interested in using ERG derivations to help with realisation, a relevant paper to look at might be: Chen et al. (2018). They use derivation trees to guide semantic composition, but they also allow the trained model to deviate from the exact composition used in the ERG. Instead of the MRS algebra, they use hypergraph replacement grammar to formalise composition. (The AM algebra, which I’m using to formalise a new DMRS algebra, ultimately also uses hypergraph replacement grammar.)

1 Like

Another exception that INDEX does not come from the syntactic head I believe is inverted sentence?

image

The INDEX here is e2 from the subject but not the head as in sb-hd_mc_c.

1 Like

This is a nice example. It isn’t complicated in the way that coordination is complicated, but it shows how the syntactic head is not always the same as the “semantic head”. You’ll notice in my talk, I used the term “functor” rather than “(semantic) head”. I find it’s helpful to use a different term, to avoid confusing them.

This example is interesting because it combines subject-dependent inversion and a semantically empty copula. (Subject-dependent inversion is quite different from subject-verb inversion. If you have access, Ward et al. (2002) give a 6-page summary of the construction.) So it’s helpful to consider these two phenomena separately.

Unlike an identity copula (e.g. “Kim is a doctor”), a semantically empty copula does not contribute a predicate to the MRS. For example:

  • Kim is happy.
  • Kim is in the building.

In these cases, the MRS of the verb phrase (“is happy”, “is in the building”) is the same as the copula’s complement (“happy”, “in the building”). Syntactically there’s a difference, but semantically there isn’t (if we ignore tense). After that, the MRS for the verb phrase acts as the functor, and the subject as the argument.

Subject-dependent inversion can also occur with other verbs. For example:

  • An unexpected announcement came three days later.
  • Three days later came an unexpected announcement.

These two sentences have exactly the same semantics (if we ignore information structure). In terms of semantic composition, they’re also exactly the same (if we ignore the order that we combine things). Syntactically, subject-dependent inversion has important constraints, but semantically, there’s nothing special.

So both of these seem fairly “boring” semantically. But when we combine them, things get interesting. The copula “was” is semantically empty, waiting for a complement (like “equally important”). But with inversion, we want to compose it with the subject first! This is what I referred to as “eager composition” in my talk. The bookkeeping becomes a little bit more complicated but the basic principles are the same.

So, to conclude – in this example the syntactic head still gets to say what the INDEX is, but in this case it says that the INDEX is the subject’s INDEX.