Docs: Help understanding underspecified variable types: dropped arguments, shared 'i' variables and 'u'

[This question is part of the working group effort to update the Delph-In docs]

I’ve never fully understood how to interpret argument types beyond x and e and I’m hoping someone can explain some scenarios I’ve encountered. I’ve posted about this before and there’s a good discussion here , but I’m trying to get to the next level of understanding.

For reference, here are the docs from Ergsemantics_basics, which I’m happy to update once I gather more information here:

In addition to these most specific variable types, there are underspecifications as follows: i (for individual ) is a generalization over eventualities and instances; p (the half-way mark in the alphabet between h and x ) is a generalization over labels and instances; and u (for unspecific or maybe unbound) generalizes over all of the above.

i arguments meaning dropped (“unused” or “unknown”)

99.9% of the MRS that I’ve encountered so far that has anything beyond an x or e variable seems use the i, p or u variable as a placeholder meaning this argument is “ignored” or “unknown” which changes the predicate meaning (often making it passive).

For example: I have interpreted the ARG1 of _lock_v_cause(exx) as being the “actor”. When ARG1 is of type i, I’ve interpreted the i variable to mean “unknown” or “unused”, kind of like passing “Null” to a function argument in Python. This clearly changes the meaning of a predicate like lock_v_cause(exx), “someone (x) locked a thing (x)”, to lock_v_cause(eix), “a thing(x) is locked”. For example, the phrase “Is the safe locked?” produces this MRS:

[ TOP: h0
INDEX: e2
RELS: <
[ _the_q LBL: h4 ARG0: x3 [ x PERS: 3 NUM: sg IND: + ] RSTR: h5 BODY: h6 ]
[ _safe_n_1 LBL: h7 ARG0: x3 [ x PERS: 3 NUM: sg IND: + ] ]
[ _lock_v_cause LBL: h1 ARG0: e2 [ e SF: ques TENSE: pres MOOD: indicative PROG: - PERF: - ] ARG1: i8 ARG2: x3 ]
>
HCONS: < h0 qeq h1 h5 qeq h7 > ]

This makes sense to me, except when I try to map it to the docs and interpret it as “a generalization over eventualities and instances”. What does that mean in this case? It doesn’t seem like anything is being “generalized”. And: why would it be generalized over an x and an e variable type? does _lock_v_cause sometimes take an ARG1 event and it is saying, “it could normally be an e or an x but it isn’t specified so we’ll indicate the parent type that could be either”?

Looking back over my previous post, @goodmami stated:

A convention in MRS is that when an argument is dropped (e.g., I left. vs I left Oregon. ) there is no EP introduced for the missing argument and its variable becomes underspecified (an i instead of an x ). This is, I guess, to avoid requiring a quantifier for the variable. However i variables are also used for some scopal modifiers (e.g., for _never_a_1 in I never left. ) and things like compositional numbers (for the EPs for “20” and “8” in Kim drove for 28 hours ). I brought up this issue at our last research summit in Cambridge (see Section 3 of my presentation notes and the minutes of the discussion). There were some proposals for resolving the various meanings of i variables in MRSs but I think the current situation still stands.

So it sounds like the description of i variables in the docs needs to be updated to include the fact that i also can mean “dropped” and in those cases has nothing to do with the fact that it might be an events or instance so we picked the parent. Instead, we just needed something besides x to put there?

Shared i arguments

An example I’ve found that is clearly not meant to mean “ignore this argument” is: “yell ‘I am free’”:

[ TOP: h0
INDEX: e2
RELS: <
[ pronoun_q LBL: h4 ARG0: x3 [ x PERS: 2 PT: zero ] RSTR: h5 BODY: h6 ]
[ pron LBL: h7 ARG0: x3 [ x PERS: 2 PT: zero ] ]
[ proper_q LBL: h9 ARG0: x8 [ x PERS: 3 NUM: sg ] RSTR: h10 BODY: h11 ]
[ fw_seq LBL: h12 ARG0: x9195 [ x ] ARG1: x13 ARG2: i14 ]
[ fw_seq LBL: h12 ARG0: x13 [ x ] ARG1: i15 ARG2: i16 ]
[ quoted LBL: h12 CARG: "I" ARG0: i15 [ i ] ]
[ quoted LBL: h12 CARG: "am" ARG0: i16 [ i ] ]
[ quoted LBL: h12 CARG: "free" ARG0: i14 [ i ] ]
[ fw_seq_end_z LBL: h12 ARG0: x8 [ x PERS: 3 NUM: sg ] ARG1: x9195 ]
[ _yell_v_1 LBL: h1 ARG0: e2 [ e SF: comm TENSE: pres MOOD: indicative PROG: - PERF: - ] ARG1: x3 ARG2: x8 ]
>
HCONS: < h0 qeq h1 h5 qeq h7 h10 qeq h12 > ]

There are shared i variables in the above MRS. i14, i15 and i16 all appear in more than one predicate, clearly indicating they they need to hold values. And the only way I have found to process this logically is to treat them as existentially qualified x variables. This is the only place I’ve seen this phenomena so far.

Why are the instances represented by i variables in this MRS? Why not plain old instance variables?

A way underspecified example

This is the MRS that caused me to create this post since it breaks all kinds of assumptions I have and causes me to simply hack around my processing model to get it to work. Ironically, it is the simplest phrase ever: “hi”

[ TOP: h0
INDEX: e2
RELS: <
[ discourse LBL: h1 ARG0: i9 [ i ] ARG1: h6 ARG2: h10 ]
[ greet LBL: h6 CARG: "hi" ARG0: i8 [ i ] ]
[ unknown LBL: h4 ARG0: e2 [ e SF: prop-or-ques ] ARG: u5 ]
>
HCONS: < h0 qeq h1 > ]

I’ve posted about the discourse predicate before and think I understand that it is being used as a kind of conjunction, and that greet(i) is saying “It isn’t specified know who is being greeted” (using the same i as ignored or unspecified pattern as above) and I think unknown(u) is saying “and we don’t know what the other part of the conjunction is”.

But…Why is u being used for unknowns first argument?

2 Likes

Thanks for documenting this issue so thoroughly. I don’t really have any answers or thoughts beyond what was mentioned in those other threads, but maybe somebody else does. However, I think the following points are useful to consider:

  1. The HPSG / MRS framework is not “complete”, whatever that may mean. Some thorny issues have been ignored, sidestepped, or addressed only with hacks (such as the overloading of i for dropped xs and things that are not clearly x or e), so it may be the case that there is no answer to be reasoned out. You may need to create an answer.
  2. The ERG (as with any grammar) may be lacking analyses or may be buggy. Not every output from the ERG should be considered correct or ideal.
3 Likes

Understood. My goal is to try to articulate the best model that is intended, and then point out bugs or workarounds that happen in practice.

In case this helps some, I’d say that i means either e or x, but definitely not h. (And in the case of dropped arguments, it’s generally “e or x, but actually x, but we don’t want x unless we have a quantifier, so we’ll be vague”.) u means any of e, x, or h. I’m not quite sure what “unknown” is doing in your example, but its argument could be any of those.

1 Like