Help understanding the variable types 'i' and 'p' in the ERG

I see in the SEM-I there are predicates that sometimes use p as their variable type, some examples:

_absorb_v_1 : ARG0 e, ARG1 i, ARG2 p, ARG3 h.
_install_v_1 : ARG0 e, ARG1 i, ARG2 p, ARG3 h.
_send_v_1 : ARG0 e, ARG1 i, ARG2 p, ARG3 h.

And I see on the semantics basics page it says that p is an underspecification between instances and labels, but I can’t quite grasp what it means to underspecify between these two types and why some verbs allow for either.

Also, while in theory underspecifying between e and x feels more natural at first glance, I also am struggling to understand why it seems like most verbal predicates call for i on their arguments instead of x

Also, for those entries, why is the ARG3 of type h?

Lastly, see that some arguments do use the top-level u type:

_run_v_1 : ARG0 e, ARG1 u, [ ARG2 i ].

Why would it need to be that unspecific?

Some example sentences motivating the need for the underspecification would be greatly appreciated!

1 Like

@ecconrad one thing to understand is that the SEM-I is not a hand-curated resource but one generated from the grammar. I forget the specifics of how that happens, but sometimes what looks like a surprising annotation decision may just be a reflection of the grammar that is not immediately intuitive (or perhaps a grammar bug).

Some verbs can take an instance or a scope handle as their argument, such as believe (here and below I’m using [ and ] to group constituents corresponding to arguments, not for optionality):

  • I believe [her].
  • I believe [she told the truth].

In this case the ARG2’s type is u, but I think it may have been p in a previous version (I used this example in my dissertation, where I said it was p):

_believe_v_1 : ARG0 e, ARG1 i, ARG2 u, [ ARG3 h ].

i usually means that the instance can be dropped. E.g., A species believed to be extinct. Here, ARG1 (who believes) is dropped.

That’s a good question. I think @Dan might need to answer that.

I wonder if it’s just an artifact of how the SEM-I is produced? Consider _advise_v_1:

  _advise_v_1 : ARG0 e, ARG1 i, ARG2 i, [ ARG3 h ].
  _advise_v_1 : ARG0 e, ARG1 i, ARG2 p, ARG3 h.
  _advise_v_1 : ARG0 e, ARG1 i, ARG2 h.

These are used in the following (in order):

  • [I] advise [you] [to get a second opinion].
  • ???
  • [I] advise [that you get a second opinion].

The middle one I don’t have a good example for. It also matches the first sentence, but I can’t think of an example where ARG2 is an h along with an ARG3 that is an h. This could just be a lack of knowledge/creativity on my part.

Having type u means that the value could also be an e. Often an argument selects a scope handle, but sometimes it takes the eventuality EP directly. One example is _active_a_1:

  _active_a_1 : ARG0 e, ARG1 u.
  • I actively [sought] an example.
1 Like

Thanks!

Looking at this again as I try to write about the types in my paper.

Is it the case for _run_v_1 that ARG1 is u because it can take arguments of all three of the most specific types? Or is it just u as some side effect of the way the SEM-I was produced? What might cause so many entries to have u when it’s not necessarily required to be that general?

The variable type can be specified by a lexical entry or by a construction. It could be that it’s underspecified by the lexical entry, but it’s specified by all constructions it could appear in.

I’m surprised to see _run_v_1 has ARG1 u, but “run” is highly polysemous so maybe I’m just not thinking of the right construction. Maybe it’s something to do with subject-verb inversion, like “in the room ran a mouse”.

The reason _run_v_1 has ARG1 u in the SEM-I is because one of the lexical entries that introduces that predicate is (as @guyemerson suspected) of the type v_np_locinv-mv_le, used in locative inverson constructions, maybe more clearly seen in “into the room ran a mouse”. The grammar writer in defining that lexical type failed to impose any constraint on the semantic type of its ARG1, but should have constrained it to type ‘x’, since even in the locative inversion use, it is still the mouse who is the ARG1 (the one doing the running), not the location. So here the SEM-I falls victim to a missing constraint in the definition of the lexical type (essentially a grammar bug), and I fear there are many such oversights in the ERG lexicon, since the absence of such constraints does not affect parsing, where most of the quality assurance is done. I’ll add the task of tightening up the SEM-I’s constraints on argument types to the to-do list for the soon-to-be released 2024 version of the ERG, but alas you’re left with these shortcomings in the versions of 2023 and earlier.

1 Like