Mapping ERG predicates to propbank/verbnet

Does anyone know about previous works on mapping ERG predicates to propbank and/or verbnet dataset?

In particular, I am curious about copula constructions. ERG gives me 2 analysis for This balloon is red. and one analysis for This is a red balloon. It seems we have two lexical entries for the verb be: be_c_is and be_id_is. So probably they may correspond to the two different senses be.01 and be.02 in

Does it make sense?

be_c_is is semantically empty (contributes no EP); be_id_is contributes an EP…

Yes. But what is the syntactic analysis of these two cases? Are they both copulas? If so, they are probably different kind of copulas since they have different analysis right? Or maybe one is a copula and the other one is not?

In UD, these are called

One case above are not considered copula.

In the ERG, both are verbs that take non-verbal complements. The be_id_is is an instance of the ‘identity copula’, which takes an NP complement and introduces an EP that relates that NP complement to the subject. We use this both for identity cases (like The author of that book is Kim) and cases that look more predicatives (Kim is an author). (I think there might be even more semantic categories of NP predicates, but don’t know off the top of my head.) The key commonality is that there is no argument role in the NP predicate for the subject to fill, so the verb introduces a predication that links them.

be_c_is, on the other hand, is an ‘ordinary’ copula that supports a non-verbal predicate that does have an argument position for the subject. Syntactically it is a raising verb, identifying its subject’s index with its complement’s XARG (which in turn points to the complement’s subject’s index). From our perspective, it is incorrect to say this copula has a ‘sense’ because it doesn’t mean anything.

1 Like

Does anyone have the contact of Sergio Roa-Ovalle? The link to his thesis is broken at

still related to this thread, I am trying to compare ERG analysis with UD/Propbank with the intension to evaluate how reasonable would be to use ERG to produce data for training a SRL system.

I am curious with the MOD relation introduced in the DMRS. For instance, this is a simplified sentence from EWT corpus

The on Tuesday in UD is OBL of nominated (syntax). ERG makes Bush and the event arguments of the preposition on. In the DMRS, we have the MOD edge between the nominated event and the preposition, what is that? The MOD mean a modification, But how was that introduced?

_on_p_temp<15:17> LBL: h7 ARG0: e16 ARG1: x3 ARG2: x17

Interesting, none of the syntactic trees from the sentence above take on Tuesday as a modifier of the verb “nominated” as


MOD/EQ means a modifier without a direct ARG relation.

When converting from MRS, it is introduced when nodes share a label, but without an ARG relation between them (or a chain of ARG relations).

This case looks like a weird analysis. I can’t think of how to justify it.

A sentence where MOD/EQ is needed is “The dog whose tail wagged barked”. Here, “whose tail wagged” modifies “dog”, so the head of the clause “wagged” needs to share a label with “dog”. This is necessary to get the right quantifier scope (there could be many dogs, but just one dog whose tail wagged). But the only argument of “wagged” is “tail”, which doesn’t share a label with “dog” (tails are quantified separately from dogs). So “wagged” has two outgoing links, ARG1/NEQ to “tail”, and MOD/EQ to “dog”.

1 Like

Thank you @guyemerson, I will wait some feedback from @Dan

Finally, I managed to prepare a complete experiment of parsing and analyzing all examples of Propbank rolesets with ERG. I was able to parse ~85% of the examples with ERG after many editions in the examples (see fixing examples and null elements by arademaker · Pull Request #10 · propbank/propbank-frames · GitHub).

Besides additional editions that I may need to do, I am starting to think about how to make a consistent comparison. The first example in Frameset - avoid is

[She]-2 wanted trace-2 to avoid the morale-damaging public disclosure that a trial would bring.
Arg0: trace-2
Rel: avoid
Arg1: the morale-damaging public disclosure that a trial would bring.

ERG 2018 gives me

So _avoid_v_1 maps to avoid.01 and ERG.ARG1 maps to PB.ARG0. But I will only be able to figure this out if I use the trace-2 and the [She]-2 marks, not trivial since the propbank examples were quite messy. For PB.ARG1 can I recover the span from the MRS ERG.ARG2 @goodmami? Or is it easier to use the derivation tree?

PS: I know this experiment is only an approximation and theories differ. My point is if it makes sense to have some measurement of the difference. In particular, I am aware of @AnnC article Invited Talk: Slacker Semantics: Why Superficiality, Dependency and Avoidance of Commitment can be the Right Way to Go - ACL Anthology

Any comment, suggestion, or criticism is welcome! :wink:

Sorry I don’t follow. You want the character span of the entire subgraph linked through _avoid_v_1's ARG2?

Yes, I want to match the analysis produced by ERG with the data available in the frame files. See here, so if I can get the span, I can compare the substring of the example with the XML tag content associated with the ARG1 (line 21 of the XML file)

Another difference between the ERG lexicon and Verbnet/Propbank

put.01 for propbank and verbnet has 3 arguments. For ERG it has also 3 arguments but the last one is a handle (see here), I am curious to understand the reason for that. Probably for some uniform analysis of prepositional phrases, right?

Hum… related to the last reason listed by @Dan in Events for adverbs, adjectives and prepositions - #4 by Dan?