Node identifiers of quantifiers in EDS

What is the name for the node identifiers starting with underscore:

{e2:
 _1:udef_q<0:3>[BV x3]
 e9:card<0:3>("2"){e SF prop, TENSE untensed, MOOD indicative, PROG -, PERF -}[ARG1 x3]
 x3:_dog_n_1<4:8>{x PERS 3, NUM pl, IND +, PT pt}[]
 e2:_fight_v_1<13:21>{e SF prop, TENSE pres, MOOD indicative, PROG +, PERF -}[ARG1 x3]
}

I mean the _1 above.

What do you mean, “the name for the node identifiers”?

Do note that in EDS and DMRS, node identifiers are not variables. For convenience and at the risk of some confusion, EDS uses the form of the ARG0 variables of the MRS’s EPs for the corresponding node’s identifier, but they do not behave exactly like variables. Furthermore, they must be unique and each node must have an identifier, so for those that are missing or non-unique, they will get an underscore instead of an e, i, x, etc. Most prominently, you’ll notice this on quantifiers, which do not have their own intrinsic variables. You’ll also see these on (arguably bad) MRSs where EPs have no ARG0 or share their ARG0 with some other non-quantifier EP.

No meaning should be attributed to either the letter (or underscore) portion of the identifiers nor to the numeric portion, and neither is required. The identifier is just a unique string of valid identifier characters.

2 Likes

Thank you, my original question was how do we call those variables with an underscore… but your answer was even better, now I understood where they came from during the translation from MRS to EDS.

Despite all reading about MRS variables vs EDS identifiers (constants), I still have some doubts about the difference. I understand that MRS handlers must be understood as variables, otherwise, the HCONS wouldn’t make sense, they are constraints that precisely make assertions of possible equalities. So handlers are for sure variables (in the computational and mathematical sense). But what about x or e or i MRS variables? The ICONS constraints are never really used, right? At least for some sentences, ACE didn’t give me values for the ICONS.

I am reading now http://www.lrec-conf.org/proceedings/lrec2006/pdf/364_pdf.pdf and trying to understand the variable-free motivation. Parts such as the one quoted below confused me, it seems that variables vs constraints are related to the preservation of identifiers between possible analysis of a sentence. But two distinct possible analyses aren’t two distinct MRS? If EDS is derived from MRS, it doesn’t make sense for me this quote.

the EPs associated to an NP constituent shared among two analyses might well internally end up using distinct (albeit abstractly equivalent) semantic variables.

MRS variables are logical variables. This means that they specifically stand for individuals in a model structure. As you say, handles are mathematical variables, in the sense that each handle must be equated with a label in a fully scoped MRS. However, a label/handle only ever refers to a linguistic object and never an individual in a model structure. This is really important if we want a well-defined logic, where we can have a model structure to represent a situation, and an MRS to represent a sentence, and then evaluate the MRS on the model structure.

However, for many applications, we don’t need a well-defined logic and so the logical variables can seem cumbersome. Hence the motivation for a simpler variable-free representation.

As for the specific quote, I think it refers to a situation where there are two analyses for a sentence but they share the same analysis for a particular NP. In assigning a unique identifier to each variable, the identifiers might end up being different because the analysis of the rest of the sentence is different, even though the analysis of the NP is the same. If the identifier instead comes from the token, it will be the same identifier across all analyses.

1 Like

Reading this thread more carefully, I see, the x or e prefix from the MRS variables lose their meaning in EDS. They are just symbols used as identifiers. This is an important point, thank you.

Hi @guyemerson , the “for many applications … logical variables can seem cumbersome” I read in some papers. But I am still having trouble to actually understand it. My best bet for now is that variable-free representations should be used as data structures, right? We can unify to EDSs, for example. For entailment detection, we would need more than unification, since we would like to have some kind of ordering between two structures. Am I in the right path?

On the other hand, MRS are presentations that holds logical fragments. Basically I like to think that each handler holds a logical sentence (a conjunction of predicates). How to calculate, for example, entailment of two MRSs is something that I am trying to understand… We would have to make the unification of the handlers AND for each unified handler, the logical entailment test of its logical fragment, something like that, right?

Does it make sense?

Yes, as you say, entailment has an order but unification does not.

If the task is the kind of entailment you might find in a logic textbook, using fully scoped MRSs would work very well. But if the task is intuitive reasoning, a formal logic can be very brittle. For example, we might want to say that “yesterday evening I made fried rice and it was both filling and delicious” intuitively entails “I cooked dinner”… but this requires world knowledge: dinner is eaten in the evening, you have to eat something to know it’s delicious, a meal should fill you up, making fried rice involves cooking, etc. Trying to add world knowledge as a huge set of logical forms is difficult.

So one approach is to give up on using a formal logic, but still use semantic representations as providing useful features for a machine learning model. But in that case, we want a representation that is easy to work with from the point of view of the machine learning model. Logical variables are not easy to work with, and so a variable-free representation is helpful.

This isn’t the only possible approach to entailment, of course! (I think logic is useful!) But I hope that helps explain the motivation.

1 Like

Thank you @guyemerson, as always you provided a very clear an illustrative example. I will take your answer as confirming that a logical use of MRS would be something along the lines I wrote in my last comment. I am still making a literature review on the uses of MRS to text entailment. So far I found this using EDS converted to RDF and augmented with wordnet information. I am reading now the paper https://www.aclweb.org/anthology/E14-3009.pdf that uses MRS but without actually translating it to a full logical sentence and employing a theorem prover.

I just found your thesis with a very clear presentation of the DMRS/EDS motivation and rationality. It makes sense of the node identifiers as unifying the predicate and its intrinsic variable.

Regarding the quote below, it is not completely clear to me why logical variables not easy to work with from the point of view of the ML model? I am not an ML person by I am following some works exploring FOL provers in ML. Anyway, I really appreciate your attention to my questions. thank you.

BTW, the corpus I am using has much simpler examples compared to your example above. For instance, I expect logical reasoning (with some wordnet information) would be enough for showing:

  1. Two dogs are playing by a tree |= Two dogs are playing by a plant
  2. A woman is wearing an Egyptian hat on her head |=
    A woman is wearing an Egyptian headdress

I’m glad you found my thesis helpful :slight_smile:

And yes, the type of ML model is of course crucial here. As you say, there are some people working on theorem proving in ML, and in that case the classic MRS might be easier to work with than DMRS/EDS! But the “mainstream” paradigm in ML is more like: define your inputs and outputs, define a big neural net connecting them, and train by gradient descent. So in this paradigm, we want inputs that interface nicely with neural methods. A dependency graph could be an input to a graph-convolutational network (or we could linearise the graph and use sequence models like LSTMs or Transformers). Building a network on a classic MRS could be done, but you would have to carefully deal with the different types of object (predicates, handles, variables), while DMRS lets you just deal with nodes.

For those two examples, I agree that classical logic and WordNet should do the trick. I haven’t looked at the SICK dataset in detail, but I hope you can get something to work :slight_smile:

1 Like

This thread started in 2020 and I am still playing with text entailment in the SICK dataset using logic reasoning. Actually, the main motivation is that proofs can be eventually constructed with neuro-symbolic provers like ([2006.13155] Logical Neural Networks) and huge knowledge bases. Given that goal, I started working with MRS to FOL before going to more expressive approaches (e.g. A Type-Theoretical system for the FraCaS test suite: Grammatical Framework meets Coq - ACL Anthology).

For a simplified sentence from the SICK dataset:

'A boy is playing piano'
[ TOP: h0
  INDEX: e2
  RELS: < [ _a_q<0:1> LBL: h4 ARG0: x3 RSTR: h5 BODY: h6 ]
          [ _boy_n_1<2:5> LBL: h7 ARG0: x3 ]
          [ _play_v_1<9:16> LBL: h1 ARG0: e2 ARG1: x3 ARG2: x8 ]
          [ udef_q<17:22> LBL: h9 ARG0: x8 RSTR: h10 BODY: h11 ]
          [ _piano_n_1<17:22> LBL: h12 ARG0: x8 ] >
  HCONS: < h0 qeq h1 h5 qeq h7 h10 qeq h12 > ]

'No boy is playing piano'
[ TOP: h0
  INDEX: e2
  RELS: < [ _no_q<0:2> LBL: h4 ARG0: x3 RSTR: h5 BODY: h6 ]
          [ _boy_n_1<3:6> LBL: h7 ARG0: x3 ]
          [ _play_v_1<10:17> LBL: h1 ARG0: e2 ARG1: x3 ARG2: x8 ]
          [ udef_q<18:23> LBL: h9 ARG0: x8 RSTR: h10 BODY: h11 ]
          [ _piano_n_1<18:23> LBL: h12 ARG0: x8 ] >
  HCONS: < h0 qeq h1 h5 qeq h7 h10 qeq h12 > ]

My translator from MRS to FOL gives me F1 for the first MRS above and F2 for the second:

[F1] ∃ e2, ∃ x8, _piano_n_1 x8 ∧ (∃ x3, _boy_n_1 x3 ∧ _play_v_1 e2 x3 x8)
[F2] ∃ e2, ∃ x8, _piano_n_1 x8 ∧ (∀ x3, _boy_n_1 x3 → ¬_play_v_1 e2 x3 x8)

Does it make sense so far? My transformation is based on http://svn.delph-in.net/lkb/branches/fos/src/tproving/gq-to-fol.lisp. But I consider more quantifiers and I allow high-order predicates (with formulas as arguments).

For these two sentences, I would expect to be able to find a proof of the contradiction using a FOL prover, that is, a proof that not(and(F1, F2)) is a tautology, or that and(F1, F2) is unsatisfiable.

but, of course, the problem is that the existential variables are different so there is no logical contradiction and I will need to eventually unify some existential variables first, I imagine that I would need to make some variables unification in the two MRS before the FOL transformation. But I suspect people already investigated approaches to that, right? Any obvious direction?

¬((∃ e2, ∃ x8, _piano_n_1 x8 ∧ (∃ x3, _boy_n_1 x3 ∧ _play_v_1 e2 x3 x8)) ∧ 
  (∃ e2, ∃ x8, _piano_n_1 x8 ∧ (∀ x3, _boy_n_1 x3 → ¬_play_v_1 e2 x3 x8)))