"a part of" and "part of" parse pretty differently

Maybe this is a “colloquial English” problem, but I’m getting very different parses for "Is the lock a part of the safe?" vs "Is the lock part of the safe?". One of my early users and I interpret them the same, FWIW.

Here’s the first parse of "is the lock a part of the safe?". It looks like what’d I’d expect:

                         ┌_safe_n_1__x:x13
  _the_q__xhh:x13,h15,h16┤
                         │                    ┌_lock_n_1__x:x3
                         └_the_q__xhh:x3,h6,h7┤
                                              │                    ┌_part_n_of__xx:x4,x13
                                              └_a_q__xhh:x4,h10,h11┤
                                                                   └_be_v_id__exx:e2,x3,x4
 Logic: _the_q__xhh(x13, _safe_n_1__x(x13), _the_q__xhh(x3, _lock_n_1__x(x3), _a_q__xhh(x4, _part_n_of__xx(x4, x13), _be_v_id__exx(e2, x3, x4))))

Here is the first parse of "is the lock part of the safe?" (missing the "a"):

                         ┌_lock_n_1__x:x10
  udef_q__xhh:x10,h12,h13┤
                         │                       ┌_safe_n_1__x:x15
                         └_the_q__xhh:x15,h17,h18┤
                                                 │                    ┌_be_v_id__eix:e2,i3,x4
                                                 └_the_q__xhh:x4,h6,h7┤
                                                                      │   ┌compound__exx:e9,x4,x10
                                                                      └and┤
                                                                          └_part_n_of__xx:x4,x15
Logic: udef_q__xhh(x10, _lock_n_1__x(x10), _the_q__xhh(x15, _safe_n_1__x(x15), _the_q__xhh(x4, and(compound__exx(e9, x4, x10), _part_n_of__xx(x4, x15)), _be_v_id__eix(e2, i3, x4))))

It looks like this is saying “x4 is a part of the safe(x15) and x4 is also a compound noun-noun type term with the lock (x10)”? I may just be misunderstanding what compound() is doing here.

What is the right interpretation of this second parse?

Update: Looking at it more, is it interpreting “lock part” as a compound as in: “is the [lock part] of the safe?” or something like that?

I don’t think you want the parse wth the compound rel :wink: . It would make sense for something like “The safe has two parts: a lock part and a box part. If you only had the box part of the safe, it wouldn’t work as a safe. So the lock part of the safe is very important.”

But something’s strange about that parse anyway, because _be_v_id's ARG1 (i3) doesn’t have a predicate. (In your scope trees, is the restriction supposed to be the top branch each time? If so, _part_n_of is in the wrong place in the second tree.) Supposing that dropping the subject is okay, the parse would mean something like “is this the lock part of the safe”?

The parse you want will look just like the first parse, but with part_q instead of _a_q.

1 Like

This may be too big for this thread, but I’ve been wondering this myself too, so I will ask it here and we can start a new thread if need be. @EricZinda has found several places where the reranker is giving a parse for the sentence which is not the one he wants (or arguably anyone wants, but let’s ignore that for now).

I could see several ways to fix these problems. In Eric’s situation, in a game environment with a fairly limited domain, he could:

  1. write some code to try to select the parse he expects
  2. write a grammar extension to start constraining away those parses before they are realized
  3. do some tree banking to make a custom reranking model to try to improve the model for his domain

Do we have any guides or wikis on how to do any of these things (in particular tree banking and writing grammar extensions)? Or are there other avenues?

G’day,


trimblet

    June 19

This may be too big for this thread, but I’ve been wondering this myself too, so I will ask it here and we can start a new thread if need be. @EricZinda has found several places where the reranker is giving a parse for the sentence which is not the one he wants (or arguably anyone wants, but let’s ignore that for now).

I could see several ways to fix these problems. In Eric’s situation, in a game environment with a fairly limited domain, he could:

  1. write some code to try to select the parse he expects
  2. write a grammar extension to start constraining away those parses before they are realized
  3. do some tree banking to make a custom reranking model to try to improve the model for his domain

I think this would help a lot, although we do not really know how many new sentences are needed to make a change. OUr current best guess is 5,000 or so.

Do we have any guides or wikis on how to do any of these things (in particular tree banking and writing grammar extensions)? Or are there other avenues?

I have gathered what I know about treebanking here: http://moin.delph-in.net/TreebankingTop

We are treebanking with the ERG and Zhong at the moment.

2 Likes

This is a great question @trimblet, and one that I’ve had on my queue for a while to investigate deeper. Thanks for bringing it up. I should note that there are actually two issues which I suspect have similar answers (i.e. tree banking):

  1. Am I getting the parse that is expected by the user?
  2. Am I getting the fully scoped tree that is expected by the user?

Right now the ERG at least attempts to solve #1, even though it doesn’t always work for my scenario. #2 has no solution yet that I’ve found.

I’d also be very interested in finding out if there are other solutions as you asked.

Hey @EricZinda, I think, as others have pointed out, what you refer to as a fully scoped tree is not a thing that DELPH-IN folks usually care about, but is an extension of the MRS (in fact, while I’m sure there are folks who work on this, I know many academic and industrial systems completely ignore scope). I recall reading about your scope resolution logic at some point, but I agree with what I think your point is that the ERG does not attempt to resolve the scope, so I suppose that leaves options 2 and 3 relatively not useful for you (unless you did a lot of work on a grammar extension…). So, yes, trying to match the exact resolved scope tree for the user will probably require some additional insight on your end.

However, there is another distinction that is relevant to DELPH-IN folks and to you which is are you getting the right MRS, and not just the right parse tree. I think in DELPH-IN land parse and MRS are sometimes used interchangeably when talking about the output of parsing, but parsing generates both a syntactic tree (a parse or parse tree) and a semantic graph (the MRS). You could attack the problem from both angles: discarding trees you don’t want and MRS you don’t want. Since the MRS is derived from the tree (it’s hard but not impossible to get a desired MRS from an undesired tree) and your scoped tree is derived from the MRS, I would think pruning trees you don’t care about for your domain would be the lowest hanging fruit (pun not intended) and then pruning MRS and then working on your scope resolution logic.

One benefit of this method is, as you say, finding an agreed upon tree or MRS should be much easier than finding an agreed upon fully resolved scope tree. For instance, you could compare the trees to the output of other syntactic parsers.

Sorry, now I’m just rambling!

@trimblet, a couple of follow-on questions:

First, could you point me to examples of academic or industrial systems that use Delph-in and ignore scope while doing “deep” semantic analysis? What I mean by “deep” is “a system that interprets and uses the semantics of every word in the phrase to accomplish its goal”. I contrast this with systems that cherry-pick keywords to make a best guess, for example. I’d love to see how they are accomplishing it.

Second, I’m not sure I understand what you’re getting at with your point about syntactic trees. Is this something I can look at as an output of ACE?

thanks for the help interpreting @guyemerson. That makes sense.

The parse diagrams I show don’t keep the order of RSTR on top consistently, I should fix that. You can tell by the “Logic:” statement on the bottom that the RSRT of the_q that contains be_v_id is actually the and() predicate since it is first.

  1. To meet this requirement, I suppose it depends heavily on what type of system we’re concerned with. Arguably, DELPH-IN grammars don’t do much of anything with lexical semantics, and so don’t meet that definition. Stanford’s rule-based coreference resolution system is arguably a deep system in your sense that doesn’t use scope, but maybe that’s not the type of system we’re concerned with. For modern use cases, I think your definition of a deep system is very rare, and subsequently I think one is hard pressed to find positive or negative examples of scope-aware systems. Some other examples which might interest you include those there (note, I haven’t double checked that none of these attempt to resolve scope, but the ones I’m familiar with do not): http://moin.delph-in.net/DelphinApplications; another more recent example meeting your criteria is @goodmami’s dissertation doing MT, which in 2.1 discusses some reasons not to handle scope.
  2. Yes! I don’t recall exactly how you’re using ACE, I think with PyDelphin? If so, the syntactic parse trees are called derivations. From the ACE CLI, it looks like this:
$ echo “hello world” | ace -g erg.2018.dat
SENT: hello world
[ LTOP: h0 INDEX: e2 [ e SF: prop-or-ques ] RELS: < [ unknown<0:11> LBL: h1 ARG0: e2 ARG: x4 [ x PERS: 3 NUM: sg IND: + ] ]  [ udef_q<0:11> LBL: h5 ARG0: x4 RSTR: h6 BODY: h7 ]  [ compound<0:11> LBL: h8 ARG0: e9 [ e SF: prop TENSE: untensed MOOD: indicative PROG: - PERF: - ] ARG1: x4 ARG2: x10 [ x PT: notpro ] ]  [ udef_q<0:5> LBL: h11 ARG0: x10 RSTR: h12 BODY: h13 ]  [ _hello_n_1<0:5> LBL: h14 ARG0: x10 ]  [ _world_n_of<6:11> LBL: h8 ARG0: x4 ARG1: i15 ] > HCONS: < h0 qeq h1 h6 qeq h8 h12 qeq h14 > ICONS: < > ] ;  (539 np_nb-frg_c -1.825700 0 2 (538 n-hdn_cpd_c -1.115970 0 2 (48 hello_n1 -0.876697 0 1 ("hello" 37 "token [ +FORM \"hello\" +FROM \"0\" +TO \"5\" +ID *diff-list* [ LIST *cons* [ FIRST \"0\" REST *list* ] LAST *list* ] +TNT null_tnt [ +TAGS *null* +PRBS *null* +MAIN tnt_main [ +TAG \"NN\" +PRB \"1.0\" ] ] +CLASS alphabetic [ +CASE non_capitalized+lower +INITIAL + ] +TRAIT token_trait [ +UW - +IT italics +LB bracket_null [ LIST *list* LAST *list* ] +RB bracket_null [ LIST *list* LAST *list* ] +LD bracket_null [ LIST *list* LAST *list* ] +RD bracket_null [ LIST *list* LAST *list* ] +HD token_head [ +TI \"<0:5>\" +LL ctype [ -CTYPE- string ] +TG string ] ] +PRED predsort +CARG \"hello\" +TICK + +ONSET c-or-v-onset ]")) (537 hdn_optcmp_c -0.890894 1 2 (536 n_sg_ilr -1.738718 1 2 (45 world_n1 -2.916444 1 2 ("world" 35 "token [ +FORM \"world\" +FROM \"6\" +TO \"11\" +ID *diff-list* [ LIST *cons* [ FIRST \"1\" REST *list* ] LAST *list* ] +TNT null_tnt [ +TAGS *null* +PRBS *null* +MAIN tnt_main [ +TAG \"NN\" +PRB \"1.0\" ] ] +CLASS alphabetic [ +CASE non_capitalized+lower +INITIAL - ] +TRAIT token_trait [ +UW - +IT italics +LB bracket_null [ LIST *list* LAST *list* ] +RB bracket_null [ LIST *list* LAST *list* ] +LD bracket_null [ LIST *list* LAST *list* ] +RD bracket_null [ LIST *list* LAST *list* ] +HD token_head [ +TI \"<6:11>\" +LL ctype [ -CTYPE- string ] +TG string ] ] +PRED predsort +CARG \"world\" +TICK + +ONSET c-or-v-onset ]"))))))
NOTE: 1 readings, added 373 / 40 edges to chart (20 fully instantiated, 16 actives used, 9 passives used)	RAM: 864k

The tree is the part from (539 np_nb-frg_c to )))))). You can also see the trees at erg.delph-in.net

Thanks for the mention but I’m not sure even my work meets the rather narrow criteria as my system didn’t “interpret” the semantics (as in NLU), but it certainly did more than “cherry-pick keywords”. Cherry-pick semantic fragments, maybe? I worked with the full-sentence semantic representations, but then broke them down into pieces and matched them against a model and reassembled full-sentence semantics for the target. I think that having a way to incorporate scoped readings could theoretically lead to better translations, but…

  1. I can’t recall any work that looks at scopal divergences in translation (like Dorr 1994’s list which I put in section 1.1 of my dissertation), so I don’t know the shape of the problem
  2. I doubt any gains in translation scores would outweigh the cost in effort to make it work
  3. I doubt it would practically lead to gains; it might increase data sparsity and lead to a less powerful model
  4. It also increases the need for good scope resolvers, probably for both languages

My “reasons not to handle scope” in section 2.1 is just the standard argument about computational tractability, which is one of the desiderata for MRS in the first place.

The translation task also allowed me to ignore lexical semantics because, as long as I had decent parses for both sides, the mapping of a source to target predicate (or subgraph) already limited the word senses quite a bit.

1 Like

Thanks for the pointers and details @trimblet and @goodmami. I did a pretty exhaustive read through on anything that looked even close to what i was doing on your Delphin Applications link. The one that is closest to what I’m doing was “Packard, W. 2014. UW-MRS: Leveraging a Deep Grammar for Robotic Spatial Commands”. I guess it is not surprising since you can think of interacting with the computer in a text-based game very much like telling a robot what to do. I’d love to know if you know of other projects like that one. I’ll certainly use that as a basis for searching other approaches.

One of the goals of my prototype is to have the player have high trust in the game’s ability to deeply understand what they said. I want to get away from the pattern matching type processing that is (surprisingly to me, at least, still) seems to be state of the art in the Interactive Fiction (IF) world. To this end, I’ve designed it to fail if it there is anything about the utterance that it doesn’t understand. I want users to get used to the idea that success means it actually worked and understood all the words and their semantics. This is for a few reasons:

  • There is a long running debate about whether having higher precision understanding of the semantics of commands will add anything to the game (and whether it may, in fact, be a distraction). I believe it will enable new types of interesting games, but we’ll see.
  • One of the commonly held criticisms of text based games is that they devolve into “guess the verb/term” type puzzles where the challenge is more about figuring out how to tell the computer what you want it to do than actually solving the problem. This is really annoying to players and I’ve love to find a solution that avoids it. Here’s a good blog post from Emily Short that investigates some of these challenges.
  • I’d love to have an engine that is the text equivalent of the physics engines that have transformed the 3D game world: something that comes prebuilt with a great understanding of the real world and an ability to interact with it, that you can fill with your own situation as a game designer and just go!

Anyway, I’m just rambling now but thought it would clarify a bit what I’m looking for and why I’ve been bugging you guys with so many questions :slight_smile:

Regarding applications. Many works nowadays in QA using AMR. I am interested in text entailment first to make the case of DELPH-IN compositional approach from surface to semantics.