Emerson (append) lists for subject extraction

I am looking now at @guyemerson’s wrapper-grammar which is a matrix-generated English grammar for 567 modified with emerson lists.

My goal is to consider using this solution for (1) my Russian development grammar, to facilitate extraction of both subject and object; (2) potentially for the whole wh-questions library, permanently changing the matrix core.

I copied the wrapper-grammar from here over to my matrix core, and everything loads etc., and sentences generally parse but it seems like the subject extraction rule doesn’t work yet.

basic-extracted-subj-phrase := basic-extracted-arg-phrase & head-compositional &
  [ SYNSEM.LOCAL.CAT.VAL [ SUBJ < >,
                           SPR < >,
                           COMPS < > ],
    HEAD-DTR.SYNSEM [ LOCAL.CAT [ VAL [ SUBJ < gap &
                                             [ LOCAL #local & local &
                                               [ CONT.HOOK.INDEX ref-ind ] ] >,
                                        COMPS olist ],
                                  MC na ],
                      NON-LOCAL.SLASH.LIST < #local > ],
    C-CONT [ RELS.LIST < >,
             HCONS.LIST < >,
             ICONS.LIST < > ] ].

I have a version of that in my Russian grammar:

my-extracted-subj-phrase := basic-extracted-arg-phrase & head-compositional &
  [ SYNSEM.LOCAL.CAT.VAL [ SUBJ < >,
                           SPR < > ,
                           COMPS #comps ],
    HEAD-DTR.SYNSEM [ LOCAL.CAT [ VAL [ SUBJ < gap &
                                             [ LOCAL #local & local &
                                               [ CONT.HOOK.INDEX ref-ind ] ] >,
                                        COMPS #comps ],
                                  MC na ],
                      NON-LOCAL.SLASH.LIST < #local > ],
    C-CONT [ RELS.LIST < >,
             HCONS.LIST < >,
             ICONS.LIST < > ] ].

and trying to apply it to a VP results in some unification errors. Here’s an example, that’s the mother’s NON-LOCAL:

I am not yet accustomed enough with emerson-lists to interpret this effectively so I’d appreciate some help :).

What is this saying? I think it is saying that it can’t unify an empty append-list on the VP daughter with the non-empty append-list on the mother? How do I make it work though?

The rule says the sole element on its SLASH.LIST is its daughter’s LOCAL value. Good so far, in term of types? (I expect there will be some confusion for a little while regarding what’s which type and what should go where, as we transition).

Oh, so, the original matrix core type looks like this:

basic-extracted-subj-phrase := basic-extracted-arg-phrase & head-compositional &
  [ SYNSEM.LOCAL.CAT.VAL [ SUBJ < >,
                           SPR < >,
                           COMPS < > ],
    HEAD-DTR.SYNSEM [ LOCAL.CAT [ VAL [ SUBJ < gap &
                                             [ LOCAL #local & local &
                                               [ CONT.HOOK.INDEX ref-ind ] ] >,
                                        COMPS olist ],
                                  MC na ],
                      NON-LOCAL.SLASH.LIST < #local > ],
    C-CONT [ RELS <! !>,
             HCONS <! !>,
             ICONS <! !> ] ].

This indicates perhaps that, if the original constraint was on SLASH.LIST, it should be something else with the new a-list type?

Basically, if the APPEND constraints are correct (which should hopefully be easy to check – this was the motivation for doing all of this in the first place), then the only thing you should ever need to look at is the LIST. Just ignore what’s inside the APPEND feature because it’s just tedious bookkeeping. The unification failure under LIST shows that it’s constrained to be empty and be of length 1 at the same time. The basic-extracted-subj-phrase type clearly constrains it to be of length 1 (it’s < #local >). Something else is constraining it to be empty.

1 Like

Hmm…

It has to be something to do with the head-complement rule or the transitive lexeme then because I have no problem extracting subjects of instransitive verbs.

Yeah so when I look at the VP formed with an intransitive verb, I see that the SLASH is a 0-1-alist and it is identified with that of the SUBJ (such sentences parse):

50%20PM

When I look at a VP formed with a transitive verb in my grammar, I see the same thing except indeed the SUBJ’s SLASH is constrained to be empty:

Will try to find what is doing that…

Here’s the terminal nodes:

26%20PM

intran-verb-lex, pointing to its SUBJ’s SLASH (this one works):

So if the above track is the right one, then the difference must lie here:

basic-one-arg := lex-item &
  [ ARG-ST < [ NON-LOCAL [ SLASH #slash,
                           REL #rel,
                           QUE #que ] ] >,
    SYNSEM.NON-LOCAL [ SLASH #slash,
                       REL #rel,
                       QUE #que ] ].


basic-two-arg := lex-item &
  [ ARG-ST < [ NON-LOCAL [ SLASH #s2,
                           REL #r2,
                           QUE #q2 ] ],
	     [ NON-LOCAL [ SLASH #s1,
			   REL #r1,
			   QUE #q1 ] ] >,
    SYNSEM.NON-LOCAL [ SLASH.APPEND < #s1, #s2 >,
                       REL.APPEND < #r1, #r2 >,
                       QUE.APPEND < #q1, #q2 > ] ].

Why is the first type working (allowing subject extraction) and the other one does not?..

Except, of course both of them can be observed working in declarative sentences. So, something else then, something to do with the wh-word? But then why would it work with intransitive verbs but not with transitive ones?

The first SLASH list here is underspecified.

Looking at SLASH – if it’s empty, it should be a 0-alist, and if there’s something on the list, it should be a 1-alist. A 0-1-alist is almost always wrong. (And with diff-lists, a 0-1-dlist is almost always wrong.)

Looking at SLASH.APPEND – it should be either a cons-of-alists (non-empty) or a null-of-alists (empty). So this rule isn’t appending the alists, probably because it hasn’t inherited from something it needs to.

@guyemerson Thank you! Are you referring to the second type, or the first one (or both of them)?

I was referring to the first image in the fifth post.

On further reflection, I made a mistake when I said that a 0-1-alist or a 0-1-dlist is almost always wrong, because those are the containers. It’s a 0-1-list that is almost always wrong.

The second image in the fifth post also has a suspicious APPEND (underspecified list-of-alists).

In the first image in the sixth post, I can’t see the value of [8], which is presumably displayed somewhere else.

The second image in the sixth post has a suspicious SLASH.LIST (underspecified 0-1-list) and a suspicious APPEND (underspecified list-of-alists).

I don’t think there’s any problem with the TDL in the seventh post.

To be more concrete in terms of debugging:

If the APPEND is an underspecified list-of-alists, nothing is being appended. The type needs to inherit from something that says what to append.

If a LIST is an underspecified 0-1-list (or list), the append is broken. This is either because nothing was being appended (see above), or because a list somewhere is underspecified (as discussed in the other thread).

Both of these sources of bugs are shared with diff-list appends. Switching to append-lists can’t help with that.

1 Like

Thank you, Guy.

Do you mean, in a particular kind of node? Say for a top S node where SLASH is supposed to be empty, is this OK to have lists-of-alists as the type of APPEND?

44%20PM

Or I guess, you were saying, it should be null-of-alists?

Here’s a screenshot from your grammar (the 567-based English):

I should be using that for a model, right? (As for my Russian grammar from which I posted screenshots above, I guess so far all of them look suspicious, regardless of whether the sentence is a question or whether it parses.)

If the SLASH is supposed to be empty and the SLASH is supposed to be appending the daughters’ SLASHes, we should have SLASH.LIST being null and SLASH.APPEND being a cons-of-alists (which contains the daughters’ SLASHes). APPEND will only be an underspecified list-of-alists if nothing is appended – this would be expected in the leaves of the parse tree, and it would be expected if the daughters’ lists are being deliberately discarded.

If an APPEND has value null-of-alists, that means the LIST is the result of appending no lists (therefore LIST will be null). For any list that is supposed to be appending the daughters’ lists, APPEND should always be a cons-of-alists (since a non-terminal node can’t have 0 daughters).

In the last screenshot, you can see that all of the LISTs have value null (they are all empty) and all of the APPENDs have value cons-of-alists (they are appending some other lists).

1 Like

This config (append of daughters and empty) makes sense to me for an instantiated feature structure (instance of subj-head from some tree) but not for the definition of subj-head itself…

Yes, I agree. I think Olga’s screenshot is of an instantiated feature structure taken from a parse tree. (@Olga please correct me if I’m wrong!)

Yes the screenshots are from specific parses (or parse chart), not just grammar rules.

OK, I decided to start from @guyemerson’s grammar which we know is doing the right thing, and try to add a subject extraction rule, a head-filler rule, and a wh-pronoun there, and see if it will work.

Here’s what I added:

wh-ques-phrase := basic-head-filler-phrase & interrogative-clause & 
      head-final &
   [ SYNSEM.LOCAL.CAT [ MC bool,
      VAL #val,
      HEAD verb ],
     HEAD-DTR.SYNSEM.LOCAL.CAT [ MC na,
         VAL #val & [ SUBJ < >,
                COMPS < > ] ],
     NON-HEAD-DTR.SYNSEM.NON-LOCAL.QUE.LIST < ref-ind > ]. 
     
extracted-subj-phrase := basic-extracted-subj-phrase &
  [ SYNSEM.LOCAL.CAT.HEAD verb,
    HEAD-DTR.SYNSEM.LOCAL.CAT.VAL.COMPS < > ].

wh-word-lex := norm-ltop-lex-item & basic-icons-lex-item & 
  [ SYNSEM [ LOCAL [ CAT [ VAL [ SPR < >,
                                 SUBJ < >,
                                 COMPS < >,
                                 SPEC < > ] ],
                     CONT [ RELS.LIST < [ LBL #larg,
                                      ARG0 #arg0 ],
                                    [ PRED "which_q_rel",
                                      ARG0 #arg0,
                                      RSTR #harg ] >,
                            HCONS.LIST < [ HARG #harg,
                                       LARG #larg ] > ] ],
             NON-LOCAL.QUE.LIST < #arg0 > ] ].

wh-pronoun-noun-lex := wh-word-lex & norm-hook-lex-item & 
non-mod-lex-item & basic-one-arg &
  [ SYNSEM [ LOCAL [ CAT.HEAD noun,
         CONT [ HOOK.INDEX.PNG.PER 3rd,
                  RELS.LIST <[ ARG0 ref-ind ], [] > ] ] ] ].

wh-noun-lex := wh-pronoun-noun-lex & 
[ SYNSEM.LOCAL.CAT.HEAD.CASE nom ].

Now I can’t apply the subject extraction rule to the VP for this reason (this is the SYNSEM’s own NONLOCAL):

27%20AM
Any tips? Here’s basic-extracted-subj, for reference:

basic-extracted-subj-phrase := basic-extracted-arg-phrase & head-compositional &
  [ SYNSEM.LOCAL.CAT.VAL [ SUBJ < >,
                           SPR < >,
                           COMPS < > ],
    HEAD-DTR.SYNSEM [ LOCAL.CAT [ VAL [ SUBJ < gap &
                                             [ LOCAL #local & local &
                                               [ CONT.HOOK.INDEX ref-ind ] ] >,
                                        COMPS olist ],
                                  MC na ],
                      NON-LOCAL.SLASH.LIST < #local > ],
    C-CONT [ RELS.LIST < >,
             HCONS.LIST < >,
             ICONS.LIST < > ] ].

To clarify, I don’t understand where the conflicting constraints come from. Does it seem like something in the head-complement rule is insisting that the SLASH be null??..

Hmm – I’m actually a bit surprised to see that basic-extracted-subj-phrase specifically talks about the head-daughter’s NON-LOCAL. It seems to me that with the lexical threading analysis and the constraints on gap, just saying the SUBJ is a gap should be enough. Looking at the ERG, Dan certainly doesn’t say anything about the NON-LOCAL there.

Right – and I think we even discussed this bit before… But kind of left it in “if it works, let’s not touch it” state at that point (not quite like that, but at any rate, decided not to change that at the time).

So if I do remove that constraint:

basic-extracted-subj-phrase := basic-extracted-arg-phrase & head-compositional &
  [ SYNSEM.LOCAL.CAT.VAL [ SUBJ < >,
                           SPR < >,
                           COMPS < > ],
    HEAD-DTR.SYNSEM [ LOCAL.CAT [ VAL [ SUBJ < gap &
                                             [ LOCAL local &
                                               [ CONT.HOOK.INDEX ref-ind ] ] >,
                                        COMPS olist ],
                                  MC na ] ],
    C-CONT [ RELS.LIST < >,
             HCONS.LIST < >,
             ICONS.LIST < > ] ].

Then half of the unification failure trying to extract a subject from a VP goes away but I still get this: