Emerson (append) lists for subject extraction

@guyemerson Thank you! Are you referring to the second type, or the first one (or both of them)?

I was referring to the first image in the fifth post.

On further reflection, I made a mistake when I said that a 0-1-alist or a 0-1-dlist is almost always wrong, because those are the containers. It’s a 0-1-list that is almost always wrong.

The second image in the fifth post also has a suspicious APPEND (underspecified list-of-alists).

In the first image in the sixth post, I can’t see the value of [8], which is presumably displayed somewhere else.

The second image in the sixth post has a suspicious SLASH.LIST (underspecified 0-1-list) and a suspicious APPEND (underspecified list-of-alists).

I don’t think there’s any problem with the TDL in the seventh post.

To be more concrete in terms of debugging:

If the APPEND is an underspecified list-of-alists, nothing is being appended. The type needs to inherit from something that says what to append.

If a LIST is an underspecified 0-1-list (or list), the append is broken. This is either because nothing was being appended (see above), or because a list somewhere is underspecified (as discussed in the other thread).

Both of these sources of bugs are shared with diff-list appends. Switching to append-lists can’t help with that.

1 Like

Thank you, Guy.

Do you mean, in a particular kind of node? Say for a top S node where SLASH is supposed to be empty, is this OK to have lists-of-alists as the type of APPEND?

44%20PM

Or I guess, you were saying, it should be null-of-alists?

Here’s a screenshot from your grammar (the 567-based English):

I should be using that for a model, right? (As for my Russian grammar from which I posted screenshots above, I guess so far all of them look suspicious, regardless of whether the sentence is a question or whether it parses.)

If the SLASH is supposed to be empty and the SLASH is supposed to be appending the daughters’ SLASHes, we should have SLASH.LIST being null and SLASH.APPEND being a cons-of-alists (which contains the daughters’ SLASHes). APPEND will only be an underspecified list-of-alists if nothing is appended – this would be expected in the leaves of the parse tree, and it would be expected if the daughters’ lists are being deliberately discarded.

If an APPEND has value null-of-alists, that means the LIST is the result of appending no lists (therefore LIST will be null). For any list that is supposed to be appending the daughters’ lists, APPEND should always be a cons-of-alists (since a non-terminal node can’t have 0 daughters).

In the last screenshot, you can see that all of the LISTs have value null (they are all empty) and all of the APPENDs have value cons-of-alists (they are appending some other lists).

1 Like

This config (append of daughters and empty) makes sense to me for an instantiated feature structure (instance of subj-head from some tree) but not for the definition of subj-head itself…

Yes, I agree. I think Olga’s screenshot is of an instantiated feature structure taken from a parse tree. (@Olga please correct me if I’m wrong!)

Yes the screenshots are from specific parses (or parse chart), not just grammar rules.

OK, I decided to start from @guyemerson’s grammar which we know is doing the right thing, and try to add a subject extraction rule, a head-filler rule, and a wh-pronoun there, and see if it will work.

Here’s what I added:

wh-ques-phrase := basic-head-filler-phrase & interrogative-clause & 
      head-final &
   [ SYNSEM.LOCAL.CAT [ MC bool,
      VAL #val,
      HEAD verb ],
     HEAD-DTR.SYNSEM.LOCAL.CAT [ MC na,
         VAL #val & [ SUBJ < >,
                COMPS < > ] ],
     NON-HEAD-DTR.SYNSEM.NON-LOCAL.QUE.LIST < ref-ind > ]. 
     
extracted-subj-phrase := basic-extracted-subj-phrase &
  [ SYNSEM.LOCAL.CAT.HEAD verb,
    HEAD-DTR.SYNSEM.LOCAL.CAT.VAL.COMPS < > ].

wh-word-lex := norm-ltop-lex-item & basic-icons-lex-item & 
  [ SYNSEM [ LOCAL [ CAT [ VAL [ SPR < >,
                                 SUBJ < >,
                                 COMPS < >,
                                 SPEC < > ] ],
                     CONT [ RELS.LIST < [ LBL #larg,
                                      ARG0 #arg0 ],
                                    [ PRED "which_q_rel",
                                      ARG0 #arg0,
                                      RSTR #harg ] >,
                            HCONS.LIST < [ HARG #harg,
                                       LARG #larg ] > ] ],
             NON-LOCAL.QUE.LIST < #arg0 > ] ].

wh-pronoun-noun-lex := wh-word-lex & norm-hook-lex-item & 
non-mod-lex-item & basic-one-arg &
  [ SYNSEM [ LOCAL [ CAT.HEAD noun,
         CONT [ HOOK.INDEX.PNG.PER 3rd,
                  RELS.LIST <[ ARG0 ref-ind ], [] > ] ] ] ].

wh-noun-lex := wh-pronoun-noun-lex & 
[ SYNSEM.LOCAL.CAT.HEAD.CASE nom ].

Now I can’t apply the subject extraction rule to the VP for this reason (this is the SYNSEM’s own NONLOCAL):

27%20AM
Any tips? Here’s basic-extracted-subj, for reference:

basic-extracted-subj-phrase := basic-extracted-arg-phrase & head-compositional &
  [ SYNSEM.LOCAL.CAT.VAL [ SUBJ < >,
                           SPR < >,
                           COMPS < > ],
    HEAD-DTR.SYNSEM [ LOCAL.CAT [ VAL [ SUBJ < gap &
                                             [ LOCAL #local & local &
                                               [ CONT.HOOK.INDEX ref-ind ] ] >,
                                        COMPS olist ],
                                  MC na ],
                      NON-LOCAL.SLASH.LIST < #local > ],
    C-CONT [ RELS.LIST < >,
             HCONS.LIST < >,
             ICONS.LIST < > ] ].

To clarify, I don’t understand where the conflicting constraints come from. Does it seem like something in the head-complement rule is insisting that the SLASH be null??..

Hmm – I’m actually a bit surprised to see that basic-extracted-subj-phrase specifically talks about the head-daughter’s NON-LOCAL. It seems to me that with the lexical threading analysis and the constraints on gap, just saying the SUBJ is a gap should be enough. Looking at the ERG, Dan certainly doesn’t say anything about the NON-LOCAL there.

Right – and I think we even discussed this bit before… But kind of left it in “if it works, let’s not touch it” state at that point (not quite like that, but at any rate, decided not to change that at the time).

So if I do remove that constraint:

basic-extracted-subj-phrase := basic-extracted-arg-phrase & head-compositional &
  [ SYNSEM.LOCAL.CAT.VAL [ SUBJ < >,
                           SPR < >,
                           COMPS < > ],
    HEAD-DTR.SYNSEM [ LOCAL.CAT [ VAL [ SUBJ < gap &
                                             [ LOCAL local &
                                               [ CONT.HOOK.INDEX ref-ind ] ] >,
                                        COMPS olist ],
                                  MC na ] ],
    C-CONT [ RELS.LIST < >,
             HCONS.LIST < >,
             ICONS.LIST < > ] ].

Then half of the unification failure trying to extract a subject from a VP goes away but I still get this:

I didn’t test all aspects of my append-list version of the Matrix – and that includes the behaviour of the NON-LOCAL lists! So don’t assume it’s doing the right thing here… I know it does the right thing for RELS, HCONS, ICONS, but I won’t promise more than that without looking more closely!

There are some edge cases where converting a grammar from diff-lists to append-lists would lead to different behaviour, but these are cases where the use of diff-lists is fragile – for example, if a grammar exploits the fact that a diff-list can have extra elements on the LIST after the LAST (e.g. a 0-dlist identifies the LIST and LAST but the LIST can still be non-empty, so that diff-list appends are possible). May or may not be relevant here, but I thought I’d mention it.

@guyemerson would you be interested in looking more closely? :wink: I am not sure at this point how to proceed with this. (My ultimate goal is to use these lists to model multiple extraction).

Right now the only thing I can guess (but it is pure guessing as I don’t yet have a good understanding of the mechanics) is that the VP somehow ended up hard-constrained to be SLASH-empty… and so it cannot be the daughter of subject extraction. It sounds weird but that’s the only direction I can come up with for now.

Maybe after the ACL deadline :wink:

The specific problem that I was having here were the types which were created in parallel to diff-list types (0-1-alist, 1-alist). @ebender suggested that I remove them entirely for now and just have SLASH be of type append-list everywhere.

This fixed the problem and I can now extract the subject and form a wh-question properly. Thank you, @ebender!

(The question of how to do something like adjunct extraction without specific constraints on lists length is for now open.)

I can see one problem – I didn’t create a subtype inheriting from both 1-list and cons-copy. We would need to add 1-list-copy := 1-list & cons-copy.. But then 0-1-list and list-copy will be assigned a glb-type… which is probably okay, but probably worth checking this in practice, and maybe worth defining the glb explicitly just for clarity.

The above omission might explain the strange unification failures, because my wrapper-type grammar has the glb of 0-1-list and list-copy being null-copy. This might be enough to explain the null-copy types in your screenshots.

Could you share the grammars with and without 0-1-alist types? I’ll like to check if anything else is going wrong.

1 Like

By the way, what is the right way of saying: an append-list of length one or more?

I am finding that simply underconstraining SLASH lists is very confusing. I probably must specify whether the list is empty or non-empty, at the very least, in my filler-head rules… (see related thread)

How about:

[ SLASH.LIST < [ ], ... > ]

?

cons is a list of length one or more, so the simplest way is: LIST cons.

My impression from the related thread is that some of the confusion comes from lexical threading rather the behaviour of append-lists. So this may not be the right solution for your use case.