Extracting both subject and object

Here’s where I am at, extracting both subject and object (for my Russian grammar).

I am considering the following analysis for (1):

(1) Kto     chto     vidit?
    Who.NOM what.ACC sees
`Who sees what?' [rus]

Right now, here’s what I have:

2trees

I think generally that’s good because I do want to also license (2):

(1) Chto     kto     vidit?
    what.ACC Who.NOM sees
`Who sees what?' [rus]

I am assuming that gaps are filled in the order in which they were extracted?

experimental-filler-phrase := binary-phrase & phrasal &
  [ SYNSEM [ LOCAL.CAT.VAL [ 
                             COMPS < >,
                             SPR < > ] ],
    ARGS < [ SYNSEM [ LOCAL #slash & local &
			    [ CAT.VAL [ SUBJ olist,
					COMPS olist,
					SPR olist ],
			      CTXT.ACTIVATED + ],
		      NON-LOCAL.SLASH 0-alist ] ],
	   [ SYNSEM [ LOCAL.CAT [ VAL.COMPS olist ],
		      NON-LOCAL [ SLASH append-list &
					[ LIST.FIRST #slash   ],    <<<<<<<<< NOTE THIS; IS THIS DOING WHAT I WANT?..
				  REL 0-alist ] ] ] > ].

What I am confused about is the order of the elements on the SLASH list and what is going on with the CASE values.

In the tree on the left, subject extraction happens first, then object extraction. The tree on the right, vice versa. Why then, when I look at the lowest S node in both trees, the first element of the SLASH list is [CASE acc]? Looks like it is underconstrained somehwere; would that be connected to the removal of the #local identity between the gap and the head-daughter’s SLASH (again, as discussed here Emerson (append) lists for subject extraction)?

The first element on the SLASH list in the left tree, lowest S node:

The first element on the SLASH list in the lowest S node in the right tree:

Are those local avms? That is, you want to see what the SLASH list looks like before any information is unified in.

(You are right that I was again looking at full avm by mistake but also:) Yes that is also what the SLASH lists on both lower S-s look like in local avms. Despite the different order of the extraction rules, the first element on the list ends up being [CASE acc+part].

A shot in the dark, but is the order of the arguments to the append actually correct on both rules?

By both rules, do you mean extracted subject and extracted complement?

The extracted-comp rule says gap is first and then the rest is appended:

basic-extracted-comp-phrase := basic-extracted-arg-phrase &
                               head-compositional &
  [ SYNSEM canonical-synsem &
       [ LOCAL.CAT [ VAL [ SUBJ #subj,
                           SPR #spr,
                           COMPS #comps ],
                     MC #mc ] ],
    HEAD-DTR [ SYNSEM
               [ LOCAL.CAT [ VAL [ SUBJ #subj,
                                   SPR #spr,
                                   COMPS < gap . #comps > ],
                             MC #mc ] ] ],
    C-CONT [ RELS.LIST < >,
             HCONS.LIST < >,
             ICONS.LIST < > ] ].

The extracted-subj simply says SUBJ contains a gap:

my-extracted-subj-phrase := basic-extracted-arg-phrase & head-compositional &
  [ SYNSEM.LOCAL.CAT.VAL [ SUBJ < >,
                           SPR < > ,
                           COMPS #comps ],
    HEAD-DTR.SYNSEM [ LOCAL.CAT [ VAL [ SUBJ < gap &
                                             [ LOCAL local &
                                               [ CONT.HOOK.INDEX ref-ind ] ] >,
                                        COMPS #comps ], MC na ]],
    C-CONT [ RELS.LIST < >,
             HCONS.LIST < >,
             ICONS.LIST < > ] ].

They both inherit from:

basic-extracted-arg-phrase := head-valence-phrase & head-only &
  [ SYNSEM.LIGHT - ].

and from head-compositonal which has nothing to do with ARGS.

I am also pretty sure my Head-Filler Rule is written incorrectly. That’s probably where the problem lies… I am not sure how to write it so that the first element is taken from the SLASH list. That’s what I have right now:

experimental-filler-phrase := binary-phrase & phrasal &
  [ SYNSEM [ LOCAL.CAT.VAL [ 
                             COMPS < >,
                             SPR < > ] ],
    ARGS < [ SYNSEM [ LOCAL #slash & local &
			    [ CAT.VAL [ SUBJ olist,
					COMPS olist,
					SPR olist ],
			      CTXT.ACTIVATED + ],
		      NON-LOCAL.SLASH.LIST #slashrest ] ],
	   [ SYNSEM [ LOCAL.CAT [ VAL.COMPS olist ],
		      NON-LOCAL [ SLASH append-list &
					[ LIST < #slash . #slashrest > ],
				  REL 0-alist ] ] ] > ].

On a side note: when I am inspecting the local avm for the head-filler phrases (say, the lower one, which is the middle S node), it doesn’t seem helpful as it just shows me a list as a value of the SLASH. Full avm however shows me that one item from the SLASH list of the head daughter has been discharged and now there is one more item left.

I think I am still confused about full vs. local…

Perhaps it is the gap that is currently written incorrectly, then?

gap := expressed-non-canonical &
  [ LOCAL #local,
    NON-LOCAL [ REL 0-alist,
		QUE 0-alist,
		SLASH append-list &
		    [ LIST #local ] ] ].

Perhaps APPEND should be used here somehow?

I think the SLASH value of a gap should be an append-list containing just one element, the LOCAL value of the gap itself. But I don’t know how to write that in append-list land.

Full has info unified in from outside. If you want to see the edge as it is built up, either grab it from the chart (rather than the tree) or pick local avm.

So, to re-summarize:

For some reason, whichever order extracted-comp and extracted-subj rules are in the tree, the top rule (of the two) will have a SLASH list with the complement first and the subject second. This results in: (1) two trees for the SOV order; and (2) no trees for OSV order.

So far I don’t have an idea about what’s responsible…

Aha, here’s something interesting.

When I look at the (local) AVMs for the extracted-subj node here (see circle on the left):

Screen Shot 2019-12-19 at 2.19.31 PM

What I see is the following, here’s its SLASH:

Screen Shot 2019-12-19 at 2.20.51 PM

but when I look at 7’s definition, I see that it is actually identified with SYNSEM.LOCAL.CAT.VAL.COMPS.FIRST.NON-LOCAL.SLASH.LIST

I don’t immediately know why this is the case but I doubt it could be correct?

…to resummarize again, it appears that I am not using the syntax correctly to ensure that item properly get on the SLASH list when more than one extraction rule is used. Things do get appended but looks like they do so by some accident, and so they are by all means not appended properly.

Something somewhere should say that SLASH can already have something on it; either gap or the extraction rules?.. @guyemerson perhaps you could help; please let me know what you’d need from me to make it easier for you…

The “lexical threading” aka the appends in the types such as basic-two-arg etc. should be doing this for you, if the extraction rules are filling in the info that one of the arguments has a non-empty SLASH value.

Oh, right. Back to square one :slight_smile:

Sorry if this is obvious, as I haven’t been following this whole multiple wh thing closely, but:

Lexical threading of SLASH means the verb’s lexical entry determines the order in which slashed arguments’ SLASH values show up on the VP’s SLASH list. The extraction rules just make one or more of those nonempty (vs empty when a hdcmp or hdsubj rule is used). You will never get a difference in the order of the elements on the SLASH list as long as pure lexical threading is used to agglomerate the arguments’ SLASHes.

You might need to either (1) abandon lexical threading and make the SLASH list reflect the order you want, or (2) block the (currently spurious) ambiguity of extraction order and add a head_2nd_filler rule to realize the second element of the SLASH list before the first has been discharged, if the language in question actually allows that. Worse yet, I could imagine the choice between (1) and (2) might depend on language, though hopefully not! There might also be a hybrid where a rule appends something explicitly to SLASH in some cases but using the gap type on valence lists in others…

Or maybe I am looking at this wrong and giving you bad advice :-).

Best, Woodley

1 Like

No it was not obvious to me, thanks, Woodley.

Right; initially I did not have the ambiguity and was thinking about a second rule but for some reason decided that that wasn’t the right way to go. Particularly because I have also ditransitives in my grammar (although I suppose I could cut them, so to speak).

@ebender What are your thoughts on option (1)?

Thanks, @sweaglesw — that sounds indeed exactly right. The lexical threading analysis, especially when paired with phrase structure extraction rules, is slippery!

But, it is pretty deeply embedded in the system, and I think moving away from it would be a big step. Let me start with an attempt at summarizing what I think we want:

(1) No ambiguity in the order of application of extraction rules (since this is just spurious). This can probably be achieved by making the daughter of the subject extraction rule COMPS < >, which you probably want anyway to block spurious ambiguity between that rule and opt-comp.

(2) The ability to realize the extracted arguments in any order, in at least some languages. This can probably be achieved by creating a head-2nd-filler rule that grabs the second thing off that list (and a 3rd if need be…).

(3) Maybe: The ability to parameterize the order of elements on the SLASH list, for languages that allow only one order of fillers in multiply extracted sentences. What does the typological literature say on this point? If it’s never fixed, or if it is fixed, it’s always subj > comps, then the current lexical threading should be just fine. If some languages fix it as comps > subj, and we stick with lexical threading, you’ll need to take basic-two-arg etc out of matrix.tdl and have them be output by your library.

1 Like

I believe that would be the “superiority effects”. So the literature generally tends to say this exists but we’ve been planning to follow the HPSG tradition in assuming that this is not a purely syntactic phenomenon.

And yes, if it does exist, it should be subj > comp, I am pretty sure.

Okay, then I vote for head-2nd-filler :slight_smile: