Does the Grammar Matrix main verb type support object control?

Hi everybody! In the course of my grammar development project (joint work with @arademaker), I’ve been creating diverse new types for handling phenomena not covered by the Grammar Matrix customization questionnaire. I’ve been reporting here the difficulties encountered while creating subtypes of general types provided by the Grammar Matrix common core matrix.tdl file. Thanks to the kind help of @ebender and @olzama, these difficulties have been solved, enabling drastic improvements in grammar coverage.
Since our first goal is to parse the MRS testsuite, the main grammatical phenomena that need to be manually implemented now relate to verb valency. I’ve already succeeded in manually implementing ditransitive verbs with prepositional objects, subject raising verbs with infinitives headed by complementizers, transitive and ditransitive subject control verbs, etc.
The implementation of object control verbs, however, was a puzzle for some time. In order to help other Grammar Matrix users, I’d like to share my experience with creating a subtype of the general type ditrans-second-arg-control-lex-item from the matrix.tdl file.
For Portuguese, I first tried the following (I explain the type noninh-refl-verb-lex in my previous topic):

ditrans-second-arg-control-verb-lex :=  main-verb-lex & noninh-refl-verb-lex & ditrans-second-arg-control-lex-item &
  [ SYNSEM.LOCAL [ CAT.VAL [ SPEC < >,
                             COMPS < #comp1, #comp2 >,
                             SUBJ < #subj > ],
                   CONT.HOOK.XARG #xarg ],
    ARG-ST < #subj &
             [ LOCAL.CAT [ HEAD noun &
                                  [ CASE nom ],
                             VAL [ SUBJ < >,
                                   SPR < >,
                                   SPEC < >,
                                   COMPS < > ] ] ],
	      #comp1 &
	      [ LOCAL [ CAT [ VAL [ SPR < >,
                                 COMPS < > ],
                             HEAD noun & [ CASE acc ] ],
                       CONT.HOOK.INDEX #xarg ] ],	   
			  
             #comp2 &
             [ LOCAL.CAT [ VAL [ SUBJ < unexpressed >,
                                 COMPS < >,
                                 SPR < >,
                                 SPEC < > ],
                           HEAD verb ] ] > ].
              

inf-ditrans-second-arg-control-verb-lex := ditrans-second-arg-control-verb-lex & [ ARG-ST.REST.REST.FIRST.LOCAL.CAT.HEAD.FORM infinitive].

This didn’t work. I was not able to parse examples in Portuguese analogous to (1).

(1) the dog persuaded the lion to sleep

(In Portuguese, the infinitive must be headed by a complementizer, in the case at hand, a, but I abstracted away from this complexity, postponing its implementation to a later stage.)
Since the source of the problem seemed quite mysterious, I moved on to experiment with the ditrans-second-arg-control-lex-item type using one of my toy grammars of English. Example (1) was parsed, but the MRS generated was not correct:

Inspecting the type main-verb-lex type created by the customization system, I spotted the source of the problems:

main-verb-lex := verb-lex & basic-verb-lex & basic-non-wh-word-lex &
  [ SYNSEM [ L-QUE -,
             LOCAL [ CAT [ HEAD.AUX -,
                           VAL [ SPEC < >,
                                 SUBJ < #subj > ] ],
                     CONT.HOOK.XARG #xarg ] ],
    ARG-ST.FIRST #subj &
                 [ LOCAL [ CAT cat-sat &
                               [ VAL [ SPR < >,
                                       COMPS < > ] ],
                           CONT.HOOK.INDEX #xarg ] ] ].    

As we can see, this type is tailored to subject control. Therefore, I adapted it to suit object control:

main-verb-lex-2 := verb-lex & basic-verb-lex & basic-non-wh-word-lex &
  [ SYNSEM [ L-QUE -,
             LOCAL [ CAT [ HEAD.AUX -,
                           VAL [ SPEC < >,
                                 COMPS.FIRST #obj ] ],
                     CONT.HOOK.XARG #xarg ] ],
    ARG-ST.REST.FIRST #obj &
                 [ LOCAL [ CAT cat-sat &
                               [ VAL [ SPR < >,
                                       COMPS < > ] ],
                           CONT.HOOK.INDEX #xarg ] ] ].

The MRS now being generated is the following:

correct-mrs

It seems correct to me, but maybe I’ve overseen something. Is the additional main verb type a viable solution or is there a more elegant one?

Thanks for the thorough reporting, @leonel !

I think you’ve gone down a slightly strange path and I don’t recommend your additional main-verb type. The #xarg identity in that type stipulates that the XARG of any given main verb (i.e. the argument that could be controlled by something else) points to the INDEX of that verb’s subject. We believe this constraint to be true across languages (though things do get interesting when you talk about syntactically ergative languages, that is not what is at issue here). This is true even when we’re talking about object control verbs: the ‘object’ in object control is the role with respect to the matrix verb. With respect to the embedded verb, it’s still the would-be subject that is controlled.

The problem you show in that broken MRS is one of over-identification, which has to do with the #xarg identity in your ditrans-second-arg-control-verb-lex type. Specifically, the INDEX of comp1 should not be identified with the XARG of the verb itself. Rather you want the INDEX of comp1 to be the XARG of comp2. And in fact, that constraint should already be provided by the type ditrans-second-arg-control-lex-item which you have as a supertype.

1 Like

@ebender, thank you for your remarks. I modified the type definition for persuade-type verbs in my Portuguese grammar following your suggestion (as far I could understand it, being a novice to the HPSG and Grammar Matrix universe). This is the new definition:

ditrans-second-arg-control-verb-lex :=  main-verb-lex & noninh-refl-verb-lex & ditrans-second-arg-control-lex-item &
  [ SYNSEM.LOCAL [ CAT.VAL [ SPEC < >,
                             COMPS < #comp1, #comp2 >,
                             SUBJ < #subj > ] ],
    ARG-ST < #subj &
             [ LOCAL.CAT [ HEAD noun &
                                  [ CASE nom ],
                             VAL [ SUBJ < >,
                                   SPR < >,
                                   SPEC < >,
                                   COMPS < > ] ] ],
	      #comp1 &
	      [ LOCAL [ CAT [ VAL [ SPR < >,
                                 COMPS < > ],
                             HEAD noun & [ CASE acc ] ],
                       CONT.HOOK.INDEX #xarg ] ],	   
			  
             #comp2 &
             [ LOCAL [ CAT [HEAD comp], CONT.HOOK.XARG #xarg ] ] > ].

The grammar generates the MRS below for example (1):

(1) a estudante convenceu o artista a matar a ratazana
the student convinced the artist to kill the rat

object-control

This MRS seems not to be correct, however, because there’s only one event argument e2, which is shared between the verbs convencer ‘convince’ and matar ‘kill’. These two verb predicates should have their own event variables, according to the ERG analysis of the translation of (1).
I then tried the following definition:

ditrans-second-arg-control-verb-lex :=  main-verb-lex & noninh-refl-verb-lex & ditrans-second-arg-control-lex-item &
  [ SYNSEM.LOCAL [ CAT.VAL [ SPEC < >,
                             COMPS < #comp1, #comp2 >,
                             SUBJ < #subj > ] ],
    ARG-ST < #subj &
             [ LOCAL.CAT [ HEAD noun &
                                  [ CASE nom ],
                             VAL [ SUBJ < >,
                                   SPR < >,
                                   SPEC < >,
                                   COMPS < > ] ] ],
	      #comp1 &
	      [ LOCAL.CAT [ VAL [ SPR < >,
                                 COMPS < > ],
                             HEAD noun & [ CASE acc ] ] ],	   
			  
             #comp2 &
             [ LOCAL.CAT.HEAD comp ] > ].

The MRS seems to be identical to the previous one:

object-control-2

The question therefore arises as to what is missing in the type definition, in order for the embedded predicate to have its own event variable.

If you have an unwanted identity, it means that there is something extra (enforcing the identity) rather than something missing in the types.

The way to debug this is to trace up the supertypes and look for constraints that concern the INDEX of the second complement…

I’d start by looking at the “expanded type” for ditrans-second-arg-control-verb-lex through the LKB (View | Expanded Type, I think) and seeing what the complement’s INDEX is identified with. Then look at each of the supertypes to see where the offending constraint seems to be coming from. (You can click on the supertypes in the expanded type view and select their expanded type.) If you find the culprit that way, you can go back to the tdl to change it…

1 Like

Thanks, @ebender. I’ve found out the reason for the error, which only occurred in the Portuguese grammar, it didn’t arise in the English toy grammar. Inspecting the ancestors of the types inherited by the ditrans-second-arg-control-verb-lex type, I couldn’t discover the source of the error. But I had a suspicion it was caused by the fact that I had modeled the embedded VP in Portuguese as a clause headed by a complementizer, following Gabriel & Müller (2008, p. 39)'s approach for Spanish, French and Italian. So I adapted the matrix.tdl type clausal-third-arg-ditrans-lex-item to handle second argument control verbs (e.g., convencer ‘convince’):

clausal-third-arg-ditrans-lex-item := non-local-none-lex-item & one-icons-lex-item &
   [ ARG-ST < [ LOCAL [ CAT cat-sat,
                        CONT.HOOK [ INDEX ref-ind & #ind1,
                                    ICONS-KEY.IARG1 #clause ] ] ],
              [ LOCAL [ CAT cat-sat,
                        CONT.HOOK [ INDEX ref-ind & #ind2,
                                    ICONS-KEY.IARG1 #clause ] ] ],
              [ LOCAL.CONT.HOOK [ LTOP #larg,
                                  INDEX #target ] ] >,
     SYNSEM [ LOCAL.CONT [ HOOK.CLAUSE-KEY #clause,
                           HCONS.LIST < qeq & [ HARG #harg,
                                            LARG #larg ] >,
                           ICONS.LIST < [ IARG1 #clause, IARG2 #target ] > ],
               LKEYS.KEYREL [ ARG1 #ind1,
                              ARG2 #ind2,
                              ARG3 #harg ] ] ].

The only change to the above type was adding XARG #ind2 to the LOCAL.CONT.HOOK feature structure of the third argument:

    cl-3rd-arg-ditrans-2nd-arg-control-lex-item := non-local-none-lex-item & one-icons-lex-item &
       [ ARG-ST < [ LOCAL [ CAT cat-sat,
                            CONT.HOOK [ INDEX ref-ind & #ind1,
                                        ICONS-KEY.IARG1 #clause ] ] ],
                  [ LOCAL [ CAT cat-sat,
                            CONT.HOOK [ INDEX ref-ind & #ind2,
                                        ICONS-KEY.IARG1 #clause ] ] ],
                  [ LOCAL.CONT.HOOK [ XARG #ind2,
    				  LTOP #larg,
                                      INDEX #target ] ] >,
         SYNSEM [ LOCAL.CONT [ HOOK.CLAUSE-KEY #clause,
                               HCONS.LIST < qeq & [ HARG #harg,
                                                LARG #larg ] >,
                               ICONS.LIST < [ IARG1 #clause, IARG2 #target ] > ],
                   LKEYS.KEYREL [ ARG1 #ind1,
                                  ARG2 #ind2,
                                  ARG3 #harg ] ] ].

The MRS now being produced for (1) seems correct, as far as the identification of the event argument of the embedded verb is concerned:

(1) a estudante convenceu o artista a matar a ratazana
the student convinced the artist to kill the rat

convencer

However, the ICONS list is different from the one produced for (2) and the ERG analysis of the translation of (1) (no ICONS list at all):

(2) a estudante contou a o gato que o artista matou a ratazana
the student told the cat that the artist killed the rat

MRS of ditransitive verbs subcategorzing for a clausal complement, e.g., contar ‘tell’:

contar

Is this discrepancy expected or have I overseen something?

MRS generated by the ERG for the student convinced the artist to kill the rat:

By the way, @arademaker, what do you think about it?

@leonel I am not sure but I think ICONS aren’t always displayed. I think some tools display them and some don’t. You probably used different tools in these two cases, right?

1 Like

@olzama, the structures for the Portuguese sentences were produced by the LKB, the ERG’s analysis was generated by the online demo tool.

The tree produced for example (1) with the analysis of the main verb’s second complement as a CP:

(1) a estudante convenceu o artista a matar a ratazana
the student convinced the artist to kill the rat

tree

I suspect the online demo simply does not visualize ICONS.

@olzama, it seems that the online demo tool doesn’t display ICONS. If I’m not wrong, the ERG doesn’t model ICONS at all. Is it true, @arademaker? But my main doubt concerns the discrepancy between the ICONS lists of the two sentences:

(1) a estudante convenceu o artista a matar a ratazana
the student convinced the artist to kill the rat
(2) a estudante contou ao gato que o artista matou a ratazana
the student told the cat that the artist killed the rat

For (1), PorGram generates an MRS with ICONS: <e16 NON-FOCUS x9 e2 INFO-STR e16>, while the ICONS value for (2) reads <e2 INFO-STR e21>, according to the structures presented previously. I wonder whether this discrepancy is expected or it was caused by a wrong modeling.

It’s possible that the additional ICONS in (1) has to do with there being an unexpressed argument (ARG1 of matar, controlled by object of convenceu). Does that explain the particular variables involved? @sanghoun do you have any thoughts here?

1 Like

Thanks, @ebender, for your suggestion. I’ve modified the type of complementizers heading infinitival clauses, specifying that the subject of their complement be < unexpressed >:

 complementizer-lex-item-1 := raise-sem-lex-item & non-local-none-lex-item & basic-icons-lex-item &
  [ SYNSEM.LOCAL.CAT [ HEAD comp &
                            [ MOD < > ],
                       VAL [ SPR < >,
                             SPEC < >,
                             SUBJ < >,
                             COMPS < #comp > ] ],
    ARG-ST < #comp &
             [ LOCAL.CAT [ HEAD verb,
                           VAL [ SUBJ < unexpressed >,
                                 COMPS < > ] ] ] > ].

This type is based on the complementizer-lex-item produced by the Grammar Matrix. The change made the specification e16 NON-FOCUS x9 disappear from all verbs subcategorizing for an infinitival complement headed by one of the prepositional complementizers a or de. Now the ICON lists in the MRS structures of sentences with contar ‘tell’ and convencer are parallel. Only INFO-STR is specified.
I still wonder what the specification INFO-STR is about or whether it is necessary for convince-type verbs. The abbreviation surely means information structure, but I don’t see any linguistic reason why it shouldn’t appear in the MRS of promise-type verbs:

(1) a artista prometeu a o estudante matar a ratazana
the artist promised the student to kill the rat

I’m aware that this asymmetry in the INFO-STR specification is triggered by the fact that I’ve modeled the second complement of convince-type verbs as clauses, while the second complement of promise-type verbs are VPs.
Maybe should this specification be dropped from the MRS of both verb types?
Any ideas about this are welcome.

I believe INFO-STR in the ICONS is an underspecified relation. I suspect that means somewhere you either want to specify it or get rid of it, but I don’t see how off the top of my head here.

1 Like

Good point,@trimblet! I’d be interested to know which rule (or lex type) is responsible for adding that constraint. I wonder if it’s from an analysis that expected some subtype to further specify that ICONS (and was just responsible for linking up the two arguments to it). It’s also the same between both clauses — I think the one that @leonel is concerned with this the NON-FOCUS one.

1 Like