How to implement ditransitive verbs in a Grammar Matrix-based grammar?

Hi! In collaboration with @arademaker, I’m developing a new HPSG grammar for Portuguese, which will be distributed under a free open-source software license.
I’m using the Grammar Matrix customization questionnaire to build the first version of the grammar. I’m impressed with how much I’ve accomplished with this system in few months, since I’m a beginner in HPSG. Practically all my previous work in grammar engineering was carried out within the LFG using the Xerox Linguistic Environment.
While the customization questionnaire covers a significant portion of Portuguese syntax, the available argument structure options only cover a small amount of the valency patterns of the language. For example, it doesn’t support the implementation of ditransitive verbs. Quite fortunately for HPSG beginners, the core grammar component matrix.tdl, which is generated by the system for every grammar, does contain a wider range of valency pattern types, including ditransitive-lex-item. However, the code seems overwhelmingly complex for me, so that I didn’t succeed in creating a functioning subtype for ditransitive verbs in an English minigrammar I’ve used to learn the system and the underlying formalism. I guess this customization would be a trivial task for experienced developers of HPSG grammars . So I’m asking for help here.
Given the definitions of the general transitive-lex-item and its customized version transitive-verb-lex:

transitive-lex-item := non-local-none-no-hcons & basic-icons-lex-item &
   [ ARG-ST < [ LOCAL [ CAT cat-sat,
                        CONT.HOOK [ INDEX ref-ind & #ind1,
                                    ICONS-KEY.IARG1 #clause ] ] ],
              [ LOCAL [ CAT cat-sat,
                        CONT.HOOK [ INDEX ref-ind & #ind2,
                                    ICONS-KEY.IARG1 #clause ] ] ]>,
     SYNSEM [ LKEYS.KEYREL [ ARG1 #ind1,
                             ARG2 #ind2 ],
              LOCAL.CONT.HOOK.CLAUSE-KEY #clause ] ].


transitive-verb-lex := main-verb-lex & transitive-lex-item &
  [ SYNSEM.LOCAL.CAT.VAL.COMPS < #comps >,
    ARG-ST < [ LOCAL.CAT.HEAD noun ],
             #comps &
             [ LOCAL.CAT cat-sat &
                         [ VAL [ SPR < >,
                                 COMPS < > ],
                           HEAD noun ] ] > ].

how would the type for ditransitive verbs look like? The general type for ditransitive lexical items is defined as follows:

ditransitive-lex-item := non-local-none-no-hcons & basic-icons-lex-item &
   [ ARG-ST < [ LOCAL [ CAT cat-sat,
                        CONT.HOOK [ INDEX ref-ind & #ind1,
                                    ICONS-KEY.IARG1 #clause ] ] ],
              [ LOCAL [ CAT cat-sat,
                        CONT.HOOK [ INDEX ref-ind & #ind2,
                                    ICONS-KEY.IARG1 #clause ] ] ],
              [ LOCAL [ CAT cat-sat,
                        CONT.HOOK [ INDEX ref-ind & #ind3,
                                    ICONS-KEY.IARG1 #clause ] ] ] >,
     SYNSEM [ LKEYS.KEYREL [ ARG1 #ind1,
                             ARG2 #ind2,
                             ARG3 #ind3 ],
               LOCAL.CONT.HOOK.CLAUSE-KEY #clause ] ].
2 Likes

Hi Leonel,

The basic ditransitive-lex-item type will give you a lexical item which has three arguments (one subject and two complements). I think for the very start, you don’t need to worry too much about the details of that type. Suffice it to say that there are three items on the ARG-ST list.

So generally, what you want to do is create subtypes of this ditransitive-lex-item which suit your needs.

If you look at how transitive and intransitive verbs are done in your grammar, you will usually see sets of types of this sort:

intransitive-verb-lex := verb-lex & intransitive-lex-item &
  [ SYNSEM.LOCAL.CAT [ VAL.COMPS < > ],
    ARG-ST.FIRST.LOCAL.CAT.HEAD noun ].

nom-intransitive-verb-lex := intransitive-verb-lex &
  [ ARG-ST.FIRST.LOCAL.CAT.HEAD noun &
                                [ CASE nom ] ].

The second of the above types was created by the case library (I think; or you could say it was created automatically by the case and the lexicon libraries) which knew, from the grammar specification (the choices file), that
the language has a nominative-accusative system. So it added the appropriate CASE constraint.

For transitive verbs, you will probably have, for a nom-acc language:

transitive-verb-lex := verb-lex & transitive-lex-item &
  [ SYNSEM.LOCAL.CAT.VAL.COMPS < #comps >,
    ARG-ST < [ LOCAL.CAT.HEAD noun ],
             #comps &
             [ LOCAL.CAT cat-sat &
                         [ VAL [ SPR < >,
                                 COMPS < > ],
                           HEAD noun ] ] > ].

nom-acc-transitive-verb-lex := transitive-verb-lex &
  [ ARG-ST < [ LOCAL.CAT.HEAD noun &
                              [ CASE nom ] ],
             [ LOCAL.CAT.HEAD noun &
                              [ CASE acc ] ] > ].

And then accordingly, in the lexicon file, you can have lexical entries inheriting from the more specific of the above subtypes (the below is from my Russian grammar):

ид := nom-intransitive-verb-lex &
  [ STEM < "ид" >,
    SYNSEM.LKEYS.KEYREL.PRED "_go_v_rel" ].

I do have ditransitive verbs in one of the versions of my Russian grammar, and here’s what they look like:

да := ditran-nom-acc-dat-verb-lex &
  [ STEM < "да" >,
    SYNSEM.LKEYS.KEYREL.PRED "_give_v_rel" ].

ditran-verb-lex := verb-lex & ditransitive-lex-item & 
  [ SYNSEM.LOCAL.CAT.VAL.COMPS < #comp1, #comp2 >,
    ARG-ST < [ LOCAL.CAT.HEAD noun ],
             #comp1 &
             [ LOCAL.CAT [ VAL [ SPR < >,
                                 COMPS < > ],
                           HEAD noun ] ],
             #comp2 &
             [ LOCAL.CAT [ VAL [ SPR < >,
                                 COMPS < > ],
                           HEAD noun ] ] > ].


ditran-nom-acc-dat-verb-lex := ditran-verb-lex & 
  [ ARG-ST < [ LOCAL.CAT.HEAD noun &
                              [ CASE nom ] ],
             [ LOCAL.CAT.HEAD noun &
                              [ CASE acc ] ],
             [ LOCAL.CAT.HEAD noun &
                              [ CASE dat ] ] > ].

You could have other case constraints of course.

Does this help?

P.S.: I am not sure why some of the [ HEAD noun ] constraints are repeated in the types for intransitive and transitive verbs (in the ditransitive, I could have written it like that by hand). That looks unnecessary to me. @ebender does that look like a matrix bug to you? I just confirmed that that’s what comes out of the customization system (e.g. for the mini-japanese), not something I added manually in my Russian grammar.

1 Like

Indeed, the [ HEAD noun ] information can just go on the supertypes. I’m not sure why it’s repeated on the case-bearing subtypes.

2 Likes

Right. I opened an issue on github.

So, really, it should be like this:

nom-intransitive-verb-lex := intransitive-verb-lex &
  [ ARG-ST.FIRST.LOCAL.CAT.HEAD.CASE nom ].

etc.

So when you write ditransitive one by hand, you don’t need to repeat the [ HEAD noun ] constraints; seems like in the automatically customized grammars, that’s just a bug. Only include new constraints which aren’t present in the supertype.

2 Likes

Thanks a lot, @olzama! Using the code from your Russian grammar, I’ve managed to implement ditransitive verbs in my English toy grammar. It has brought me a step further to understand the TDL syntax and the Grammar Matrix code. I’ll try to adapt the solution to Portuguese, where the recipient argument is realized by a PP headed by a (also para in ditransitive verbs in Brazilian Portuguese) or by a dative clitic, analogously to other Romance languages. Due to pronominalization by a dative clitic, this preposition has been analyzed in LFG (among other frameworks) as a dative marker.
Some two-place verbs require or allow the object to be introduced by this preposition:

(1) o cachorro obedec-eu a-o homem
the dog obey-PST;PRF;3s DAT-DEF;3ms man
‘the dog obeyed the man’
(2) o cachorro lhe=obedec-eu
the dog 3s;DAT=obey-PST;PRF;3s
‘the dog obeyed him’

Many two-place verbs subcategorizing for a dative-marked object also licence accusative marking on the object and are passivizable, e.g., beside obedecer ‘obey’, ajudar ‘help’.
I’ve tried to implement the dative case analysis of (1) with the questionnaire, but it didn’t work. I’ll try it again, following Drellishak’s 2009 dissertation.
Do you have a similar construction (with two-place or three-place verbs) in your Russian grammar? (Argument realization by means of adpositions maybe deserve another post.)

I would suggest making a toy grammar with transitive verbs that take dative PP complements — where the dative case is contributed by a case-marking preposition. Then you can examine the tdl from this grammar to create an analogous ditransitive type.

The case marking library and the analyses it provides are described in Scott Drellishak’s dissertation:

See especially Chapter 3, section 3.2.

2 Likes

@ebender, thank you for your reply. I’ve succeeded in implementing transitives and ditransitives governing a prepositional object in a toy grammar of German. This grammar is an extended version of Drellishak’s fragment of German, which only handles quirky inflectional case on the object of transitive verbs, namely dative case on the object of helfen ‘help’.
First, I created with the case library of the questionnaire the cases auf, um, an, von, mit etc., corresponding to different prepositions introducing prepositional objects. Then, I used the Lexicon page of the questionnaire to create the verb types for the transitive verbs subcategorizing for a prepositional object, e.g., auf-obj for warten ‘wait’, requiring the object to be marked with auf ‘for’.
The case library creates the type case-marking-adp-lex for case marking adpositions, which enforces identity of values between the adposition head and the governed noun head. In German, however, the verb selects a specific preposition which, in turn, selects a specific case. While auf requires the accusative with warten, mit requires the dative, e.g., beginnen ‘start’. Therefore, I manually reformulated the type definition of case-marking-adp-lex:

case-marking-adp-lex := non-local-none-lex-item & raise-sem-lex-item &
  [ SYNSEM.LOCAL.CAT [ HEAD adp &
                            [ CASE case,
                              MOD < > ],
                       VAL [ SPR < >,
                             SUBJ < >,
                             COMPS < #comps >,
                             SPEC < > ] ],
    ARG-ST < #comps &
             [ LOCAL.CAT [ VAL.SPR < >,
                           HEAD noun &
                                [ CASE case ] ] ] > ].

Note that I also have eliminated the feature CASE-MARKED, explained in Drellishak’s dissertation. This feature doesn’t seem to be necessary in this approach of mine.
For each type of preposition, I created a specific type:

dat-case-marking-adp-lex := case-marking-adp-lex & [ ARG-ST.FIRST.LOCAL.CAT.HEAD.CASE dat ].

acc-case-marking-adp-lex := case-marking-adp-lex & [ ARG-ST.FIRST.LOCAL.CAT.HEAD.CASE acc ].

Eliminating the CASE-MARKED specification in the types generated by the Matrix, I redefined the different transitive verb types created by the Matrix, e.g.:

mit-obj-verb-lex := transitive-verb-lex &
  [ SYNSEM.LOCAL.CAT.VAL.COMPS.FIRST.LOCAL.CAT.HEAD adp & [ CASE mit] ].

auf-obj-verb-lex := transitive-verb-lex &
  [ SYNSEM.LOCAL.CAT.VAL.COMPS.FIRST.LOCAL.CAT.HEAD adp & [ CASE auf] ].

In the lexicon, I also redefined the entries for the prepostions created by the Matrix, e.g.:

mit-marker_mit := dat-case-marking-adp-lex &
  [ STEM < "mit" >,
    SYNSEM.LOCAL [ CONT [ HOOK [ ICONS-KEY.IARG1 #clause,
                                 CLAUSE-KEY #clause ],
                          ICONS.LIST < > ],
                   CAT.HEAD [ CASE mit ] ] ].

auf-marker_auf := acc-case-marking-adp-lex &
  [ STEM < "auf" >,
    SYNSEM.LOCAL [ CONT [ HOOK [ ICONS-KEY.IARG1 #clause,
                                 CLAUSE-KEY #clause ],
                          ICONS.LIST < > ],                       CAT.HEAD [ CASE auf ] ] ].

On the basis of this background, the implementation of ditransitives governing a prepositional object was straightforward. First, I redefined @olzama’s ditransitive-verb-lex type, assigning the second complement’s HEAD the value +np (compatible with both nouns and adpositions):

ditransitive-verb-lex := verb-lex & ditransitive-lex-item &
  [ SYNSEM.LOCAL.CAT.VAL.COMPS < #comp1, #comp2 >,
    ARG-ST < [ LOCAL.CAT.HEAD noun ],
             #comp1 &
             [ LOCAL.CAT cat-sat &
                         [ VAL [ SPR < >,
                                 COMPS < > ],
                           HEAD noun ] ],
	      #comp2 &
	      [ LOCAL.CAT cat-sat &
                         [ VAL [ SPR < >,
                                 COMPS < > ],
                           HEAD +np ] ]	   
			   > ]. 

Then, I created subtypes for the different three-place verbs subcategorizing for a prepositional object, e.g., for verbs like bitten ask, which require the asked for entity to be introduced by um ‘for’:

nom-acc-um-ditransitive-verb-lex := ditransitive-verb-lex &
  [ ARG-ST < [ LOCAL.CAT.HEAD noun &
                              [ CASE nom ] ],
	     [ LOCAL.CAT.HEAD noun &
                              [ CASE acc ] ],
             [ LOCAL.CAT.HEAD adp &
                              [ CASE um ] ] > ].

The grammar behaved as expected for a test set of positive and negative sentences with one exception, which reveals a caveat of the implementation: the grammar overgenerates with the so called Wechselpräpositionen, which license both accusative and dative, depending on the verb, e.g. erkennen an ‘recognize’ and erinnern an ‘remember’, which require dative and accusative, respectively.
I’ve tried to enforce the case of the objects of the prepositions in the lexical entries of the verbs, but my present knowledge of TDL was not sufficent. Does anyone have an idea? In LFG, this would be quite trivial, e.g., (^ OBL OBJ CASE)= acc, if I remember correctly.
A simpler solution to avoid this overgeneration seems to create different case variants for each one of the Wechselpräpositionen, ana and and for accusative and dative an, respectively. This doesn’t seem very elegant, however.
Another possible solution would be to use a PCASE or PFORM (i.e., preposition form) feature, which some people have used in LFG-based grammars. In this approach, the preposition assigns a CASE value to its object, which is also assigned to the PP and selected by the verb (along the lines of the original case-marking-adp-lex type). The verb, however, also requires its prepositional object to have a specific PCASE value, thus enforcing selection of a specific prepostion.

Thanks for reporting back @leonel ! The CASE of the complement of the adposition can be constrained by the adding a constraint to the COMPS list of the adposition’s type. If the adposition always takes a noun with the same case value, this is straightforward.

Is this for Portuguese though? So the independent nouns (not clitics) actually contrast in case?

@ebender, as in French, NPs are not marked for case in Portuguese, where case also only survives in the pronominal system. The pronominal clitics show a two-way case distinction in the third person (e.g. accusative masculine singular o, dative singular lhe), which collapses in the other two persons (e.g. me 1s and te 2s). Non-clitic first and second person singular personal pronouns have two forms: nominative eu and tu, non-nominative mim and ti, used as object of prepositions:

O cachorro correu para mim.
‘The dog ran to me.’

Since all prepositions assign the same case to their object, it was much easier to implement prepositional objects in Portuguese than it was in German, where one and the same preposition can assign different cases to its object, depending on the verb (which makes it necessary to constrain the required case of the preposition’s object in the lexical entry of the verb). Besides, in the Portuguese grammar, it wasn’t necessary to handle case in the morphology: since there are just a few forms, I’ve implemented them in the lexicon.
By the way, the behavior of the validation system was puzzling to me when I was trying to implement the case-marking adpositions a and de (corresponding to dative and genitive case) with the respective section of the lexicon page of the questionnaire. Differently than with the German grammar, the validation system required the prepositions to be marked as optional, displaying the message and preventing the grammar to be created:

You have case-marking adpositions marked non-optional, but not all core cases are contributed by adpositions, inflection, or lexical types. Please account for the cases: non, acc, trans.

After I had created entries for nominative and accusative clitics, the validation system still required that I account for the “trans” case, which doesn’t appear in my list of cases. So I was forced to mark the two preposions a and de as optional in order to create the grammar (although they aren’t optional in the sense that their realization is required by the verbs that select them).
The Portuguese grammar fragment now handle different types of verbs with prepositional objects, including ditransitives, after I manually edited the portuguese.tdl file along the lines of the German grammar fragment implementation described in my previous commentary, i.e. verbs with prepositional objects select either a or de, both prepositions assign nonnominative case to their object. The code generated by the Grammar Matrix shows the feature CASE-MARKED in some places, which doesn’t seem to be necessary anymore:

nom-intransitive-verb-lex := intransitive-verb-lex &
  [ ARG-ST.FIRST.LOCAL.CAT.HEAD noun &
                                [ CASE nom ],
    SYNSEM.LOCAL.CAT.VAL.SUBJ < [ LOCAL.CAT.HEAD.CASE-MARKED + ] > ].

Thanks and I’m glad you’re getting this working. Regarding the mysterious “trans” case, would you be willing to share the choices file so I can try to see where that is coming from?

As for the feature CASE-MARKED, that might have to do with making sure that the prepositions are there when they are necessary, but I don’t remember all the details of the analysis. It should be documented in Drellishak’s dissertation.

@ebender, I’ve just sent you the choices file by e-mail (if you prefer, I would also be glad to grant you access to the grammar project’s private Github repository).
Yes, the CASE-MARKED feature is very well documented in Drellishak’s dissertation. Its purpose is to avoid redundant marking in languages with mixed marking (p. 57). Since Portuguese doesn’t fit into this category, I’ve manually removed all occurrences of this feature from the TDL files generated by the Grammar Matrix. This didn’t affect the coverage of the grammar in relation to the positive and negative test files.
While the diversity of case-marking patterns handled by Drellishak is amazing, it seems to me that the typologically more trivial cases of case-marking adpositions represented by languages such as Portuguese, German, and French cannot be implemented or are very difficult to implement with the questionnaire. As I said, German is one of his test languages, but only quirky case marking of a verb’s complement with the inflectional dative was accounted for in his test choices file. Despite of that, the questionnaire does provide an excellent backbone that one can manually tweak to account for the patterns of German and Portuguese.