Constraining a unary rule to only apply to temporal expressions

In the Spanish Resource Grammar, there is a rule whose purpose it clearly to only apply to phrases like hace tres semanas (ans similar), which means “for three weeks” (I have been here for three weeks, and similar).

pp_npadv_phrase := basic-npadv-phrase &
  [  SYNSEM.LOCAL.CAT.HEAD.MOD < [ LOCAL.CAT.HEAD verb ] >,
     ARGS < [ SYNSEM.LOCAL [ CAT [ HEAD verb & 
                                        [ VFORM fin ] ],
			     CONT [ HOOK.INDEX event,
                                    RELS.LIST.FIRST arg2-ev-relation & 
                                                    [ PRED "_hacer_v_rel",
                                                      ARG2 #arg2 & [ SORT tmp ] ] ] ] ] >,
    C-CONT.RELS <! prep-relation &
                   [ ARG2 #arg2 ] !> ]
"""
; hace tres semanas ...
""".

From the constraints on the ARGS, it is very clear that the purpose of the rule is very specific (there is also the comment, but I think with the ARGS we can be absolutely sure).

However, this rule causes massive overgeneration of edges and also of derivations. So, something in it is not working in the sense that it is happy to apply in many cases when there is no PRED hacer or SORT tmp involved.

For example, the one-word sentence imaginense (Imagine!) gets edges like these ones:

Screenshot from 2024-02-29 16-42-12

The top node there is the pp_npadv rule that applies to optsubs that applies to optcomp that applies to the adverb extraction rule that applies to the string of lexical rules.

I am assuming that the problem is some mess with the RELS list? If I look at the pp_npadv rule separately, I indeed see that it expects, as its daughter:

If I look at the daughter of the ADV rule in the edge and the lower daughters, I don’t see any RELS list until I reach the lexical rules:

Extracted adverb rule:

The highest lexical rule:

Screenshot from 2024-02-29 16-55-28

So, perhaps the extracted adverb rule loses the RELS and that messes up the grammar?

Here’s the relevant types from the grammar:

basic-extracted-adj-phrase := head-mod-phrase & head-only & phrasal & 
  [ SYNSEM [ LOCAL.CAT [ POSTHEAD #ph,
                         MC #mc,
                         VAL.SPR #spr ],
	     NON-LOCAL.SLASH 1-dlist &
                             <! [ COORD -,
                                  CAT [ HEAD.MOD < [ LOCAL intersective-mod &
                                                           [ CAT [ POSTHEAD #ph,
                                                                   MC #mc,
                                                                   HEAD #head,
                                                                   VAL #val ],
                                                             CONT.HOOK #hook,
                                                             CTXT #ctxt ] ] >,
                                        VAL [ SUBJ olist,
                                              COMPS olist ] ] ] !> ],
    HEAD-DTR.SYNSEM [ LIGHT +,
                      LOCAL [ CAT [ POSTHEAD #ph,
                                    MC #mc,
                                    HEAD #head,
                                    VAL #val & [ SPR #spr ] ],
                              CONT.HOOK #hook,
                              CTXT #ctxt ],
                      NON-LOCAL.SLASH 0-dlist ],
    C-CONT [ HOOK #hook,
	     RELS <! !>,
	     HCONS <! !> ] ]
"""
; verbal adjunct are extracted before canceling the complement because "ayer cumplió usted años"
""".

extadj-v_phrase := basic-extracted-adj-phrase &
  [ SYNSEM [ SLSHD -, 
             LOCAL.CAT [ HEAD.KEYS.KEY v_event_rel,
                         VAL.SUBJ #subj ],
             NON-LOCAL.SLASH <! [ CAT.HEAD prep_or_modnp & 
                                           [ CASE not-nom,
                                             MOD < [ LOCAL.CAT.HEAD verb ] > ] ] !> ],
    HEAD-DTR.SYNSEM [ MODIFIED notmod,
                      LOCAL.CAT [ HEAD verb & [ AUX -, MOD < > ],
                                  VAL.SUBJ #subj ] ] ].

Comparing it to the similar matrix types, I am not seeing any missing constraints related to RELS… So, does it seem like I am on the right track?..

Several issues arise here.
(1) The pp_npadv_phrase rule wrongly (and fatally) tries to constrain its daughter by requiring that the first element of its RELS list be an EP with a specific PRED value. This won’t ever work, because for efficiency the ACE parser uses packing of edges, necessarily ignoring the RELS lists of the edges it packs, so you rightly won’t see anything in RELS lists in edges in the charts, which are all packed. This rule is simply badly constructed, doomed to overgenerate with either the LKB or ACE when parsing with packing, which is what they both always do because parsing would be way too inefficient otherwise (you might of course explicitly turn off packing in order to debug something in the grammar, but at high cost in time and space). You will need to redesign this rule, but comment it out for now, since it will always apply to every verb phrase.
(2) You say one of the rules in your derivation tree is an adverb extraction rule, which presumably produces a phrase with a non-empty SLASH value. But your root constraints (which determine legal outputs from the parser) should all require that SLASH be empty, so it would seem that maybe the pp_npadv_phrase rule (or more generally the basic-npadv-phrase type) neglects to propagate its daughter’s SLASH value up to the mother. You might check that parent phrase type, which you will want to be right for its other instances even after you comment out the pp_npadv_phrase rule.

2 Likes

Thank you, @Dan ! That’s what I suspected (that it doesn’t even work as intended). I will disable it.

The basic np-adv phrase does copy up all nonlocal features, but the edge above doesn’t result in a parse either. It’s one of those edges that never result in anything.

Going back to the original rule, how would you design something like “hace tres semanas”? I suppose I can create a lexical entry that is a preposition, and then that could form a normal PP modifier. Freeling analyzes “hace” as a verb in all cases but I can override that.

Update: I have added such a lexical entry, and it works, however it slows down the parsing a lot. Probably because it creates ambiguity with a very frequent verb “hacer”…

The performance drop was simply because I commented out the deleted daughters setting in ACE’s config file.

So, I implemented the following solution:

  1. removed the broken rule pp_npadv_phrase altogether;

  2. added the following lexical entry:

hace_p := p_np_i-tmp-vm_native_le & 
[ STEM < "hacer" > ]
"""
"hace tres semanas"
""".
  1. Added the following exception to the SRG-Freeling interface (srg-freeling.dat) in the ReplaceAll section:
hace hace SP hacer VMIP3S0

this just means any occurrence of hace will be given both SP and VMIP3S0 tags (preposition and verb3sg present) and actual Freeling’s output will be ignored.

This results in better analyses for constructions with hace and in about 25% performance improvement on TIBIDABO sections 5 and 6. I did not test on the other sections.

Your solution looks reasonable, and the performance improvement is great. Maybe you can find one or two more such cases :).

1 Like