Invalid MRS produced by the Spanish grammar

Started as this thread: MRS, DMRS not displayed in LTDB - #6 by goodmami

So the Spanish grammar produces invalid MRSs.

I don’t really know why this is; is this something to do with the original version of the grammar or with the token mapping that we added?

@goodmami is saying top and list are a problem here. I think CTO and CFROM as well as CARG come from the token mapping machinery. I don’t know what WLINK is… Help, anyone? :slight_smile:

>>> from delphin.codecs import simplemrs
>>> s = """[ LTOP: h0 INDEX: e2 [ e SF: prop TENSE: untensed MOOD: indicative ] RELS: < [ named_rel<-1:-1> LBL: h4 CARG: "pitágoras" WLINK: list CFROM: *top* CTO: *top* ARG0: x3 ARG1: u9 ]  [ "_ladrar_v_rel"<-1:-1> LBL: h1 WLINK: list CFROM: *top* CTO: *top* ARG0: e2 ARG1: x3 ] > HCONS: < h0 qeq h1 > ]"""
>>> m = simplemrs.loads(s)
>>> m
[<MRS object (named _ladrar_v) at 140321187189856>]
>>> print(simplemrs.dumps(m))
[ TOP: h0 INDEX: e2 [ e SF: prop TENSE: untensed MOOD: indicative ] RELS: < [ named<-1:-1> LBL: h4 ARG0: x3 ARG1: u9 CFROM: *top* CTO: *top* WLINK: list CARG: "pitágoras" ] [ _ladrar_v<-1:-1> LBL: h1 ARG0: e2 ARG1: x3 CFROM: *top* CTO: *top* WLINK: list ] > HCONS: < h0 qeq h1 > ]
>>> from delphin import dmrs
>>> dmrs.from_mrs(m[0])
Traceback (most recent call last):
  [...]
ValueError: Invalid variable string: list

Currently in the grammar, in the tmt.tdl:

relation :+
  [ CFROM *top*,
    CTO *top*  ].

In fundamentals.tdl:

; WLINK links semantic relation to input string elements, more or 
; less. This becomes useful whenever a grammar is used in some 
; application.

relation := avm &
  [ LBL handle,
    PRED predsort,
    WLINK list, CFROM *top*, CTO *top* ].

How should I change that?

The ERG seems similar:

 LNK links semantic relation to input string elements
; CFROM and CTO used for characterization

relation_min := *avm*.
relation := relation_min &
  [ PRED predsort,
    LBL handle,
    LNK *list*,
    CFROM *top*,
    CTO *top* ].

So, not sure why in the ERG it’s fine but in the SRG it’s not?..

In the ERG treebanks, I am not seeing that LNK, CTO, CFROM stuff in the MRS:

RELS: < [ proper_q<0:6> LBL: h4 ARG0: x3 [ x PERS: 3 NUM: sg IND: + ] RSTR: h5 BODY: h6 ]  [ named<0:6> LBL: h7 CARG: "Abrams" ARG0: x3 ]  [ _bark_v_1<7:13> LBL: h1 ARG0: e2 ARG1: x3 ] > HCONS: < h0 qeq h1 h5 qeq h7 > 

How do I suppress it similarly in the SRG?..

The following looks relevant in the ERG’s ace/config.tdl:

mrs-deleted-roles := 
  IDIOMP LNK CFROM CTO --PSV
;; starting here, mrs deleted roles left over from old ACE config file
  WLINK PARAMS.

So, adding the following to the SRG’s ace/config.tdl and then recompiling the grammar and the LTDB database fixes the problem:

mrs-deleted-roles := 
  IDIOMP WLINK CFROM CTO --PSV
;; starting here, mrs deleted roles left over from old ACE config file
  WLINK PARAMS.

I don’t know the meaning of the last line though; not sure if it is relevant.

it can be important to check if LKB also has a similar post processing option.

I remember some discussion about the importance of consolidate the config options used by different processors (actually Ace and LKB) are the only two in use right?

I wonder in what step the Link information was projected from the EP arguments to the <n,m> pair right next to the predicate name.

1 Like