Cannot mrs-compare things with CARG time

We are using pydelphin’s delphin.commands compare function to compare profiles, and we are hitting the following problem:

delphin compare ~/delphin/GAUSS/treebanks/experiment/test-full-forest/test-gender-overt/test-long/cow10_test/ ~/delphin/GAUSS/treebanks/experiment/test-1flag/test-gender-overt/test-long/cow10_test/
1529020	<1,0,1>

  line 1, character 951
    [ LTOP: h0 INDEX: e2 [ e SF: prop TENSE: untensed MOOD: indicative ] RELS: < [ focus_d_rel<-1:-1> LBL: h1 ARG0: e4 [ e SF: prop TENSE: untensed MOOD: indicative ] ARG1: e2 ARG2: x5 [ x GEND: m ] ]  [ _su_q_rel<-1:-1> LBL: h6 ARG0: x5 RSTR: h7 BODY: h8 ]  [ poss_rel<-1:-1> LBL: h9 ARG0: i10 ARG1: x5 ARG2: x11 ]  [ pronoun_q_rel<-1:-1> LBL: h12 ARG0: x11 RSTR: h13 BODY: h14 ]  [ pron_rel<-1:-1> LBL: h15 ARG0: x11 ]  [ ord_rel<-1:-1> LBL: h9 CARG: "1" ARG0: e17 [ e SF: prop TENSE: untensed MOOD: indicative ] ARG1: x5 ]  [ "_esposo_n_rel"<-1:-1> LBL: h9 ARG0: x5 ARG1: x18 ]  [ _be_v_id_rel<-1:-1> LBL: h19 ARG0: e2 ARG1: x20 ARG2: x5 ]  [ named_rel<-1:-1> LBL: h21 CARG: "kaynette_williams" ARG0: x20 ARG1: u23 ]  [ _durante_p_temp_rel<-1:-1> LBL: h19 ARG0: i24 ARG1: e2 ARG2: x25 [ x GEND: m ] ]  [ undef_q_rel<-1:-1> LBL: h26 ARG0: x25 RSTR: h27 BODY: h28 ]  [ card_q_rel<-1:-1> LBL: h29 ARG0: i30 ARG1: x25 ]  [ time_n_rel<-1:-1> LBL: h29 CARG: string ARG0: x25 ARG1: u32 ] > HCONS: < h0 qeq h1 h7 qeq h9 h13 qeq h15 h27 qeq h29 > ]
MRSSyntaxError: expected: a string

It looks to me like the problem maybe is in the CARG value part of the MRS for the word anno (year):

Screenshot from 2024-06-06 15-05-46

Does it sound like it could be problematic? And on which side then? The grammar’s or pydelphin’s? Does this CARG value look valid? (I cannot track it down by the way, but I am sure I would if I really needed to.)

It looks like you tracked it down pretty well. This isn’t a bug in PyDelphin, but an intentional decision to expect the values of CARG to be a double-quoted string. I don’t think this is wrong and hasn’t been a problem, but I notice that the MrsRfc wiki differs in its grammar of SimpleMRS:

Value        := Token | QuotedString
Carg         := "CARG" ":" Value

So we could question whether PyDelphin should allow non-string values. Whether or not we think grammars should allow for non-string CARG values, it might make sense for PyDelphin since I don’t think ACE or the LKB are this strict. One complication is whether we can treat unquoted symbol constant values and quoted string ones differently. For example, case sensitivity: are CARG: time and CARG: TIME equivalent? This also means we’d need to serialize them properly again on encoding.

On your side of things, do you think that time is a valid value for the CARG of the time_n predication? Does that capture the meaning of the sentence? To me it looks like an underspecified argument value, like maybe token mapping didn’t work correctly.


You are right, it’s probably an underspecified value, which is why I also couldn’t track down how it got set. I will look into it. Thanks, Michael!