ERG not generating the string used for parsing on the same MRS

I was trying to generate a string like “the unlit candle,” but I was getting “the unlighted candle.”

I used the ERG to parse “the unlit candle” and got this:

[ TOP: h0
  INDEX: e2 [ e SF: prop ]
  RELS: < [ unknown<0:16> LBL: h1 ARG: x4 [ x PERS: 3 NUM: sg IND: + ] ARG0: e2 ]
          [ _the_q<0:3> LBL: h5 ARG0: x4 RSTR: h6 BODY: h7 ]
          [ _light_v_cause<4:9> LBL: h8 ARG0: e9 [ e SF: prop TENSE: untensed MOOD: indicative PROG: bool PERF: - ] ARG1: i10 ARG2: x4 ]
          [ _un-_a_neg<4:9> LBL: h8 ARG0: i11 ARG1: e9 ]
          [ _candle_n_1<10:16> LBL: h8 ARG0: x4 ] >
  HCONS: < h0 qeq h1 h6 qeq h8 >
  ICONS: < e9 topic x4 > ]

But when I pass this exact MRS back to the ERG for generation I not only still get “the unlighted candle” but I also don’t get “the unlit candle.” Why does it not generate the string that got me this MRS originally?

This surprising behavior is due to a quirk about how both the ACE and LKB generators deal with irregularly inflected verbs such as “light” where there are two valid variants for a particular inflected form. It seems that the generator searches through the contents of the file erg/irregs.tab until it finds the first valid match for the pair of the stem and the lexical rule, then stops and produces the associated surface form. Since that irregs.tab file is organized more or less alphabetically by the spelling of the inflected form, “lighted” is listed before “lit”, so the generator produces “lighted”, but does not keep hunting to also find that “lit” is a valid form. If you really want “lit” (and “unlit”) instead of “lighted”, you can as a hack simply move the “lighted” lines in irregs.tab so they appear below the “lit” ones, and then after recompiling the grammar, you’ll get “lit” but not “lighted”.

We’ll have to discuss this issue with the developers to see if we can agree upon a spec for what the right behavior should be in these cases.