When running the newer python3 version of the matrix on a choices file, there are some errors in lexicon.tdl
. It attempts to merge adposition types and then comes up with tdl with invalid syntax as shown:
;;; Case-marking adpositions
in-marker := case-marking-adp-lex &
[ STEM < "mia" & "mi-ng" & "mia" >,
SYNSEM.LOCAL [ CONT [ HOOK [ ICONS-KEY.IARG1 #clause,
CLAUSE-KEY #clause ],
ICONS.LIST < > ],
CAT.HEAD [ CASE in,
CASE-MARKED + ] ] ].
Where the STEM value is invalid.
I compared the python2 matrix version to the python3 version, and I found the difference that’s causing this problem:
adp_type = TDLencode(abbr + '-marker')
if adp_type in adp_type_names:
adp_type = adp_type + '_2'
while True:
if adp_type not in adp_type_names:
break
adp_type = adp_type[:-1] + str(int(adp_type[-1])+1)
adp_type_names.append(adp_type)
typedef = \
adp_type + ' := ' + super_type + ' & \
[ STEM < "' + orth + '" > ].'
lexicon.add(typedef)
Everything from the second line (if adp_type in adp_type_names
) to the line just before setting the typedef (adp_type_names.append(adp_type)
) has been removed in the new version. All this snippet does is create a new adposition type with an underscore+index to prevent it from passing the “if TDLmergeable” test that occurs later which tries merging the types. In the new version because the types have identical names it passes the mergeable test and then merges them into a type with invalid syntax.
So a potential solution would be to just put that snippet back in, but when that snippet is in the code, this is the resulting output:
in-marker := case-marking-adp-lex &
[ STEM < "mia" >,
SYNSEM.LOCAL [ CONT [ HOOK [ ICONS-KEY.IARG1 #clause,
CLAUSE-KEY #clause ],
ICONS <! !> ],
CAT.HEAD [ CASE in,
CASE-MARKED + ] ] ].
in-marker_2 := case-marking-adp-lex &
[ STEM < "mi-ng" >,
SYNSEM.LOCAL [ CONT [ HOOK [ ICONS-KEY.IARG1 #clause,
CLAUSE-KEY #clause ],
ICONS <! !> ],
CAT.HEAD [ CASE in,
CASE-MARKED + ] ] ].
in-marker_3 := case-marking-adp-lex &
[ STEM < "mi-a" >,
SYNSEM.LOCAL [ CONT [ HOOK [ ICONS-KEY.IARG1 #clause,
CLAUSE-KEY #clause ],
ICONS <! !> ],
CAT.HEAD [ CASE in,
CASE-MARKED + ] ] ].
It works, but it seems a little “improper” to me to have these three types as opposed to three lexical entries for the markers that inherit from one “in-marker” type. But, I have no idea how to tackle making an improvement like that. Is it worth bothering? Or should I put the old snippet back in that adds the “_2” etc and leave it at that for now?