I have two functions that use pydelphin to parse items in a tsdb profile. The goal is then to compare the resulting MRSs with the gold (from the ERG tsdb/gold).
I am observing different exact match number between the two functions. Could someone help me spot a bug or offer a possible explanation?
Method 1, with Testsuite
process
:
def run_ace_on_ts(tsuite, grammar, ace_exec, cmdargs, output_path):
ts = itsdb.TestSuite(tsuite)
with open(output_path + '/ace_err.txt', 'w') as errf:
with ace.ACEParser(grammar, cmdargs=cmdargs, executable=ace_exec, stderr=errf) as parser:
ts.process(parser)
id2mrs = {}
for i,res in enumerate(ts['result']):
id = ts['item'][i]['i-id']
id2mrs[id] = simplemrs.decode(res['mrs'])
return id2mrs, len(ts['item'])
This gives me, for a particular experiment:
Parsed 11/25 sentences
2 same, 23 different, 0.08% exact match, 7.843531713485718 sec/sen
Method 2, with Parser
interact
, for each item separately:
def run_ace(tsuite, grammar, ace_exec, cmdargs, output_path, id2gold_mrs):
ts = itsdb.TestSuite(tsuite)
id2mrs = {}
items = list(ts['item'])
responses = []
no_result = []
with open(output_path + '/ace_err.txt', 'w') as errf:
with ace.ACEParser(grammar, cmdargs=cmdargs, executable=ace_exec, stderr=errf) as parser:
for item in items:
response = parser.interact(item['i-input'])
if len(response['results']) == 0:
no_result.append(item['i-input'])
print('*** No parse. ***')
else:
responses.append(response)
id = item['i-id']
id2mrs[id] = simplemrs.decode(response['results'][0]['mrs'])
if id in id2gold_mrs:
if not mrs.is_isomorphic(id2gold_mrs[id], id2mrs[id]):
print('*** Different MRS ***')
else:
print('*** Same MRS ***')
print("Parsed {}/{} sentences".format(len(responses), len(items)))
return id2mrs, len(items)
This way, I get not 2 but 5 exact matches, somehow (note the warning).
Parsed 11/25 sentences
/home/olga/delphin/parsing_with_supertagging/venv/lib/python3.8/site-packages/delphin/dmrs/_operations.py:81: DMRSWarning: unusable TOP: h0
warnings.warn(f'unusable TOP: {top_var}', dmrs.DMRSWarning)
5 same, 20 different, 0.2% exact match, 7.885234594345093 sec/sen
The gold MRS are the same for sure in both cases (they come from the same code and the same location).