I’d like to report some partial DMRS comparison metric, and I understand the pydelphin edm library is exactly for that.
I am trying to use it and I am a bit confused; perhaps I am doing something wrong or am interpreting the results wrong.
I am trying two ways: command line and API. I have some experimental results and the gold is the ERG treebanks which are under trunk/tsdb/gold. II am using pydelphin’s ACE wrapper to parse the profile. (On a related note, I am seeing not identical results from using pydelphin’s Testsuite process
function vs. Parser interact
on each individual item in the profile… But that is perhaps another question.)
With the edm
:
Experiment 1
Parsed 11/25 sentences
Coverage: 0.44
2 same, 23 different, 0.08% exact match, 7.843531713485718 sec/sen
EDM: P = 0.2293354943273906, R = 0.06495294927702548, F = 0.10123412627436952
The edm numbers are from using edm
through the API, where gold_mrs and results are dicts of item id to mrs:
gold_dmrs = [dmrs.from_mrs(gm) for gm in gold_mrs.values()]
results_dmrs = [dmrs.from_mrs(r) for r in results.values()]
edm_p, edm_r, edm_f = edm.compute(gold_dmrs, results_dmrs)
Strangely, when I try to use edm
through command line—on the same gold and test profile— I get:
Precision: 0.9384615384615385
Recall: 0.265886671254875
F-score: 0.41437254200929563
I do get warnings, both through the API and through command line:
EDSWarning: broken handle constraint: <HCons object (h0 qeq h1) at 140114977756880>
warnings.warn(
/home/olga/delphin/parsing_with_supertagging/venv/lib/python3.8/site-packages/delphin/eds/_operations.py:77: EDSWarning: broken handle constraint: <HCons object (h0 qeq h1) at 140114977752944>
warnings.warn(
Do the warnings perhaps mean that I cannot trust the numbers unless I get rid of the warnings? Or is command line the only reliable way to obtain the edm metric?..