Full forest vs -1 wrt treebanking

(Not sure what the right category here would be.)

If I have a profile that was parsed using the -1 option (so, to get the best parse according to the parse ranking model), is it possible to treebank it? It could well contain parses which I don’t want. FFTB requires the full forest option which I think is not compatible with the -1 option.

Perhaps I should be able to treebank using LKB->Annotate? (I am not able to but maybe that’s for an unrelated reason?)

And as for comparing a profile treebanked with FFTB with one that was parsed with the -1 option: I know I can compare the MRSs using pydelphin but suppose I am interested in the derivation? Suppose I simply want to know how close the profile yielded by the parse ranking model is to the one where I manually accepted/rejected things?

G’day,

If I have a profile that was parsed using the -1 option (so, to get the best parse according to the parse ranking model), is it possible to treebank it? It could well contain parses which I don’t want. FFTB requires the full forest option which I think is not compatible with the -1 option.

No. You haven’t stored the other trees so you can’t select between them. You would need to reparse.

Perhaps I should be able to treebank using LKB->Annotate? (I am not able to but maybe that’s for an unrelated reason?)

And as for comparing a profile treebanked with FFTB with one that was parsed with the -1 option: I know I can compare the MRSs using pydelphin but suppose I am interested in the derivation? Suppose I simply want to know how close the profile yielded by the parse ranking model is to the one where I manually accepted/rejected things?

You could use pydelphin to compare derivation trees.

But is there an API for that? That’s my question… I didn’t find any.

Derivations can be tested for equality:

>>> from delphin import itsdb
>>> ts = itsdb.TestSuite("tmp")
>>> item = next(ts.processed_items())
>>> d0 = item.result(0).derivation()
>>> d1 = item.result(1).derivation()
>>> d0 == d0
True
>>> d0 == d1
False

Unfortunately, special methods like __eq__() don’t get automatically documented, and I did not always explain what equality meant in the high-level documentation. You can check the docstrings or code of the involved classes, though:

UDFToken

UDFTerminal

UDFNode

Since comparison does not include parse-specific things like IDs and scores, you should be able to see if two derivations from separate runs are the same. But this is still just boolean equality and doesn’t tell you what are the differences, etc.

1 Like