Re-treebanking SRG: figuring out how to work with old decisions using fftb

I am looking at the Spanish MRS test suite treebanked long time ago with the old version of the grammar and re-parsed now with ACE using a new version of Freeling. So, the grammar was changed with respect to Freeling tags, and token mapping and lexical filtering was added. (Please look at other posts in the SRG category for context, if needed.)

FFTB can use some of the old treebanking decisions but not others, so I am going through the test suite item-by-item, trying to figure out the right process.

Suppose I am looking at the sentence Montse apostó con Núria un cigarillo a que llovía (‘Montse bet Nuria a cigarette that it rained.’)

I do have a treebanking decision for this one, which I can extract from the old test suite using LKB+tsdb:

Screen Shot 2023-01-10 at 3.26.36 PM

So I can use this as a reference.

Here’s what FFTB shows me upon clicking on the item:

(1) First of all, what does this picture mean in terms of the discriminants being on/off? What is confusing to me is, they seem to all be “on” but I have 186 trees remaining, and if I turn them off, I actually get fewer trees? Do I get the meaning of on/off backwards somehow?..

(2) In any case, what should be my steps here? I tried to add manual decisions to the existing ones; is this what I should be doing? I am assuming so?.. But maybe not?.. After all, if there was a decision but the item appears as unannotated after attempting an “Automatic update”, it means something did not go as expected, so, perhaps I should not be adding anything manually at this point?

(3) Assuming it is fine for me to select from the remaining trees, I can arrive at the following point, using the old reference:

(4) From here, I believe I should be able to select e.g. the hd-pt_c, which is certainly correct, and it promises my just one remaining tree. But if I click on that, I get:

What is it telling me? That something is wrong with the rule named hd_advnp-pp_c? That’s a syntactic rule, from srules.tdl, I don’t see how it would be affected by the inflectional rule tag update…

But I also don’t think that rule is in the reference decision… Does this indicate the correct parse is simply missing from this new forest?

Could someone guide me a little bit here in terms of the process: am I doing at least some steps correctly? :slight_smile: What else should I be doing, or what should I be doing alternatively?.. Many thanks in advance.

Let’s take your questions about treebanking with FFTB one at a time:
(1) The display in the right column shows each saved discriminant with the on/off toggle, where the blue label is the clickable one; hence the discriminant is off until you click the “on” to turn it on. So it is right that as you turn each one on, your forest of remaining parses can shrink as you observed.
(2) It is possible that the existing discriminants would be enough to fully disambiguate, if you turn each of them on, but it is more reliable and often faster to ignore those old ones, and instead to make new choices from the left side of the display.
(3) As you discovered, not every discriminant is guaranteed to lead to a good analysis, because the parser uses packing of edges for efficiency when constructing the parse forest, and when unpacking, some of these packed edges in one cell of the parse chart can fail to unify with other constraints packed elsewhere in the chart. In your example, the construction hd_advnp-pp_c is apparently incompatible with the analysis you want, and it appears from the error message that you selected the discriminant for the unary chain that included both hd-pt_c and hd_advnp-pp_c, instead of just the discriminant hd-pt_c as you intended.
If you did select the one you intended, then it may be that the analysis you want is not in the forest for some reason. It sometimes helps to try going either “top down” (choosing the discriminants with longer spans first) or “bottom up” (choosing the shortest spans first), since your first guess about the right analysis might not be what the grammarian intended.

1 Like

Thank you very much, @Dan ! Would it make sense (is it possible) to turn packing off for the treebanking?