Replacing RELS <! with RELS.LIST < : possible issues with TDL parser?

I’ve replaced diff-lists with append-lists (affectionately known as emerson-lists but I think Guy ended up using “append-list”, so) everywhere in the customization system, and am now looking at the regressions.

Does anyone know, off the top of their head, whether I should do anything with respect to this in tdl.py?

On the one hand, if tdl.py knows what to do with regular lists, it should have no problems with append-lists, since those have the same syntax as regular lists (LIST < >). However I am wondering, perhaps there is something about e.g. RELS geometry that I need to worry about. Now, the change (to append-lists) mostly seems to bring no problems across the system, but I do get some, for instance with this sort of lexical rule in the adnominal possession library:

Original code in the customization system:

Updated version:

Now, here’s what is supposed to come out of it in case of one of the tests (look at the middle):

But here’s what comes out instead (note the .null thing):

Is there anything about TDL add() or merge() methods that I should be looking at here? I’ve been debugging but I am quickly losing track because of their recursive structure, lots and lots of seemingly empty embedded children (representing the parentheses, probably?). I think that a merge function fails at a certain point in tdl.py, with this rule, but I yet don’t know why.

To follow up, the problem perhaps is somewhere not in merging but in tokenization or parsing of the initial TDL element. Here’s what I have in the trunk, before any merging:

Screen Shot 2020-01-22 at 10.38.33 AM

There is an element of type “dlist”.

But in my branch, I now get:

A type “feat” instead. I imagine this would break things. It should be “list” or something like that…

Although I doubt that there could be a bug in parsing or tokenization; then most of the tests should’ve broken, not just a few. Must be something I am not noticing in my string that I am passing it?..

Is it somehow in the function that handles +POSSESSUM_EXIST_REL+ and +POSS_REL+?

I thought so too but I actually doubt it, because that’s just python string concatenation. So the result is just a string, shouldn’t make any difference. And I did check; it doesn’t make any difference, the string comes into the TDL module in a proper state, having all the information, just like in the trunk.

I suspect right now that this has to do with RELS.LIST perhaps being an unexpected geometry, from the TDL parser point of view… Though that doesn’t make too much sense either, because like I said, I would then expect all the tests to break. Nonetheless, for this test, for this lexical rule, the TDL parser ends up parsing RELS.LIST as a “feature” rather than as a “list”… While in the trunk, it happily parses RELS <! … as a “dlist”. I am debugging but it is rather difficult of course, given the data structure with lots of parentheses.

And I still think that the most likely explanation is that I am not noticing a mistake in the string itself. Like a TDL syntax mistake, such as a missing paren (it has to be something else in this case though).

To follow up, a list can only be properly parsed by the TDL parser via the TDLparse_conj() method. A list will never be properly parsed by the TDLparse_av method (don’t know for sure what conj and av stand for). The syntax of RELS.LIST, specifically the dot between RELS and LIST, forces this part of the string to go through TDLparse_av() method, at which point LIST will be parsed as a feature and not as a list. For now I am at a loss how the tests which pass do so, because I have RELS.LIST all over the place but most grammars come out fine… Hm. Perhaps I will look at one of the tests which pass, then…

Follow up: no, RELS.LIST on its own can’t be the issue… The parser can still treat LIST < as a list, even if it follows a dot (as I expected). Still, somehow I end up without an element of type list here…

So I do suspect that the problem is in this ordered() method which is implemented for dlist but I am not seeing it for normal lists, indeed there is no type in tdl.py for normal lists (there is for dlists).

But then the merge function seems to expect it to be implemented for both lists and dlists:

        # if the elements are ordered (list or dlist), merge the list
        # items in order.  That is, <a,b,c> + <A,B,C> = <a&A,b&B,c&C>.

Is it possible that nothing in the customization system has been merging normal lists in such a way as to break this? For diff-lists, there is a special class, TDLelem_dlist, and so that would be created for any RELS. Then come merge, the merge function would operate on the elements of the two lists in order.

But with normal lists, there is no special type and so the ordered() method returns False. So then the other branch of code is executed in merging, resulting, I think, in only one relation being merged in, and then finita.

We’ve been using normal lists for valence; anything else? Perhaps we’ve never been merging valence lists in customization much?

Otherwise, the TDL parser actually expects that normal lists are something different from diff-lists; it treats them a feature structures with FIRST and REST in them, and it seems like a special TDLelem_list class is absent on purpose.

I suspect you’re right, and it’s entirely possible that RELS (and maybe HCONS and ICONS) are the only places where we were calling tdl.merge() on a list-like object.

OK, I think this is fixed for now. Not sure I’ve fixed it in the best possible way, and I still need to rerun all 400+ tests and make sure the remaining failures don’t have to do with this, but for now all 91 adnominal possession tests are passing (and they were the ones which exposed the issue, due to RELS added in the middle of other RELS).

What I did is add a special case for when a null REST is being merged with a REST valued as a non-null attribute-value pair (i.e. the value of REST is [FIRST something, REST something].

Previously, such items were considered by the system as not mergeable, and, like in other non-mergeable cases, the system would simply add both items as children of a given node. So, in our case, a null REST would get strung on right before the non-null REST. Which would result in the write() function skipping everything that follows null.

Now, each time two items would be declared unmergeable, the system will check for the special case (null REST being merged with non-null), will remove the null from the parent node’s children and will add the non-null REST instead.

Like I said, I am not 100% sure this is correct but seems to work for the 91 tests for adnominal possession. Will report regarding other tests later…

1 Like

Thanks, Olga!