An unforeseen conflict: Lexical threading, adjuncts, coordination, and append lists

I’d like to come back to @ebender’s question of whether we want lexical threading:

After looking into this further (thanks @sweaglesw for the discussion, @olzama for looking into lexical threading, and @Dan for your summary), I think there’s a choice between:

  • Lexical threading
  • Argument/adjunct distinction
  • Identifying structures containing lists (e.g. in coordination)
  • Append-lists

Pick three out of four.

Conflicting Analyses

The problem comes with the coordinating-monkey sentences posted by @ebender, and expanded on by @sweaglesw and myself. For example:

Monkeys eat and sleep.
Monkeys eat bananas and sleep.
Monkeys eat bananas quickly and sleep.

To summarise why the analyses are incompatible: appends using append-lists are destructive, meaning that the lists being appended are modified (in particular, by using list-copy and its subtypes). For example, suppose lists A and B are token-identical, and then we have appended other lists to them: D=A+C, and F=B+E, where C=/=E. Doing these appends means that we cannot identify A and B, even though they have the same elements. I will refer to a list that’s been appended to as “dirty”, because they are unsafe to use.

Lexical threading means a word’s SLASH is the append of its arguments’ SLASHes. This means that SUBJ and COMPS have dirty SLASH lists. Adjuncts also introduce appends for the SLASH lists – this means we don’t even know how dirty the lists are (the number of “generations” of copying, as @sweaglesw discusses here). Coordination identifies the dirty lists of the coordinands, which is the dangerous part. This causes a problem, because coordination also identifies the SLASHes of the coordinands (which are inside the dirty list, some number of generations away). So we get a cyclic re-entrancy in at least some sentences.

I think the challenge with lexical threading (which also makes it hard to reason about) is that a SLASH list is not specified locally. We need to look at the whole tree to know what its value is.

Which analysis to drop?

One of the four analyses above needs to be dropped.

This proposal rules out identifying structures containing lists, as currently used in coordination. All appends create a “clean” version of the list alongside the “dirty” one, but this means that care must be taken when manipulating lists. However, I can see that identifying large chunks of structure could be useful. (And this proposal is also moving towards re-creating object-oriented programming inside feature structures, which is a little alarming…). So this would leave the other three options.

I’m now fairly convinced that append-lists would be needed to allow sonata-violin sentences. With diff-lists, it’s impossible to check if a diff-list is nonempty but of an unspecified length. With 0-1-diff-lists, we make the simplifying assumption that if it’s nonempty, it’s of length one – and that’s possible to check for. I think I now understand what @Dan meant in this discussion that it’s “readily understandable to constrain a diff-list of max length one”. If we can’t check that a diff-list is nonempty, taking something off the list is dangerous – it could actually be something appended to the list, instead. For example, an easy-adjective needs to check that its clausal complement has a nonempty SLASH list, rather than a gap later appended to the list:

An easy piece to play is this sonata.
This sonata is an easy piece to play.
* Is an easy piece to play this sonata.
* An easy piece to play this sonata is.

So that leaves lexical threading and the argument/adjunct distinction. I suspect there’s not much appetite here for treating adjuncts as arguments. We could have a weaker version of the distinction, where adjuncts are treated the same as arguments for the purposes of the SLASH list. This would make the list an append of an underspecified number of lists… But if lexical threading was already difficult to reason about, this will almost certainly be worse.

So this brings us back to lexical threading. Now, we still need to allow predicates to control how to deal with their arguments’ SLASH lists. So we would need a way to do this while keeping lists clean. This means that we must “delay” appends so that they don’t make lists dirty.

A proposal for lexical threading

This proposal aims to stays close to the lexical threading analysis, but letting us reason about local feature structures only.

Each SLASH list is for the local structure only. There is also another feature, let’s call it EXCESS-SLASH, which holds the values in SLASH that will be passed up to the mother. In most cases, these two features hold the same list. However, sometimes EXCESS-SLASH will be shorter, because something’s been taken off – for example in a filler-phrase, or by an easy-adjective.

The EXCESS-SLASH will be underspecified until the structure is a daughter in a phrase. For head-comp-phrase and head-subj-phrase, the mother’s SLASH will be the append of the head daugher’s SLASH and the comp/subj’s EXCESS-SLASH. However, specifying EXCESS-SLASH is delegated to the head daughter’s lexical entry. For basic-N-arg, the comp’s EXCESS-SLASH is identified with its SLASH. For an easy-adjective, the comp’s first element on SLASH is used, and the rest of the list is the EXCESS-SLASH.

Under this proposal, the appends happen “late” (e.g. in head-comp-phrase, or head-subj-phrase), rather than “early” (e.g. in lexical entries). This means we can view the appends in a strict input-output way, so the coordinating-monkey sentences won’t cause a problem: all of the lists are “clean”.

This proposal is vaguely similar to the TO-BIND feature that @Dan refers to here (but “in reverse”, since EXCESS-SLASH is what is not bound). However, if I’ve understood correctly, the original TO-BIND analysis required set-valued features and relational constraints, so it would need adapting for the Delph-in universe, anyway.

A caveat: using deleted-daughters

The above discussion deals with cyclic re-entrancies caused by list-copy. There are also potential unification failures from append-list

Identifying structures containing lists will still require either: (1) unorthodox use of deleted-daughters; or (2) an extra RELATIONAL feature to hide the relevant appends far away.

However, the RELATIONAL feature would need to be used sparingly (e.g. not for CONT lists) or it would need to be introduced in many places, so that different supertypes can all introduce appends without conflicting with each other. I’m leaning towards the unorthodox use of deleted-daughters, but I can see why this is a controversial point. If it’s too controversial, using RELATIONAL should be okay as long as it’s used sparingly.

1 Like

Thank you, @guyemerson for laying this all out! It’s always intriguing to find cases where n-1 out of n things at most can be maintained. (Similar to how AAVE copulaless sentences with long-distance dependencies show, as far as I can tell, that we can either avoid a phonologically empty copula or phonologically empty traces…)

If lexical threading is what we are dropping, I wonder if it would be better to drop it all together. As I understand from the previous discussion, it’s the easy adjectives that require it: that is we want them to be able to halt the propagation of one SLASH element from their complement. If the head-comp rule just appends SLASH values of its daughters, we have a problem. But we could abandon the idea of having just one head-comp rule and propose a special head-comp-like construction for combining easy adjectives and their complements. There’s a loss of elegance here … for languages with easy adjectives at least, but maybe that’s an easier cost to bear (in our constructiony universe) than complicated lexical threading?

Your sonata violin examples and your point about popping an item from a SLASH list that is currently empty but later will be appended to is a good one. To me it seems that the grief people have with diff-lists comes in two categories: (1) they are somewhat error prone, and (2) it’s hard to say things about their lengths.

You said in the discussion that you linked to that you can’t assert that a diff-list is empty without preventing future appends to it, if I understand the transcript correctly. Why would that be? If you make the assertion by stating [ LIST null ] then it is true that you have blocked future nonempty appends, but if you make the assertion by stating [ LIST #1, LAST #1 ] then the list can still be appended to. The more troubling problem is the other direction: asserting that a diff-list is nonempty. We would like to be able to say [ LIST #1, LAST not #1 ], but there is no such functionality in typed feature structures.

Here is a modest proposal which, to my tired (1:30am) brain, seems like it solves that problem at least for the common case where you want to remove an item from the SLASH list but fail if there isn’t one to be removed. I exploit the fact that unification fails when a cycle is created.

non-empty-checker-cons := cons & [ NOT-EQUAL-TO list ].

pop-non-empty-diff-list := diff-list & [ LIST [ NOT-EQUAL-TO #1 ], LAST #1 ].

If pop-non-empty-diff-list is unified with an empty diff-list, the resulting structure will be subsumed by [ LIST #1 [ NON-EMPTY #1 ], LAST #1 ], which is cyclic, causing a failure. The actual length of the list could be anything greater than zero. This test is, of course, destructive; the first cons cell of the list to be checked is tainted in the resulting structure. Trouble could hypothetically result if it was left on the diff-list, something was subsequently appended to the end of the diff-list, and then it was tested for non-emptiness again. In the case where that element is to be immediately popped off of the list though, it doesn’t matter, since that cell won’t be part of any further tests.

I have not tested this idea. It might be wrong, and it does nothing to address the other shortcomings of diff-lists. :slight_smile:

1 Like

I agree! I think we neglected identifying LIST and LAST in that discussion. It’s not very user-friendly, but it’s possible.

Interesting! This would also block a list of “negative” length (where LAST points before LIST).

I guess the claim should be stated more carefully as: it’s not possible to check if a diff-list is non-empty, unless we introduce a computation type.

Well, I didn’t think it would matter that cons-copy is destructive… As you say, it should be okay in this use case, and it’s a little “safer” than cons-copy in the sense that checking for non-emptiness is rare (unlike appending, which happens a lot), but I wouldn’t rule out a conflict with some other construction. Under the above proposal, the lexical entry for “easy” has a COMPS list containing something with a dirty SLASH list. So we have a small piece of computation that is exposed. Nothing comes to mind that would cause a problem, but it’s hard to say that there couldn’t be a problem somehow.

My (admittedly naïve; I’ve never really gotten the hang of working with diff-lists) understanding of [ LIST #1, LAST #1 ] is that it doesn’t actually enforce the list being empty – because you could unify in something else. Does that always necessarily give a cycle?

@sweaglesw’s proposals (for checking emptiness and non-emptiness) both rely on a division of labour between the diff-list and the list it contains. Some things need to be done at the level of the list and other things need to be done at the level of the diff-list. To look at the actual elements, you have to look inside the list, but to check for emptiness or non-emptiness, you have to manipulate the diff-list.

For both kinds of check, this is assuming that the diff-list is well-formed (it was constructed from diff-lists with LAST pointing at the right place, e.g. using the <! !> syntax). For example, if you have <! *top* !> and you unify it with [ LIST #1, LAST #1 ], you will get a cycle. Similarly, if you have <!!> and unify it with pop-non-empty-diff-list above, you will get a cycle.

The division of labour means that you have to be careful, e.g. [ LIST #1, LAST #1 ] doesn’t mean that LIST is null. And for pop-non-empty-diff-list, the check is destructive. Coming back to @sweaglesw’s two kinds of problems:

Using [ LIST #1, LAST #1 ] and pop-non-empty-diff-list deal with the second point but not the first.

That’s certainly true. It’s (as we have seen) hard to make predictions about how troublesome a potential problem will actually be.

Unifying in a non-empty non-broken diff-list will produce a cycle. However, as Guy pointed out, it doesn’t force LIST to be null, so you could unify [ LIST #1, LAST #1] with [ LIST < blah > ] without producing a failure, because its LAST is unspecified. Unifying with <! blah !>, however, will fail.

I have recently been thinking about this from the perspective of the (D)MRS algebra. Implicitly, lexical threading of SLASH means that (D)MRS composition can manipulate SLASH. In normal composition, one (D)MRS is the functor and the other is the argument, and we can only manipulate the argument via its HOOK – the three elements of the HOOK (INDEX, TOP, XARG) can be unified with slots in the functor. With lexical threading, we’re manipulating both HOOK and SLASH.

So from the point of view of the algebra, it might be preferable not to let the functor directly manipulate SLASH. We already have a mechanism for relabelling slots (e.g. in a passive construction), and we could do something similar with SLASH, before composition. The idea is that, in a construction like “easy to try to replace”, the composition of “easy” and “to try to replace” will actually happen in two steps: the first step involves relabelling the semantics of “to try to replace” (so that what was at the top of the SLASH list is now in the HOOK), and then the second step is normal composition. The benefit is that SLASH propagation can now be treated uniformly across all composition, so lexical threading isn’t necessary. Instead, “easy” will have a type that licenses this construction. The construction could be treated syntactically as two rules (a special unary rule for “to try to replace” and then the normal head-complement rule), or a single rule (a special head-complement rule that bundles the two semantic operations together).

I think this would follow @ebender’s suggestion to drop lexical threading altogether, and introduce a special head-comp-like construction – but in a way that doesn’t complicate the algebra. In fact, by viewing it as relabelling plus normal composition, the algebra can be simpler.