I’d like to come back to @ebender’s question of whether we want lexical threading:
- Lexical threading
- Argument/adjunct distinction
- Identifying structures containing lists (e.g. in coordination)
Pick three out of four.
Monkeys eat and sleep.
Monkeys eat bananas and sleep.
Monkeys eat bananas quickly and sleep.
To summarise why the analyses are incompatible: appends using append-lists are destructive, meaning that the lists being appended are modified (in particular, by using
list-copy and its subtypes). For example, suppose lists A and B are token-identical, and then we have appended other lists to them: D=A+C, and F=B+E, where C=/=E. Doing these appends means that we cannot identify A and B, even though they have the same elements. I will refer to a list that’s been appended to as “dirty”, because they are unsafe to use.
Lexical threading means a word’s
SLASH is the append of its arguments’
SLASHes. This means that
COMPS have dirty
SLASH lists. Adjuncts also introduce appends for the
SLASH lists – this means we don’t even know how dirty the lists are (the number of “generations” of copying, as @sweaglesw discusses here). Coordination identifies the dirty lists of the coordinands, which is the dangerous part. This causes a problem, because coordination also identifies the
SLASHes of the coordinands (which are inside the dirty list, some number of generations away). So we get a cyclic re-entrancy in at least some sentences.
I think the challenge with lexical threading (which also makes it hard to reason about) is that a
SLASH list is not specified locally. We need to look at the whole tree to know what its value is.
Which analysis to drop?
One of the four analyses above needs to be dropped.
This proposal rules out identifying structures containing lists, as currently used in coordination. All appends create a “clean” version of the list alongside the “dirty” one, but this means that care must be taken when manipulating lists. However, I can see that identifying large chunks of structure could be useful. (And this proposal is also moving towards re-creating object-oriented programming inside feature structures, which is a little alarming…). So this would leave the other three options.
I’m now fairly convinced that append-lists would be needed to allow sonata-violin sentences. With diff-lists, it’s impossible to check if a diff-list is nonempty but of an unspecified length. With 0-1-diff-lists, we make the simplifying assumption that if it’s nonempty, it’s of length one – and that’s possible to check for. I think I now understand what @Dan meant in this discussion that it’s “readily understandable to constrain a diff-list of max length one”. If we can’t check that a diff-list is nonempty, taking something off the list is dangerous – it could actually be something appended to the list, instead. For example, an easy-adjective needs to check that its clausal complement has a nonempty
SLASH list, rather than a gap later appended to the list:
An easy piece to play is this sonata.
This sonata is an easy piece to play.
* Is an easy piece to play this sonata.
* An easy piece to play this sonata is.
So that leaves lexical threading and the argument/adjunct distinction. I suspect there’s not much appetite here for treating adjuncts as arguments. We could have a weaker version of the distinction, where adjuncts are treated the same as arguments for the purposes of the
SLASH list. This would make the list an append of an underspecified number of lists… But if lexical threading was already difficult to reason about, this will almost certainly be worse.
So this brings us back to lexical threading. Now, we still need to allow predicates to control how to deal with their arguments’
SLASH lists. So we would need a way to do this while keeping lists clean. This means that we must “delay” appends so that they don’t make lists dirty.
A proposal for lexical threading
This proposal aims to stays close to the lexical threading analysis, but letting us reason about local feature structures only.
SLASH list is for the local structure only. There is also another feature, let’s call it
EXCESS-SLASH, which holds the values in
SLASH that will be passed up to the mother. In most cases, these two features hold the same list. However, sometimes
EXCESS-SLASH will be shorter, because something’s been taken off – for example in a filler-phrase, or by an easy-adjective.
EXCESS-SLASH will be underspecified until the structure is a daughter in a phrase. For head-comp-phrase and head-subj-phrase, the mother’s
SLASH will be the append of the head daugher’s
SLASH and the comp/subj’s
EXCESS-SLASH. However, specifying
EXCESS-SLASH is delegated to the head daughter’s lexical entry. For basic-N-arg, the comp’s
EXCESS-SLASH is identified with its
SLASH. For an easy-adjective, the comp’s first element on
SLASH is used, and the rest of the list is the
Under this proposal, the appends happen “late” (e.g. in head-comp-phrase, or head-subj-phrase), rather than “early” (e.g. in lexical entries). This means we can view the appends in a strict input-output way, so the coordinating-monkey sentences won’t cause a problem: all of the lists are “clean”.
This proposal is vaguely similar to the
TO-BIND feature that @Dan refers to here (but “in reverse”, since
EXCESS-SLASH is what is not bound). However, if I’ve understood correctly, the original
TO-BIND analysis required set-valued features and relational constraints, so it would need adapting for the Delph-in universe, anyway.
A caveat: using deleted-daughters
The above discussion deals with cyclic re-entrancies caused by
list-copy. There are also potential unification failures from
Identifying structures containing lists will still require either: (1) unorthodox use of deleted-daughters; or (2) an extra
RELATIONAL feature to hide the relevant appends far away.
RELATIONAL feature would need to be used sparingly (e.g. not for
CONT lists) or it would need to be introduced in many places, so that different supertypes can all introduce appends without conflicting with each other. I’m leaning towards the unorthodox use of deleted-daughters, but I can see why this is a controversial point. If it’s too controversial, using
RELATIONAL should be okay as long as it’s used sparingly.