Subject extraction vs Non-Main-Clause rules in info-str library


I am wondering, in the information structure library in the Grammar Matrix, what is the broader context for sometimes using extraction rules in combination with head-filler rules to license various word orders and sometimes special “NMC” (non-main-clause) rules?

For example, the following sentence Ivan is lying down is licensed by a subject extraction (and then a head-filler rule):


In this one, in contrast (Ivan is reading a book), the VP is licensed by a combination of a special “NMC” head-subject rule, as opposed to, say, an object extraction rule AND a subject extraction rule.

Are there theoretical/practical obstacles to utilizing object extraction rules or doing more than one extraction? Does this have to do with the SLASH list length again?

So to follow up on this, in matrix.tdl we currently have these two very similar rules:

(1) The (presumably older) subject extraction type:

; ERB 2004-08-26 Remove [MC -] on mother; probably specific
; to analysis of subject extraction for English.
; ASF 2011-10-05 Added supertype 'head-compositional' to basic-extracted-subj-phrase
; in order to make sure matrix.tdl provides the right semantics for extracted subjects.

basic-extracted-subj-phrase := basic-extracted-arg-phrase & head-compositional &
                           SPR < >,
                           COMPS < > ],
                                             [ LOCAL #local & local &
                                               [ CONT.HOOK.INDEX ref-ind ] ] >,
                                        COMPS olist ],
                                  MC na ],
                      NON-LOCAL.SLASH.LIST < #local > ],
    C-CONT [ RELS <! !>,
             HCONS <! !>,
             ICONS <! !> ] ].

(Note especially the [ MC - ] comment.)

(2) The special rule for the information structure library:

; SSH 2013-04-08 non-canonical and non-matrix clausal head-comp-phrase
; This type cannot be a root node by itself ([MC -]).
; This typed phrase is supposed to be combined with a filler-phrase.
basic-head-comp-nmc-phrase := head-valence-phrase & head-compositional &
                              binary-headed-phrase &
  [ SYNSEM phr-synsem &
       [ LOCAL.CAT [ MC -,
                     VAL [ SUBJ < >,
                           SPR #spr ],
                     POSTHEAD #ph,
                     HC-LIGHT #light ],
         LIGHT #light ],
       [ LOCAL.CAT [ VAL [ SUBJ < [ LOCAL #slash ] >,
                           SPR #spr ],
                     HC-LIGHT #light,
                     POSTHEAD #ph ],
         NON-LOCAL.SLASH 1-dlist & [ LIST < #slash > ] ],
    NON-HEAD-DTR.SYNSEM canonical-synsem,
    C-CONT [ RELS <! !>, HCONS <! !>, ICONS <! !> ] ].

Now, despite its name, the rule above is doing subject extraction in practice. Also in practice, it is used as a supertype for the following sort of head-complement rules:

head-comp-nmc-phrase := basic-head-comp-nmc-phrase & head-initial &
             LOCAL.CAT.VAL.COMPS #comps ],
    HEAD-DTR.SYNSEM [ LOCAL.CAT.VAL.COMPS < #synsem . #comps >,
                      NON-LOCAL.SLASH.LIST < [ CONT.HOOK.ICONS-KEY focus-or-topic ] > ],
    NON-HEAD-DTR.SYNSEM #synsem ].

With this, you get parses like the following, for a language with a basic SVO word order and clause-final focus position (the sentence is Ivan book reads):


Note how there is actually implicit subject extraction happening in this tree. There is no non-branching extraction rule (the second V is the lexical rule) and yet the top S is licensed by the Head-Filler rule (the bottom one by the special Head-Complement which inherits from the special subject extraction).

So again my question is whether this is expected behavior?

Having looked at these rules together with Olga, my question is: What is the motivation for having a single rule that does both subject extraction and complement realization? Is there some reason that using two separate rules would not work for the data this is intended to handle?

…And: does it make sense to talk about subject extraction in the tree for Ivan book reads? Should it not be: (1) complement extraction; (2) head-filler for the complement; (3) normal subject-head?

Agreed – I don’t see anything that looks like subject extraction there. But maybe the info-str library is doing something I’m not tracking.

I cannot quite understand from @sanghoun’s book, but from the choices file, it looks like such an order should not even be admitted (and the fact that it is admitted is probably due to the modifications in my branch).

Here’s the choices file for infostr-foc-svo-final:

  sentence1_orth=CN IV
  sentence2_orth=IV CN
  sentence3_orth=PN TV CN
  sentence4_orth=PN CN TV
  sentence5_orth=CN PN TV
  sentence6_orth=CN TV PN
  sentence7_orth=TV PN CN
  sentence8_orth=TV CN PN

So, something like Russian would not in fact have been modeled with wo=SVO and focus=final in the original library, looks like.

But, I am not sure how transitive constructions are analyzed at all; perhaps the assumption is that the verb cannot be in focus?

So I think that SVO + foc-final should give only:

SVO (no focus)
VOS (subject focused)
SV (no focus)
VS (subject focus)

I don’t know if Sanghoun’s system would include SOV or OSV as verb focus here. We’ll have to wait to hear from him!

It won’t originally but I am wondering whether it should? Or rather, how should a language where it is possible be analyzed under this system? Or are there reasons to say that it is not possible?

SVO (no focus) => underspecified.
Given that this is the basic word order in the language, the object in situ may or may not involve a focus meaning. The others are all right.

In the infostr-foc-svo-final language, SOV and OSV are basically not allowed, because there is no specific motivation for the verb to be placed at the rightmost position.

Thanks, @sanghoun. Do do I understand correctly that you aren’t modeling focus on V?

Also, do you recall what (if anything) you thought the right analysis of positional focus in a free-word order language should be?