How do I find the supertype of all the surface forms in surface.smi of the ERG?

I am having a little trouble navigating the SEMI.

I see the files erg.smi, abstract.smi, hierarchy.smi, and surface.smi.

I believe I am a little confused about what the purpose of each file is, and I don’t seem to be able to find the direct supertype of every item I look for.

For example, I was able to figure out that abstract_q is the highest level quantifier type (I believe) which I found in erg.smi. I also found some quantifier information in hierarchy.smi such as the fact that _the_q < def_explicit_q. However I am unable to find what the supertype of def_explicit_q is as it doesn’t have its own supertype in hierarchy.smi that I can find.

Another example, out of curiosity I wanted to see what the supertype of a common noun like _car_n_1 was, but was only able to find the ARG structure in surface.smi and no other information.

I don’t know what facilities there might be for browsing these hierarchies … perhaps someone has built something.

I think it is expected that most (all?) of the “surface” predicates (thus the vast majority of predicates) are not part of a hierarchy.

Would this thread be relevant at all?

If you haven’t yet seen the SemiRfc wiki, I suggest going there to get some familiarity with what the SEM-I contains.

Also, PyDelphin’s delphin.semi module might help with navigating the files. Just load from the top file (erg.smi, since it includes the others). You may see some warnings, but you can ignore these.

>>> from delphin import semi
>>> erg_semi = semi.load('erg/etc/erg.smi')
/home/goodmami/delphin/pydelphin/delphin/semi.py:492: SemIWarning: _be_v_id: property 'NUM' not allowed on 'i'
  warnings.warn(
/home/goodmami/delphin/pydelphin/delphin/semi.py:492: SemIWarning: one+less_a: property 'NUM' not allowed on 'i'
  warnings.warn(
/home/goodmami/delphin/pydelphin/delphin/semi.py:492: SemIWarning: one+more_a: property 'NUM' not allowed on 'i'
  warnings.warn(
/home/goodmami/delphin/pydelphin/delphin/semi.py:492: SemIWarning: poss: property 'NUM' not allowed on 'i'
  warnings.warn(

The SemI.predicates member is a MultiHierarchy object, so you can use methods like ancestors() to find supertypes:

>>> erg_semi.predicates.ancestors('udef_q')
{'abstract_q', 'def_udef_a_q', 'explicit_quant_or_udef_noagr_q', 'both_all_udef_q', 'def_poss_or_barepl_or_prop_q', 'existential_q', 'udef_or_proper_q', 'udef_a_q', '*top*', 'implicit_q'}
>>> erg_semi.predicates.ancestors('_car_n_1')
{'*top*'}

As Emily said, if the predicate does not list any supertypes, it’s not involved in a predicate hierarchy, so it’s supertype is just *top* (this is just a default set by PyDelphin; elsewhere (ACE or LKB, maybe) we might consider string to be the supertype, especially if the predicate is quoted in the grammar’s lexicon).