Summary of what a non-linguist had to swap in to start to understand ERG

FWIW, I put up a shortish blog post outlining what I’ve had to swap in to understand (as little as I do so far…) how to use the ERG to convert English to some form of logic.

I’m sure I got a lot wrong and misunderstood more, but there it is. Thought it might be interesting for those “on the inside” of ERG to see what it takes for someone with no background in linguistics to come at this cold…

Enjoy! (oh and happy to get feedback or suggestions on any of it as well!)


Thanks for sharing! A few things stood out to me:

parsing English is taking the words in some utterance and turning them into a form a computer can understand.

I know you said this is a “sweeping generalization”, but this seems a bit generous, since the computer isn’t really “understanding” the utterance. Some semantic parsers do go this far, in a limited sense, such as text-to-SQL parsers (e.g., “which students passed the exam?” -> SELECT FROM students AS s JOIN examscores AS e ON = e.studentid WHERE e.score >= 60, assuming the computer understands the SQL statement) but their applicability is extremely narrow. Maybe “…a form a computer can reason about” or “…deal with”?

Also you seem to be conflating or glossing over syntactic vs semantic parsing, which is probably fine for something informal like a blog post. In some frameworks, like how HPSG is used in DELPH-IN, these happen simultaneously. Others may do it as a transformation of the syntax or by ignoring syntax and going straight to some meaning representation.

We still haven’t fully figured it out.

I don’t think that’s a plausible goal, even for one language. Never mind that there are many varieties or dialects and that these evolve over time, but the idea that there is something there to “figure out”, such that all linguists would be in agreement of its veracity, is dubious. Rather, we are in the business of modeling language (at least the bits that are important to us), and, as the aphorism goes, all models are wrong, but some are useful.

Regarding (Neo) Davidsonian semantics, my understanding is that Donald Davidson’s version, dubbed “Davidsonian semantics”, introduced the event variable but that verbs are still like frames in that they have fixed places for their arguments (e.g., stab(e, Brutus, Caesar)), but Neo-Davidsonian (by Terry Parsons) decoupled arguments from their verbs (e.g., stab(e) ^ AGENT(e, Brutus) ^ PATIENT(e, Caesar)). This made it easier to attribute other information that may not be part of some fixed frame (e.g., INSTRUMENT(e, knife), etc.). Under this view, MRS is more like Davidsonian semantics, although there’s RMRS and possibly the more recent DMRS which are more like Neo-Davidsonian semantics.

BTW, I’ve always thought the closed world of a game would be a good testbed for actually applying our semantic representations in natural language understanding or generation tasks. I’m interested to see what you come up with.

1 Like

Can I just say: I love this discussion!!!


I agree with everything @goodmami said. I want to add that the analogy of programming language syntax and semantics to natural language syntax and semantics is very weak. Some examples:

  1. Programming language syntax is designed, and in particular is usually designed to be unambiguous and fast. Natural language syntax is none of these things. Natural language syntax is often extremely ambiguous (a common example is “time flies like an arrow”; is “time” a noun, verb, or adjective? Is “flies” a verb or noun? Etc.). Parsing natural language syntax is usually considered an O(N^3) problem, and understanding how humans do it so quickly and learn to do it so quickly has been studied for 50 years in cognitive science and neuroscience.
  2. Programming language semantics is actually quite simple. There are a handful of core operations (logic, math, memory, etc) that are assembled almost exclusively through a single reference mechanism to interpret and execute code.
  3. Natural language semantics is very complex. Linguistic semantics is typically divided into two categories: compositional semantics and lexical semantics. The ERG is almost exclusively concerned with compositional semantics, which is basically what effect does the syntax have on the semantics on the utterance (although the boundary is a little blurry with words like “everyone” and “is”). When we talk about the “meaning” of “red hat” being red(hat), this is compositional semantics. Lexical semantics is concerned with the meaning of the words themselves: what does “hat” mean, what does “red” mean, what does “time” mean; this is more in line with the dictionary definition. The largest lexical semantics project (at least in computational linguistics and probably in general) is WordNet, which is a concept graph of words, their senses, and how each sense is related to other senses (e.g. “lion_1” is_a “cat_1”). These two categories also only cover to what is typically referred to as “sentence meaning” and not “speaker meaning”, which is studied in pragmatics and discourse studies, and is concerned about things like intent, sarcasm, positioning, social status, etc.

Thanks for the feedback @goodmami!

  • Agree that “understanding” is a loaded word…I changed it to “turning them into a form a computer can act on”.
  • Agree I am not being precise with syntactic vs semantic distinction, did a few fixes which should solve that.
  • Re: having “figured out” how to represent the semantics of language: I changed it to: “We still don’t have complete set of rules that map language to a meaning that agrees with the mapping most humans would do. Furthermore, whether it is even possible for all of language is not obvious. That said, lots of practical progress has been made…”
  • Thanks for the clarification on the (Neo) vs Davidson. I’ll add that as well.
  • Finally: yes, I’m hopeful the combination of scoping to a closed world plus the fact that it is a game and I have the freedom to fudge things means I can come up with something interesting. I’ll definitely post here when I’ve got something worth looking at.

Great points @trimblet. I by no means meant to imply that this was easy by making the comparison! Just that, conceptually, that is what is going on and I’m targeting other programmers who aren’t linguists so it seemed like a good way to frame the problem. Here’s what I added to clarify: “The analogy is by no means perfect, kind of like comparing a cheetah and a car when trying to understand movement. Human language is organic and fuzzy and resilient, the C++ language is designed and precise and brittle. Regardless, the point is that, in my game, the user is going to type English and the game needs to do something and show that it understood things in a deep way (not based on keywords, etc.), much like what a compiler does with program text.”

Also, appreciate the breakdown of compositional vs. lexical semantics and speaker vs. sentence meaning. I will have to drill into those eventually, for sure. Just finding the right names for concepts seems to be half the battle…

Yes you can. I also like the see this perspective of newcomers. The hardest part is always to explain what we do to others.

What is even more interesting is why humans learn natural language easily than programming languages (I believe I read it somewhere, probably some text from @ebender).

@EricZinda I’ve actually been toying with making a game using a Delph-In grammar for a natural language interface myself. I’d be happy to discuss with you more via chat or a call or something. I brought up the speaker meaning vs sentence meaning difference in large part because from the game point of view, the ERG is only going to go get you so far in that it’s semantics is not like C++’s semantics. You’ll need something to handle lexical semantics (in NLU or dialogue systems this might be called entity resolution or entity linking) and something to handle speaker meaning (whether this is just like sarcasm/integrity detection or something more robust).

@trimblet Awesome! I’ll message you and set something up.