Examples of HPSG/MRS (or any syntax/semantics formalism) being used in applications?

I am teaching an introductory course on computational linguistics this quarter. For some of the upcoming lectures, I want to give them examples of syntax and semantics being used in applications, sort of as a way to show them “why it matters,” and to counter the hype that ML is the only way to do things anymore.

I would love if anybody has examples of syntax/semantics being used for NLP tasks.

Relatedly, I know I’ve heard about ERG MAL rules for CALL, are there any good demos/citations for this?

Note: the syntax applications lecture and the semantics application lecture are separate, so while I’m sure it’s the case that most applications involve both, if there are examples that are more syntax focused and some that are more semantics focused, that would be great! But anything is welcome and will help :slight_smile: and it doesn’t just have to be DELPH-IN stuff either, but I expect that most examples I get from this forum will be in that space :slight_smile:

The most recent CALL and CALL-ish citations related to grammars that I know of are:

  1. Morgado da Costa et al. 2020 https://aclanthology.org/2020.lrec-1.46.pdf
  2. Morgado da Costa et al. 2016 https://aclanthology.org/W16-4914.pdf

I am currently working on something similar but unfortunately I don’t have a paper yet.

More generally, I would say that the value of the grammars can be explained by the simple fact that all treebanks come from somewhere. The treebanks that all semantic parsers were trained on were annotated somehow. If they were annotated by people, that’s very expensive and inconsistent, and if your opinion about how things should be annotated changes, you have to hire people again and spend a lot of time updating the treebank. In contrast, if you have a grammar (which is undoubtedly very expensive to build), you can update the treebank faster and with more consistency (assuming it was easy to fix the grammar such that it now reflects your new ideas about what the structure should look like).

Now, because the treebanks produced by the grammar are more consistent, they are probably better training data. See, for example:

  1. Lin et al. 2022 https://par.nsf.gov/servlets/purl/10345521

To summarize, if you want annotated training data at all, there is no way around having it annotated using some kind of formalism. The only question is whether you apply it by hand or build a grammar. Both ways have advantages and disadvantages, the main advantage of the grammar being its consistency and the automatic annotation, and the main disadvantage – the cost of building it and the necessity of an expert maintaining it over the years.

Of course nowadays some people are asking the question of whether annotated training data and even parsing algorithms are useful at all? My opinion about that is, while you may be able to train a neural system on unannotated data and ask it to perform e.g. a parsing task as sequence to sequence, still, as a human, there is hardly a way for you to evaluate the output without having a concrete (formal) way of interpreting the results. In other words, no matter how a system was trained (with or without e.g. a linguistic theory), people will continue reasoning about things with theory. For example, people probe neural systems using data annotated as trees. (I am not aware of someone using HPSG or something similar for probing; I think people use UD normally because it is simple to use. But it is the same idea.)

There is also a wiki page for applications: DelphinApplications · delph-in/docs Wiki · GitHub