Integrating supertagging models with ACE

I am working on a project that intends to apply neural methods to HPSG supertagging and compare the results (in terms of parsing speed) to classic maxent-based supertagging (as in Dridan 2009).

Most of the information on the wiki is about the PET parser. But I know that ACE also supports lexical filtering (the --ubertagging option). But other than mentions of this option, I am not finding much information.

I have trained new maxent and SVM supertagging models on the latest ERG treebank release, as the baseline (using lexical types as tags). The question now is, how to integrate those new models with ACE?

What should be my way forward at this point? I realize there is ACE source code for me to work with, of course, but I wanted to start a discussion here first.

Many thanks for any advice!

If your models are formatted the same as the ones Rebecca Dridan produced and can be interpreted the same way, then it should be easy to load them into ACE too. You would change (and possibly uncomment) the corresponding paths in erg/ace/config.tdl and rebuild the grammar image, then proceed with using the --ubertag option.

If your models are of a fundamentally different shape – surely your neural methods will be, but I’m not sure about the maxent/SVM ones you have just trained – then this route will not be a useful one. For instance, if you want to use ACE to test the effects of your neural supertagging method on parsing speed (and I would hope accuracy), you will need to dive in a bit deeper. The function you would most likely want to replace is called ubertag_lattice(). It is called when the --ubertag option is active, and its job is to perform the forward-backward algorithm for trigram HMMs using the parameters loaded from the model files, which results in a probability score for each token candidate. In Dridan’s model, the candidates consist of a lexeme plus the chain of lexical rules it undergoes. When ubertag_lattice() returns, candidates whose probability is below a certain threshold have been removed from the lattice / chart. I imagine in the long run your methods for computing those scores will be completely different, but the goal is similar.

1 Like

Thank you so much for your response, Woodley!

I will start working on setting up a compiler and a debugger. It should be possible to do on both OSX and Linux, right?

OK… I am a bit confused right now because from Dridan’s thesis, I was under the impression she used off-the-shelf taggers which included the HMM sequence labeling step. But from what you are saying, the sequence labeling step is performed directly in ACE?

(The models which I currently have actually don’t involve HMM for the moment. They are the models which python scikit-learn maxent libraries produce, and they consist just of the coefficients trained for features which include the word context and the previous two tags (as in this paper, for example, which is the paper that I tracked down after I looked at several Clark and Curran papers, which are the papers related to the C&C tagger that Bec used). The decoders I’ve looked at so far to evaluate the taggers intrinsic accuracy are just the decoders included in scikit-learn.)

Either Linux or OSX can be made to work. Getting the correct tool chain and boost regex library is significantly easier under Linux in my experience.

Dridan’s actual work was in the PET universe. What you see in ACE is my replication of her runtime setup. I don’t recall off hand whether she used something premade for deciding in PET or rolled her own.

One option that would perhaps be appealing at least for your experimental setup (possibly less so for a production environment) would be to decouple things so that your decoder can run as a separate program at your leisure and have the pruning function in ACE just read the scores from a file. That would make it easy to cross-tie it with whatever slew of ML toolkit decoders you want to use, at the expense of having to do several steps at parse time, e.g.: (1) run ACE once to get the candidate lexical token lattice and save that out to a file via one replacement of the ubertag_lattice() function, then (2) run your decoder on that lattice file and save results, then (3) run ACE again, this time with an ubertag_lattice() replacement that reads those scores and actually prunes based on them.