In a thread about a PyDelphin error, @arademaker asked what “test suites” are and how their useful and also about grammar profiling. I’m starting this separate thread so the discussion doesn’t get lost and so knowledgeable people can more easily contribute.
For the question about test suites, @olzama gave a brief description of what test suites are and linked to PyDelphin docs that explain the terminology (which is from the [incr tsdb()] user manual). I also added how profiles help manage the correspondence between inputs and outputs and store performance-related info.
@arademaker then said:
My task is mainly the grammar evaluation. Given a set of sentences I want to determine what the reason when a sentence was not parsed and which sentences generate more readings and why. I suspect some ambiguities are caused by compound terms not considered as such.
So I think it would be useful to lay out the various grammar profiling and development tools and their uses.
- [incr tsdb()] – the original grammar profiling tool; it has the most support for inspecting the competence and performance of grammars by looking at their outputs over test items. It has a GUI from which you can click on individual items and view parse trees, semantics, etc., as well as filter results on TSQL queries and compare to gold profiles. [incr tsdb()] can process (e.g., parse) test suites using a “CPU” such as the LKB, PET, or ACE, and it can be used to produce treebanks.
- PyDelphin – implements the database format defined by [incr tsdb()], which facilitates scripting over profiles; also implements a subset of TSQL queries and models derivation trees and MRS semantics. PyDelphin can also process profiles using ACE or via the web API.
- gTest – scripts built on top of PyDelphin to further support for regression testing, coverage testing, and auditing the well-formedness of semantic representations.
- gDelta – a tool for comparing syntactic differences between the outputs of two versions of a grammar
- Typediff – a tool for exploring the analyses of grammatical phenomena through syntactic derivations
- FFTB – “full forest treebanker”, a standalone treebanking tool built on top of ACE (also can be integrated with [incr tsdb()])
- … more are listed at the ToolsTop wiki
And @arademaker do you have any more specific questions?