What would be a convenient way to parse a corpus into the DM representation using an ERG-based parser? The ERG API does the job if I pass dm: sdp
, but it is not suitable for parsing a large corpus. I read that I can use $LOGONROOT/www
, but I’m encountering some errors running that. Before I open another thread to ask about those errors, I want to first make sure that this is indeed the easiest way to go.
In principle, it should be possible to parse a large corpus using ERG+ace and then export the resulting full structures to the DM format. I think that would involve going through a couple of steps, though, and you probably want to store the parser output in [incr tsdb()] profiles. Using art might help with this. Here’s documentation on ace:
http://moin.delph-in.net/AceTop
Perhaps @sweaglesw can give us a pointer to info on art?
I couldn’t quickly turn up documentation on exporting to DM. If no one on here has a good answer, you can try emailing developers@delph-in.net with that query.
In http://moin.delph-in.net/WeScience there is an example of how to evoke the LOGON tool for converting between formats. I believe https://github.com/cfmrp/mtool provide has also support for conversion between formats.
To complete this question. I was wrong, the mtool does not provide conversion to DM, but in Convert to DM · Issue #122 · delph-in/pydelphin · GitHub we have discussed that issue with pointers provided by Stephan.