Why would I sometimes hit this error but not always:
File "/home/olga/delphin/parsing_with_supertagging/venv/lib/python3.8/site-packages/delphin/itsdb.py", line 891, in process
_add_row(self, tablename, data, buffer_size)
File "/home/olga/delphin/parsing_with_supertagging/venv/lib/python3.8/site-packages/delphin/itsdb.py", line 918, in _add_row
File "/home/olga/delphin/parsing_with_supertagging/venv/lib/python3.8/site-packages/delphin/itsdb.py", line 793, in commit
File "/home/olga/delphin/parsing_with_supertagging/venv/lib/python3.8/site-packages/delphin/tsdb.py", line 845, in write
raise NotImplementedError('cannot append to a gzipped file')
NotImplementedError: cannot append to a gzipped file
with ace.ACEParser(grammar, cmdargs=cmdargs, executable=ace_exec, stderr=errf) as parser:
cmdargs can be “-1” or “-1 --ubertagging=000.1”.
With some profiles, the process finishes but with others there is the error above.
on the ERG tsdb profiles (all of which have the same format, namely, item.gz etc)?
When you parse a profile, the item(.gz) file will be read, but not written to. One possibility for the error is that some of the profiles you are working with have been compressed to save space, but may need to be uncompressed before processing.
What @Dan said is partially true for PyDelphin. The .gz files are compressed, but PyDelphin is happy to read and write them transparently (i.e., usually you don’t need to know or care whether the file was gz-compressed). One exception is when PyDelphin is appending to a file on disk rather than writing the whole file anew, because the gzip compression would work better on the whole file than compressing it piecemeal.
There are two questions:
Why is it gz-compressing the files?
Why is it appending to the files?
RE (1), if you aren’t passing the -z or --gzip options, it won’t compress the results, but the exception can still be raised if the profile has already-compressed files. In the snippet below, gzip is the flag to determine if the results will be compressed, and use_gz is a flag to indicate if an existing file is gzipped: