Choices data structure and loading a python dict into ChoiceDict


#1

I would like to write a custom constructor for the ChoiceDict class in the Matrix’s Choices.py (I am working with a copy of that for MOM but I could eventually merge it into the Matrix’s copy if it seems useful).

Currently, ChoiceDict, which it the outer structure for any choices file as it gets loaded, can either be initialized empty or from a choices file. If it is initialized from a choices file, then it is assumed that on the LHS of every line there will be a full dictionary key (such as “verb-pc1_lrt1_lri1_feat1_name”).

I need a constructor which would accept a python dictionary (to remind, a ChoiceDict inherits from dict but is not a dict; it does various clever things with counting (and by the same token with adding and removing items) which drive me mad but I think this is needed for proper interaction with the website or something; similarly a ChoiceList inherits from list but is not a list).

I wrote an iterative function which assumes the morphotactics choices structure as the deepest: a position class (a dict) can contain a list of lexical rule types which can in turn contain lists of things (features and instances). So, three-level deep.

Questions:

(1) Is there a deeper possible structure?

(2) What would be the base case if I wanted to write a recursive function? I first thought that the deepest level won’t have a “name” attribute but for example features do… Perhaps it’s just the fact that “there are no more lists among the members of this dict”? Doesn’t sound like a very satisfying base case…


#2

As for the recursive solution, here’s what I ended up writing for now:

    def dict2ChoiceDict(self,list_of_python_dicts):
        mc = MatrixChoices.ChoiceDict()
        for d in list_of_python_dicts:
            key_so_far = d['name']
            self.dict2ChoiceDict_helper(d,key_so_far,mc)
        return mc

    def dict2ChoiceDict_helper(self,python_dict,key_so_far,mc):
        list_attrs = []
        for k in python_dict:
            if isinstance(python_dict[k],str):
                mc[key_so_far + '_' + k] = python_dict[k]
            else:
                list_attrs.append((python_dict[k],k))
        if len(list_attrs) == 0:
            return
        for l,k in list_attrs:
            for i,ll in enumerate(l):
                new_key = key_so_far + '_' + k + str(i+1)
                self.dict2ChoiceDict_helper(ll, new_key, mc)

#3

Hmm I don’t have an answer to your questions, but I can give a bit of unsolicited history :slight_smile:

The choices used to be stored in a regular dictionary with the full keys going to the values, then there were functions to iterate over those keys:

>>> choices.iter_begin('noun')       # iterate through all nouns
>>> while choices.iter_valid():
...   type = choices.get('type')
...   choices.iter_begin('morph')    # sub-iterate through all morphs
...   while choices.iter_valid():
...     orth = choices.get('orth')
...     order = choices.get('order')
...     choices.iter_next()          # advance the morph iteration
...   choices.iter_end()             # end the morph iteration
...   choices.iter_next()            # advance the noun iteration
>>> choices.iter_end()               # end the noun iteration

When I joined the Matrix project, one of my first changes was to parse the choices into a nested structure (roughly as we have now), but while keeping the original choices file format. In order to handle the dictionary-type access (choices.get() above) and the list enumeration (choices.iter_begin(), above) we made the ChoiceDict and ChoiceList classes and changed their basic access methods to be aware of the existing structure in the choices file format. In retrospect I should have pushed harder to just switch to a more standard format like JSON (see here for my thoughts at the time), but oh well.

Anyway, I guess I’m not sure what you’re trying to do now. You want to initialize something like a partial choices file? Or you want to create a ChoiceDict from the nested structure and not from the full keys?


#4

I am trying to initialize from JSON, as it happens :). From dict, but I get that by loading JSON.

We’ve been handing a JSON serialization to the MOM GUI, and assuming something might have been changed by the user via that GUI, we would like to be able to load the result back. Obviously we don’t want to manipulate strings; having a dict which can be serialized directly as JSON is much more preferable. So, I end up with a dict which I loaded from JSON which came to me from the GUI, and I need to turn that into a choices file.

I think I succeeded with the above, though I haven’t thoroughly tested it.