DMRS verbalization

In GitHub - AskNowQA/LC-QuAD: A data set of natural language queries with corresponding SPARQL queries, a QA dataset, questions are associated with SPARQL, and the systems need to learn the translation. There are many approaches to this translation, and some rely on steps like entity linking and relation linking. For entity/relation linking, ML methods usually need more context to better calculate the embeddings for the words in the utterance. For instance, the question.

  1. question = Who owns the newspaper which was founded by Nehru?
  2. intermediary question: What is the <is owned by> of the <newspaper> whose <is founded by> is <Jawaharlal Nehru>?

The golden SPARQL is

SELECT DISTINCT ?uri 
WHERE { 
?x <http://dbpedia.org/property/founder> <http://dbpedia.org/resource/Jawaharlal_Nehru> . 
?x <http://dbpedia.org/ontology/owner> ?uri  . 
?x <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Newspaper>
}

So the challenge is to discover the URIs in the KB associated with the linguistic signs in the utterance. One DMRS produced to this question is

image

It is nice how close it is to the intermediary question above. But still, one idea to help the system learn the association of the linguistic signs to URIs would be to expand the context with more words. To expand the context, I was thinking about the verbalization of such DMRS:

what is X that is a person, such that X owns Y, such that Y is a newspaper, such that Nehrtu founded Y.

In that case, not too much was added besides the term person. Still, maybe for other examples, the graph explanation may be a text fragment more informative for calculating the embeddings than the sentence itself. Does anyone have thoughts about ways to generate an explanation or verbalization of the DMRS graph…

Maybe obvious but: ACE does generation as well as parsing. Seems like it could generate the english from the DMRS.

I think he wants to “generate” (in the general sense) a sentence that verbalized the logical formula, not an English sentence that has the same semantics.

@arademaker It sounds like it would get complicated if you want to properly inflect verbs and things (_own_v_1owns), and if you want to foreground any questions. E.g., you have “What is X that is a person”, but if which_q was on _newspaper_n_of, you might want “What is X that is a newspaper such that Nehrtu founded X and…”. And what if there are multiple questions? I think it can be done rather mechanically if you don’t care about these issues, but it might end up a bit like prolog instead of English: “X own Y and X is person and which X and…”.

Actually do we have a way to get at the morphological internals of grammars, such that, for example, inflect(_own_v_1, 3sg) gives owns? That sounds like it would be very useful for other things as well.

It is not about generation @EricZinda but about the explanation of the semantic representation. The SUMO ontology has some similar ideas on trying to produce a NL fragment explaining a formula (see right columns)

https://sigma.ontologyportal.org:8443/sigma/Browse.jsp?lang=EnglishLanguage&flang=SUO-KIF&kb=SUMO&term=SelfConnectedObject

Yes @goodmami you got it. But I don’t need to generate a complete grammatical and fluent sentence. Only a fragment that could potentially be used by ML methods as additional context.

But your question is relevant. Maybe @Dan can answer. I remembered a similar work with the GF framework (http://publications.lib.chalmers.se/records/fulltext/116606.pdf) that I can still try to explore.