Implied information, background knowledge, ellipsis, coreference, figurative speech, ambiguity; these are a few of the immense challenges a natural language semantic system faces. And yet, humans process language in real time every day with very little misunderstanding. How can a computer do the same?
By constraining the problem. Sixty six million and some odd amount of
thousands is, indeed, a large number. Two hundred and thirty five
billion is much larger. These two numbers represent the number of
choices an computational semantic system faces for a medium size and a slightly larger
size problem. Come across a truly long sentence and the numbers soar
past
. And that only to determine basic semantic dependencies;
add in ellipsis, discourse and coreference resolution possibilities and they
increase even faster. Such exponential growth in the size of
the problem must be constrained if serious work is to be accomplished.