Friday, August 13, 2004
On the separation of syntax and semantics
and the issue of pragmatics
A purpose crystallized in discussion on August 12th.
Our purpose is to test the hypothesis that
1) the Primentia patent is optimal is a specific sense related to the encoding of data,
2) that the underlying technology needs a semantic dimension.
Paul Prueitt, BCNGroup Director
Note from Ken Ewell
Founder of Readware Inc
I reviewed the PriMentia patent on creating reference data and reducing error and disorder in databases.. Its number is 6,542,896.
And I am comparing it with the MITi/Readware's patent: "Method and Apparatus to Identify the Relationship of Meaning between Words in Text Expressions", U.S. Patent # 4,849,898, July 1989.
Paul, you have, previously from me, a rather complete description of the perfectly orthogonal semantic representation our methods form upon the phonetic (or any) alphabet. After reading Primentia's patent it made me think there is some possibility to merge the techniques in each of these systems of data representation.
It is important to note that the Readware ConceptBase is not built from the data at all. It was hand-constructed. It's entries came from the study and collection of only the most stable and coherent taxonomy of cultural word-roots we could find.
We were able to field the Readware concept-base (a reference database) in 1992 and its reform version in 1996. Before that time we could only do pattern matching and correlation of linguistic data—something that many people do today.
Which patterns are the important ones? Which ones make sense? Which ones deserve our attention? These are problems that are not completely solvable with pattern matching and correlation. A number of problems exist is some type of constraints are not put on the interpretation of the meaning of patterns.
For example, without taxonomic conditions, TRAINS and TRENDS became related through the (early) stem-reduction method we used to parse the terms.
These two words have similar patterns of consonants -- indicating some possible semantic similarities between the *behavior* of trains and the *behavior* trends.
A cognitive system of interpretation might see the stems as unrelated via their concept-taxonomy. The concepts in which “trends” and “trains” appear can give guidance. A trend is not a trend unless its direction can be fixed. A train heads off in a fixed direction.
Is it conceptually or cognitively significant to understanding what a train or a trend is or does? How can we expect an algorithm reach a conclusion?
The anticipatory approach that you have talked about gives a surprising answer. I look forward to a project that can bind together what you have with what we have.
Because Text Retrieval systems do not reduce the original patterns and use the keywords, they do not have to deal with a literal problem. I don't see Primentia getting away from the literal representations. I agree with you that if that is all one has, you will not have enough to capture even most possible meaningful, semantic and significant relations.
Our solution to the literal problem was to figure out how objects fit into a worldview. This sounds like Cyc Corp’s notion of micro-theories, but is not.
We began by studying ancient languages. Ancient language had a full complement of terminology. By Aristotle's time, the Greek language was rich enough to create the foundations of the scientific study of everything. This was a time when the world was a good deal less sophisticated and the theories (concepts) about the world where given in more concrete (and maybe less emotionally charged) terms. There were fewer synonyms.
By studying the terms of several old languages we were able to devise a taxonomy of concepts for a broad spectrum of human endeavors. The taxonomical relationships became our stable attractors and we use them to inform the mathematical data models that are created by the software. We see something similar in the way that PriMentia uses a set of known facts to boot strap a complete recovery of data in fractured databases.
We also used our taxonomical elements to solve problems of synonymy and polysemy that plague other systems.
I am sure you can see Paul, there is a philosophic and linguistic problem in this cognitive space. The issues I mentioned above has to do with language change and the ways in which words are used to represent ideas and to reference objects in the world. If we can properly lay down a new computational foundation we might be able to make rapid progress in areas that have seen stagnation in the recent decade.
Somewhere, somehow, we have to inform the mathematical model/measure that not only do certain objects occur in patterns, they also form or represent classes and that it is the potential of symmetry, harmony, coherence and relatedness between these classes that warrants our attention.
Regards and thanks for keeping up the work!