Section 3 (Chapter 5 in [1] )
Associative memories between
theme space and semantic space
Computational association
between discrete and continuous mathematical spaces S1 and S2
can be used to define transformations between partitions of
representational spaces. If we notate a
transformation between the two spaces as “T” then we have the
mathematically defined relationship that for each a, an element
of, S1, then there exists an element b in the
space S2 such at
T(a) = b
The partitions are needed in
order to create a system of formal containers that assist humans in performing
agile decompositions of informational organizations into areas where rules of
transformation can usefully vary.
The partitions and
transformations are needed to reflect a diversity of human viewpoint and the
relative nature of information to varying contexts. Over time, one expects to map out similarities and differences
between these partitions.
Shifts in viewpoints change
the organization of information. The
existence of continuum mathematics based transformations, over how information
is organized is critically important in moving from one human viewpoint to
another. The roles of the abstractions,
that allow humans to agree over the nature of these transformations, are
essential in ways that our work attempts to explain.
One issue to consider as the
reader tackles this new work is the unique location property (ulp). Computers can depend on the ulp, when it is
provably present; in order to decrease the number of machine cycles required
for fetching data. In the continuum,
one can always insert, delete or modify the localized information between two
other “pieces” of localized information.
This update does not change the ulp over the data! The update takes one simple step, which is
to make the update occur as part of the abstraction. The virtual model can also be expressed into computer memory in
such a fashion that the ulp is preserved.
This requires a complete rewriting of all data in one
pass.
Let us look at ulp
again. A data base record is one
example of localized information. A
block of records that have been written to computer memory cannot be updated in
one machine cycle in such a fashion that preserves the unique location property
over the discrete structure in memory.
However, a virtual model of information organization can exist with the
ulp in the continuum mathematics due to the properties of the real
numbers. In the continuum mathematics
the updates are essentially one step and thus can be achieved in one or only a
very few machine cycles. Once an update
to the abstraction has been made the abstraction can be used to write out to
computer memory a contiguous record that does have a physically rendered
ulp. Computer memory with ulp has
special properties that allow parallel and serial processes to act in the most
efficient way possible. The term “structural
holonomy” has been developed, elsewhere, to refer to these contiguously written
records.
The association between
vectors and semantic primitives
Our purpose in this section
is to provide a strategy that uses several knowledge technologies to develop a
representation about a specific object of investigation. The object of investigation can be the
structure of how computer data is organized.
But the object of investigation may also be a complex natural phenomenon
that cannot be perfectly modeled in either the continuum mathematics or the
discrete formalism. The phenomenon may
be, for example, consequent to a social or psychological system.
Any one of several different
types of associative neural networks can easily be used to define
transformations between types of representations when representations exist in
a mathematical formalism. Simple
bi-associative memories can learn specific mappings between different knowledge
spaces and between knowledge spaces and themespaces. Our strategy is to utilize
multiple types of representations and then to create an associative memory
between the representational spaces.
Consider two major types of
representation.
o
Theme
/ keyword vector representation
o
Semantic
primitive representation
The two types of
representation are very different in nature and formal notation. Theme vector representation is well
developed in techniques like latent semantic indexing and self-organizing
feature maps. Differential ontology
creates a mapping between the continuum mathematics and discrete encoding
structures. In fact, the first application
of differential ontology was in converting the consequences of LSI placement of
text into the categories defined by LSI transforms into ( type : value ) pairs
(Prueitt 2003).
The semantic primitive
representation depends on a theory of type that has been worked out by a line
of scholarship that includes the works of C. S. Peirce (1839-1914), the Russian
work described in Chapters 1 and 2, and the more recent work by John Sowa and
Richard Ballard. Ballard’s work makes
clear the use of a Zackman-type framework to create language neutral
informational codes that reflect a specific theory of type. The new work on generalFramework theory,
introduced by Prueitt 2001, can be used to encode knowledge experiences directly
into a framework. The techniques of
differential ontology can be used to convert explicit information developed
from the use of these primitives into a Hilbert space representation.

Figure 1 : The use of associative
memories to link themespace and concept space (Prueitt, 1998)
Before developing some
additional notation, we make the observation that knowledge technology
acknowledges the importance of a direct awareness about the structure of events
occurring in the world. This structure
is not random and is specific. So two,
or more, good methods for discovering knowledge about the structure of events
will be deriving representations that have some type of categorical
linkage. If the methods are good, then
the variation related to viewpoint is expected but some important essence of
the events will be preserved in each viewpoint.
The computer encoding of
knowledge, however, creates glass ceiling and some illusions. Differential ontology maps computer encoding
with the aid of continuum mathematics and therefore raises the glass ceiling
that one must expect from any type of formalism. But the ceiling remains.
In the knowledge sciences,
the computer program is always regarded as an abstraction, and thus not
something that is in the same category as natural phenomenon. Programs and computer states organized by
programs are like counting numbers in that these abstractions do not have a
location or any type of boundary conditions.
Abstractions are not “physically” real.
Instantiating abstraction as
a computer state does cause something to exist as a physical state. However, that computer’s physical state is
highly constrained to reflect a specific type of discrete formalism. Thus the computer states share something in
common with the discrete formalisms.
Zeno’s paradox, the Russell paradox and other artifacts of classical
logic and mathematics express foundational limitations. These observations set the stage for the
knowledge technologies.
Theme space: Let C be a
collection of text documents separated into text units.
For each text unit dk
in C, a set of phrases
{ pk,1 , pk,2
, . . . , pk,h }
can be identified as a
representation of the semantic content of the text unit. The parameter h
depends on the representational procedures and on the text unit.
Let A = È { pk,1 , pk,2
, . . . ,.pk,h }
be the union of phrases from
each text unit in the collection C.
Given a narrow domain for
the collection C , the size of A is weakly convergent and A is
an open set,
A = { q1 ,
q2 , . . . , qn(t) } .
By “weakly convergent” we
mean that new phrases may be occasionally added or removed. The domain might be thought of as a universe
of discourse and the collection C a sample of text expression of this
universe of discourse. The weakly
convergent properly is therefore important to fully and minimally model the
thematic content of the universe of human discourse as expressed over
time.
Let S be the vector
space
where each qi is assigned a distinct dimension.
By using the interval [0,1]
at each dimension, then S = [0,1]n(t) .
This assignment will impose
an unique location property to the set
{ q1 , q2
, . . . , qn(t) }.
Each qi is
assigned a unique location in the space S. Again, the integer n(t) is an
integer valued variable whose value depends on the situation.
Semantic prime space: Now let { Di } be a set of knowledge domains
and Di be one of these domains. As before the domain can be thought of as a sampling from some
active universe of human discourse.
Suppose that Di
can be represented by a set of syntagmatic units, each in the form of an
ordered triple < c1, r ,c2 > where c1
and c2 are concept symbols and r is a relationship symbol. Suppose further that the relational symbol
is specified using only the set of semantic primitives discussed by Ballard or
Sowa. Allow the set of concept symbols
to be enumerated by a human, or human community, in the form of a controlled
vocabulary. This controlled vocabulary would then be part of a simple graph
construct.
Suppose, further, that { Di
} is the minimal set of domains required to describe the semantic
interpretations of the collection of text documents C. We assume that a software interface
exists that allows two activities. So
suppose that the first is the development of a community vetted controlled
vocabulary. The second activity is
needed that allows a mark-up of the “primitive relationships” between elements
of the controlled vocabulary in context.
This means that a human is aware of a context and, consistent with this
awareness, the human makes an annotation about relationships of the types
specified by a specific set of semantic primitives. Community agreement, on the set of semantic primitives, is like
negotiated agreements on controlled vocabularies, except that the set of
primitives is small and reflects a mature theory of how phenomenon is
organized. The set of all details of
Newtonian mechanics is in this sense a community-vetted theory that is derived
from a small set of primitives.

Figure 2: A Knowledge base Framework
Using a general framework construct, such as a Zackman
framework or a Ballard framework, we define a conceptual syntagmatic unit
as an ordered triple where elements from a controlled vocabulary are annotated
pairwise to have semantic relationships expressed in some aggregation of the
semantic primitives. The
generalFramework theory uses a framework such as the one in Figure 2 to define
a specific number of cells that are the types of meaning that can occur, either
by itself or in combination with other types of meaning. In the Figure 2 we have the 18 cells related
to Ballard’s framework.
In the
theory of categoricalAbstraction (cA) and eventChemistry (eC) we have fillers
as potential atoms of event compounds, slots serve to provide the binding of
atoms into the event compound and the script (or framework) is in fact the
relationship between atoms.
The
generalFramework (gF) discrete data encoding has the form of an n tuple:
< event, a(1), a(2), . .
. , a(n) >
The
n-tuple has n atoms and one relationship.

Figure 3: The process flow
model of human memory formation, storage and use (Prueitt, 1996)
Now
suppose that we have looked at a number of relationships to produce resources
as in Figure 3.
An enumeration of conceptual
syntagmatic units is achieved by human use of an computer interface that
presents instances of situations within a class of events. If { Di } is a set of knowledge
domains, then the relationship between the domains and the instances of event
is determined, as best one can, by some process.
There are several strategies
available to us. We have some degree of
flexibility over how the formalism is established. Once formalism is established we engage in the Actionable
Intelligence Process Model discussed earlier.
So the tuning of the informational organization is left in the hands of
humans.
The most attractive
formalism is to treat each pair in the controlled vocabulary as an event, and
use the Ballard framework to annotate the set of all possible relationship
types between these two vocabulary elements.
Each of these instances is marked-up by indicating the negative presence
(blocker) or presence of just those semantic primitives that appear to the
humans as being relevant. The blockers
are annotated as a negative 1 and the presence is annotated as a positive 1
placed into the semantic primitives cell.
All cells of the framework
do not have to be filled for each pair of elements from a controlled
vocabulary. A human need not manually
identify all pairs that might usefully be considered. Automated means exists.
For example linguistic variation can be identified using LSI
correlation, for example.
Let
O = { < c1,
r, c2 > }
be the union of the sets of
conceptual syntagmatic units from { Di }. One way to read the triple < c1,
r, c2 > is as a rule. The
rule is that c(1) is related to c(2) through the semantic primitive r. O is a derived minimal ontology for {
Di }.
The construct O is
derived within an investigation of an phenomenon or the structure of
information within a computer-based information system. It is perhaps important to note that a
minimalism can occur where the construct O is left to the interpretation
of human or human communities. Thus a
great deal of common sense detail does not need to be developed into the
construct. The trade off is a high
degree of agility in how the elements of the construct are rendered and presented
to the human or human community.
If there is a well defined
set of semantic rules Q for structures composed from subsets of O,
then there may exist an intermediate language sufficient for the description
and analysis of the conceptual contents derivable from C and from the
related partitions { Di }. The object of investigation may be a natural phenomenon and the
constructs
( O, Q, C,
{ Di } )
consequence to a process of
complex science. The intermediate
language would be derived from these constructs.
If both O and Q are
open, the intermediate language describes a "semiotic" system if
compartmental transformation rules can be specified and if a substructure for
computational inference rules exists.
Let K be the vector
space
where each semantic primitive is assigned a distinct dimension.
As in the case with the
vector space S, this assignment will impose an unique location to the
set O. The set O then has
the unique location property. The
unique location property is realized in the Hilbert space as locations of
points. Scatter-gather on a circle can
be performed for each dimension.
The data’s organization can
at any time be projected into computer memory as localized bits of information,
relational database records or CCM constructs, in such a fashion as to preserve
the unique location property.
Now we can specify an associative memory between the two vector spaces,
one being a theme space and the other being a semantic space. We are allowed to
define a simple bi-directional associative neural network (using back
propagation or other method) where the training set is composed of vector pairs
from K and S.
This associative neural network allows theme-based retrieval from the
Concept Space and knowledge association from the Theme Space.