Back ... ... ...
... ... ... ... ... ... ... ... On Stream ... ... ... ... ... ...
... ... ... ... ... Forward
![]()
OntologyStream Inc.
Copyright:
2001
Technical Paper on: Process Model for In-memory Databases
In-memory databases have certain nice
properties. However, in-memory mapping
process relies on memory-mapped location as an actual property of
the information.
The problem of adjusting
data positions in memory has been with the standard relational database since
it’s beginning in the 1950s. The “best solution”
to this problem was to
1)
define
normal forms for the relational database,
2)
define
a file system for breaking the data into tables, and
3)
create
a data manipulation language for record updates.
This “best solution” is not
good enough. Data mapped in-memory cannot
easily be broken up into smaller units and moved around. In fact, in-memory database
systems use the mapped data as a single mathematical object. The object cannot be altered in any way, or
else the system breaks.
Imagine moving numbers
around arbitrarily so that the number “5” is sometimes in front of the number
“4” and sometimes after the number “4”.
The meaning of the tokens “4” and “5’ are arbitrary. But once
established, the arithmetic has a specific meaning as in “5 is the integer that
follows 4”. This ordering cannot be
destroyed without affecting the arithmetic.
Data needs to be
updated. The update is required to
reflect changes in the information available at a specific point in time. In fact, the structure of data also needs to
be updated periodically to reflect a deeper alteration in the nature of complex
systems such as business-to-business relationships and even the conduct of
military or political matters.
However, the absence of a
delete function is NOT a flaw for a strictly non-transaction technology. If the database is non-transaction then the
view of the data can be very fast.
Standard analytic processes (of any kind) can be run at great speed. These processes include scatter-gather
methods (such as Latent Semantic Indexing), neural network cluster algorithms,
evolutionary programming methodologies, emergent computing methodology, and any
of the new data mining algorithms.
Scatter-gather methods produce a
structural model. Derived (or mined)
information can be placed in the structured model. For this mining process to work well, a process model
is needed. The process model
provides a basis for data updates within a specific structural model. The process model also must
allow updates to the structural model itself.
The trivial solution to the in-memory databases
transactional problem will be one that minimizes the cost of the process
model measured in processing cycles and allocated memory space.
Section 2: A Process Model and New Categorical Specification
Data object transformation
requires a process model and a categorical specification of the abstract type
for objects. The standard relational
architecture is one way to specify a process model and abstract type. Taken together, Extensible Markup
Language (XML) and in-memory databases
suggest a different specification.
There exists a framework to
solve in-memory database transactional problems. In general and abstract terms, this solution
creates a transactional database with an analytic engine as an
"object" (the object here is a category of simple mathematical
constructs). A data manipulation
language manages the object. A separate formative process manages
the specification for the category of objects.
The architecture is a
standard three-tier model {presentation, logic, data} for database
applications. This architecture is a
natural consequence of the update problem.
The presentation layer allows users and developers to define behavioral
features for the software. The logic
layer allows developers to place transactional engines that perform essential
tasks that support the required behavioral features. The data layer is where data is placed for use by the logic
layer.
Our framework leads to a process
model where by the analytic database is dissolved and reconstituted on
a "commit" command. All data
updates are handled in a temporary way, until the "commit" and then a
new object is created.
This new object is again purely analytic in nature, and may be treated
as such. The commit command simply
allows one to control WHEN this "costly" update computation will
occur.
Our process model
provides for data updates within a structural model and periodic updates
to the structural model. Such a solution is ideal for products that require a
fast interaction between human perception and data structures.
A relationship exists
between a data structure and some object in the world. In the Topic Maps conceptual model,
this relationship is indicated by delineating subjects into machine addressable
subjects (such as text in an ASCII file) and subjects that are not addressable
by the computer. The addressable
relationships, and only these addressable relationships, may be used to produce
the single mathematical object.
Data updates require a
transactional database equipped with a data manipulation language. However, periodic updates to structural
models require something new. What is
required is a formative process, involving the human perceptual
acuity. The consequence of this update
can be instantiated as a single mathematical object. This object, once instantiated can be placed in-memory
and serve analytic purposes.
In January and February of
2001, Prueitt developed a notational system for ”Semantic Situation
Algebra" and provided a description of how to index a large collection of
Intellectual Property text.
The notation system
involves the use of what logicians call a syntagmatic unit having the form <
a, r, b> where a and b are data invariance and r is a relational
operator. In theory, any in-memory
database can be completely described using this notational system. Prueitt has published work on this notational
system since 1995. In late 2000, he
created a small test collection using fables to simplify an anticipated
development process for a technology client.
An in-memory database can
be used as a static construct category on which to run a class of transforms
(defined over the construct category).
Prueitt has called this class of constructs a Semantic Situation
Algebra, or SSA.
In April 2001, he began to use the term structural holonomy
to indicate the general construct.
Conceptual work has been completed on how the construct might serve as
an analytic engine and transactional database.
This work is done in conjunction with scientists at the U. S. Einstein
Institute.
It is clear that any
reduction-to-practice of generalized structural holonomy will
take years. A technical review of the
prior art has produced a rich linkage to fundamental problems in scholarly
literatures, to existing patents and to patent applications. An ability to show prior art is an issue to
consider.
Structural holonomy can be created using any of
several basic mathematical formalisms, such as a Fourier Transform or
wavelet. The concept is simple and
straightforward. In my mind, anyway,
the notion is grounded in the neuropsychological literature of Karl Pribram and
the ecological literature of Robert Shaw and J. J. Gibson. Pribram’s 1991 book is called “Brain and
Perception: Holonomy and Structure in Figural Processing”.
The Topic Map conceptual
model has features similar to those features discussed above. Topics within scope must remain within a
whole construct. The "add
topic" function requires the entire Topic Map construct to be
re-instantiated. The Topic Map is a structural
holonomy, but is not generally considered an in-memory data
structure.
A similar problem, and a
similar solution, is present with the in-memory database systems
being developed by NCorp. NCorp uses a structural
holonomy, as does Excalibur and several others. Discussions continue with companies in this
regard.
Dr. Paul Prueitt
Consulting Scientist