Regarding a description of the use of hash tables

 

Modified:  September 14, 2003

 

<business confidential>

 

 

OntologyStream Inc is studying how BerkeleyDB handles hash tables and b-trees.  This work is to be extended over the coming months.

 

CCM (type:value) pairs are mapped into a hash table.  The type and the value together form an element.  The element is then hashed and then mapped to an object.  This object contains a data construction to hold a branch or simple tree. 

 

The element is those ending nodes of the branches in the Input Bag of Branches, I.   In this instance, the node’s label is a (type:value), where value is the specific word. 

 

The element is then mapped to a class of objects (eventually in OWL interoperable format) external to the hash. 

 

A set of composed objects are constructed by the cross scale transform as noted in:

 

http://www.ontologystream.com/CCM/CCMnotation.htm#_Section_4.2:_

 

The type “word” of the ending node is not needed in the simple inversion, as discussed previously. 

 

In the first implementation of differential inversions we have two categories of types, a class of nouns

 

( enumerated as { n(i) } ),

 

and a class of verbs

 

(enumerated as { v(i) }).

 

The elements of these two categories can be descriptively enumerated:

 

http://www.bcngroup.org/area3/pprueitt/private/KM_files/frame.htm

 

In our current disclosure the ‘recognition of the type” "comes from" a text analysis of the local neighborhood of the center word of the 5-gram, only if the center word is a noun or verb - as determined by fableParse, a program developed by Amnon.  

 

A general linguistic theory of (v,n,v) and (n,v,n) topological covers, using a local linguistic neighborhood (lln) in semantic space, is being developed.  There exists a research literature on this topic.

 

The type is an indicator of context and thus a first order logic can be constructed that produces (to be disclosed) ATS ambiguation/disambiguation operators on the categories of subject indicators - enumerated in correspondence to the ordering the branches, in I, by the ending nodes. 

 

The placement of the center word into a BerkeleyDB hash produces a retrieval mechanism that delivers all occurrences of the same word in a single step.  The type is then a "memory of" the local linguistic variation (llv), and can be mapped to a referential system that has encoded more complex graph structure corresponding to llns. 

 

How the stochastic referential system is found is a separate issue, but for now we are able to produce stochastic models of llv using the current Otis algorithm.

 

A possible future system could involve differential ontology and open polylogics. 

 

Any stochastic referential system is domain specific, and the construction of this system depends on parameters that can be exposed to the end user so that the end user can develop different viewpoints over the same domain. Because the stochastic referential system can be modified in real time, we refer to this as a formative schema. 

 

Formative schemas need to be interoperable with OWL ontology.

 

The measure of llv is captured as a set of llns.  The notion of a cover over specific linguistic variation with a specific domain is a issue that can be addressed formally using what is called topological logic (Victor Finn, Robert Burch).

 

A specific set of llns are then used to define the compounded element of logics over formative schema, the atoms of which are CCM constructions in the format

 

{ (type:value) }.

 

The CCM construction is applied only to the center of the n-gram (or generalized n-gram construction), and then multiple branches are fractured away from this n-gram to produce the bag of branches.