Development Note

 

December 18, 2001

 

 

 

 

 

 

Paul S. Prueitt, PhD

Founder (2001), OntologyStream Inc.

 

 

 

 

 

 

 

 


 

Development note

 

 

 

 

OSI is actively searching for investment and clients.  

 

End of the year resources are very short.   Meanwhile, the development of the software continues through discussions primarily between Cameron Jones, Paul Prueitt and Don Mitchell.  Several side discussions are occurring, including discussions with Dr. Richard Ballard on ontology encoders.  The eventChemistry e-forum is open and is supporting additional discussions. 

 

It is important to continue the basic development of the technology even while we are waiting for positive results from the search for additional economic support.  All that we can do is to develop the historical record and find a closure for those things that are keeping investment from occurring. 

 

The development of the market is a somewhat separate process.  To identify clients requires that we have real presentations on what the software will do for the client.  At least temporarily we are not looking for a client in the computer intrusion domain – in deference to our recent client.  It is hoped that after the first of the year, that management will wake up and restart the development and deployment process. 

 

However, we must develop demonstrations for the other markets and figure out ways of making the case for our technology in each of the following markets.

 

1)      The analysis and event trending of trouble tickets from the network performance centers in one of the major telecommunications corporations. 

2)      OSI eventChemistrywill be applied to the examination of financial data. 

3)      The examination of the activities of the Patent and Trademark Office in awarding patents. 

4)      The development of the BeadGame Communities software. 

 

More will be discussed on the marketing issue as this Development Note is written. 

 

 


Section 1: One the issue of resolution

 

Up to now, we have not talked a great deal about the pre-processor that develops “datawh.txt” files for thematic analysis.  Some theoretical discussion is made on December 17 in First Report on eventChemistry ™ . However, this discussion has not been detailed as we need it to be.  Thematic analysis will be taken up again soon.  But we address a different issue now.  This is the issue of compound resolutions.

 

Look at the data set in smallFables.zip. 

 

 

a                                                  b

Figure 1:  A SLIP Framework on this small data set

 

A reduced set of tokens is used to develop a small datawh.txt.  The category A1 has only 54 atoms.  The reduced set of token also reduces the linkage between fables and shows a separation of prime categories at the A1 level. 

 

Figure 1b is the event scatter for the category B2.

 

The six atoms of category B2 can be fully resolved into a situated compound such as shown in Figure 2;

Figure 2: Situated Compound for category B2

 

This resolution is done by hand, as is the resolution of Figure 3a into Figure 3b.  In both cases, the resolution is unique except for the angles made between the links. 

 

A diffusion process leading to uniform lengths to the links and equal distances between atoms will tighten up the picture into a characteristic pattern.  But these processes are sometimes conflicting and sometimes there is not sufficient constraints to impose a unique distribution.

 

    

a                                                   b

Figure 3: Event chemistry from IDS data

 

These remarks are general in nature, but revealing of the technical issues related to imposing a set of rules on the evolution of a resolution of a scattering of atoms.

 

What does one do algorithmically to cause something like the transition between Figure 3a and Figure 3b?  Given that such an algorithm exists then will this algorithm reasonably transform Figure 4 into a single resolution?

 

 

Figure 4:  A more complex bag of atoms from the fable collection.

 

Dr. Jones and Dr. Prueitt have engaged in a few discussion regarding general algorithmic processes related to fractals, fractal decomposition as related to image understanding, and the event chemistry needed to produce a single resolution to the dynamics from a bag of atoms.  Mitchell and Prueitt have engaged in similar discussions. 

 

Mitchell is current coding a solution to the problem.  This solution will produce a unique explicit resolution to dynamics that is implicate in a bag of atoms, and is thus an example of event chemistry.  Perhaps Mitchell’s event chemistry is the first example in a long line of examples.

 

Mitchell event chemistry is as follows.

 

EC.1:  A bag of atoms is scattered into a virtual white board called objectSpace.  The boundary of the objectSpace is defined by the size of the Event Browser’s display window. 

EC.1.1: The atoms are formally objects with location, valance of various kinds and other properties. 

EC.1.2: The scattering process mimics exactly the scatter of the atoms to the circle – except that objectSpace has a boundary, unlike the surface of a sphere.  The existence of a boundary turns out to be an important formal consideration.

EC.1.3: The scattering is random into this 3 dimensional manifold. 

EC.1.3.1: The objectSpace is rendered in 2 dimensions, but each atom has a location in three-space. 

EC.1.3.2: The atoms have one or more valances (defined by the types of non-specific links that where defined by the Conjectures). These are stored as properties of atom objects. 

EC.1.4:  Each atom is initially considered to be a separate compound.  This is the cross scale phenomenon, where the atom is both a compound and an atom.

EC.1.4.1: Each compound starts out with one atom.  Actually a pointer to the atom is a property of the compound.  The compound also has a location that starts out as the location of the atom. 

EC.1.4.2: We will see that as compounds are merged, that the atoms are combined into one of the compound containers.

EC.1.4.2.1: The location of the atoms may undergo some rearrangement. 

EC.1.4.2.2:  The location of the compound is then the average of the locations of the contained atoms. 

EC.2:  Once the atoms are scattered then we start the aggregation process.  This process is seen as an escapement process, a term that has meaning in stratified complexity theory.

EC.2.1:  Excapement is a local resolution of the internal dynamics of an object.  The object’s internal dynamics is realized as valance.

EC.2.2:  The local resolution is done iteratively acting on compounds (not atoms). 

EC.2.2.1:  A type of valance is randomly selected. 

EC.2.2.1.1: Types of valance come from the b values in the Conjecture

EC.2.2.1.2: Each atom will have one of more types of valance. 

EC.2.2.1.3: Un-resolved atomic valance is subsumed to the compound, but originates from the location of the atom. 

EC.2.2.1.4: The types of valance for each atom is computed by the SLIP Warehouse Browser and stored into the file Links.txt. 

EC.2.2.2:  All compounds having this valance type are brought closer together in objectSpace. 

EC.2.2.2.1: The movement caused here will cause other distances to move apart or closer together.  So we use the same technique as the gather on the circle, and move each compound only a little bit each iteration

EC.2.2.2.2: When compounds get close (as measured by a parameter) then the links resolve and the compounds at joined through the link that is being resolved.

EC.2.2.2.3:  A midpoint is used to move compounds together. 

EC.2.2.2.3.1: If a link type resolution occurs then all links of that type will resolve in the same step. 

EC.2.2.2.3.2: Given the resolution of a link type, the resolution may bring un-resolved links into the compound. 

EC.2.2.2.3.2.1:  Internal unresolved links are resolved by creating fibers so that single internal atoms may be multiple linked.

EC.2.2.2.3.2.2:  However, there is the possibility that the un-resolved link type has instances that are external to the atom. 

EC.2.2.2.3.2.2.1: To account for this, a temporary external valance is created for the compound as a whole and attached randomly to one of the involved internal atoms.  A temporary flag is set.

EC.2.2.2.3.2.2.2: The next time this link type is chosen, it is checked to see if there are any links of this type external to the compound.  If so then the compound assumes the external valance and is moved along with all other compounds having that link type.  If not, then the temporary valance is dissolved.

EC.2.2.3:  The process in 2.2.2 simplified the internal re-configuration of the atoms in compounds that are merging.

EC.2.2.3.1:  In the best modeling of the event chemistry one would have to randomly scatter atoms into an empty object space each time there was a merge of compounds.  This would lead to a proper determination of the relationships internal to the new compound.

EC.2.2.3.1:  In our simplification, the internal atomic structure of a compound is static and we move the entire compound as a whole EC.2.2.3.2: The movement is relative to the anchor of a link type on internal atoms and the location of the internal atom. 

 

Again, we should point out that there are similarity and differences between the Mitchell event chemistry and the scatter-gather to a circle (invented by Prueitt in 1995 – evidently) and implemented by the SLIP Technology Browser (using algorithms described in SLIP Data Structures and Programs (November, 2001))

 

Section 2: On the issue of coding structure

 

Mitchell’s code will resolve any bag of (SLIP) atoms into a compound with specific link structure as reflected in the real link analysis developed by the SLIP Analytic Conjecture.  Of course this most makes sense, right now if we have an event log with columns. 

 

The columns need to be reified as tokens so if the data were numerical then perhaps a fuzzification technique would make the conversion.  If the columns are values from the nodes of a finite state machine, then the class of natural kind is equal to the types of nodes in a small finite state machine.  The I-Ching provides an example of a finite state machine with 64 nodes.  KOS is designed to control the transitions of small finite state machine that are linked to machine ontology – such as what Dr. Ballard has available in his Mark 3 ontology encoder.

 

If the columns are text elements then we bin the text using a parse program and use the unique values as the nodes of a finite state machine.

 

There are some topological issues related to nearness being interpreted as closeness in meaning.  Of course this is the major problem we have in general with almost any technique in image understanding or text understanding or event log understand.  Part of the responsibility for the human visual acuity and decision-making is to increase the constraints on an under constrained aggregation (stochastic) process.  This human touch encodes relationships of interest. 

 

The use of human visual acuity is suggested first (at least in my work) in my report to Army research lab on Alex Zenkin’s cognitive visualization.  Dr. Prueitt has also shared with Dr. Jones the work on the use of fractal decomposition and fractal error. 

 

Any new coding structure should be able to expect a standard object model. 

 

The Event Browser will understand the object model and the data for a specific bag of atoms will be stored in a text file.  The Object Model is then used conceptually as well as programmatically to transform the bag of atoms into a compound.  Just like a chemical factory in a plant composes atoms into chemical compounds.

 

The object model has

 

1)      The objectSpace, which is now 3-D and infinite in span.  However the size of the Event Browser window creates a boundary in which the bag of atoms is scattered and within which all transformation are constrained.

2)      Atoms and links are objects or constructs.  All atoms and all links for a bag are computed when the Event Browser is pointed at a node in the SLIP Framework.  This information is stored in an ASCII file and the data is placed into process memory.  The ASCII file is made available immediately, to any other computer process, after the data is placed into memory.

3)      Immediately after scattering the bag of atoms (from one node in the SLIP Framework) into objectSpace, each individual atom will be subsumed as a compound object with one atom.  Locate of the compound and compound valances are acquired from the atoms contained in the compound object.

4)      The dissipative/Excapement algorithms are run iteratively to cause a process compartment having structure and stability.

5)      Compounds are merged until the bag of atoms is fully resolved.  As this process develops, there are fewer and fewer compounds.  The aggregation process stops when the atom valances are fully resolved – just like in physical chemistry.

6)      Also just like in physical chemistry, there are sometimes meta-stable states that move the compound dynamics back and forth between two or more “situated resolutions”. 

7)      Due to the fact that the bag may have atoms with resolution only possible to atoms outside the bag, the fully resolved compound is relative to the bag, and may have valance (linkage) to compounds associated with other nodes in the SLIP Framework.

 

At any time the entire state produces by the aggregation process is recorded in the various objects in memory.  This information can be expressed into a text file.

 

Thus Mathematica or any other process can take up the transforms of the bag of atoms at any point during the formative process.  The interface that allows one to move the bag over to a Mathematica visualization environment is straightforward.  This is the perfect way to develop different types of chemistry.  Once chemistry is developed, then the transformation can be encoded as Referential Information Base (lines and tensors as in the scatter gather to the circle.)  This provide very fast processing.

 

 


Section 3: Transient processes

 

Once a bag of event atoms are resolved it is possible to see how new structures are placed into a common objectSpace.  We should remember that invariance is what is being detected.  We can review how invariance is detected with the aid of Figure 5.

 

A pattern is defined in the link analysis of the Conjecture, and then refined and identified as an abstraction (such as a “port scan”) that is occurring in more than one way and at various times.  The scatter to the circle produces a category associated with these abstractions.  The event chemistry then develops a second layer of abstraction. 

 

Conceivably this will produce a small number of event categories at a third order of a stratified taxonomy as discussion in a presentation on Sensemaking. 

Figure 5: The SenseMaking environment for SLIP

 

Figure 5 can be regarded as an IDEF type diagram with the following inputs or outputs (starting in the upper right corner):

 

1)      Event Log:  The data source has to be in the form of an event log.  The event can be a text mining process of a data mining process that is discovering and placing into the record event profiles.  Examples so far are (a) Intrusion Detection Audit logs, (b) FTP log file dump, and a crude thematic parser as applied to the BCNGroup Fable collection.  Human work is required to identify proper event logs.

2)      Human Judgment:  The development of the Analytic Conjecture is aided by the SLIP warehouse Browser (see exercises)

3)      Human Control of Categorization:  Human control can be exercised in either one of the two levels of abstraction (see section 5.2 of the First Report on eventChemistry )

4)      Human Intervention: Actions are taken due to information delivered to the Invariance Detection Cycle.

5)      Human Control over relevance metrics: Reinforcement learning, such as found in artificial neural network classifiers, can be used in conjunction with manual changes in the way the over all system is working.

 

This type of SenseMaking system is being considered for implementation as an incident management system.