December 18, 2001
Paul S. Prueitt, PhD
Founder (2001), OntologyStream Inc.
OSI is actively searching for investment and clients.
End of the year resources are very short. Meanwhile, the development of the software continues through discussions primarily between Cameron Jones, Paul Prueitt and Don Mitchell. Several side discussions are occurring, including discussions with Dr. Richard Ballard on ontology encoders. The eventChemistry e-forum is open and is supporting additional discussions.
It is important to continue the basic development of the technology even while we are waiting for positive results from the search for additional economic support. All that we can do is to develop the historical record and find a closure for those things that are keeping investment from occurring.
The development of the market is a somewhat separate process. To identify clients requires that we have real presentations on what the software will do for the client. At least temporarily we are not looking for a client in the computer intrusion domain – in deference to our recent client. It is hoped that after the first of the year, that management will wake up and restart the development and deployment process.
However, we must develop demonstrations for the other markets and figure out ways of making the case for our technology in each of the following markets.
More will be discussed on the marketing issue as this Development Note is written.
Section 1: One the issue of resolution
Up to now, we have not talked a great deal about the pre-processor that develops “datawh.txt” files for thematic analysis. Some theoretical discussion is made on December 17 in First Report on eventChemistry ™ . However, this discussion has not been detailed as we need it to be. Thematic analysis will be taken up again soon. But we address a different issue now. This is the issue of compound resolutions.
Look at the data set in smallFables.zip.
A reduced set of tokens is used to develop a small datawh.txt. The category A1 has only 54 atoms. The reduced set of token also reduces the linkage between fables and shows a separation of prime categories at the A1 level.
Figure 1b is the event scatter for the category B2.
The six atoms of category B2 can be fully resolved into a situated compound such as shown in Figure 2;
Figure 2: Situated Compound for category B2
This resolution is done by hand, as is the resolution of Figure 3a into Figure 3b. In both cases, the resolution is unique except for the angles made between the links.
A diffusion process leading to uniform lengths to the links and equal distances between atoms will tighten up the picture into a characteristic pattern. But these processes are sometimes conflicting and sometimes there is not sufficient constraints to impose a unique distribution.
Figure 3: Event chemistry from IDS data
These remarks are general in nature, but revealing of the technical issues related to imposing a set of rules on the evolution of a resolution of a scattering of atoms.
What does one do algorithmically to cause something like the transition between Figure 3a and Figure 3b? Given that such an algorithm exists then will this algorithm reasonably transform Figure 4 into a single resolution?
Figure 4: A more complex bag of atoms from the fable collection.
Mitchell’s code will resolve any bag of (SLIP) atoms into a compound with specific link structure as reflected in the real link analysis developed by the SLIP Analytic Conjecture. Of course this most makes sense, right now if we have an event log with columns.
The columns need to be reified as tokens so if the data were numerical then perhaps a fuzzification technique would make the conversion. If the columns are values from the nodes of a finite state machine, then the class of natural kind is equal to the types of nodes in a small finite state machine. The I-Ching provides an example of a finite state machine with 64 nodes. KOS is designed to control the transitions of small finite state machine that are linked to machine ontology – such as what Dr. Ballard has available in his Mark 3 ontology encoder.
If the columns are text elements then we bin the text using a parse program and use the unique values as the nodes of a finite state machine.
There are some topological issues related to nearness being interpreted as closeness in meaning. Of course this is the major problem we have in general with almost any technique in image understanding or text understanding or event log understand. Part of the responsibility for the human visual acuity and decision-making is to increase the constraints on an under constrained aggregation (stochastic) process. This human touch encodes relationships of interest.
The use of human visual acuity is suggested first (at least in my work) in my report to Army research lab on Alex Zenkin’s cognitive visualization. Dr. Prueitt has also shared with Dr. Jones the work on the use of fractal decomposition and fractal error.
Any new coding structure should be able to expect a standard object model.
The Event Browser will understand the object model and the data for a specific bag of atoms will be stored in a text file. The Object Model is then used conceptually as well as programmatically to transform the bag of atoms into a compound. Just like a chemical factory in a plant composes atoms into chemical compounds.
The object model has
1) The objectSpace, which is now 3-D and infinite in span. However the size of the Event Browser window creates a boundary in which the bag of atoms is scattered and within which all transformation are constrained.
2) Atoms and links are objects or constructs. All atoms and all links for a bag are computed when the Event Browser is pointed at a node in the SLIP Framework. This information is stored in an ASCII file and the data is placed into process memory. The ASCII file is made available immediately, to any other computer process, after the data is placed into memory.
3) Immediately after scattering the bag of atoms (from one node in the SLIP Framework) into objectSpace, each individual atom will be subsumed as a compound object with one atom. Locate of the compound and compound valances are acquired from the atoms contained in the compound object.
4) The dissipative/Excapement algorithms are run iteratively to cause a process compartment having structure and stability.
5) Compounds are merged until the bag of atoms is fully resolved. As this process develops, there are fewer and fewer compounds. The aggregation process stops when the atom valances are fully resolved – just like in physical chemistry.
6) Also just like in physical chemistry, there are sometimes meta-stable states that move the compound dynamics back and forth between two or more “situated resolutions”.
7) Due to the fact that the bag may have atoms with resolution only possible to atoms outside the bag, the fully resolved compound is relative to the bag, and may have valance (linkage) to compounds associated with other nodes in the SLIP Framework.
At any time the entire state produces by the aggregation process is recorded in the various objects in memory. This information can be expressed into a text file.
Thus Mathematica or any other process can take up the transforms of the bag of atoms at any point during the formative process. The interface that allows one to move the bag over to a Mathematica visualization environment is straightforward. This is the perfect way to develop different types of chemistry. Once chemistry is developed, then the transformation can be encoded as Referential Information Base (lines and tensors as in the scatter gather to the circle.) This provide very fast processing.
Figure 5: The SenseMaking environment for SLIP
Figure 5 can be regarded as an IDEF type diagram with the following inputs or outputs (starting in the upper right corner):
1) Event Log: The data source has to be in the form of an event log. The event can be a text mining process of a data mining process that is discovering and placing into the record event profiles. Examples so far are (a) Intrusion Detection Audit logs, (b) FTP log file dump, and a crude thematic parser as applied to the BCNGroup Fable collection. Human work is required to identify proper event logs.
2) Human Judgment: The development of the Analytic Conjecture is aided by the SLIP warehouse Browser (see exercises)
3) Human Control of Categorization: Human control can be exercised in either one of the two levels of abstraction (see section 5.2 of the First Report on eventChemistry )
4) Human Intervention: Actions are taken due to information delivered to the Invariance Detection Cycle.
5) Human Control over relevance metrics: Reinforcement learning, such as found in artificial neural network classifiers, can be used in conjunction with manual changes in the way the over all system is working.
This type of SenseMaking system is being considered for implementation as an incident management system.