Index                  .

 

Design of Vader Systems

 

Paul S. Prueitt

OntologyStream Inc

 

March 23, 2002

 

Our business group is developing marketing materials for a commercial Visual Abstraction Database for Emergancy Response (Vader). The current tutorial is offered, from the research group, in support of this effort.

 

The software and data, for this tutorial, is at the Free download

 

This paper assumes that one has worked through the introduction tutorials.  For more discussion please call Dr. Paul Prueitt at 703-981-2676.

 

Event detection illustrates factual informative within the context of an inventory of the event types. Events indicate that specific programs (such as a e-mail program) are talking between two or more machines. 

 

The research group must inventory the known event types as the first step in demonstrating the commercial viability of Vader systems.

 

We see event types as being informative about

 

1)      expert tacit knowledge of the domain

2)      objects of perceptual invariance in the domain experts mind 

 

In this tutorial, we illustrate a "source port behavior" using snort data.  Of course there are many ways of doing this.  Like object perception in the human mind, the visual Abstractions can be seen from different angles; however the number and kind of objects does not change simply by the perceptual act.  Each object has its own reality.

 

Visual Abstractions are virtual objects.  With an object knowledge base, expert tacit knowledge is directly used to control response processes.  A perception/action cycle, with memory and reinforcement, is established as experts work on real problems.  

 

This has to be demonstrated, but we are very close to this demonstration.

 


Section 1: A short note on the analytic conjecture

 

Up to now, the SLIP conjectures have been simple link analysis.  However, any logical construction that involves a calculus on the names of the columns, and on the contents of the column cells is a viable conjecture. 

 

For example:

 

The value in the 5th column is x, y or z,

the time is between p and q,

and there is an occurrence of the value a, b, or c in the third column

after the found value in the 5th column

 

will produce syntagmatic units of the form < x, r, a >.  Once one has a set of syntagmatic units

 

{ < x, r, a > },

 

one can define a set of atoms and relationship types.  One could even define several types of atoms during the conjecture.  The x values and the a values can be treated differently, for example, so that the r relationship have a temporal or cause aspect.

 

These logical calculi on column names and contents is beyond the scope of the tutorial, except to indicate that most of the Intrusion Deception System (IDS) rules can be converted directly into a logical construction that visualizes atoms and link-types.  These atoms and links form a substructure for the production of visual clues regarding the nature and variation of the IDS events.  The IDS log file need not be the starting point for visual abstraction.

 

Observationally, we see that the conjectures produce object invariance that is driven by data content.   For each conjecture there are between 3 – 10 major compounds and between 10 – 50 smaller ones.  A conjecture can be applied to a data stream by building a dual-buffer where real time data is accumulated into one buffer and the other buffer is used to build the event Chemistry.  The dual-buffering architecture is commonly used in real time compression and encryption streaming.

 

The conjecture acts as a convolution over the data steam to produce an real time imaging of exactly those events that are occurring in the data stream.


Section 2: Snort data

 

We received 1833 snort records, and placed six columns of this data into a datawh.txt file.

 

 

Figure 1: (source port, destination IP) conjecture

 

We developed a (source port, destination IP) conjuncture.

 

Dean Rich and others have talked with us about a specific "source port behavior" and we have read about this in the books on intrusion detections. 

 

We have an internal mental model of "source port behavior".

 

"Source port behavior" is part of what is “in the data” and can be useful as a means to point precisely at an incident.  The source port will often be incremented each time a packet is sent from the source IP address to the destination IP address.  So this should look like a type of reverse port scan. 

 

Port scans should be viewable using any one of several conjectures such as (destination port, source IP), (destination port, destination IP) or the conjugant conjectures (source IP, destination port), (destination port, destination IP). 

 

"Source port behavior” can be used to illustrate the operational properties of visual Abstraction.  In the conjecture (source port, destination IP), the destination IP organizes source port log file events. Compounds suggest how source ports use the various destination IP addresses and how events are reported to the snort log. 

 

The reader is encouraged to look in the data set and investigate some of these compounds. 

 


Section 3: Perceptual priming using pictorial icons

 

The issue of mental imaging and conceptual priming of the individual experience of knowledge is a question of science.  This complete description of the issue of mental priming is highly relevant to our proposed National cyber defense knowledge base.

 

The White Hats have many mental images of Cyberspace events, and these mental images can be triggered by a visualAbstraction (perceptual priming).

 

The visual Abstractions can also be used to transfer some domain specific tacit knowledge from a White Hat (or CERT domain specialist) to someone who knows nothing about Internet phenomenon. The transfer of knowledge can be within a highly trained community.  Using the event types, this community can protect the core infrastructure of the Internet.

 

The human awareness, that is primed, will sometimes leads to the mental recognition that

 

1)      Something is understood

2)      Or that something is "there" but not understood

 

For example, our snort data set was given to the research group with the following information:

 

" There were some vulnerability scans and other things going on.  "

 

To investigate the vulnerability scans one should copy the snort2 folder (from snort2.zip) and delete all files in the data folder except the datawh.txt file.  Then build a conjecture and look at the compounds. 

 

 

Figure 2: A possible source port behavior event type occurring to Dip = 192.168.10.249

 

We will take a slightly different approach.  We first filter the data to produce a datawh.txt file having only one specific IP that appears to have been scanned.

 

Having identified this potential port scan, we used the SLIPCore to manually bring the single IP address into a category so that report generation could produce a file having only Dip = 192.168.10.249.  Clearly some automation would be helpful here, but the I-RIB data structure is readily available for this type of query and retrieval.

 

 

Figure 3: Report generated for all records having Dip = 192.168.10.249

 

Having retrieved the 193 records from the original dataset, we now develop events using the (source port, destination IP) conjecture (see Figure 4).

 

 

Figure 4: The conjecture (source IP, dest port) in the query set

 

By inspection of Figure 5 we verify that there is only one event compound, and that this compound is defined by the relationship to a single destination IP, 192.168.10.249.

 

This single IP address has been passed information from 95 source ports.  It is natural to ask if the compete set of source ports are related to the same source IP, or are related to source IPs that are known to have a common locus of control.

 

 

Figure 5: The view of the single compound having the proper number (95) of atoms

 

The "source port behavior" is likely to be best shown in a (source IP , dest port) conjecture.  But we will see this same "object" using anyone of several conjectural rules. 

 

The notion of a conjectural rule is part of an OSI patent (currently under development) that protects the vertical market development by OSI Trusted Partners.

 

 

 


Section 4: Enumeration of event-types

 

One may descriptively enumerate all event-types and to thus define what one means by an event, taken in the abstract; and what is meant by each event, considered by itself. 

 

Event types will become fully enumerated in the knowledge base part of Vader systems.  One then has available a state-gesture model to implement priming of human perception through the services of the root_KOS.  The vertical market will prove the horizontal technology.

 

We have not yet enumerated the Cyber Security event-types, because we still have some open issues.  The descriptive enumeration (DE) of event types is a knowledge acquisition process that is governed by a process model for enumeration. 

 

What is the proper data instrumentation, data sensor parameterization, and SLIP/CLIP analytic conjecture needed to produce good visual Abstraction?  When is visual Abstraction evocative of the mental experience by CERT analysts and other White Hats?  We have only the first few examples.  But clearly the event type knowledge base will develop rapidly, once domain experts begin to use the tool set.

 

When the first commercial Vader is prototyped., we will see new types of expectations from the interface design. 

 

For example, when a port scan event appears to be starting we will see an anticipatory object appear in the Vader interface. 

 

 

Figure 6: Mock up of a Vader controller

 

As the scan occurs, we should see the development of the scan profile.  Other real time objects will be viewable, as will be incident histories.  Variations from the typical object representation will be seen. 

 

After the scan has been completed we will register the event at a higher level of organizational detail.  Petri type models anticipate what might happen because of a scan of a certain type.  Incident histories detail the behavior of a program of a particular type. Incident histories detail that a particular type of event has in fact occurred at a specific time. 

 

The four-layer taxonomy

 

{ bit stream, intrusion, incident, and policy }

 

is seen in Figure 7.

 

 

Figure 7: The first of four PowerPoint slides on SenseMaking

 

The four layer “stratified taxonomy” is to be used in a Vader or CDKB system in order to communicate within professional communities.  Knowledge management methodology and technology is useful in facilitating this type of collaborative communication.

 

Thus we see that the facilitation of terminology development and use in a community of practice is an essential aspect related to the notion of a CERT center.

 

Other uses of visual Abstraction involve:

 

1)      Automatic known event detection

2)      Novelty detection

3)      Automatic and mediated response

a.       Automatic and mediated adjustment of sensors and instrumentation

b.      Automatic and mediated adjustment of vulnerability exposure

c.       Routing of activity into a HoneyNet

d.      Denial of access response

4)      Assembly of historical accounts as incident records

 

Once event types are identified that the Vader knowledge base can be instructed to alert the Vader viewer that specific events are present in the data stream.  We can describe the "event" by name and short description and modify the object formation process to look just for this "object".