Research Note 17

 

July 24, 2003

 

( Index )

 

 

Satish has made a commitment to me to have the following by this evening:

 

Documents with:

1. Input and Output trees generated for the fox example

2. Input and Output trees generated for the fables

3. Performance degrade for the 32,000 files compliant with Note 14

 

These documents will explain both "how" and "what" is done using several Perl programs that Satish has been working on, on and off for several weeks. 

 

The programs are each very small, but uses one or two Perl modules in a clever way to reproduce the process involved in producing both the current NdCore Input Array and the Output Array. 

 

The use of Perl was decided on for several reasons.  However, we have had some difficulties that were unexpected, in getting the code to be understandable.  This may lead us to redevelop (hopefully within few hours) a C module that matches the input/output functions of the Perl.  Having understood the process in the development of the Perl code, it will be immediate to re-express this understanding as C code. 

 

ANSI C is extremely clear in how it works, and being clear is the essence of our strategy towards putting in two reification cycles.  We have had a FoxPro program that re-produces the NdCore Input Array and word-type only inversion, for some time. 

 

It is important to note that the NdCore program, that does the Input Array and Output Array, is less that 40K in size and runs on any Linux system - no matter how small the memory.  We currently have three machines that run the NdCore conceptual role up engine.

 

The two reification cycles are regarded as "reification of measurement" in the context of the Actionable Intelligence Process Model.

 

1) First reification cycle involves the use of frames and types (slots and fillers) to produce the Input Array. This Input Array will have exactly the form of the current NdCore Input Array...  eg the Input Array is in fact an Array of "centers" with two or more branches linked to the center.  The center of the current NdCore 5-grams are the significant words identified by a process involved in the current production of the current NdCore Input Array. 

 

2) The second reification cycle involves the use of a controlled vocabulary where the vocabulary is organized into localized schema containers, similar to the containers in SchemaLogic Inc's software for data federation with reconciliation processes. 

 

Note 14 is at

 

http://www.ontologystream.com/area2/review/Note14.htm

 

where I am communicating to Dr. David Alberts at OSD about the issues related to making measures of knowledge sharing in line with Dr. Albert's books:

 

Network Centric Warfare, several others and the more recent book "Power to the Edge".

 

Note 16 addresses some of the key generic questions about the use of n-grams as a measurement device,  in the context of our Actionable Intelligence Process Model.

 

http://www.ontologystream.com/area2/review/Note16.htm

 

Note 16 also suggests that a comparison can be made between the Semio concept maps and the NdCore conceptual rollup.  A comparison of this type is the only way to automatically establish ground truth without having human reification processes.

 

So for example, if there is something surprising and novel about some use of words by some terrorist group, then this linguistic variation should appear as a unit in both the Semio and the NdCore constructions.  Since, I argue, the two technologies are really quite different, the cross validation is a strong indicator of ground truth.