[om-list] System Design

Mark Butler butlerm at middle.net
Fri Sep 29 13:51:09 EDT 2000


Luke Call wrote:

> What
> causes us to require a special format other than objects decomposed onto
> a database? 

Same purpose as export formats like XML and GEDCOM.  Different applications
may use different kinds of databases depending on their own requirements
(speed, capacity, etc.), but text export formats provide a common way for them
to exchange information, as well as a common analysis technique.  The only
hard core requirement is that there is a 1:1 mapping between the format and
the database.

Adding certain capabilities is a matter of compromise - I may need them to
satisfy personal requirements for a useful system and only ask that the system
be capable of having such a function developed for it.  I certainly do not
expect that people who do not care about certain features volunteer their time
to develop support for them.

The only critical questions are those that force significant constraints on
the lowest level data structures and how they are implemented.  For example, I
doubt that any form of analysis could be effectively performed on a LISPy
structure stored in a relational database without caching large portions of it
into memory for processing.  Unfortunately, tracing thousands of node graph
edges for a good size problem just to get them into memory is likely to take
several minutes. If the problem size is limited, loading and parsing a text
export file into memory is likely to be at least a hundred times faster.

Now if you have a very large problem set with no convenient boundaries you
need a database that is optimized for graph traversal, much like most modern
object oriented databases.

In any case, regardless of how many implementations we end up with, we cannot
work together without a common lowest level logical data model.  I suggest a
LISP-like data model because I know that it is sufficiently flexible to
represent virtually anything, but also recognizing that it has severe
implications for practical database implementations.

Our number one problem is that we have no consensus on the meta-model level. I
want a first class, singly rooted system capable of analyzing logical formulas
or sentences, performing natural language translation and so forth - that
pretty much requires the user to be able to dynamically create meta model
structures for arbitrary abstractions and refer to them in any context, which
prohibits a relational database implementation at the meta-model level (i.e.
with hard coded concepts for what classifications entities, attributes, names,
and so forth need to be in).

Tom is working on developing an optimal meta-model that I do not understand
very well, mostly because he hasn't finished it yet.  From what I can tell
Luke would be happy with a meta model composed of objects, attributes, names,
and relationships.  The problem with fixed meta-models is that they entail
strict ontological commitments about not only what is real, but what
abstractions are capable of being represented. 

Natural languages force no such ontological commitments - anything can be
treated as a first class object, i.e. "noun-ified".  If we want a system that
can store natural language in its native form for further analysis, we have to
standardize on a meta-meta-model layer that can represent any sentence in any
language.  The same goes for any general form of automated reasoning, i.e. one
that can perform inference chaining on arbitary statements.

If we are to live up to the name "One Model", we need to have a lowest level
data model capable of representing what other people believe in a form
equivalent to its original representation, which is natural language.  

If we do otherwise we drastically constrain what our methodology is capable
of, reducing our purpose to building a bigger version of what so many people
are already doing.  We could also build the world's largest distributed
semantic network for human research and navigation but find it impossible to
perform the kind of common sense reasoning that databases like Cyc are
designed to make possible.

Cyc is based on a LISP meta-meta-model, which makes it capable of being
extended to do all the things I have described above.  If we force a higher
level meta-model design, we automatically concede the whole general purpose AI
field to Cyc and projects like it.

That is not necessarily a bad thing, it just means we are narrowing our focus
to a specific application domain rather than trying to build the Swiss army
knife of knowledge representation.  Again, this article is very good on these
kind of issues:

What is a Knowledge Representation? R. Davis, H. Shrobe, and P. Szolovits. AI
Magazine, 14(1):17-33,1993.

http://www.medg.lcs.mit.edu/ftp/psz/k-rep.html


I would like to hear from both Tom and Luke on whether you agree it is best to
base the system on a first class meta-meta-model, with all its implications
for data storage, or whether you think we should narrow our scope and
standardize on a consensus meta-model with specific classes / tables for
objects, attributes, relationships, names, and so forth.
 
> When we get together around Thanksgiving (assuming we can?) it would be
> cool to have a white- or blackboard, or at least a place to gesture and
> talk for a while. Place, anyone?

We can have it at my house.  662 N. 100 E. Farmington, UT 84025

- Mark

-- 
Mark Butler	       ( butlerm at middle.net )
Software Engineer  
Epic Systems              
(801)-451-4583




More information about the om-list mailing list