[om-list] Data Model

Sun Apr 22 11:11:35 EDT 2001

OM

    Just so you know, I'm reading this stuff.

    I have taken one database class, and it was enlightening -- but almost
solely in the sense that I can now understand how the majority of you
business-computer people are thinking, a little more that I could before.
In my opinion, what we will eventually end up doing to construct an optimal
model of knowledge will look nothing like RDB, or even OODB.

    So, for now, my main motivation is to work on my own mathetical theory
of modelling and inference, so we have at least one well-thought-out plan to
borrow from as we find the flaws in our first implementation.

    By the way, I'm going to start writing my first attempt at a
machine-learning algorithm in about one month.  It will be similar to
statistical learning, from what I've been told.  Did I tell you guys this
already?  Have any of you heard of cluster analysis?  I told Mark about my
charge-density analogy in learning/classifying new/old objects based on
their point-values in many-dimensional feature space.

    For me, there are two main objectives to writing good AI: (1) the model,
(2) the method, i.e. the (1) statics and (2) dynamics of the language, i.e.
(1) the data structure, and (2) the algorithm operating on that data; and
it's almost arbitrary which one you start your work on, since they are both
so interdependent.  I think Lee wants to see more of the algorithm part.

    The OM project is starting on the model first, and while I see nothing
wrong in this, for some reason, my personal interests have lead me to start
working on method/algorithms first.  I had thought about the form of the
model a lot, to start with, but now that I am actually ready to program
something I want to try out algorithms.

    I just discovered Support Vector Machine theory, which is very
interesting, and similar to what I wanted to develop.  There are so many
ideas out there, and they all seem to be overly concerned with method, above
model, but I think we can learn something from them.

    Here is my number-one desire right now, concerning our project:  How do
we get data to test our method (our inferences) on, and how do we compare
our results to the previously created systems so we know what value our
system has, personal-use-wise, and market-wise?  I think we need to find
data that other people have tested their programs on, and make similar
tests.  Any thoughts?  Any known resources?

    In the second semester hence, I will work on developing a neural net: an
alternative approach to the explicit inference method of geometry.  Then I
will work on trying to combine the two methods, or even to come up with two
models, and a relation (translation) between the two, so we can take
advantage of what I think I will discover in my senior research: that each
type of inference (the explicit and the implicit, i.e. the geometric and the
anti-geometric, the local and the non-local) will have its strengths and
weaknesses.

tomp

----- Original Message -----
From: "Luke Call" <lacall at onemodel.org>
To: <om-list at onemodel.org>
Sent: Wednesday, April 18, 2001 7:47 AM
Subject: Re: [om-list] Data Model

Sorry for the delay; responses below. -Luke

Mark Butler wrote:

> ....
> I have had second thoughts about this - for query purposes we want all
> relationship attribute values to be in the same table.  So I propose that
we
> move ATTRIBUTE_VERSION to be a new table called REL_VALUE (short for
> relationship value).

Thanks for the work. I think we'll continue to learn more about what we
want in the data model over time & w/ experience.

> ...
> This table would be the most critical data table in the whole system.  I
> propose that we use a structure like the following:
>
> -- THIS_ID Primary Key (inherited - aka ATTRIBUTE_VALUE_ID)
> -- STANDARD_NAME       Standard name (inherited)

When we develop contexts so that anything can have 0-n names, we will
probably want to move this; it's probably a good place for now.

> -- RELATIONSHIP_ID Relationship this value is for
> -- PARENT_ID Entity / relationship this attribute applies to
> -- (Same as ATTRIBUTE.PARENT_ID)
> --
> -- SOURCE_ID Source Entity ID - Origin for relationship
> -- Null allowed here - means "empty space"
> --
> -- DEST_ID  Destination Entity ID - Destination for relationship
> --
> -- DOMAIN_ID Domain for this relationship value
> --                      A domain is a combination of unit and basis
> --
> -- SCHEMA_ID Schema for this relationship value
> -- A schema is an entity or a work thereof that
> -- asserts a particular world view
> --
> -- CLAIM_YEAR Year schema first asserted this value (fractional)
> -- VALUE_YEAR      Value asserted to be valid for this year (fractional)
> -- Null if claim is for all time

Wouldn't a value be good for a span of time?

> -- STRING_VALUE String value
> -- SCALAR_VALUE Scalar value
> -- VECTOR_VALUE Vector value
> -- MATRIX_VALUE Matrix value

Would it be more efficient use of space to break these four value fields
into related tables, since 3 out of four of these will be blank on any
given record?

Also, how do we distinguish between a relationship and an actual value?
Some values aren't relationships, right? (Unless I've just forgotten
something.)

> Now I completely skipped the relationship and relationship attribute
tables
> because I do not think we need to instantiate those objects unless we are
> going to use them as one of the terminal entities in a relationship. Any
> objections?

No objections from me.

> Finally, I propose that we treat NULL as the entity id for "nothing", i.e.
the
> entity that does not exist.  This is useful for a class of relationships,
like
> mass / energy that do not relate to any convenient real entity.  In such
cases
> SOURCE_ID would be null, and dest_id would be the entity id of the entity
we
> are describing.

What about the use of NULL as "no information known"? What would we use
to indicate "no information"?

Luke

_______________________________________________
om-list mailing list
om-list at onemodel.org
http://www.pairlist.net/mailman/listinfo/om-list