[om-list] use cases: future features (queries, worldviews, logical statements)

Fri Apr 20 09:18:37 EDT 2001

*****future features:
******queries
*******could include trust levels and automatically deprecate things 
that have inconsistencies w/ other info in the system, as per Mark's aug 
18 2000 comment in old email archive #12:
********I think we should allow users to specify trust levels for different
world views / schemas, and then have the query engine take those levels into
account.  A good query engine would also automatically deprecate the trust
level of each piece of any unresolved inconsistency.
*******tom says there are 4 kinds of classes (his om-list email of march 
2001 sometime):
I have
modified my original
ideas about there being
three types of "queries",
in two ways.
(1) The third type of
"query", called "distance
from
prototype/archetype" can
and probably should be
generalised to something
more like "distance from
centroid/centre-of-mass"
in the same mathetical
state/classification
space, (i.e. a functional
space where the domain is
a
many-dimensional state or
list of features, and
where the range is
classificational values
for many classes). But,
if we model psychology
well
enough, the centre of
mass should turn out to
be very close to the
conceptual "prototype"
anyway.
(2) There is a fourth
type of query. Just as
the third query is a
fuzzy
version of the first type
(the first type being the
idealistic-inductive or
extensional-definition-ba
sed query) the fourth
type is a fuzzy version
of
the second type (the
second type being the
idealistic-deductive or
intensional-definition-ba
sed query).
More about all this
later.
My next suggestion is
"semi-symbolic",
(semi-symbolic/semi-geome
tric, is
how I think of this), or,
in other words, graph
theory. Mark, I strongly
suggest we/you look into
what, in graph theory,
are called "isomorphisms"
and "homeomorphisms".
They are very similar to
a part of this process of
identification. Two
reasons: (1) if there are
algorithms for
determining
homeomorphisms in
weighted, directed
graphs, then our job is
easier. (2) If
not, then our job will
involve more word, but it
will be work that will
advance the field of
graph theory, and should
be published.
The way I see it, the
process of
homeomorphism-finding
should be
specialised to apply to a
graph whose edges
(internodes) are not just
weighted and directed,
but are of differing
sub-types, corresponding
to
different dimensions in
space; and every relation
or attribute in the
database (such as your
GEDCOM files) would be
encoded as nodes and
internodes.

******adding worldviews with certainty levels:
*******Problem: Specifying differential certainty among multiple 
relations, and dependency among certainty sets.  I think this is 
probably more what you (Luke) were talking about, but I can't think of a 
very effective example to use.  It would illustrate instances when two 
relations had the same certainty assigned to them because the knowledge 
which they encoded came from the same source.  If one relation was found 
to be wrong, then the other would also be wrong.  They would change 
certainty together.     Solution: inclusion of "worldviews" into the 
model.  A worldview is assigned a certainty, and all relations would be 
classified as belonging to this or that worldview (and perhaps more than 
one -- redundancy is very good in promoting certainty). (from 
www.onemodel.org/maillist/oldlist-volume00-5).
Then Mark said:
"I agree, but an even better enhancement is to derive the drop in 
certainty of
relation 2 by traversing the co-derivation of relation 1 and relation 2. 
  That
way you can better determine whether relation 2 is a mistake or whether the
whole worldview is not to be trusted.

By the way, rather than storing certainty / degree of belief as a number P
between zero and one, storing it in "log likelihood" form as log (P / ( 
1 - P
)) makes it far easier to both read and to do calculations.   Some examples:

Belief Level    Degree of Belief (0 .. 1)
------------    -------------------------
-4 
	0.000099  (roughly 10 ^ -4)
-3 
	0.000999  (roughly 10 ^ -3)
-2 
	0.009901  (roughly 10 ^ -2)
-1 
	0.090909  (roughly 10 ^ -1)
  0		0.5000
  1		0.9091    (roughly 1 - (10 ^ -1) )
  2		0.9901    (roughly 1 - (10 ^ -2) )
  3		0.9990    (roughly 1 - (10 ^ -3) )
  4		0.9999 
   (roughly 1 - (10 ^ -4) )

The nice thing about this form is that if you have several independent 
pieces
of evidence that are invidually sufficient to prove a hypothesis if 
true, you
can just add the logarithmic belief levels and get a correct net belief 
level."

******logical statements embedded in the model (used w/ queries etc too):
*******they are like a "query info cache" to save the system from 
repeating the same long queries many times, just to find out that "all 
___ are ___" or basic mathematical principles or something
*******they have to be validated
********whether by a background process scanning all statements not 
checked since a certain "latest update" date on the individual queryies 
or lit of them, or by checking them every time you enter new info into 
the model.
*******entities referred to in them are regular entities within the model.
*******they are related to contexts within the model: expected to be 
valid in some, not others (or valid to a fuzzy-defined extent as per tom?)
*******they could be used for constraints on individual entities or 
classes of them.
*******used for logical analysis?
*******what language or format are they stored in?

[end]