Partial containment of languages
Syntax
A :- B1,... Bn
3 Semantics (first order logic FOL) for datalog programs
Example program
r1 reachable(X,Y) :- link(X,Y).
r2 reachable(X,Y) :- link(X,Z), reachable(Z,Y).
query(X,Y) :- reachable(X,Y).
Logical model theoretic semantics
∀X∀Y (link(X,Y) --> reachable(X,Y))
∀X∀Y∀Z (link(X,Z) ^ reachable(Z,Y) --> reachable(X,Y))
Given source db I and program P, a model of P is instance I’ superset I that satisfies all above logic rules. Of all possible models, there is a minimal model that is smallest.
P(I) = I'
that is polynomial in size of IImmediate Consequence Operator T_P maps I to I’ based on rules in P:
A is an immediate consequence of I, P if:
1. A is a ground atom
2. A :- B1, ..., Bn is
- a ground instance rule
- B1,...Bn are in I
Given atom A, its existance can be derived with a proof tree whose leaves are ground atoms
A = reachable(a,c)
Proof:
link(b,c) given
reachable(b,c) rule
link(a,b) given
reachable(a,c) rule
Set of atoms that have proofs
Negation (!datalog)
The following doesn’t have a unique minimal model
p :- !q
q :- !p
both {p} and {q} are minimal, yet proofs for either are not finite
Semipositive datalog
negation only allowed in rule body on EDB predicates
reachable(X,Y) :- link(X,Y).
reachable(X,Y) :- link(X,Z), reachable(Z,Y).
indirect(X,Y) :- reachable(X,Y), not link(X,Y).
indirect
depends on negated atom
Stratified Negation
Following is not semipositive because not reachable
is over an intensional predicate
R1 reachable(X,Y) :- link(X,Y).
R2 reachable(X,Y) :- link(X,Z), reachable(Z,Y).
R3 node(X) :- link(X,Y).
R4 node(Y) :- link(X,Y).
R5 unreachable(X,Y) :- node(X), node(Y), not reachable(X,Y).
Stratification: P
rewritten as sequence of semipositive programs P1,...Pn
where each all IDBs of each step are “frozen” before starting next step
A :- ...B... if A in Pi, B in Pj -> i ≥ j
A :- ..!B... if A in Pi, B in Pj -> i > j
p :- not q. q :- not p.
is not stratifiableExpressiveness
Stratified !Datalog = polytime queries on ordered databases.
I.e., any query computable in polytime on an ordered database
is expressable by a stratified !Datalog program.
i.e., exactly all tractable queries (theoretical perspective)
Aggregation
reachable(X,Y) :- link(X,Y).
reachable(X,Y) :- reachable(X,Z), link(Z,Y).
summary(X, count<Y>) :- reachable(X,Y).
Challenges
p(X) :- q(X).
p(sum<X>) :- p(X). `p` is an infinite table with no fix point
Stratification is same as for negation with an additional rule:
A :- ...B... if A has agg() and in Pi, B in Pj -> i ≥ j
Extend cycle detection with aggregation to test for non-stratifiability
Aggregate Selection
Don’t completely understand the selection rule:
Selections s1 = path_s1(X,Y,P,C) : groupby(path_s1(X,Y,P,C),[X,Y],min(C))
Functors
usage
path(S,D,P) :- link(S,D), P = [S,D].
path(S,D,Z,P) :- link(S,Z), path(Z,D,Z2,P2), P = [S,P2].
[_,_]
is a list concatonation functor
If we want to split emp(name, salary)
into names(id, name)
and salary(id, salary)
, use a
skolem functor
names(f(name,salary), name) :- emp(name, salary)
salary(f(name, salary), salary) :- emp(name, salary)
Logically, it changes the first order statement
∀x ∀y (emp(x, y) → ∃z names(z, x) ∧ salary(z, y))
And gets rid of the existential ∃
by fixing a value to z = f(x,y)
∀x ∀y (emp(x, y) → names(f(x, y), x) ∧ salary(f(x, y), y))
p
with k
verticies, (p, i)…(p,k) are nodesp :- ...q...
where i’th term in p is same as j’th term in qTopdown vs bottom up
Naive approach straightup recomputes everything all the time
Start with database instance D_0, program P
D_0
until D_i = D_{i-1}
D_i = P(D_{i-1})
SemiNaive leverages fact that P is monotone to only propagate changes
Notice similariy with Naiad
Δ_0 = D_0
until Δ_i is empty
Δ_i = P(D_{i-1}, Δ_{i-1})
Consider the following program. pi is a predicate, qi is a fact (tuple)
P = p1, .. pn, q1,... qm
Let's just say
Q = q1... qm
We know the following, where δi are tuples added to pi after ith iteration
p_{i+1} = p1_i, ..., pn_i, Q
and
p_{i+1} = pi U δi
Thus
p_{i+1} = (p1_{i-1} U δ1_{i-1}), ..., (pn_{i_1} U δn_{i-1}), Q
Given δs for a given table, we can compute the new table as:
p_{i+1} = p_i U Δ_i
Δ_i = δ1_{i-1}, p2, ..., pn_{i_1}, Q
...
Δ_i = p1_{i-1}, ..., δn_{i-1}, Q
where
δ_i = Δ_i - p_i
Since we know that
p_i U Δ_i == δ_i U p_i
New records for S probe T hash table
Q(a, b, c) = S(a, b), T(b, c)
= S_i(a, b) T_i(b, c) U
δS_i(x, y) T_i(y, z) U
S_i(m, n) δT_i(n, o)
There’s a special case here, where p is recursive on p
p = p1, ..., p, ... pn
p_{i+1} = p_i U Δ_i
Δ_i = p1_{i-1}, ..., δp_{i-1}, ... pn_{i-1}
δp_i = δp_{i-1}, Q
Main idea
Example
r1: R(X, Y) = L(X, Y)
r2: R(X, Y) = L(X, Z), R(Z, Y)
Q(X, Y) = R(X, Y)
Q(1, Y)
Terminology
in_R[bf](V)
in_R[bf](X)
denotes head of a rule that binds X
sup^i_j(V)
is the input binding for jth atom of ith rulesup^i_0(V)
is the input binding for left-most atomExample
R(X, Y) = L(X, Z), R(Z, Y)
R(1, Y)
top down info pass:
R(1, Y) = L(1, Z), R(Z, Y)
eval L since it's a base relation
L(1, 2), L(1, 3) matches
R(1, Y) = L(1, 2), R(2, Y)
= L(1, 3), R(3, Y)
R(2, Y) and R(3, Y) are candidate rules now
since they are heads and distinguished
Main idea
Adorned Exampl
Start with adorned program
ar1 R[bf](X, Y) = L[bf](X, Y)
ar2 R[bf](X, Y) = L[bf](X, Z), R[bf](Z, Y)
Create inital binding predicates
in_R[bf](1)
sup^1_0(X) = in_R[bf](X)
sup^1_1(Y) = sup^1_0(X), L[bf](X, Y)
sup^2_0(X) = in_R[bf](X)
sup^2_1(Z) = sup^2_0(X), L[bf](X, Z)
sup^2_2(Y) = sup^2_1(Z), R[bf](Z, Y)
Recursion implies that supplementary relation for R also feeds input bindings
in_R[bf](X) = sup^2_1(X)
Finally, get results
R[bf](X, Y) = sup^1_2(Y)
= sup^2_2(Y)
Semi-naive evaluation naturally filters out irrelevant facts!
Concerns: managing and applying constraints is not free!
Filter-Join for R join S
Project R based on join attrs, filter S using it
R(X, Y), S(Y, Z)
into
R(X, Y), sup(Y), S(Y, Z)
Statelog: Explicit notation of state. State is timestamped logically using natural numbers.
State constant 0, unary function +1, state variable S
The state terms set _S_ is the least set s.t.,
- [0]∈S
- [S]∈S
- [T]∈S --> [T+1]∈S
Atom: [T]p(x1,..,xn)
xi: constant or variable
Literal: Atom or its negation
Rule: [S+k0]H :- [S+k1]B1, ... [S+kn]Bn
Head is Atom, body are literals
Time can be _anything_
Dedalus differs in two ways
k0 = ... = kn
k0 = kj+1, k1 = .. = kn
k0 = kj+n, k1 = .. = kn
where n
is some value greater than 0
p(..)@next :- p(..)