Editing Inductive logic programming (section)

== Approaches to ILP ==
An ''inductive logic programming system'' is a program that takes as an input logic theories <math>B, E^+, E^-</math> and outputs a correct hypothesis {{mvar|H}} with respect to theories <math>B, E^+, E^-</math>. A system is ''complete'' if and only if for any input logic theories <math>B, E^+, E^-</math> any correct hypothesis {{mvar|H}} with respect to these input theories can be found with its hypothesis search procedure.  Inductive logic programming systems can be roughly divided into two classes, search-based and meta-interpretative systems.

Search-based systems exploit that the space of possible clauses forms a [[complete lattice]] under the [[Theta-subsumption|subsumption]] relation, where one clause <math display="inline">C_1</math>  subsumes another clause <math display="inline">C_2</math> if there is a [[Substitution (logic)|substitution]] <math display="inline">\theta</math> such that <math display="inline">C_1\theta</math>, the result of applying <math display="inline">\theta</math> to <math display="inline">C_1</math>, is a subset of <math display="inline">C_2</math>. This lattice can be traversed either bottom-up or top-down.

=== Bottom-up search ===
Bottom-up methods to search the subsumption lattice have been investigated since Plotkin's first work on formalising induction in clausal logic in 1970.<ref name=":02"/><ref>{{cite thesis |first=G.D. |last=Plotkin |title=Automatic Methods of Inductive Inference |date=1970 |type=PhD |publisher=University of Edinburgh |url=https://www.era.lib.ed.ac.uk/bitstream/handle/1842/6656/Plotkin1972.pdf |hdl=1842/6656}}</ref> Techniques used include least general generalisation, based on [[Anti-unification (computer science)|anti-unification]], and inverse resolution, based on inverting the [[Resolution (logic)|resolution]] inference rule.

==== Least general generalisation ====
A ''least general generalisation algorithm'' takes as input two clauses <math display="inline">C_1</math> and <math display="inline">C_2</math> and outputs the least general generalisation of  <math display="inline">C_1</math> and <math display="inline">C_2</math>, that is, a clause  <math display="inline">C</math> that subsumes  <math display="inline">C_1</math> and <math display="inline">C_2</math>, and that is subsumed by every other clause that subsumes  <math display="inline">C_1</math> and <math display="inline">C_2</math>. The least general generalisation can be computed by first computing all ''selections'' from <math display="inline">C</math> and <math display="inline">D</math>, which are pairs of literals  <math>(L,M) \in (C_1, C_2)</math> sharing the same predicate symbol and negated/unnegated status.  Then, the least general generalisation is obtained as the disjunction of the least general generalisations of the individual selections, which can be obtained by [[first-order syntactical anti-unification]].<ref>{{Cite book |last1=Nienhuys-Cheng |first1=Shan-hwei |title=Foundations of inductive logic programming |last2=Wolf |first2=Ronald de |date=1997 |publisher=Springer |isbn=978-3-540-62927-6 |series=Lecture notes in computer science Lecture notes in artificial intelligence |location=Berlin Heidelberg |page=255}}</ref>

To account for background knowledge, inductive logic programming systems employ ''relative least general generalisations'', which are defined in terms of subsumption relative to a background theory. In general, such relative least general generalisations are not guaranteed to exist; however, if the background theory ''{{mvar|B}}'' is a finite set of [[Ground expression|ground]] [[Literal (mathematical logic)|literals]], then the negation of ''{{mvar|B}}'' is itself a clause. In this case, a relative least general generalisation can be computed by disjoining the negation of ''{{mvar|B}}'' with both <math display="inline">C_1</math> and <math display="inline">C_2</math> and then computing their least general generalisation as before.<ref>{{Cite book |last1=Nienhuys-Cheng |first1=Shan-hwei |title=Foundations of inductive logic programming |last2=Wolf |first2=Ronald de |date=1997 |publisher=Springer |isbn=978-3-540-62927-6 |series=Lecture notes in computer science Lecture notes in artificial intelligence |location=Berlin Heidelberg |page=286}}</ref>

Relative least general generalisations are the foundation of the bottom-up system [[Golem (ILP)|Golem]].<ref name=":12"/><ref name="Springer/Ohmsha"/>

==== Inverse resolution ====
Inverse resolution is an [[inductive reasoning]] technique that involves [[wiktionary:invert|inverting]] the [[Resolution (logic)|resolution operator]].

Inverse resolution takes information about the [[Resolvent (logic)|resolvent]] of a resolution step to compute possible resolving clauses. Two types of inverse resolution operator are in use in inductive logic programming: V-operators and W-operators. A V-operator takes clauses <math display="inline">R</math> and <math display="inline">C_1</math>as input and returns a clause <math display="inline">C_2</math> such that <math display="inline">R</math> is the resolvent of <math display="inline">C_1</math> and  <math display="inline">C_2</math>. A W-operator takes two clauses <math display="inline">R_1</math> and <math display="inline">R_2</math> and returns three clauses <math display="inline">C_1</math>, <math display="inline">C_2</math> and <math display="inline">C_3</math> such that <math display="inline">R_1</math> is the resolvent of <math display="inline">C_1</math> and <math display="inline">C_2</math> and <math display="inline">R_2</math> is the resolvent of  <math display="inline">C_2</math> and <math display="inline">C_3</math>.<ref name="invres">{{Cite book |last1=Nienhuys-Cheng |first1=Shan-hwei |title=Foundations of inductive logic programming |last2=Wolf |first2=Ronald de |date=1997 |publisher=Springer |isbn=978-3-540-62927-6 |series=Lecture notes in computer science Lecture notes in artificial intelligence |location=Berlin Heidelberg |page=197}}</ref>

Inverse resolution was first introduced by [[Stephen Muggleton]] and Wray Buntine in 1988 for use in the inductive logic programming system Cigol.<ref name="Proceedings of the 5th Internationa"/> By 1993, this spawned a surge of research into inverse resolution operators and their properties.<ref name="invres" />

=== Top-down search ===
The ILP systems Progol,<ref name=":2" /> Hail <ref>{{cite book |last1=Ray |first1=O. |url= |title=Proceedings of the 13th international conference on inductive logic programming |last2=Broda |first2=K. |last3=Russo |first3=A.M. |date=2003 |publisher=Springer |isbn=978-3-540-39917-9 |series=LNCS |volume=2835 |pages=311–328 |chapter=Hybrid abductive inductive learning |citeseerx=10.1.1.212.6602 |doi=10.1007/978-3-540-39917-9_21 |chapter-url=https://link.springer.com/chapter/10.1007/978-3-540-39917-9_21}}</ref> and Imparo <ref>{{cite book |last1=Kimber |first1=T. |title=Proceedings of the 10th international conference on logic programing and nonmonotonic reasoning |last2=Broda |first2=K. |last3=Russo |first3=A. |date=2009 |publisher=Springer |isbn=978-3-642-04238-6 |series=LNCS |volume=575 |pages=169–181 |chapter=Induction on failure: learning connected Horn theories |doi=10.1007/978-3-642-04238-6_16 |chapter-url=https://link.springer.com/chapter/10.1007/978-3-642-04238-6_16}}</ref> find a hypothesis {{mvar|H}} using the principle of the '''inverse entailment'''<ref name=":2" /> for theories {{mvar|B}}, {{mvar|E}}, {{mvar|H}}: <math>B \land H \models E \iff B \land \neg E \models \neg H</math>. First they construct an intermediate theory {{mvar|F}} called a bridge theory satisfying the conditions <math>B \land \neg E \models F</math> and <math>F \models \neg H</math>. Then as <math>H \models \neg F</math>, they generalize the negation of the bridge theory {{mvar|F}} with anti-entailment.<ref>{{cite journal |last1=Yamamoto |first1=Yoshitaka |last2=Inoue |first2=Katsumi |last3=Iwanuma |first3=Koji |year=2012 |title=Inverse subsumption for complete explanatory induction |url=https://link.springer.com/content/pdf/10.1007/s10994-011-5250-y.pdf |journal=Machine Learning |volume=86 |pages=115–139 |doi=10.1007/s10994-011-5250-y |s2cid=11347607}}</ref> However, the operation of anti-entailment is computationally more expensive since it is highly nondeterministic. Therefore, an alternative hypothesis search can be conducted using the inverse subsumption (anti-subsumption) operation instead, which is less non-deterministic than anti-entailment.

Questions of completeness of a hypothesis search procedure of specific inductive logic programming system arise. For example, the Progol hypothesis search procedure based on the inverse entailment inference rule is not complete by ''Yamamoto's example''.<ref>{{cite book |last=Yamamoto |first=Akihiro |url= |title=International Conference on Inductive Logic Programming |date=1997 |publisher=Springer |isbn=978-3-540-69587-5 |series=Lecture Notes in Computer Science |volume=1297 |location= |pages=296–308 |chapter=Which hypotheses can be found with inverse entailment? |citeseerx=10.1.1.54.2975 |doi=10.1007/3540635149_58 |chapter-url=https://link.springer.com/chapter/10.1007/3540635149_58}}</ref> On the other hand, Imparo is complete by both anti-entailment procedure <ref name="kimber2009induction">{{cite thesis |first=Timothy |last=Kimber |title=Learning definite and normal logic programs by induction on failure |date=2012 |type=PhD |publisher=Imperial College London |url=https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.560694 |id=ethos 560694 |access-date=2022-10-21 |archive-date=2022-10-21 |archive-url=https://web.archive.org/web/20221021035457/https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.560694 |url-status=dead}}</ref> and its extended inverse subsumption <ref>{{cite arXiv |eprint=1407.3836 |class=cs.AI |first=David |last=Toth |title=Imparo is complete by inverse subsumption |date=2014}}</ref> procedure.

=== Metainterpretive learning ===
Rather than explicitly searching the hypothesis graph, metainterpretive or ''meta-level'' systems encode the inductive logic programming program as a meta-level logic program which is then solved to obtain an optimal hypothesis. Formalisms used to express the problem specification include [[Prolog]] and [[answer set programming]], with existing Prolog systems and answer set solvers used for solving the constraints.<ref name=":0">{{Cite journal |last1=Cropper |first1=Andrew |last2=Dumančić |first2=Sebastijan |date=2022-06-15 |title=Inductive Logic Programming At 30: A New Introduction |journal=Journal of Artificial Intelligence Research |volume=74 |page=795 |doi=10.1613/jair.1.13507 |issn=1076-9757|doi-access=free |arxiv=2008.07912 }}</ref>

And example of a Prolog-based system is [[Metagol]], which is based on a [[Meta-interpreters in Prolog|meta-interpreter in Prolog]], while ASPAL and ILASP are based on an encoding of the inductive logic programming problem in answer set programming.<ref name=":0" />

=== Evolutionary learning ===
[[Evolutionary algorithm]]s in ILP use a population-based approach to evolve hypotheses, refining them through selection, crossover, and mutation. Methods like [[EvoLearner]] have been shown to outperform traditional approaches on structured machine learning benchmarks. <ref>{{cite conference |last1=Heindorf |first1=Stefan |last2=Blübaum |first2=Lukas |last3=Düsterhus |first3= Nick |last4=Werner |first4=Till |last5=Golani |first5=Varun Nandkumar |last6=Demir |first6=Caglar |last7=Ngonga Ngomo |first7=Axel-Cyrille |title=EvoLearner: Learning Description Logics with Evolutionary Algorithms |conference=WWW |date=2022|arxiv=2111.04879 }}</ref>