Editing Association rule learning (section)

== Definition ==
[[File:Association Rule Mining Venn Diagram.png|thumb|A Venn Diagram to show the associations between itemsets X and Y of a dataset. All transactions that contain item X are located in the white, left portion of the circle, while those containing Y are colored red and on the right. Any transaction containing both X and Y are located in the middle and are colored pink.

Multiple concepts can be used to depict information from this graph. For example, if one takes all of the transactions in the pink section and divided them by the total amount of transactions (transactions containing X (white) + transactions containing Y(red)), the output would be known as the support. An instance of getting the result of a method known as the confidence, one can take all of the transactions in the middle (pink) and divide them by all transactions that contain Y (red and pink).

In this case, Y is the antecedent and X is the consequent.

]]

Following the original definition by Agrawal, Imieliński, Swami<ref name="mining" /> the problem of association rule mining is defined as:

Let <math>I=\{i_1, i_2,\ldots,i_n\}</math> be a set of {{mvar|n}} binary attributes called ''items''.

Let <math>D = \{t_1, t_2, \ldots, t_m\}</math> be a set of transactions called the ''database''.

Each ''transaction'' in {{mvar|D}} has a unique transaction ID and contains a subset of the items in {{mvar|I}}.

A ''rule'' is defined as an implication of the form:

:<math>X \Rightarrow Y</math>, where <math>X, Y \subseteq I</math>.

In Agrawal, Imieliński, Swami<ref name="mining" /> a ''rule'' is defined only between a set and a single item, <math>X \Rightarrow i_j</math> for <math>i_j \in I</math>.

Every rule is composed by two different sets of items, also known as ''itemsets'', {{mvar|X}} and {{mvar|Y}}, where {{mvar|X}} is called ''antecedent'' or left-hand-side (LHS) and {{mvar|Y}} ''consequent'' or right-hand-side (RHS). The antecedent is that item that can be found in the data while the consequent is the item found when combined with the antecedent. The statement <math>X \Rightarrow Y</math> is often read as ''if {{mvar|X}} then {{mvar|Y}}'', where the antecedent ({{mvar|X}} ) is the ''if'' and the consequent ({{mvar|Y}}) is the ''then''. This simply implies that, in theory, whenever {{mvar|X}} occurs in a dataset, then {{mvar|Y}} will as well.