Indicator function

Template:Short description Template:About Template:More footnotes Template:Use American English

File:Indicator function illustration.png

A three-dimensional plot of an indicator function, shown over a square two-dimensional domain (set Template:Mvar): the "raised" portion overlays those two-dimensional points which are members of the "indicated" subset (Template:Mvar).

In mathematics, an indicator function or a characteristic function of a subset of a set is a function that maps elements of the subset to one, and all other elements to zero. That is, if Template:Mvar is a subset of some set Template:Mvar, then the indicator function of Template:Mvar is the function <math>\mathbf{1}_A</math> defined by <math>\mathbf{1}_{A}\!(x) = 1</math> if <math>x \in A,</math> and <math>\mathbf{1}_{A}\!(x) = 0</math> otherwise. Other common notations are Template:Math and <math>\chi_A.</math>Template:Efn

The indicator function of Template:Mvar is the Iverson bracket of the property of belonging to Template:Mvar; that is,

<math display="block">\mathbf{1}_{A}(x) = \left[\ x\in A\ \right].</math>

For example, the Dirichlet function is the indicator function of the rational numbers as a subset of the real numbers.

DefinitionEdit

Given an arbitrary set Template:Mvar, the indicator function of a subset Template:Mvar of Template:Mvar is the function <math display=block>\mathbf{1}_A \colon X \mapsto \{ 0, 1 \}</math> defined by <math display="block" qid="Q371983">\operatorname\mathbf{1}_A\!( x ) = \begin{cases} 1 & \text{if } x \in A \\ 0 & \text{if } x \notin A \,. \end{cases} </math>

The Iverson bracket provides the equivalent notation <math>\left[\ x\in A\ \right]</math> or Template:Nobr that can be used instead of <math>\mathbf{1}_{A}\!(x).</math>

The function <math>\mathbf{1}_A</math> is sometimes denoted Template:Math, Template:Mvar, Template:Mvar Template:Efn or even just Template:Mvar.Template:Efn

Notation and terminologyEdit

The notation <math>\chi_A</math> is also used to denote the characteristic function in convex analysis, which is defined as if using the reciprocal of the standard definition of the indicator function.

A related concept in statistics is that of a dummy variable. (This must not be confused with "dummy variables" as that term is usually used in mathematics, also called a bound variable.)

The term "characteristic function" has an unrelated meaning in classic probability theory. For this reason, traditional probabilists use the term indicator function for the function defined here almost exclusively, while mathematicians in other fields are more likely to use the term characteristic function to describe the function that indicates membership in a set.

In fuzzy logic and modern many-valued logic, predicates are the characteristic functions of a probability distribution. That is, the strict true/false valuation of the predicate is replaced by a quantity interpreted as the degree of truth.

Basic propertiesEdit

The indicator or characteristic function of a subset Template:Mvar of some set Template:Mvar maps elements of Template:Mvar to the codomain <math>\{0,\, 1\}.</math>

This mapping is surjective only when Template:Mvar is a non-empty proper subset of Template:Mvar. If <math>A = X,</math> then <math>\mathbf{1}_A \equiv 1.</math> By a similar argument, if <math>A = \emptyset</math> then <math>\mathbf{1}_A \equiv 0.</math>

If <math>A</math> and <math>B</math> are two subsets of <math>X,</math> then <math display=block>\begin{align} \mathbf{1}_{A\cap B}(x) ~&=~ \min\bigl\{\mathbf{1}_A(x),\ \mathbf{1}_B(x)\bigr\} ~~=~ \mathbf{1}_A(x) \cdot\mathbf{1}_B(x), \\ \mathbf{1}_{A\cup B}(x) ~&=~ \max\bigl\{\mathbf{1}_A(x),\ \mathbf{1}_B(x)\bigr\} ~=~ \mathbf{1}_A(x) + \mathbf{1}_B(x) - \mathbf{1}_A(x) \cdot \mathbf{1}_B(x)\,, \end{align}</math>

and the indicator function of the complement of <math>A</math> i.e. <math>A^\complement</math> is: <math display=block>\mathbf{1}_{A^\complement} = 1 - \mathbf{1}_A.</math>

More generally, suppose <math>A_1, \dotsc, A_n</math> is a collection of subsets of Template:Mvar. For any <math>x \in X:</math>

<math display=block> \prod_{k \in I} \left(\ 1 - \mathbf{1}_{A_k}\!\left( x \right)\ \right)</math>

is a product of Template:Maths and Template:Maths. This product has the value Template:Math at precisely those <math>x \in X</math> that belong to none of the sets <math>A_k</math> and is 0 otherwise. That is

<math display=block> \prod_{k \in I} ( 1 - \mathbf{1}_{A_k}) = \mathbf{1}_{X - \bigcup_{k} A_k} = 1 - \mathbf{1}_{\bigcup_{k} A_k}.</math>

Expanding the product on the left hand side,

<math display=block>\mathbf{1}_{\bigcup_{k} A_k}= 1 - \sum_{F \subseteq \{1, 2, \dotsc, n\}} (-1)^{|F|} \mathbf{1}_{\bigcap_F A_k} = \sum_{\emptyset \neq F \subseteq \{1, 2, \dotsc, n\}} (-1)^{|F|+1} \mathbf{1}_{\bigcap_F A_k}</math>

where <math>|F|</math> is the cardinality of Template:Mvar. This is one form of the principle of inclusion-exclusion.

As suggested by the previous example, the indicator function is a useful notational device in combinatorics. The notation is used in other places as well, for instance in probability theory: if Template:Mvar is a probability space with probability measure <math>\mathbb{P}</math> and Template:Mvar is a measurable set, then <math>\mathbf{1}_A</math> becomes a random variable whose expected value is equal to the probability of Template:Mvar:

<math display=block>\operatorname\mathbb{E}_X\left\{\ \mathbf{1}_A(x)\ \right\}\ =\ \int_{X} \mathbf{1}_A( x )\ \operatorname{d\ \mathbb{P} }(x) = \int_{A} \operatorname{d\ \mathbb{P} }(x) = \operatorname\mathbb{P}(A).</math>

This identity is used in a simple proof of Markov's inequality.

In many cases, such as order theory, the inverse of the indicator function may be defined. This is commonly called the generalized Möbius function, as a generalization of the inverse of the indicator function in elementary number theory, the Möbius function. (See paragraph below about the use of the inverse in classical recursion theory.)

Mean, variance and covarianceEdit

Given a probability space <math>\textstyle (\Omega, \mathcal F, \operatorname{P})</math> with <math>A \in \mathcal F,</math> the indicator random variable <math>\mathbf{1}_A \colon \Omega \rightarrow \mathbb{R}</math> is defined by <math>\mathbf{1}_A (\omega) = 1 </math> if <math> \omega \in A,</math> otherwise <math>\mathbf{1}_A (\omega) = 0.</math>

Mean: <math>\ \operatorname\mathbb{E}(\mathbf{1}_A (\omega)) = \operatorname\mathbb{P}(A)\ </math> (also called "Fundamental Bridge").

Variance: <math>\ \operatorname{Var}(\mathbf{1}_A (\omega)) = \operatorname\mathbb{P}(A)(1 - \operatorname\mathbb{P}(A)).</math>

Covariance: <math>\ \operatorname{Cov}(\mathbf{1}_A (\omega), \mathbf{1}_B (\omega)) = \operatorname\mathbb{P}(A \cap B) - \operatorname\mathbb{P}(A) \operatorname\mathbb{P}(B).</math>

Characteristic function in recursion theory, Gödel's and Kleene's representing functionEdit

Kurt Gödel described the representing function in his 1934 paper "On undecidable propositions of formal mathematical systems" (the symbol "Template:Math" indicates logical inversion, i.e. "NOT"):<ref name=Martin-1965>Template:Cite book</ref>Template:Rp

There shall correspond to each class or relation Template:Mvar a representing function <math>\phi(x_1, \ldots x_n) = 0</math> if <math>R(x_1,\ldots x_n)</math> and <math>\phi(x_1,\ldots x_n) = 1</math> if <math>\neg R(x_1,\ldots x_n).</math>{{#if:|{{#if:|}}
— {{#if:|, in }}Template:Comma separated entries
}}

{{#invoke:Check for unknown parameters|check|unknown=Template:Main other|preview=Page using Template:Blockquote with unknown parameter "_VALUE_"|ignoreblank=y| 1 | 2 | 3 | 4 | 5 | author | by | char | character | cite | class | content | multiline | personquoted | publication | quote | quotesource | quotetext | sign | source | style | text | title | ts }}

Kleene offers up the same definition in the context of the primitive recursive functions as a function Template:Mvar of a predicate Template:Mvar takes on values Template:Math if the predicate is true and Template:Math if the predicate is false.<ref name=Kleene1952>Template:Cite book</ref>

For example, because the product of characteristic functions <math>\phi_1 * \phi_2 * \cdots * \phi_n = 0</math> whenever any one of the functions equals Template:Math, it plays the role of logical OR: IF <math>\phi_1 = 0\ </math> OR <math>\ \phi_2 = 0</math> OR ... OR <math>\phi_n = 0</math> THEN their product is Template:Math. What appears to the modern reader as the representing function's logical inversion, i.e. the representing function is Template:Math when the function Template:Mvar is "true" or satisfied", plays a useful role in Kleene's definition of the logical functions OR, AND, and IMPLY,<ref name=Kleene1952 />Template:Rp the bounded-<ref name=Kleene1952 />Template:Rp and unbounded-<ref name=Kleene1952 />Template:Rp mu operators and the CASE function.<ref name=Kleene1952 />Template:Rp

Characteristic function in fuzzy set theoryEdit

In classical mathematics, characteristic functions of sets only take values Template:Math (members) or Template:Math (non-members). In fuzzy set theory, characteristic functions are generalized to take value in the real unit interval Template:Closed-closed, or more generally, in some algebra or structure (usually required to be at least a poset or lattice). Such generalized characteristic functions are more usually called membership functions, and the corresponding "sets" are called fuzzy sets. Fuzzy sets model the gradual change in the membership degree seen in many real-world predicates like "tall", "warm", etc.

SmoothnessEdit

Template:See also In general, the indicator function of a set is not smooth; it is continuous if and only if its support is a connected component. In the algebraic geometry of finite fields, however, every affine variety admits a (Zariski) continuous indicator function.<ref>Template:Cite book</ref> Given a finite set of functions <math>f_\alpha \in \mathbb{F}_q\left[\ x_1, \ldots, x_n\right]</math> let <math>V = \bigl\{\ x \in \mathbb{F}_q^n : f_\alpha(x) = 0\ \bigr\}</math> be their vanishing locus. Then, the function <math display="inline">\mathbb{P}(x) = \prod\left(\ 1 - f_\alpha(x)^{q-1}\right)</math> acts as an indicator function for <math>V.</math> If <math>x \in V</math> then <math>\mathbb{P}(x) = 1,</math> otherwise, for some <math>f_\alpha,</math> we have <math>f_\alpha(x) \neq 0</math> which implies that <math>f_\alpha(x)^{q-1} = 1,</math> hence <math>\mathbb{P}(x) = 0.</math>

Although indicator functions are not smooth, they admit weak derivatives. For example, consider Heaviside step function <math display="block">H(x) \equiv \operatorname\mathbb{I}\!\bigl(x > 0\bigr)</math> The distributional derivative of the Heaviside step function is equal to the Dirac delta function, i.e. <math display=block>\frac{\mathrm{d}H(x)}{\mathrm{d}x}= \delta(x)</math> and similarly the distributional derivative of <math display="block">G(x) := \operatorname\mathbb{I}\!\bigl(x < 0\bigr)</math> is <math display=block>\frac{\mathrm{d}G(x)}{\mathrm{d}x} = -\delta(x).</math>

Thus the derivative of the Heaviside step function can be seen as the inward normal derivative at the boundary of the domain given by the positive half-line. In higher dimensions, the derivative naturally generalises to the inward normal derivative, while the Heaviside step function naturally generalises to the indicator function of some domain Template:Mvar. The surface of Template:Mvar will be denoted by Template:Mvar. Proceeding, it can be derived that the inward normal derivative of the indicator gives rise to a surface delta function, which can be indicated by <math>\delta_S(\mathbf{x})</math>: <math display=block>\delta_S(\mathbf{x}) = -\mathbf{n}_x \cdot \nabla_x \operatorname\mathbb{I}\!\bigl(\ \mathbf{x}\in D\ \bigr)\ </math> where Template:Mvar is the outward normal of the surface Template:Mvar. This 'surface delta function' has the following property:<ref>Template:Cite journal</ref> <math display=block>-\int_{\R^n}f(\mathbf{x})\,\mathbf{n}_x\cdot\nabla_x \operatorname\mathbb{I}\!\bigl(\ \mathbf{x}\in D\ \bigr) \; \operatorname{d}^{n}\mathbf{x} = \oint_{S}\,f(\mathbf{\beta}) \; \operatorname{d}^{n-1}\mathbf{\beta}.</math>