Editing Indicator function

{{Short description|Mathematical function characterizing set membership}}
{{About|the 0&ndash;1 indicator function|the 0&ndash;infinity indicator function|characteristic function (convex analysis)}}
{{More footnotes|date=December 2009}}
{{Use American English|date = March 2019}}

[[Image:Indicator function illustration.png|right|thumb|A three-dimensional plot of an indicator function, shown over a square two-dimensional domain (set {{mvar|X}}): the "raised" portion overlays those two-dimensional points which are members of the "indicated" subset ({{mvar|A}}).]]
In [[mathematics]], an '''indicator function''' or a '''characteristic function''' of a [[subset]] of a [[Set (mathematics)|set]] is a [[Function (mathematics)|function]] that maps elements of the subset to one, and all other elements to zero. That is, if {{mvar|A}} is a subset of some set {{mvar|X}}, then the indicator function of {{mvar|A}} is the function <math>\mathbf{1}_A</math> defined by <math>\mathbf{1}_{A}\!(x) = 1</math> if <math>x \in A,</math> and <math>\mathbf{1}_{A}\!(x) = 0</math> otherwise. Other common notations are {{math|𝟙{{sub|''A''}}}} and <math>\chi_A.</math>{{efn|name=χαρακτήρ}}

The indicator function of {{mvar|A}} is the [[Iverson bracket]] of the property of belonging to {{mvar|A}}; that is, 

<math display="block">\mathbf{1}_{A}(x) = \left[\ x\in A\ \right].</math>

For example, the [[Dirichlet function]] is the indicator function of the [[rational number]]s as a subset of the [[real number]]s.

==Definition==
Given an arbitrary set {{mvar|X}}, the indicator function of a subset {{mvar|A}} of {{mvar|X}} is the function
<math display=block>\mathbf{1}_A \colon X \mapsto \{ 0, 1 \}</math>
defined by
<math display="block" qid="Q371983">\operatorname\mathbf{1}_A\!( x ) =
\begin{cases}
1 & \text{if } x \in A \\
0 & \text{if } x \notin A \,.
\end{cases}
</math>

The [[Iverson bracket]] provides the equivalent notation <math>\left[\ x\in A\ \right]</math> or {{nobr|{{math|⟦&thinsp;''x'' ∈ ''A''&thinsp;⟧}},}} that can be used instead of <math>\mathbf{1}_{A}\!(x).</math>

The function <math>\mathbf{1}_A</math> is sometimes denoted {{math|𝟙{{sub|''A''}}}}, {{mvar|I<sub>A</sub>}}, {{mvar|&chi;<sub>A</sub>}}{{efn|name=χαρακτήρ|
The [[Greek alphabet|Greek letter]] {{mvar|&chi;}} appears because it is the initial letter of the Greek word {{lang|grc|{{math|χαρακτήρ}}}}, which is the ultimate origin of the word ''characteristic''.
}} or even just {{mvar|A}}.{{efn|
The set of all indicator functions on {{mvar|X}} can be identified with the set operator <math>\mathcal{P}(X),</math> the [[power set]] of {{mvar|X}}. Consequently, both sets are denoted by the conventional [[abuse of notation]] as <math>2^X,</math> in analogy to the relation for the count of elements in the powerset and the original set. This is a special case <math>\left(Y = \{0,\, 1\}\right)</math> of the notation <math>Y^X</math> for the set of all functions <math>f</math> such that <math>f: X \mapsto Y \,.</math>
}}

==Notation and terminology==
The notation <math>\chi_A</math> is also used to denote the [[Characteristic function (convex analysis)|characteristic function]] in [[convex analysis]], which is defined as if using the [[Multiplicative inverse|reciprocal]] of the standard definition of the indicator function.

A related concept in [[statistics]] is that of a [[dummy variable (statistics)|dummy variable]]. (This must not be confused with "dummy variables" as that term is usually used in mathematics, also called a [[free variables and bound variables|bound variable]].)

The term "[[characteristic function (probability theory)|characteristic function]]" has an unrelated meaning in [[probability theory|classic probability theory]]. For this reason, [[List of probabilists|traditional probabilists]] use the term '''indicator function''' for the function defined here almost exclusively, while mathematicians in other fields are more likely to use the term ''characteristic function'' to describe the function that indicates membership in a set.

In [[fuzzy logic]] and [[Many-valued logic|modern many-valued logic]], predicates are the [[characteristic function (probability theory)|characteristic functions]] of a [[probability distribution]]. That is, the strict true/false valuation of the predicate is replaced by a quantity interpreted as the degree of truth.

==Basic properties==
The ''indicator'' or ''characteristic'' [[function (mathematics)|function]] of a subset {{mvar|A}} of some set {{mvar|X}} [[Map (mathematics)|maps]] elements of {{mvar|X}} to the [[codomain]] <math>\{0,\, 1\}.</math>

This mapping is [[surjective]] only when {{mvar|A}} is a non-empty [[proper subset]] of {{mvar|X}}. If <math>A = X,</math> then <math>\mathbf{1}_A \equiv 1.</math> By a similar argument, if <math>A = \emptyset</math> then <math>\mathbf{1}_A \equiv 0.</math>

If <math>A</math> and <math>B</math> are two subsets of <math>X,</math> then
<math display=block>\begin{align}
\mathbf{1}_{A\cap B}(x) ~&=~ \min\bigl\{\mathbf{1}_A(x),\ \mathbf{1}_B(x)\bigr\} ~~=~ \mathbf{1}_A(x) \cdot\mathbf{1}_B(x), \\
\mathbf{1}_{A\cup B}(x) ~&=~ \max\bigl\{\mathbf{1}_A(x),\ \mathbf{1}_B(x)\bigr\} ~=~ \mathbf{1}_A(x) + \mathbf{1}_B(x) - \mathbf{1}_A(x) \cdot \mathbf{1}_B(x)\,,
\end{align}</math>

and the indicator function of the [[Complement (set theory)|complement]] of <math>A</math> i.e. <math>A^\complement</math> is:
<math display=block>\mathbf{1}_{A^\complement} = 1 - \mathbf{1}_A.</math>

More generally, suppose <math>A_1, \dotsc, A_n</math> is a collection of subsets of {{mvar|X}}. For any <math>x \in X:</math>

<math display=block> \prod_{k \in I} \left(\ 1 - \mathbf{1}_{A_k}\!\left( x \right)\ \right)</math>

is a product of {{math|0}}s and {{math|1}}s. This product has the value {{math|1}} at precisely those <math>x \in X</math> that belong to none of the sets <math>A_k</math> and is 0 otherwise. That is

<math display=block> \prod_{k \in I} ( 1 - \mathbf{1}_{A_k}) = \mathbf{1}_{X - \bigcup_{k} A_k} = 1 - \mathbf{1}_{\bigcup_{k} A_k}.</math>

Expanding the product on the left hand side,

<math display=block>\mathbf{1}_{\bigcup_{k} A_k}= 1 - \sum_{F \subseteq \{1, 2, \dotsc, n\}} (-1)^{|F|} \mathbf{1}_{\bigcap_F A_k} = \sum_{\emptyset \neq F \subseteq \{1, 2, \dotsc, n\}} (-1)^{|F|+1} \mathbf{1}_{\bigcap_F A_k}</math>

where <math>|F|</math> is the [[cardinality]] of {{mvar|F}}. This is one form of the principle of [[inclusion-exclusion]].

As suggested by the previous example, the indicator function is a useful notational device in [[combinatorics]].  The notation is used in other places as well, for instance in [[probability theory]]: if {{mvar|X}} is a [[probability space]] with probability measure <math>\mathbb{P}</math> and {{mvar|A}} is a [[Measure (mathematics)|measurable set]], then <math>\mathbf{1}_A</math> becomes a [[random variable]] whose [[expected value]] is equal to the probability of {{mvar|A}}:

<math display=block>\operatorname\mathbb{E}_X\left\{\ \mathbf{1}_A(x)\ \right\}\ =\ \int_{X} \mathbf{1}_A( x )\ \operatorname{d\ \mathbb{P} }(x) = \int_{A} \operatorname{d\ \mathbb{P} }(x) = \operatorname\mathbb{P}(A).</math>

This identity is used in a simple proof of [[Markov's inequality]].

In many cases, such as [[order theory]], the inverse of the indicator function may be defined. This is commonly called the [[generalized Möbius function]], as a generalization of the inverse of the indicator function in elementary [[number theory]], the [[Möbius function]]. (See paragraph below about the use of the inverse in classical recursion theory.)

==Mean, variance and covariance==
Given a [[probability space]] <math>\textstyle (\Omega, \mathcal F, \operatorname{P})</math> with <math>A \in \mathcal F,</math> the indicator random variable <math>\mathbf{1}_A \colon \Omega \rightarrow \mathbb{R}</math> is defined by <math>\mathbf{1}_A (\omega) = 1 </math> if <math> \omega \in A,</math> otherwise <math>\mathbf{1}_A (\omega) = 0.</math>

;[[Mean]]: <math>\ \operatorname\mathbb{E}(\mathbf{1}_A (\omega)) = \operatorname\mathbb{P}(A)\ </math> (also called "Fundamental Bridge").

;[[Variance]]: <math>\ \operatorname{Var}(\mathbf{1}_A (\omega)) = \operatorname\mathbb{P}(A)(1 - \operatorname\mathbb{P}(A)).</math>

;[[Covariance]]: <math>\ \operatorname{Cov}(\mathbf{1}_A (\omega), \mathbf{1}_B (\omega)) = \operatorname\mathbb{P}(A \cap B) - \operatorname\mathbb{P}(A) \operatorname\mathbb{P}(B).</math>

==Characteristic function in recursion theory, Gödel's and Kleene's representing function==
[[Kurt Gödel]] described the ''representing function'' in his 1934 paper "On undecidable propositions of formal mathematical systems" (the symbol "{{math|¬}}" indicates logical inversion, i.e. "NOT"):<ref name=Martin-1965>{{cite book |pages=41–74 |editor-link=Martin Davis (mathematician) |editor-first=Martin |editor-last=Davis |year=1965 |title=The Undecidable |publisher=Raven Press Books |place=New York, NY}}</ref>{{rp|page=42}} 

{{blockquote|1=There shall correspond to each class or relation {{mvar|R}} a representing function <math>\phi(x_1, \ldots x_n) = 0</math> if <math>R(x_1,\ldots x_n)</math> and <math>\phi(x_1,\ldots x_n) = 1</math> if <math>\neg R(x_1,\ldots x_n).</math>}}

[[Stephen Kleene|Kleene]] offers up the same definition in the context of the [[primitive recursive function]]s as a function {{mvar|φ}} of a predicate {{mvar|P}} takes on values {{math|0}} if the predicate is true and {{math|1}} if the predicate is false.<ref name=Kleene1952>{{cite book |last=Kleene |first=Stephen |author-link=Stephen Kleene |year=1971 |orig-year=1952 |title=Introduction to Metamathematics |page=227 |publisher=Wolters-Noordhoff Publishing and North Holland Publishing Company |location=Netherlands |edition=Sixth reprint, with corrections}}</ref>

For example, because the product of characteristic functions <math>\phi_1 * \phi_2 * \cdots * \phi_n = 0</math> whenever any one of the functions equals {{math|0}}, it plays the role of logical OR: IF <math>\phi_1 = 0\ </math> OR <math>\ \phi_2 = 0</math> OR ... OR <math>\phi_n = 0</math> THEN their product is {{math|0}}. What appears to the modern reader as the representing function's logical inversion, i.e. the representing function is {{math|0}} when the function {{mvar|R}} is "true" or satisfied", plays a useful role in Kleene's definition of the logical functions OR, AND, and IMPLY,<ref name=Kleene1952 />{{rp|228}} the bounded-<ref name=Kleene1952 />{{rp|228}} and unbounded-<ref name=Kleene1952 />{{rp|279 ff}} [[mu operator]]s and the CASE function.<ref name=Kleene1952 />{{rp|229}}

==Characteristic function in fuzzy set theory==
In classical mathematics, characteristic functions of sets only take values {{math|1}} (members) or {{math|0}} (non-members). In ''[[fuzzy set theory]]'', characteristic functions are generalized to take value in the real unit interval {{closed-closed|0, 1}}, or more generally, in some [[universal algebra|algebra]] or [[structure (mathematical logic)|structure]] (usually required to be at least a [[partially ordered set|poset]] or [[lattice (order)|lattice]]). Such generalized characteristic functions are more usually called [[membership function (mathematics)|membership function]]s, and the corresponding "sets" are called ''fuzzy'' sets. Fuzzy sets model the gradual change in the membership [[degree of truth|degree]] seen in many real-world [[predicate (mathematics)|predicate]]s like "tall", "warm", etc.

==Smoothness==
{{See also|Laplacian of the indicator}}
In general, the indicator function of a set is not smooth; it is continuous if and only if its [[support (math)|support]] is a [[connected component (topology)|connected component]]. In the [[algebraic geometry]] of [[finite fields]], however, every [[affine variety]] admits a ([[Zariski topology|Zariski]]) continuous indicator function.<ref>{{Cite book|title=Course in Arithmetic|last=Serre|pages=5}}</ref> Given a [[finite set]] of functions <math>f_\alpha \in \mathbb{F}_q\left[\ x_1, \ldots, x_n\right]</math> let <math>V = \bigl\{\ x \in \mathbb{F}_q^n : f_\alpha(x) = 0\ \bigr\}</math> be their vanishing locus. Then, the function <math display="inline">\mathbb{P}(x) = \prod\left(\ 1 - f_\alpha(x)^{q-1}\right)</math> acts as an indicator function for <math>V.</math> If <math>x \in V</math> then <math>\mathbb{P}(x) = 1,</math> otherwise, for some <math>f_\alpha,</math> we have <math>f_\alpha(x) \neq 0</math> which implies that <math>f_\alpha(x)^{q-1} = 1,</math> hence <math>\mathbb{P}(x) = 0.</math>

Although indicator functions are not smooth, they admit [[weak derivative]]s.  For example, consider [[Heaviside step function]] <math display="block">H(x) \equiv \operatorname\mathbb{I}\!\bigl(x > 0\bigr)</math>  The [[distributional derivative]] of the Heaviside step function is equal to the [[Dirac delta function]], i.e. <math display=block>\frac{\mathrm{d}H(x)}{\mathrm{d}x}= \delta(x)</math>
and similarly the distributional derivative of <math display="block">G(x) := \operatorname\mathbb{I}\!\bigl(x < 0\bigr)</math> is <math display=block>\frac{\mathrm{d}G(x)}{\mathrm{d}x} = -\delta(x).</math>

Thus the derivative of the Heaviside step function can be seen as the ''inward normal derivative'' at the ''boundary'' of the domain given by the positive half-line. In higher dimensions, the derivative naturally generalises to the inward normal derivative, while the Heaviside step function naturally generalises to the indicator function of some domain {{mvar|D}}. The surface of {{mvar|D}} will be denoted by {{mvar|S}}. Proceeding, it can be derived that the inward [[normal derivative]] of the indicator gives rise to a ''[[surface delta function]]'', which can be indicated by <math>\delta_S(\mathbf{x})</math>:
<math display=block>\delta_S(\mathbf{x}) = -\mathbf{n}_x \cdot \nabla_x \operatorname\mathbb{I}\!\bigl(\ \mathbf{x}\in D\ \bigr)\ </math>
where {{mvar|n}} is the outward [[Normal (geometry)|normal]] of the surface {{mvar|S}}. This 'surface delta function' has the following property:<ref>{{cite journal |last=Lange |first=Rutger-Jan |year=2012 |title=Potential theory, path integrals and the Laplacian of the indicator |journal=Journal of High Energy Physics |volume=2012 |issue=11 |pages=29–30 |arxiv=1302.0864 |bibcode=2012JHEP...11..032L |doi=10.1007/JHEP11(2012)032|s2cid=56188533 }}</ref>
<math display=block>-\int_{\R^n}f(\mathbf{x})\,\mathbf{n}_x\cdot\nabla_x  \operatorname\mathbb{I}\!\bigl(\ \mathbf{x}\in D\ \bigr) \; \operatorname{d}^{n}\mathbf{x} = \oint_{S}\,f(\mathbf{\beta}) \; \operatorname{d}^{n-1}\mathbf{\beta}.</math>

By setting the function {{mvar|f}} equal to one, it follows that the [[Laplacian of the indicator#Dirac surface delta function|inward normal derivative of the indicator]] integrates to the numerical value of the [[surface area]] {{mvar|S}}.

==See also==
{{Div col|colwidth=15em}}
* [[Dirac measure]]
* [[Laplacian of the indicator]]
* [[Dirac delta]]
* [[Extension (predicate logic)]]
* [[Free variables and bound variables]]
* [[Heaviside step function]]
* [[Identity function]]
* [[Iverson bracket]]
* [[Kronecker delta]], a function that can be viewed as an indicator for the [[Equality (mathematics)|identity relation]]
* [[Macaulay brackets]]
* [[Multiset]]
* [[Membership function (mathematics)|Membership function]]
* [[Simple function]]
* [[Dummy variable (statistics)]]
* [[Statistical classification]]
* [[Zero-one loss function]]
*[[Subobject classifier]], a related concept from [[Topos theory|topos theory]].{{div col end}}

==Notes==
{{notelist}}

==References==
{{reflist|25em}}

==Sources==
{{refbegin|25em}}
* {{cite book |last=Folland |first=G.B. |title=Real Analysis: Modern Techniques and Their Applications |publisher=John Wiley & Sons, Inc. |year=1999 |isbn=978-0-471-31716-6 |edition=Second}}
* {{cite book |last1=Cormen |first1=Thomas H. |title=Introduction to Algorithms |title-link=Introduction to Algorithms |last2=Leiserson |first2=Charles E. |last3=Rivest |first3=Ronald L. |last4=Stein |first4=Clifford |publisher=MIT Press and McGraw-Hill |year=2001 |isbn=978-0-262-03293-3 |edition=Second |pages=[https://archive.org/details/introductiontoal00corm_691/page/n116 94]–99 |chapter=Section 5.2: Indicator random variables |author-link=Thomas H. Cormen |author-link2=Charles E. Leiserson |author-link3=Ronald L. Rivest |author-link4=Clifford Stein}}
* {{cite book |editor-last=Davis |editor-first=Martin |editor-link=Martin Davis (mathematician) |year=1965 |title=The Undecidable |publisher=Raven Press Books |place=New York, NY}}
* {{cite book |last=Kleene |first=Stephen |author-link=Stephen Kleene |year=1971 |orig-year=1952 |title=Introduction to Metamathematics |publisher=Wolters-Noordhoff Publishing and North Holland Publishing Company |location=Netherlands |edition=Sixth reprint, with corrections}}
* {{Cite book |last1=Boolos |first1=George |title=Computability and Logic |last2=Burgess |first2=John P. |last3=Jeffrey |first3=Richard C. |publisher=Cambridge University Press |year=2002 |isbn=978-0-521-00758-0 |location=Cambridge UK |author-link=George Boolos |author-link2=John P. Burgess |author-link3=Richard C. Jeffrey}}
*{{cite q | Q25938993 |last1=Zadeh |first1=L.A. | author-link1 = Lotfi A. Zadeh | | journal = [[Information and Computation|Information and Control]] | doi-access = free }}
* {{cite journal |last=Goguen |first=Joseph |author-link=Joseph Goguen |year=1967 |title=''L''-fuzzy sets |journal=Journal of Mathematical Analysis and Applications |volume=18 |issue=1 |pages=145–174 |doi=10.1016/0022-247X(67)90189-8 |hdl-access=free |hdl=10338.dmlcz/103980}}
{{refend}}

[[Category:Measure theory]]
[[Category:Integral calculus]]
[[Category:Real analysis]]
[[Category:Mathematical logic]]
[[Category:Basic concepts in set theory]]
[[Category:Probability theory]]
[[Category:Types of functions]]