Editing Shattered set

{{Short description|Notion in computational learning}}{{Wiktionary}}
A class of sets is said to shatter another set if it is possible to "pick out" any element of that set using [[intersection]]. The concept of '''shattered sets''' plays an important role in [[Vapnik&ndash;Chervonenkis theory]], also known as VC-theory. Shattering and VC-theory are used in the study of [[empirical process]]es as well as in statistical [[computational learning theory]].

==Definition==

Suppose ''A'' is a [[set (mathematics)|set]] and ''C'' is a [[class (set theory)|class]] of sets. The class ''C'' '''shatters''' the set ''A'' if for each subset ''a'' of ''A'', there is some element ''c'' of ''C'' such that

: <math>a = c \cap A.</math>

Equivalently, ''C'' shatters ''A'' when their [[Growth function#Definitions|intersection]] is equal to ''A'''s [[power set]]: ''P''(''A'') = { ''c'' ∩ ''A'' | ''c'' ∈ ''C'' }. 

We employ the letter  ''C''  to refer to a "class" or "collection" of sets, as in a Vapnik&ndash;Chervonenkis class (VC-class).  The set ''A'' is often assumed to be [[finite set|finite]] because, in empirical processes, we are interested in the shattering of finite sets of data points.

==Example==

We will show that the class of all [[disc (geometry)|discs]] in the [[plane (geometry)|plane]] (two-dimensional space) does not shatter every set of four points on the [[unit circle]], yet the class of all [[convex set]]s in the plane does shatter every finite set of points on the [[unit circle]].

Let ''A'' be a set of four points on the unit circle and let ''C'' be the class of all discs.
[[File:shattering01.png|thumb|right|300px|The set ''A'' of four particular points on the unit circle (the unit circle is shown in purple).]]
To test where ''C'' shatters ''A'', we attempt to draw a disc around every subset of points in ''A''. First, we draw a disc around the subsets of each isolated point. Next, we try to draw a disc around every subset of point pairs. This turns out to be doable for adjacent points, but impossible for points on opposite sides of the circle. Any attempt to include those points on the opposite side will necessarily include other points not in that pair.  Hence, any pair of opposite points cannot be isolated out of ''A'' using intersections with class ''C'' and so ''C'' does not shatter ''A''. 

As visualized below:

<gallery>
Image:shattering02.png|Each individual point can be isolated with a disc (showing all four).
Image:shattering03.png|Each subset of adjacent points can be isolated with a disc (showing one of four).
Image:shattering04.png|A subset of points on opposite sides of the unit circle can ''not'' be isolated with a disc.
</gallery>

Because there is some subset which can ''not'' be isolated by any disc in ''C'', we conclude then that ''A'' is not shattered by ''C''. And, with a bit of thought, we can prove that no set of four points is shattered by this ''C''.

However, if we redefine ''C'' to be the class of all ''elliptical discs'', we find that we can still isolate all the subsets from above, as well as the points that were formerly problematic. Thus, this specific set of 4 points is shattered by the class of elliptical discs. Visualized below:

<gallery>
Image:shattering05.png|Opposite points of ''A'' are now separable by some ellipse (showing one of two)
Image:shattering06.png|Each subset of three points in ''A'' is also separable by some ellipse (showing one of four)
</gallery>

With a bit of thought, we could generalize that any set of finite points on a unit circle could be shattered by the class of all [[convex set]]s (visualize connecting the dots).

==Shatter coefficient==
{{Main|growth function}}
To quantify the richness of a collection ''C'' of sets, we use the concept of ''shattering coefficients'' (also known as the ''growth function''). For a collection ''C'' of sets <math>s \subset \Omega</math>, <math>\Omega</math> being any space,  often a [[sample space]],  we define
the ''n''<sup>th</sup> ''shattering coefficient'' of ''C'' as

:<math> S_C(n) = \max_{\forall x_1,x_2,\dots,x_n \in \Omega } \operatorname{card} \{\,\{\,x_1,x_2,\dots,x_n\}\cap s, s\in  C \}</math>

where <math>\operatorname{card}</math> denotes the [[cardinality]] of the set and <math>x_1,x_2,\dots,x_n \in \Omega </math> is any set of ''n'' points,.  

<math> S_C(n) </math> is the largest number of subsets of any set ''A''  of ''n'' points that can be formed by intersecting  ''A'' with the sets in collection ''C''. 

For example, if set ''A'' contains 3 points, its power set, <math>P(A)</math>, contains <math>2^3=8</math> elements.  If ''C'' shatters ''A'', its shattering coefficient(3) would be 8 and S_C(2) would be <math>2^2=4</math>.  However, if one of those sets in <math>P(A)</math> cannot be obtained through intersections in ''c'', then S_C(3) would only be 7.  If none of those sets can be obtained, S_C(3) would be 0.  Additionally, if S_C(2)=3, for example, then there is an element in the set of all 2-point sets from ''A'' that cannot be obtained from intersections with ''C''.  It follows from this that S_C(3) would also be less than 8 (i.e. ''C'' would not shatter ''A'') because we have already located a "missing" set in the smaller power set of 2-point sets. 

This example illustrates some properties of <math>S_C(n)</math>:
# <math>S_C(n)\leq 2^n</math> for all ''n'' because <math>\{s\cap A|s\in C\}\subseteq P(A)</math> for any <math>A\subseteq \Omega</math>.
# If <math>S_C(n)=2^n</math>, that means there is a set of cardinality ''n'', which can be shattered by ''C''.
# If <math>S_C(N)<2^N</math> for some <math>N>1</math> then <math>S_C(n)<2^n</math> for all <math>n\geq N</math>.
The third property means that if ''C'' cannot shatter any set of cardinality ''N'' then it can not shatter sets of larger cardinalities.

==Vapnik&ndash;Chervonenkis class==
{{Main|VC dimension}}
If ''A'' cannot be shattered by ''C'', there will be a smallest value of ''n'' that makes the shatter coefficient(n) less than <math>2^n</math> because as ''n'' gets larger, there are more sets that could be missed.  Alternatively, there is also a largest value of ''n'' for which the S_C(n) is still <math>2^n</math>, because as ''n'' gets smaller, there are fewer sets that could be omitted. The extreme of this is S_C(0) (the shattering coefficient of the empty set), which must always be <math>2^0=1</math>.  These statements lends themselves to defining the [[VC dimension]] of a class ''C'' as:
:<math>VC(C)=\underset{n}{\min}\{n:S_C(n)<2^n\}\,</math>
or, alternatively, as
:<math>VC_0(C)=\underset{n}{\max}\{n:S_C(n)=2^n\}.\,</math>

Note that <math>VC(C)=VC_0(C)+1.</math>. The VC dimension is usually defined as VC_0, the largest cardinality of points chosen that will still shatter ''A'' (i.e. ''n'' such that <math>S_C(n)=2^n</math>).

Altneratively, if for any ''n'' there is a set of cardinality ''n'' which can be shattered by ''C'', then <math>S_C(n)=2^n</math> for all ''n'' and the VC dimension of this class ''C'' is infinite.

A class with finite VC dimension is called a ''Vapnik&ndash;Chervonenkis class'' or ''VC class''. A class ''C'' is [[Glivenko&ndash;Cantelli class|uniformly Glivenko&ndash;Cantelli]] if and only if it is a VC class.

==See also==
*[[Sauer–Shelah lemma]], relating the cardinality of a [[family of sets]] to the size of its largest shattered set

==References==
*{{citation|last1=Wencour|first1=R. S.|first2=R. M.|last2=Dudley|title=Some special Vapnik–Chervonenkis classes|journal=[[Discrete Mathematics (journal)|Discrete Mathematics]]|volume=33|year=1981|pages=313–318|issue=3|doi=10.1016/0012-365X(81)90274-0|doi-access=free}}.
*{{citation|authorlink=J. Michael Steele|last=Steele|first=J. M.|year=1975|title=Combinatorial Entropy and Uniform Limit Laws|series=Ph.D. thesis|publisher=Stanford University, Mathematics Department}}
*{{citation|authorlink=J. Michael Steele|last=Steele|first=J. M.|year=1978|title=Empirical discrepancies and subadditive processes|journal=[[Annals of Probability]]|volume=6|pages=118–227|issue=1|jstor=2242865|doi=10.1214/aop/1176995615|doi-access=free|url=https://repository.upenn.edu/bitstreams/b57f8ebe-7b2d-4b10-b507-f2e9215f8511/download}}.

==External links==
*[http://www-stat.wharton.upenn.edu/~steele/Rants/ShatteredSets.html Origin of "Shattered sets" terminology], by J. Steele


[[Category:Empirical process]]
[[Category:Computational learning theory]]