Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Formal concept analysis
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Biclustering and multidimensional clustering === Given an object-attribute numerical data-table, the goal of [[biclustering]] is to group together some objects having similar values of some attributes. For example, in gene expression data, it is known that genes (objects) may share a common behavior for a subset of biological situations (attributes) only: one should accordingly produce local patterns to characterize biological processes, the latter should possibly overlap, since a gene may be involved in several processes. The same remark applies for recommender systems where one is interested in local patterns characterizing groups of users that strongly share almost the same tastes for a subset of items.<ref name="AdomTuzh">{{cite journal |last1=Adomavicius |first1=C. |last2=Tuzhilin |first2=A.|author2-link= Alexander Tuzhilin |title=Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions |journal=IEEE Transactions on Knowledge and Data Engineering |volume=17 |issue=6 |pages=734β749 |date=2005 |doi=10.1109/TKDE.2005.99 |s2cid=206742345 |url=http://homepages.dcc.ufmg.br/~nivio/cursos/ri13/sources/recommender-systems-survey-2005.pdf}}</ref> A bicluster in a binary object-attribute data-table is a pair ''(A,B)'' consisting of an inclusion-maximal set of objects ''A'' and an inclusion-maximal set of attributes ''B'' such that almost all objects from ''A'' have almost all attributes from ''B'' and vice versa. Of course, formal concepts can be considered as "rigid" biclusters where all objects have all attributes and vice versa. Hence, it is not surprising that some bicluster definitions coming from practice<ref>{{cite journal |last1=Prelic |first1=S. |last2=Bleuler |first2=P. |last3=Zimmermann |first3=A. |last4=Wille |first4=P. |last5=Buhlmann |first5=W. |last6=Gruissem |first6=L. |last7=Hennig |first7=L. |last8=Thiele |first8=E. |last9=Zitzler |title=A Systematic Comparison and Evaluation of Biclustering Methods for Gene Expression Data |journal=Bioinformatics |volume=22 |issue=9 |pages=1122β9 |date=2006 |doi=10.1093/bioinformatics/btl060 |pmid=16500941 |url=https://academic.oup.com/bioinformatics/article/22/9/1122/200492|hdl=20.500.11850/23740 |hdl-access=free }}</ref> are just definitions of a formal concept.<ref name="KayMinBicl">{{cite journal |last1=Kaytoue |first1=M. |last2=Kuznetsov |first2=S. |last3=Macko |first3=J. |last4=Wagner Meira Jr. |first4=Napoli A. |title=Mining Biclusters of Similar Values with Triadic Concept Analysis |journal=CLA |volume= |issue= |pages=175β190 |date=2011 |arxiv=1111.3270 }}</ref> Relaxed FCA-based versions of biclustering and triclustering include OA-biclustering<ref name="OABicl">{{cite book |last1=Ignatov |first1=D. |last3=Kuznetsov |first3=S. |last2=Poelmans |first2=J. |title=2012 IEEE 12th International Conference on Data Mining Workshops |chapter=Concept-Based Biclustering for Internet Advertisement | pages=123β130 |chapter-url=https://ieeexplore.ieee.org/document/6406432|doi=10.1109/ICDMW.2012.100|date=2012|isbn=978-1-4673-5164-5 |s2cid=32701053 }}</ref> and OAC-triclustering<ref name="OACTric">{{cite journal |last1=Ignatov |first1=D. |last3=Kuznetsov |first3=S. |last4=Mirkin |first4=B. |last2=Gnatyshak |first2=D. |title=Triadic Formal Concept Analysis and triclustering: searching for optimal patterns |journal=Mach. Learn. |volume=101 |issue=1β3 |pages=271β302 |date=2015|doi=10.1007/s10994-015-5487-y|s2cid=254738363 |doi-access=free }}</ref> (here O stands for object, A for attribute, C for condition); to generate patterns these methods use prime operators only once being applied to a single entity (e.g. object) or a pair of entities (e.g. attribute-condition), respectively. A bicluster of similar values in a numerical object-attribute data-table is usually defined<ref>{{cite book |chapter-url=https://www.cs.rpi.edu/~zaki/Workshops/BIOKDD04/proceedings/4-pensa.pdf |chapter=Assessment of discretization techniques for relevant pattern discovery from gene expression data |first1=R.G. |last1=Pensa |first2=C. |last2=Leschi |first3=J. |last3=Besson |first4=J.-F. |last4=Boulicaut |year=2004 |editor1-first=M.J. |editor1-last=Zaki |editor2-first=S. |editor2-last=Morishita |editor3-first=I. |editor3-last=Rigoutsos |pages=24β30 | title=Proceedings of the 4th ACM SIGKDD Workshop on Data Mining in Bioinformatics (BIOKDD 2004) | access-date=2022-07-20 |url=https://dl.acm.org/doi/proceedings/10.5555/3000580}}</ref><ref>{{cite book |last1=Besson |first1=J. |last2=Robardet |first2=C. |last3=Raedt |first3=L.D. |last4=Boulicaut |first4=J.-F. |chapter-url=https://lirias.kuleuven.be/bitstream/123456789/134532/1/07-kdid-bessonetal.pdf |chapter=Mining bi-sets in numerical data |doi=10.1007/978-3-540-75549-4_2 |isbn=978-3-540-75549-4 |editor1-first=S. |editor1-last=Dzeroski |editor2-first=J. |editor2-last=Struyf |title=International Workshop on Knowledge Discovery in Inductive Databases |series=LNCS |volume=4747 |pages=11β23 |publisher=Springer |date=2007}}</ref><ref name="CerfCloPat">{{cite journal |last1=Cerf |first1=L. |last2=Besson |first2=J. |last3=Robardet |first3=C. |last4=Boulicaut |first4=J.-F. |title=Closed patterns meet n-ary relations |journal=ACM Transactions on Knowledge Discovery from Data |volume=3 |issue=1 |pages=1β36 |date=2009 |doi=10.1145/1497577.1497580 |s2cid=11148363 |url=http://homepages.dcc.ufmg.br/~lcerf/publications/articles/Closed%20Patterns%20Meet%20n-ary%20Relations.pdf}}</ref> as a pair consisting of an inclusion-maximal set of objects and an inclusion-maximal set of attributes having similar values for the objects. Such a pair can be represented as an inclusion-maximal rectangle in the numerical table, modulo rows and columns permutations. In<ref name="KayMinBicl" /> it was shown that biclusters of similar values correspond to triconcepts of a triadic context where the third dimension is given by a scale that represents numerical attribute values by binary attributes. This fact can be generalized to ''n''-dimensional case, where ''n''-dimensional clusters of similar values in ''n''-dimensional data are represented by ''n+1''-dimensional concepts. This reduction allows one to use standard definitions and algorithms from multidimensional concept analysis<ref name="CerfCloPat"/><ref name="Voutsadakis" /> for computing multidimensional clusters.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)