FCA is a well-founded mathematical framework based on lattice theory and aimed at data analysis, classification, and knowledge discovery. We will be mainly interested in studying algorithms based on FCA and pattern structures, i.e., an FCA extension for dealing with non binary data, for mining complex data. A important feature of FCA is to allow access to and reuse of implications and association rules within the concept lattice. The concept lattice or important parts of it (AOC poset, stable concepts) can be visualized, navigated and interpreted in various ways for data and pattern visualization, information retrieval, implications and association rules visualization. In this thesis, we will design the SmartFCA-platform for supporting knowledge processing and the construction of workflows based on FCA algorithms. The SmartFCA-platform will include the necessary codes for dealing with the most standard data-types and services, i.e., data preparation, data mining, pattern visualization, navigation, filtering, constraint posting, annotation, and support to interpretation and representation. An inventory of FCA software is available while none of these tools is a consensual and centralized platform offering a library of interoperable algorithms. Among these tools, some are designed by members of the SmartFCA consortium (e.g., Coron, RCAExplore, Camelis, LatViz, Galactic, Graph-FCA). However, none of these FCA tools are interoperable and they do not provide the set of necessary tools for analyzing real-life data in complex domains such as biology, chemistry, medicine, and linked data. Accordingly, the objective of this thesis subject is to study the design of a generic and operational platform for making interoperable and reusable generic algorithms based on FCA and pattern structures, and to provide modules for building knowledge processing workflows.
Making FCA and Pattern Structures Operational, Generic, and Interoperable in the SmartFCA-Platform (Thesis Subject, ANR Project SmartFCA)
Context and Positioning of the Thesis
This thesis subject is carried out in the framework of the ANR SmartFCA project (ANR-21-CE23-0023-01). The objectives of SmartFCA are to study and to design a collection of algorithms based on Formal Concept Analysis (FCA) for Knowledge Discovery (KD). The algorithms should to be then implemented in the SmartFCA-platform intended to be an operational and open platform for KD. Defining efficient and useful KD techniques remains a major challenge for analyzing such data, discovering actionable patterns, developing reasoning techniques about data and patterns, and finally making decisions. In this way, KD methods based on lattice theory are good candidates for mining symbolic and complex data. Moreover, SmartFCA is intended to provide the necessary services for KD, such as data preparation, data mining, pattern interpretation and representation. The SmartFCA platform will propose a "Knowledge as a Service" component for making domain knowledge actionable and reusable on demand. Firstly, complex and small data have a high importance in application domains such as life sciences, e.g., the analysis of cohorts in medicine involves a few hundreds of individuals and a few thousands of features, but data are complex and expected patterns should be easy to interpret. Secondly, FCA and pattern mining are well adapted to mine tree-based and graph-based data, e.g., linked data based on RDF triples, especially for knowledge and ontology engineering purposes, and text mining. Thirdly, the machinery of FCA is traceable as the way classes of individuals are built and related decision making can be straightforwardly understood. Finally, interactivity and incrementality can be combined within the FCA framework for allowing "local" as well as "global" data analysis.Additional comments
Supervision and Practical Aspects of the Thesis.
Keywords: Formal Concept Analysis, Pattern Structures, pattern mining, mining of complex data, platform design, interoperability, algorithms.
Skills and profile of the candidate: The candidate should have a Master of Computer Science and/or in Data Science. Elements on Formal Concept Analysis, knowledge discovery and data mining algorithms (numerical and/or symbolic approaches) will be highly appreciated.
The candidate should apply on the CNRS portal, the so-called "Portail Emploi CNRS (https: // emploi.cnrs.fr/)" and should provide a recent curriculum vitae, a motivation letter, two recommendation letters or the names of two referees, and as well the transcripts of the last three academic years (Bachelor and master Degrees, or Engineering School).Web site for additional job details
https: // emploi.cnrs.fr/Offres/Doctorant/UMR7503-AMENAP-004/Default.aspxRequired Research Experiences
Mathematics › Algorithms
Computer science: Master Degree or equivalent
Mathematics: Master Degree or equivalent
FRENCH: BasicContact Information