2023-05854 - PhD Position F/M Knowledge-based reinforcement learning and knowledge evolution Campagne DOC MI-NF-GRE-2023
Contract type : Fixed-term contract
Level of qualifications required : Graduate degree or equivalent
Fonction : PhD Position
Level of experience : Recently graduated
About the research centre or Inria departmentThe Inria Grenoble - Rhône-Alpes research center groups together almost 600 people in 22 research teams and 7 research support departments.
Staff is present on three campuses in Grenoble, in close collaboration with other research and higher education institutions (Université Grenoble Alpes, CNRS, CEA, INRAE, …), but also with key economic players in the area.
Inria Grenoble - Rhône-Alpes is active in the fields of high-performance computing, verification and embedded systems, modeling of the environment at multiple levels, and data science and artificial intelligence. The center is a top-level scientific institute with an extensive network of international collaborations in Europe and the rest of the world.
ContextDoctoral school: MSTII, Université Grenoble Alpes.
Advisor: Jérôme Euzenat and Jérôme David.
Group: The work will be carried out in the mOeX team common to INRIA & LIG. It is related to the MIAI Knowledge communication and evolution chair. mOeX is dedicated to study knowledge evolution through adaptation. It gathers researchers which have taken an active part these past 15 years in the development of the semantic web and more specifically ontology matching and data interlinking.
Place of work: The position is located at INRIA Grenoble Rhône-Alpes, Montbonnot a main computer science research lab, in a stimulating research environment.
AssignmentCultural knowledge evolution and multiagent reinforcement learning share some of their prominent features. Putting explicit knowledge at the heart of the reinforcement process may contribute to better explanation and transfer.
Cultural knowledge evolution deals with the evolution of knowledge representation in a group of agents. For that purpose, cooperating agents adapt their knowledge to the situations they are exposed to and the feedback they receive from others. This framework has been considered in the context of evolving natural languages Steels, 2012. We have applied it to ontology alignment repair, i.e. the improvement of incorrect alignments Euzenat, 2017 and ontology evolution Bourahla et al. , 2021. We have shown that it converges towards successful communication through improving the intrinsic knowledge quality.
Reinforcement learning is a learning mechanism adapting the decision making process for maximising the reward provided by the environment to the actions performed by agents Sutton and Barto, 1998. Many multi-agent versions of reinforcement learning have also been proposed depending on the agent attitude (cooperative, competitive) and the task structure (homogeneous, heterogeneous) Bučoniu et al. , 2010.
From an external perspective, the two approaches operate in a similar manner: agents perceive their environment, perform an action, receive reward or punishment, adapt their behaviour in consequence. However, a look into the inner mechanisms reveals important differences: the emphasis on knowledge quality instead of reward maximisation, the lack of probabilistic or even gradual interpretation, and even the absence of explicit choice in action or adaptation. Hence these two knowledge acquisition techniques are close enough to suggest replacing one by the other and different enough to cross-fertilise.
This thesis position aims at further exploring the commonalities and differences between experimental cultural knowledge evolution and reinforcement learning. In particular, its purpose is to study which features of one technique may be fruitful in the context of the other and which may not.
For that purpose, one research direction is the introduction of knowledge- based reinforcement learning. In knowledge-based reinforcement learning, the decision-making process (the choice of the action to be performed) is obtained through accumulated explicit knowledge. Thus the adaptation performed after reward or punishment will have to directly affect this knowledge. This has the advantage that it allows to explain the decisions made by agents. It will also allow for explicit knowledge exchange among them Leno da Silva et al. , 2018.
This promotes a less utilitarian view of knowledge in which the evaluation of the performance of the system has to be disconnected from reward maximisation but to depend on the quality of the acquired knowledge. Of course, these two aspects need to remain related (the acquired knowledge must be relevant to the environment). This separation between knowledge and reward is useful when agents have to change environment or use their knowledge to perform various tasks.
Another use of reinforcement mechanisms relevant to cultural knowledge evolution is related to the motivation for agents to explore unknown knowledge territories Colas et al. , 2019. By associating an intrinsic reward to the newly acquired knowledge, agents are able to improve the coverage of their knowledge in a way not guided by the environment. Complementing cultural knowledge evolution with exploration motivation, should make agents more active in their understanding of the environment and knowledge acquisition.
These problems may be treated both theoretically and experimentally
References:
Bourahla, 2021 Yasser Bourahla, Manuel Atencia, Jérôme Euzenat, Knowledge improvement and diversity under interaction-driven adaptation of learned ontologies, Proc. 20th AAMAS, London (UK), pp242-250, 2021 https: // moex.inria.fr/files/papers/bourahla2021a.pdf Bučoniu et al. , 2010 Lucian Bučoniu, Robert Babuška, Bart De Schutter, Multi-agent reinforcement learning: an overview, Chapter 7 of D. Srinivasan and L.C. Jain, eds., Innovations in Multi-Agent Systems and Applications – 1, Springer , Berlin (DE), pp183–221, 2010 https:// www. dcsc.tudelft.nl/~bdeschutter/pub/rep/10003.pdf Colas et al. , 2019 Cédric Colas, Pierre-Yves Oudeyer, Olivier Sigaud, Pierre Fournier, Mohamed Chetouani, Curious: Intrinsically motivated modular multi-goal reinforcement learning, Proc. 36th ICML, Long Beach (CA US), pp1331–1340, 2019 https:// proceedings.mlr.press/v97/colas19a/colas19a.pdf Euzenat, 2017 Jérôme Euzenat, Communication-driven ontology alignment repair and expansion, Proc. 26th IJCAI, Melbourne (AU), pp185-191, 2017 https: // moex.inria.fr/files/papers/euzenat2017a.pdf Leno da Silva et al. , 2018 Felipe Leno Da Silva, Matthew Taylor, Anna Helena Reali Costa, Autonomously reusing knowledge in multiagent reinforcement learning, Proc. 27th IJCAI, pp5487-5493, 2018 https: // www. ijcai.org/proceedings/2018/0774.pdf Steels, 2012 Luc Steels (ed.), Experiments in cultural language evolution, John Benjamins, Amsterdam (NL), 2012 Sutton and Barto, 1998 Richard Sutton, Andrew Barto, Reinforcement learning: an introduction, The MIT Press, Cambridge (MA US), 1998 (2nd ed. 2018) https:// incompleteideas.net/book/RLbook2020.pdf
Links:
Main activities :
Researched skills:
Languages: English mandatory
Benefits packageQualification: Master or equivalent in computer science.
Please contact us in advance if possible.
About InriaInria is the French national research institute dedicated to digital science and technology. It employs 2,600 people. Its 200 agile project teams, generally run jointly with academic partners, include more than 3,500 scientists and engineers working to meet the challenges of digital technology, often at the interface with other disciplines. The Institute also employs numerous talents in over forty different professions. 900 research support staff contribute to the preparation and development of scientific and entrepreneurial projects that have a worldwide impact.
Instruction to applyDefence Security : This position is likely to be situated in a restricted area (ZRR), as defined in Decree No. 2011-1425 relating to the protection of national scientific and technical potential (PPST).Authorisation to enter an area is granted by the director of the unit, following a favourable Ministerial decision, as defined in the decree of 3 July 2012 relating to the PPST. An unfavourable Ministerial decision in respect of a position situated in a ZRR would result in the cancellation of the appointment.
Recruitment Policy : As part of its diversity policy, all Inria positions are accessible to people with disabilities.
Warning : you must enter your e-mail address in order to save your application to Inria. Applications must be submitted online on the Inria website. Processing of applications sent from other channels is not guaranteed.