Print

Enabling Multilingual Conversational AI: MultiConvAI
Details
Locations:UK
Start Date:Jan 1, 2021
End Date:Jun 30, 2022
Contract value: EUR 150,000
Sectors: Information & Communication Technology
Description
Programme(s): H2020-EU.1.1. - EXCELLENT SCIENCE - European Research Council (ERC)
Topic(s): ERC-2020-POC - Call for proposals for ERC Proof of Concept Grant
Call for proposal: ERC-2020-PoC
Funding Scheme: ERC-POC-LS - ERC Proof of Concept Lump Sum Pilot
Grant agreement ID: 957356
Project description
Training speech recognition algorithms to speak more languages
Say hello to Apple’s Siri, Amazon’s Echo and Google’s Assistant. But in which language? These task-based statistical dialogue systems (SDSs) are not available in all languages. This limits the global reach of conversational artificial intelligence (AI). The EU-funded MultiConvAI project will develop the first prototype system for scaling conversational AI to multiple languages. Based on new methodology that learns multilingual word representations, this new system will use a process called semantic specialisation. The project will develop Natural Language Understanding (NLU) modules for SDSs via more effective semantic specialisation based on joint multi-source, multi-target training. It will also focus on typologically diverse languages.
Objective
In recent past, Conversational Artificial Intelligence (AI) has made major advances, thanks to the availability of big data and increasingly powerful deep learning. Task-based statistical dialogue systems (SDS) are now viable, embedded in popular commercial applications (e.g. the Apple’s Siri, Amazon’s Echo, Google’s Assistant) and cost-effective in many scenarios (e.g. customer support, call centre service, searching, booking). Yet current SDSs are only available for a handful of resource-rich languages, leaving the majority of the worlds languages and their speakers behind. Our project will develop the first prototype system for scaling conversational AI to multiple languages. This will be based on new methodology that learns multilingual word representations (i.e. embeddings, WEs) without the need for expensive training data, using a process called semantic specialisation that complements WEs with common-sense and linguistic knowledge in external knowledge graphs. Building on our promising pilot studies, we will develop Natural Language Understanding (NLU) modules for SDS via 1) more effective semantic specialisation based on joint multi-source multi-target training; and 2) focus on typologicallydiverse languages. We foresee a pioneering use of selective sharing and structural adaptation for obtaining WEs and optimisation for the target languages guided by typological knowledge. The best resulting technology will be integrated in a demo prototype system which users and industries can deploy to generate multilingual NLU input for more widely portable SDS. Since we also plan to explore the possibility to form a start-up company, we will use the system to demonstrate the potential to our network of industry contacts and potential customers. On a larger scale, extending the multilingual scope of SDSs can have major socioeconomic benefits: it can broaden the global reach of conversational AI and it can enhance its commercial viability.