CORDIAL-AI is a research pilot funded under the ESRC Future Data Services programme that investigates how generative artificial intelligence can support the discovery and retrieval of complex statistical datasets.
The project focuses on UK census origin-destination (flow) data, which describe movements between locations, such as commuting, migration, and second-residence patterns. These datasets are among the most complex census resources, involving large volumes of numerical data, extensive code lists, multiple geographic hierarchies, and intricate relational structures.
Although these datasets provide valuable insights for research and policymaking, their complexity can make them difficult to locate and use effectively. CORDIAL-AI explores whether natural-language interaction with AI systems can help reduce these barriers.
The project investigates how large language models can interpret user queries and translate them into structured requests executed through a dedicated census flow data API. In this architecture, the AI system performs the task of interpreting user intent, while all data retrieval and processing occur within the underlying data platform. This separation enables natural-language interaction with complex datasets while maintaining deterministic, verifiable results derived from authoritative sources.
CORDIAL-AI also examines broader methodological questions surrounding the use of generative AI in research data services, including issues of transparency, provenance, explainability, and responsible use.