The CORDIAL-AI project has produced a range of technical and research outputs that contribute to the exploration of generative AI in research data services.
CORDIAL-AI prototype
A working web-based prototype was developed to test how natural-language queries can be used to retrieve tailored subsets of census flow data. The system allows users to submit queries through a conversational interface and generates structured API requests to retrieve deterministic results.
Training datasets for AI systems
The project created supervised training datasets consisting of natural-language queries paired with corresponding API requests. These datasets were used to train and evaluate language models for structured data retrieval tasks. Synthetic datasets were also generated computationally to expand the range of possible query variations.
Methodological research
CORDIAL-AI contributes methodological insights into the use of generative AI for structured data environments. This includes work on query interpretation, agent-based AI architectures, training pipelines for domain-specific language models, and approaches to transparency and provenance.
Publications and dissemination
The project’s findings have been shared through conference presentations, workshops, practitioner engagement events, and a forthcoming book chapter. These activities contribute to wider discussions about responsible uses of AI in research data infrastructures.
Engagement with data services and researchers
CORDIAL-AI has engaged with researchers, data service providers, and policy practitioners through workshops, advisory panels, training sessions, and conference presentations. These interactions have helped shape the development of the prototype and inform discussions about the future role of generative AI in data services.