How did music evolve its own distinct language, capable of articulating complex structures independent of text?
The ChordIA-CM project investigates the “big bang” of Western music: the emergence and consolidation of the tonal system.
To achieve this, our synergistic project bridges two worlds: the musicological expertise of the Instituto Complutense de Ciencias Musicales (ICCMU-UCM) and the cutting-edge AI research at the Natural Language Processing and Information Retrieval Research Center (LENAR-UNED).
A synergy between
Focusing on the critical period between 1580 and 1750, ChordIA-CM tackles a fundamental question: How and when did harmonic planning—the syntax of chords—break free from the constraints of poetic structure? This revolutionary autonomy marks the true origin of “absolute” music as we know it today.
Unlike previous studies limited by simple formats like MIDI, ChordIA is building a vast corpus of annotated digital scores using MusicXML, offering a far richer and structurally precise representation of the music.
This corpus forms the foundation for applying advanced Natural Language Processing (NLP) techniques and training Large Language Models (LLMs). We treat music not merely as a sequence of notes, but as a complex language with its own distinct grammar and hierarchies.
Our research is structured around five goals:
Building a massive dataset of thousands of digitized works from 1580-1750, enriched with standardized harmonic annotations.
Develop optimized sequencing formats and novel tokenization methods for effectively training LLMs on symbolic music data.
Conduct a diachronic study on the evolving relationship —and progressive separation— between poetic structure and harmonic planning.
Design an architecture of AI agents and symbolic models capable of deep harmonic analysis and recognizing complex structural patterns.
Establish a standardized evaluation framework and a public leaderboard to benchmark the performance of musical AI models.
ChordIA will generate tangible impact beyond academia:
Open resources: the corpus and algorithmic tools will be published open-access, benefiting educators, researchers, performers, and creators alike.
New tools: our AI models will drive applications in assisted music education, heritage recovery, and contemporary artistic creation.
An innovation hub: the project fosters collaboration between universities and cultural institutions, strengthening the fields of Digital Humanities and Computational Musicology.
ChordIA doesn’t just apply AI to music; it develops a pioneering methodology to fill a critical gap in musicology by focusing analysis on the period before 1775, when tonality was still being forged.
The ultimate goal is to lay the scientific and strategic foundations for a future ERC Synergy Grant, consolidating an interdisciplinary team capable of leading the development of the next generation of AI for the analysis and generation of symbolic music.
Álvaro is a Professor of Musicology at the Universidad Complutense de Madrid (UCM) and has served as the Director of the Instituto Complutense de Ciencias Musicales (ICCMU) since 2014. Holding a Ph.D. from the University of Cambridge, he is a leading authority on the cultural history of music, with a specific focus on 17th- and 18th-century Italian opera and religious genres. His innovative approach to research is highlighted by his leadership of the ERC Advanced Grant project DIDONE, which successfully combined historical musicology with data science to map emotions in opera . With a distinguished international profile, he served as Director-at-Large of the International Musicological Society (2007–2017) and has been a visiting scholar at prestigious institutions including Yale, NYU, and Harvard .
Salvador is an Associate Professor at the School of Computer Engineering at the Universidad Nacional de Educación a Distancia (UNED) and a prominent figure in the field of Artificial Intelligence applied to Digital Humanities. His research expertise lies in Natural Language Processing (NLP) and the development of semantic web technologies for cultural heritage. He has led significant European research initiatives, serving as Principal Investigator for the ERC Proof of Concept project LyrAIcs and the H2020 project CLS INFRA, while also playing a key role in the ERC Starting Grant POSTDATA . Additionally, he formerly directed the Innovation Lab in Digital Humanities (LINHD) and is the founder of CLARIAH-ES, the Spanish node for the European research infrastructure for the Arts and Humanities.