Computer Lyric Generation

Ben Bay


Lyrist is a corpus-based lyric-generation system that may be used in conjunction with a music-generating system. It draws from and expands upon some ideas from previous researchers, greatly multiplying the scale of data and program features. Independent of any music, it produces high-quality lyrics based on a given template from a database of over one million pop songs. Its data model comes from a variety of English corpora containing billions of words and millions of texts. Its intelligently-designed Word2Vec operations combined with its vast array of customizable filters ensure that generated lyrics are top-notch and very specific. Program operations may be probabilistic through automatic random decisions, or deterministic through user input. Additionally, Rhyme-complete is a built-in system that allows for great control over phoneme matching and comparing. This template-replacement method is powerful, but its dependence upon a large database ultimately renders it simplistic. I recognize this is an initial foray into natural language generation for songs, and I intend to build upon Lyrist in future versions.


Computer technology is massively widespread and influential. But due to a commonly perceived separation of “tech- nical tasks” and “creative tasks”, most hold the belief that computers cannot be creative. Music is a highly-valued form of human creativity. It is ubiquitous in modern life. It benefits its individual listen- ers by regulating moods, creating feelings of transcendence, helping productivity, and accelerating mental development. I submit that creative tasks can be imitated by technical processes. This is a vital idea to propagate; when computers confirm this often unpopular notion it will shift the public eye, popularize computational creativity, and facilitate fund- ing for the advancement of artificial intelligence. Songwriting has always been viewed as a “humans-only” realm of artistry. The human-run music, film, and video game industries are already very profitable. Yet computer processes are much faster than human processes, and de- mand for new kinds of music is essentially limitless. Though there is already more than enough recorded music to last a human lifetime, industries and people will never stop seek- ing new music. In many cases these entities commission their music rather than writing it themselves. That is why I am building Lyrist, a lyric-generating sys- tem. It will help to satisfy the great interest in new music, and help to alter the incredulous public mentality on computational creativity.

Lyric generation is an important problem to solve, be- cause it is integral to the greater problem of generating com- plete songs. I am building Lyrist with this in mind; it has the capacity to work alongside a music generator. Lyrist- generated lyrics are independent of music, but they may be injected into music-generating systems, which would then build around them, determining rhythm based on stress pat- terns of the injected lyrics. The profound complexities and mathematical patterns found in musical verse can be penetrated deeper than ever by computers. Innovative writing styles may come about by Lyrist.

Pop is by definition popular, the music of the majority. It embodies the zeitgeist and reflects the movements of a time period. Pop is ubiquitously enjoyed and related to, and yet research for popular song lyric generation at the scale I am proposing is unprecedented (see Methods). Lyrist generates song lyrics using lyrical templates. It uses these to generate new corpus-based songs with the same structure as the original template. This method was cho- sen in order to maintain morphological and global seman- tic cohesion. It generates lyrics for any musical genre, or even words for poems. Rhyme-complete is a subsystem of Lyrist. In addition to being capable of integration into larger projects, it will work as a standalone tool. Rhyme-complete provides the most complex of Lyrist’s filters by managing all phonetic constraints for replacement words.

Lyric Generation

Research in song lyric generation is sparse, especially surrounding the genre of popular music, perhaps because it is viewed as “low-brow”. However, the following research projects helped me come up with ideas of elements to include in Lyrist. To generate song lyrics, Toivanen et al. use lyrical templates derived from existing songs (Brownstein, Yangarber, and Astagneau 2013). They pick a template and replace a subset of its words. They constrain lyric replacement to 50%. This method is simple, effective, and presents only low risk of losing morphological cohesion or global semantic cohesion. Its downside is its dependence upon existing songs.

Nguyen and Sa did research on rap lyric generation (Hieu Nguyen 2009). They used lyrics from 40,000 rap songs to generate raps with predefined song structures (i.e., verse, chorus, verse, chorus). They generated multiple candidate sentences using n-grams models from their rap database, scored each candidate sentence according to 6 rules of rap, then chose the highest-scoring candidate sentences for inclusion in the rap. Their rules focused on the probability of their sentence occurring based on their language model, the probability of their sentence’s length occurring, the term frequency and inverse corpus frequency of each word in the sentence, whether the last word of the sentence rhymed with the last word of the previous sentence, whether the last word of the sentence rhymed with another word in the sentence, and whether the last word of the sentence had the same number of syllables as the last word of the previous sentence. They saw moderate success with these rules. They also experimented with moving a song’s theme forward or backward from some “pivot word”. However, this technique generated mostly low-quality fragments, because their desired fragment length was far smaller than the average sentence length in their corpus. Oliveira’s system Tra-la-Lyrics (Oliveira 2015) generates text based on a given rhythm, rather than drawing a rhythm out of given lyrics.

Poem Generation

Though not strictly focused on song lyrics, research on natural language generation through poetry is instructive. Gerva ́s et al. explored the challenges of automatic poem generation (Gerva ́s, Herva ́s, and Robinson 2007). Among their concerns was the difficulty of computers aesthetically evaluating texts. They list these possible solutions: understanding phonetics, using phonetic knowledge to drive poem generation, managing vocabulary, comparison, analogy and metaphor, and dealing with emotions. Colton et al. designed a corpus-based poetry generator (Colton, Goodwin, and Veale 2012). Their system used poem templates, and constructed a mood for the day by analyzing current newspaper articles. They used a corpus of 21,984 similes, which were described as tuples of object, aspect, description. They used four measures to describe the aesthetic of generated poems: appropriateness, flamboyance, lyricism, and relevancy.



For my training data, I combine my pop song database of over a million songs, the Wikipedia corpus with 1.9 billion words, the Google News corpus with about 100 billion words, and full texts of the Davies corpora. In Table 1 I have listed corpus training weights for my Master English Model. Corpus Weight GloWbE 5 COCA 5 Pop songs 3 COHA 2 Wikipedia 1 Google News 1 Table 1: Proportional Corpora Weights These are the proportional weights assigned to my corpora after correcting for number of words. Davies corpora. High-quality bodies of English text, including the Corpus ofGlobalWeb-basedEnglish (GloWbE), the Corpus of Contemporary American English (COCA), and the Corpus of Historical American English (COHA). They are some of the largest, most widely-used, and best English text corpora available (Davies 2009).

GloWbE. Contains about 1.9 billion English words collected from twenty different countries. This allows insight into variation in English. COCA. Contains more than520 million words of text and is the only large and balanced corpus of American English. It is probably the most widely-used English corpus. It is equally divided among text types (spoken, fiction, popular magazines, newspapers, and academic). COHA. Contains over 400 million words of text from the 1810s-2000s, and is 50 times larger than the next-largest structured historical corpora of English. This allows insight into changes in English over time.

Word2Vec. Models language by assigning each word in a corpus to a point in a many-dimensional vector space, based on the word’s proximity to other words in the corpus (Mikolov et al. 2013). I use Word2Vec to build and manipulate a master model of English language. Reason for master model. I am able to use only one master model rather than many by the use of diverse customizable filters. Instead of training a new vector model for every possible genre of writing, dialect of speech and time period, I train one all-encompassing model and constrain its word suggestions with any quantity and combination of filters. This allows for specific preference with regards to a generated piece’s thematic, temporal, geographical, and authorial influences while maintaining only one language model.


Replacement by template. An effective method of generating song lyrics. It involves retrieving a template song from a large database, analyzing its lyrical structure (Manning et al. 2014), and using that structure to produce an entirely piece with different words. High-level replacement procedure. Every word in the original song is replaced. Each replacement requires the following process: Lyrist uses arithmetic operations on the Word2Vec model to generate a large list of words based on the original word. Each list entry is a suggested replacement for that word. A system of filters and constraints then removes each unsuitable suggestion. The top remaining suggestion is then chosen as the replacement word. Word2Vec word replacement. I have designed several scripts for different Word2Vec operations. These operations are similar, theme, and analogy. Each of these Word2Vec operations returns 10,000 word suggestions ordered by cosine similarity, which are then filtered. The remaining word with the highest cosine distance is chosen. Cosine distance is found by taking the dot product of two vectors. It reveals the proximity of the Word2Vec operation’s result to the actual suggested word in the model’s vector space. Thus cosine distance offers a good representation of how well your requested operation matches a given result of that operation.

similar(). Returns the 10,000 words nearest to a given word or point. Finds words that have similar definitions or usages.

theme(). Finds the average of all the words in a given word list, then calls similar() on that result. Effectively summarizes a line, stanza, or song.

analogy(). Takes in an old theme from theme(), a newlygenerated theme, and a word. It performs logical analogy arithmetic on the input in the form of “old theme is to new theme as original word is to new word”, then calls similar() on the resulting point. This is a very powerful operation, as it transforms the mood of a whole song with one simple analogy.

New theme generation. Lyrist finds a new theme for a piece by finding the theme of a randomly selected song from the template database. Word Filtration. Lyrist includes a variety of word filters. By filtering out unwanted Word2Vec suggestions, these filters allow the use of only one master model for language. The filters’ constraints are highly customizable, allowing for maximum control over text generation. They may be used individually or together in any combination. Logical conjunction, disjunction, and negation may be used to define a desired net filtration. Table 2 lists my current single-responsibility filters.

Rhyme-complete. A comprehensive rhyme system. It includes the Rhyme Filter, which manages all phonemic filtering in Lyrist. It identifies rhyme schemes, identifies rhymes by their literary classifications, identifies rhymes by their phoneme sequences (much like comparing nucleotide chains in genetics), and suggests new rhymes. It draws from data on phoneme similarities with confusion matrices such as the Hirjee matrix (Hirjee and Brown 2010), and employs rules established by experts in rhyme, such as Pat Pattinson’s rules (Pattison 1991). It allows for complete user customization; users have absolute control over all parameters for each of its various functions.

Future Work

Integration. When Lyrist is complete, I will integrate it with Paul Bodily’s music generator, Pop* (Bodily 2016). It will draw all its rythmic data from Lyrist’s lyrical output.

Study. I will design, conduct, and write a paper on the results of a double-blind study measuring observers’ ability to distinguish Lyrist-generated lyrics from human-written Filter Part of speech Hypersphere Stress Prescriptive Dictionary Descriptive Dictionary Thesaurus Frequency Time period Writing Type Rhyme Regular Relationship Cosine Distance Obscenity Profanity Vulgarity Filtration effect Removes words that do not are not the same part of speech as a given word. Makes a hypersphere of a given radius, and removes words whose vectors do not occur within its codimension. Removes any words with different stress patterns than a given word. Removes any words not listed in a dictionary. Removes any words not listed in a common-speech dictionary. Removes any words not listed in a Thesaurus’s list of synonyms for a given word. Removes any words whose frequencies are under a given margin. Removes words that do not appear in a given time span. Removes words that do not appear in a a genre of texts (newspaper, poetry, pop song, fictional novel, technical, typed online, spoken, etc.) Uses Rhyme-complete to remove any words that do not qualify for the requested rhyme. Removes words that are not found in the top n-grams or collocates of a given word. Removes words that fall outside a given range of cosine distance values. Removes sexually explicit words. Removes religiously sensitive words. Removes other crude or otherwise insensitive words.

Table 2: Single-Responsibility Filters
Filter Ballpark Distastefulness Common Word at Time Uncommon Slang Ensure New Meaning Poetic Replacement Filters used Hypersphere and Type of Writing and Frequency Obscenity and Profanity and Vulgarity Frequency and Time Period and Regular Relationship Frequency and not Prescriptive Dictionary Filtration effect Removes words that are unrelated or extremely distantly related to a given word. Removes all possibly distasteful words. Removes any words that were not popular or used in a given sequence during a given time period. Removes all words commonly used and found in standard dictionaries. Removes all words that share a lexeme with or have the incorrect part of speech of a given word. Removes all words that have different stress patterns and rhymes than a given word. lyrics. The desired outcome is that observers will be unable to make this distinction. Web tools. I will also publish Lyrist and Rhyme-complete on the web for anyone to try and use in their own projects. This is to further the cause of popularizing computational creativity. Improvement. Lyrist represents an initial effort in Natural Language Generation for songs. Though the template replacement method is powerful, its dependence upon a large database ultimately renders it simplistic when compared to techniques using more advanced artificial intelligence. I intend to explore this area and build upon Lyrist feature by feature. References Bodily, P. 2016. Computational creativity in popular music composition. BYU PhD Dissertation Proposal. Brownstein, J.; Yangarber, R.; and Astagneau, P. 2013. Algodan publications 2008-2013. Journal of Intelligent Information Systems 1–19. Colton, S.; Goodwin, J.; and Veale, T. 2012. Full face poetry generation. In Proceedings of the Third International Conference on Computational Creativity, 95–102. Davies, M. 2009. The 385+ million word corpus of contemporary american english (1990–2008+): Design, architecture, and linguistic insights. International journal of corpus linguistics 14(2):159–190. Gerva ́s, P.; Herva ́s, R.; and Robinson, J. R. 2007. Difficulties and challenges in automatic poem generation: Five years of research at ucm. e-poetry. Hieu Nguyen, B. 2009. Rap lyric generator. Hirjee, H., and Brown, D. 2010. Using automated rhyme detection to characterize rhyming style in rap music. Manning, C. D.; Surdeanu, M.; Bauer, J.; Finkel, J. R.; Bethard, S.; and McClosky, D. 2014. The stanford corenlp natural language processing toolkit. In ACL (System Demonstrations), 55–60. Mikolov, T.; Chen, K.; Corrado, G.; and Dean, J. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781. Oliveira, H. G. 2015. Tra-la-lyrics 2.0: Automatic generation of song lyrics on a semantic domain. Journal of Artificial General Intelligence 6(1):87–110. Pattison, P. 1991. Songwriting: essential guide to rhyming: a step-by-step guide to better rhyming and lyrics. Hal Leonard Corporation.
Speech and not Thesaurus Stress and Rhyme Table 3: Examples of Combined Filters of

mind/midterm.txt · Last modified: 2016/11/05 12:24 by bayb2
Back to top
CC Attribution-Share Alike 4.0 International = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0