Chronological Research Notes

December 15

End of Semester Paper


Lyrist is a corpus-based lyric-generating1 system. It may generate pieces independently or provide lyrics for a “sibling system” that solely generates music. Intention objects are inputted to determine a generative process, and new lyrics are generated. Intentions may be automatically or manually determined. Lyrist’s language model is vector-based, and was built from a variety of English corpora. Vector operations combined with customizable filters ensure that generated lyrics fulfill the piece’s intentions. Each generated piece is based on a template piece from a database of song lyrics and poetry. Such a template-based approach solves the problem of lyric generation, assuming grammatical structure is already accounted for. This method is powerful, but its dependence upon a large database renders it ultimately simplistic. I recognize this is an initial foray into natural language generation for songs, and I intend to build upon Lyrist in future versions. Addition- ally, Rhyme-complete is a built-in system that allows for great control over phoneme matching and comparing.



Computers are generally acknowledged as the ultimate doers of technical tasks, but not as creative entities. The phenomenon of computational creativity is newer and thus less-acknowledged than the ubiquitous phenomenon of general computation. This perhaps stems from the popular dichotomy of “technical tasks” and “creative tasks”. I submit this dichotomy is false, and that not only can creative tasks be imitated by technical processes, but that all creative tasks are, at their root, technical. That human cognition does not make these technicalities apparent is irrelevant; creativity is technical. This idea is often unpopular, yet is vital to propagate. As computers confirm it and the public eye shifts, computational creativity will be popularized and the advancement of artificial intelligence will accelerate. Music is a highly-valued form of human creativity. It is ubiquitous in modern life. It benefits its individual listeners by regulating moods, creating feelings of transcendence, helping productivity, and accelerating mental development. 1In this paper, “lyric” refers to a word from song or poetry. Songwriting is generally viewed as a “humans-only” realm of artistry. The human-run music, film, and video game industries are already very profitable. Yet computer processes are much faster than human processes, and demand for new kinds of music is essentially limitless. Though there is al- ready more than enough recorded music to last a human lifetime, industries and people will never stop seeking new music. In many cases these entities commission their music rather than writing it themselves. Pop is by definition popular, the music of the majority. It embodies the zeitgeist and reflects the movements of a time period. Pop is ubiquitously enjoyed and related to, and yet research for popular song lyric generation at the scale I am proposing is unprecedented2.


Lyrist as an imitator of human creativity. I am building a lyric-generating system called Lyrist. It represents an inter- section between poetic theory, musical form, and computational creativity. Its primary process is generation by template, which entails using template pieces to generate new pieces that have the same structure as the template. This method was chosen in order to maintain morphological and global semantic cohesion. In this way, it imitates preexist- ing lyrical structures and produces pieces that feel familiar. A set of pre-made intentions allows it to imitate structures like pre-established rhyme schemes and meters, and to specify cultural influence given by cultural indicators such as language, time period, movement, genre, group, or person. Rhyme-complete is a subsystem of Lyrist. In addition to be- ing capable of integration into larger projects, it will work as a standalone tool. Rhyme-complete provides the most complex of Lyrist’s filters by managing all phonetic constraints for replacement lyrics. Lyrist as a creative innovator. The profound complexities and mathematical patterns found in poetry and musical verse can be penetrated deeper than ever by computers; Lyrist primarily relies on templates for piece generation, but is built to accommodate non-template methods as well. Intentions can be automatically crafted and determined using random- ization, which may produce innovative writing styles. 2See Methods for detail. Computer Lyric Generation Ben Bay

Lyrist as a postmodern artist. The art of our day increasingly reflects the Postmodernist Movement. Poetry and mu- sic are no exception. Among its primary attributes, post- modern music: • is aleatoric, meaning some element of the composition is left to chance • is polystylistic, meaning it simultaneously uses multiple lyrical styles • presents multiple meanings and “locates meaning in individual listeners more than in scores, performances, or composers” (Kramer 2002) • considers technology not only as a medium to preserve and transmit music but also as “deeply implicated in the production and essence of music” (Kramer 2002) Lyrist exemplifies all of the above attributes; it uses random- ness, it uses data from diverse stylistic contexts to write its lyrics, presents pieces that are void of meaning until consumed by individuals (Jeongwon and Song 2002), and is it- self a statement on the creative power of technology. Its dependence upon imprecise third-party parsers means its com- positions are inherently flawed; fortunately, “postmodern” pieces are perfectly suited to abstraction of meaning by random processes. Lyrist as a resource for full song generators. Lyric generation is an important problem to solve, both individually and because it is integral to the greater problem of generating complete songs. Lyrist is built with this in mind; it has the capacity to work alongside a music generator. Generated lyrics are independent of music, but may be injected into music-generating systems, which would then build around them and determine rhythm based on stress patterns of the injected lyrics.

Grammar parsing. I use Stanford Univerity’s Stanford- CoreNLP (Manning et al. 2014) for data on grammatical structure, parts of speech, and named entities. It uses the Penn Treebank part of speech tagset (Marcus, Marcinkiewicz, and Santorini 1993). While these tags are more general than I would like, they are assigned fairly accurately and facilitate the project a great deal. Phoneme parsing. I use the Carnegie Mellon University Pronouncing Dictionary (Kominek and Black 2004) for data on phonemes and syllables. It uses ARPAbet, a phonetic transcription code that was designed specifically for English words. Systems reliant on lyrical input. Monteith et al. present a successful process for creating melodic accompaniments given pre-existing lyrics (Monteith, Martinez, and Ventura 2012). Bodily presents a system that creates melody, harmony, and rhythm specifically for pop songs (Bodily 2016). These approaches to music generation assume the presence of pre-existing lyrics. Though they do not solve the problem of lyric generation, when united with lyric-generators such as Lyrist, the pair solves the larger problem of creating completely generative songs. Chatbot techniques. Chatbot creation is a growing area of interest in Natural Language Generation (NLG) today. AL- ICE and Elizabeth (Shawar and Atwell 2002) are examples of older, more primitive chatbots. The ALICE chatbot system (Artificial Linguistic Internet Computer Entity) is made from units called topics and categories which rely on constant user input. Elizabeth represents a distinct approach which uses grammatical rules. I believe grammatical rules could be useful, but the ones used by Elizabeth are too narrow in scope.

Lyric Generation

Generative poetry and song lyrics are only two problems out of the many within NLG. Hence, scholarship devoted specifically to these topics is sparse. Since popular music is often viewed as “low-brow”, even less scholarship exists on generating lyrics for it. However, the few following research efforts helped me in my design of Lyrist. To generate song lyrics, Toivanen et al. use lyrical templates derived from existing songs (Brownstein, Yangarber, and Astagneau 2013). They pick a template and replace a subset of its lyrics. They constrain lyric replacement to 50%. This method is simple, effective, and presents only low risk of losing morphological cohesion or global semantic cohesion. Its downside is its dependence upon existing songs. I decided to use and build upon their template replacement method because of its simplicity. Nguyen and Sa had moderate success by using n-gram models and rules to score candidate sentences for generative rap lyrics (Hieu Nguyen 2009). In the presence of many intentions, a lyrical segment that follows them all perfectly may be impossible; hence I like the idea of scoring candidate lyric segments by their faithfulness to some preestablished rules or intentions. Oliveira’s system Tra-la-Lyrics (Oliveira 2015) generates text based on a given rhythm, rather than drawing a rhythm out of given lyrics. This is an interesting approach, but not one that focuses solely on text as mine does. Generating lyrics based on musical input is an interesting concept (since songwriters do it that way as well), but it’s too different from Lyrist’s current approach to warrant much attention at the present. Poem Generation NLG for poetry gives a second perspective on the same over- all problem. Its disconnect from music tends to give it purer focus on meaning and linguistics. Gerva ́s et al. explored the challenges of automatic poem generation (Gerva ́s, Herva ́s, and Robinson 2007). Among their concerns was the difficulty of computers aesthetically evaluating texts. They give these possible solutions: under- standing phonetics, using phonetic knowledge to drive poem generation, managing vocabulary, comparison, analogy and metaphor, and dealing with emotions. All these suggestions are good pieces of advice; Lyrist uses phonemic data and poetic theory with those phonemes, and uses intentions to manage and vocabulary and emotion. I am intrigued by and intend to eventually implement the idea of specifically man- aging comparison, analogy, and metaphor. Colton et al. designed a corpus-based poetry generator (Colton, Goodwin, and Veale 2012). Their system used poem templates, and constructed a mood for the day by analyzing current newspaper articles. They used a corpus of 21,984 similes, which were described as tuples of object, aspect, description. They used four measures to describe the aesthetic of generated poems: appropriateness, flamboyance, lyricism, and relevancy. Obtaining thematic intentions from the web, using a corpus of similes, and using quantitative categories of measurement for generated pieces are all interesting NLG approaches that I will consider integrating into my system.


Language Model

Word2Vec. Word2Vec is an open-source tool that models language by assigning each word in a corpus to a point in a many-dimensional vector space, based on the word’s relationship to other words in the corpus (Mikolov et al. 2013). I use Word2Vec to train a master model of English language. This model allows the approximation of meaning for every lyric. Reason for master model. I am able to use one model rather than many through the use of diverse, customizable filters3. Instead of training a new vector model for every possible genre, dialect, and time period, I train one all- encompassing model and constrain its lyric suggestions with any quantity and combination of filters. For the model’s training data I combine full texts of the Davies Corpora, my lyrical database of over 2 million songs and poems, the Google News Corpus, and the Wikipedia Corpus4. Table 1 shows my ranking of how closely certain language types approximate lyrical language. Table 2 lists corpus training weights based on those Table 1 rankings. These are the proportional weights assigned to my corpora after correcting for size.


Poetry & song database. I created a relational database of over two million annotated poems and songs ranging from the year 1600 to the present. These pieces are used for Word2vec and n-gram model training. Each piece is an- notated with relevant cultural data, which enables cultural intentions such as language, time period, movement, genre, group, and person to be used. “Person” may refer to a musician, songwriter, or poet.

  • Lyrical / poetic
  • Spoken
  • Literary
  • Magazine
  • Newspaper
  • Encyclopedic

Davies Corpora. Large bodies of English text, including the News on the Web Corpus (NOW Corpus), the Corpus ofGlobalWeb-basedEnglish (GloWbE), the Corpus of Con- temporary American English (COCA), and the Corpus of Historical American English (COHA). They are some of the largest and most widely-used English text corpora available (Davies 2009).

  • NOW Corpus. Contains 3.7 billion English words from web-based newspapers and magazines from 2010 to the present. It grows as new articles

in size by about 130 million words each month.

  • GloWbE. Contains about 1.9 billion English words collected from twenty different countries. This allows insight into variation in English.
  • COCA. Contains more than520 million words of text and is the only large and balanced corpus of American English. It is probably the most widely-used English corpus. It is equally divided among text types (spoken, fiction, popular magazines, newspapers, and academic).
  • COHA. Contains over 400 million words of text from the 1810s-2000s, and is 50 times larger than the next-largest structured historical corpora of English. This allows insight into changes in English over time.
  • Google News Dataset. Contains about 100 billion words of newspaper language. Pre-trained word and phrase vectors for this dataset were released (Mikolov et al. 2013) along with Word2vec.
  • Wikipedia Corpus. Contains 1.9 billion words of encyclo- pedic language.

Rather than outputting unplanned pieces, Lyrist generates pieces based on intentions. An intention is a combination of instructions given to Lyrist in order to influence its out- put. Users can either select and/or customize their own in- tentions, or leave them to be automatically determined.

Selective property. An intention may apply globally to a piece, or locally to some subsegment of the piece. Conjunctive property. An intention may be fused with other intentions of the same type. For example, a thematic intention could combine “paranoid” and “urban”, a cultural intention could combine “rock” and “jazz”, and a structural intention could combine “anapestic trimeter” and “iambic pentameter”. are published, and increases Recursive property. An intention may be recursive, mean- ing it is defined by other injected intentions. Thematic, cul- tural, and structural intentions are all predefined recursive intentions. For example, a song’s rhyme scheme could be based on hip-hop rhyme schemes while its diction is based on words from rock and roll. This would be done by inject- ing a song’s rhyme scheme (a structural intention) with “hip- hop” (a cultural intention), and injecting the song’s diction source (a cultural intention) with “rock and roll”. Allowing intentions to define one another recursively greatly widens the scope of possibilities and control offered by Lyrist.

Generation by Template An effective method of generating lyrics is generation by template, which entails retrieving a template piece from a large database, analyzing its grammatical structure, then us- ing that structure to produce a piece with entirely different lyrics. This method was chosen in order to maintain mor- phological and global semantic cohesion. The replacement procedure is as follows: Most lyrics in the original song are marked for replacement. Each marked word is inputted into a Word2Vec analogy operation. This operation returns a new suggestion lyric based on an analogy: “template theme is to new theme as template lyric is to new lyric.” The new lyric then replaces the original one in the generative piece. Word2Vec lyric replacement. I designed several scripts for different Word2Vec operations. These operations are similar(), theme(), analogy(), add(), and subtract(). These are described in detail below and in Table 4. Each Word2Vec operation returns lyric suggestions ordered by cosine simi- larity, which are then filtered5. From the remaining words, the one with the highest cosine distance is chosen. Cosine distance is found by taking the dot product of the two word vectors. It reveals the proximity of the Word2Vec opera- tion’s result to the actual suggested lyric in the model’s vec- tor space. Thus cosine distance offers a good representation of how well a requested operation matches any given result from that operation. New theme generation. Lyrist finds a new theme for a piece by picking the theme of a randomly selected piece from the template database.

  • similar(). Returns the nearest neighbor words of a given point. These neighbors are words that have similar defini- tions or usages as the inputted word.
  • theme(). Finds the average of all the words in a given word list, then calls similar() on that result. Effectively sum- marizes a line, stanza, or song.
  • analogy(). Takes in an old theme from theme(), a newly- generated theme, and a word. It performs logical analogy arithmetic on the input in the form of “old theme is to new theme as original lyric is to new lyric”, then calls similar() on the resulting point. This is a very powerful operation, as it transforms the mood of a whole song with one simple analogy.
  • add(). Adds two words together and calls similar() on the resulting point.


Here is an example of how Lyrist generates pieces. A randomly-selected template song has a theme of “sorrow”. Its rhyme scheme is AABB. It is given below.

Template Piece: First stanza of Sorrow, by The National

 Sorrow found me when I was young
 Sorrow waited, sorrow won
 I live in a house that sorrow built
 It’s in my honey, it’s in my milk

The user wants to generate a song about medieval knight- hood. She decides to set the new piece’s global emotional intention as “honor”. She also sets the structural intention of rhyme scheme to ABAB. She leaves the global operational intention at its default, which means a Part of Speech Filter7 will be used on incoming word suggestions. Her generative piece is given below.

Generative Piece

 Honor sought me when I was unsung
 Honor tarried, honor prevailed
 I bide in a land that honor swung
 It’s in my nectar, it’s in my kale
As you can see, important lyrics and syllabic struc- ture changed, while semantic and morphological cohesion were maintained. More importantly, both intentions were achieved; the new piece evokes images of heroism and honor rather than anything sad, and its rhyme scheme is changed from AABB to ABAB. The lyric “kale” is unexpected and reads humorously; since a diction intention was not in place, this is not considered a mistake. If the user wanted to en- hance her idea of a medieval song, she might add the cul- tural intention of diction that limits word choice exclusively to words found in Shakespeare plays.

Quantitative Evaluation

To assess Lyrist’s success, I will conduct an quantitative evaluation. The three metrics of success I will measure are 1) degree of indistinguishability from human-written pieces, 2) emotional accuracy, and 3) cultural accuracy. Imitation of human creativity This evaluation will be double-blind; each participant is assigned a random ratio of human pieces to computer pieces, and a random ordering of pieces. They will not know which is which until after com- pletion. The participant will encounter this question pertain- ing to authorial discrimination:

  • A participant is presented with a piece. It is either obscure and human-written or Lyrist-generated.

Prompt: Do you recognize this piece? (yes, it looks fa- miliar, no). If yes or it looks familiar: question ends and its data is thrown out. If no: Prompt: Select the source of this piece. Choices are human, computer, or unsure.

Emotional accuracy The participant will encounter one of the two following questions pertaining to emotion:

  • A participant is presented with a generated piece. It has a private intended emotion associated with it.

Prompt: “Choose the emotion this computer-written song most exhibits.” Choices include multiple unintended emo- tions and the intended emotion.

  • A participant is presented with multiple generated pieces. One is associated with a target emotion.

Prompt: “Choose the computer-written song that most ex- hibits the target emotion.” Choices include multiple songs of non-target emotion and a song of the target emotion.

Cultural accuracy The participant will encounter one of the two following questions pertaining to culture:

  • A participant is presented with a generated piece. It has a private intended cultural indicator associated with it. Prompt: “Choose the cultural indicator this computer- written song most exhibits.” Choices include multiple un- intended cultural indicators and the intended cultural in- dicator.
  • A participant is presented with multiple generated pieces. One is associated with a target cultural indicator. Prompt: “Choose the computer-written song that most ex- hibits the target cultural indicator.” Choices include mul- tiple songs of non-target cultural indicators and one song of the target cultural indicator.

Results from the above three measures of success will be published in my comprehensive paper on Lyrist.

Future Work

Integration. Lyrist will be integrated with Paul Bodily’s mu- sic generator, Pop* (Bodily 2016). It will draw all its lyrical and rhythmic data from Lyrist’s outputted pieces.

Web publication. I will publish Lyrist and Rhyme- complete on the web for anyone to try and use in their own projects. Hopefully this will garner attention and further the cause of popularizing computational creativity.

Improvement. Lyric generation by template represents my initial effort in NLG for songs. Though powerful, its dependence upon a large database ultimately renders it sim- plistic when compared to techniques using more advanced artificial intelligence. I intend to explore this area and give Lyrist functionality without templates, using n-grams and a grammar for part of speech in English. I would like to add the ability for Lyrist intentions to come from scanning on- line material. I would like to create a few quantitative cate- gories of measurement for generated pieces, so that human feedback is not the only source of product analysis. I intend to eventually add support for other major languages besides English. I would like to develop some intention or other sub-system that deals specifically with comparisons, analo- gies, and metaphors within generated lyrics. Additionally, to better imitate human creativity, Lyrist must be able to write lyrics based on musical input. So I would like to look into the possibility of using rhythmic, harmonic, or melodic in- tentions to guide lyric generation.


Bodily, P. 2016. Computational creativity in popular music composition. BYU PhD Dissertation Proposal. Brownstein, J.; Yangarber, R.; and Astagneau, P. 2013. Al- godan publications 2008-2013. Journal of Intelligent Infor- mation Systems 1–19. Colton, S.; Goodwin, J.; and Veale, T. 2012. Full face po- etry generation. In Proceedings of the Third International Conference on Computational Creativity, 95–102. Davies, M. 2009. The 385+ million word corpus of contem- porary american english (1990–2008+): Design, architec- ture, and linguistic insights. International journal of corpus linguistics 14(2):159–190. Gerva ́s, P.; Herva ́s, R.; and Robinson, J. R. 2007. Diffi- culties and challenges in automatic poem generation: Five years of research at ucm. e-poetry. Hieu Nguyen, B. 2009. Rap lyric generator. Hirjee, H., and Brown, D. 2010. Using automated rhyme detection to characterize rhyming style in rap music. Jeongwon, J., and Song, H. S. 2002. Roland barthes’” text” and aleatoric music: Is” the birth of the reader” the birth of the listener? Muzikologija (2):263–281. Kominek, J., and Black, A. W. 2004. The cmu arctic speech databases. In Fifth ISCA Workshop on Speech Synthesis. Kramer, J. D. 2002. The nature and origins of musical post- modernism. Postmodern music/postmodern thought 66:7– 20. Manning, C. D.; Surdeanu, M.; Bauer, J.; Finkel, J. R.; Bethard, S.; and McClosky, D. 2014. The stanford corenlp natural language processing toolkit. In ACL (System Demon- strations), 55–60. Marcus, M. P.; Marcinkiewicz, M. A.; and Santorini, B. 1993. Building a large annotated corpus of english: The penn treebank. Computational linguistics 19(2):313–330. Mikolov, T.; Chen, K.; Corrado, G.; and Dean, J. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781. Monteith, K.; Martinez, T.; and Ventura, D. 2012. Auto- matic generation of melodic accompaniments for lyrics. In Proceedings of the International Conference on Computa- tional Creativity, 87–94. Oliveira, H. G. 2015. Tra-la-lyrics 2.0: Automatic genera- tion of song lyrics on a semantic domain. Journal of Artifi- cial General Intelligence 6(1):87–110. Pattison, P. 1991. Songwriting: essential guide to rhyming: a step-by-step guide to better rhyming and lyrics. Hal Leonard Corporation. Shawar, B. A., and Atwell, E. 2002. A comparison between alice and elizabeth chatbot systems.

December 12

Arpabet phoneme codes

The Carnegie Mellon Pronouncing Dictionary uses this set of codes to represent English phonemes. I represent each as an enum in Lyirst.

   AA,	//			odd         AA D
   AE,	//			at			AE T
   AH,	//			hut			HH AH T
   AO,	//			ought		AO T
   AW,	//			cow			K AW
   AY,	//			hide		HH AY D
   EH,	//			Ed			EH D
   ER,	//			hurt		HH ER T
   EY,	//			ate			EY T
   IH,	//			it			IH T
   IY,	//			eat			IY T
   OW,	//			oat			OW T
   OY,	//			toy			T OY
   UH,	//			hood		HH UH D
   UW,	//			two			T UW
   B,	//			be			B IY
   CH,	//			cheese		CH IY Z
   D,	//			dee			D IY
   DH,	//			thee		DH IY
   F,	//			fee			F IY
   G,	//			green		G R IY N
   HH,	//			he			HH IY
   JH,	//			gee			JH IY
   K,	//			key			K IY
   L,	//			lee			L IY
   M,	//			me			M IY
   N,	//			knee		N IY
   NG,	//			ping		P IH NG
   P,	//			pee			P IY
   R,	//			read		R IY D
   S,	//			sea			S IY
   SH,	//			she			SH IY
   T,	//			tea			T IY
   TH,	//			theta		TH EY T AH
   V,	//			vee			V IY
   W,	//			we			W IY
   Y,	//			yield		Y IY L D
   Z,	//			zee			Z IY
   ZH;	//		    seizure		S IY ZH ER

December 5

Some time ago I put Lyrist onto a git repository, located here: I'm trying to maintain a development branch with all my changes, a last working branch, and a master branch. The code still will only work on my laptop, since I haven't changed the paths to apply generally to anyone's machine yet.

November 20

After discussing the benefit of intentions in computational creativity with Paul, I came up with am intention structure for Lyrist. It's higher-level than everything that exists in Lyrist so far, so I won't be implementing it until I have a completed low-level pipeline.

An intention represents a goal Lyrist has for a piece that will be generated. A user may input intentions for a piece, or Lyrist may probabilistically decide its intentions for a piece. A combination of the two is also acceptable. Some intentions are 100% strict and must be fulfilled, while others are optional. After a piece is generated an Evaluation will determine whether or not Lyrist fulfilled its quantitative intentions. Human feedback will have to be the ultimate source of evaluation for certain types of qualitative intentions.

An intention may be combined with an intention of the same type. For my emotional intention I could mix “paranoid” with “amorous”. For my cultural intention I could mix “rock and roll” and “jazz”. For my structural intention I could mix “limerick rhyme scheme” with “iambic pentameter”. For my operative intention I could mix “replace by template” with “generate by n-grams”.

Furthermore, a recursive intention may be defined by any other type of recursive intention. Emotional, cultural, and structural intentions are all recursive intentions. This means that my rhyme scheme (structural intention) could be based on common hip-hop (cultural intention) rhyme schemes, while though my diction (cultural intention) comes from country music. This piece would use two cultural intentions instead of one, each for a different purpose. This is a fascinating way to build an intention framework, because the possibilities are infinite. It's simply up to me to create an entry and functional definition for famous singers, poets, art movements, and genres of music. For Lyrist's integration with Pop*, The cultural intention will be perpetually set to the genre “pop music” because that's the genre it exclusively works within.

Emotional Intention

An emotional intention defines a hypershpere for a sentiment to work within in a vector space. If no emotional intention is selected, no guided direction of emotional sentiments will occur. Below two types of emotional intentions are listed.

  • Unifying emotion (a global sentiment)
  • Emotional flow (a sequence of sentiments)

Cultural Intention

A cultural intention defines a corpus of elements to work within. If no cultural intention is selected, a predefined corpus of broad, diverse elements is returned. Below some types of cultural intentions are listed.

  • Piece type (poem or song)
  • Language
  • Date
  • Time period
  • Movement
  • Group
  • Individual: a famous songwriter, musician, artist, or poet

Structural Intention

A structural intention defines an elemental structure to work within. If no structural intention is selected, one is chosen randomly (or freeverse is chosen). Below some types of structural intentions are listed.

Populated structures

  • Parts of speech
  • Phonemes
  • Syllables
  • Stresses

Pattern structures

  • Grammatical structure (more general than pos)
  • Rhyme scheme (more general than phonemes)
  • Meter (more general than syllables)

Structure-based statistics

  • Number of [Pos] Words
  • [Pos] density
  • Number of [vowel / consonant] phonemes
  • [vowel / consonant] proportion
  • Rhyme density

Operational Intention

An operative intention simply defines the computational methods Lyrist will use to generate its piece. Below some types of operational intentions are listed.

Generate by template

  • Use analogies
  • Use similarities
  • Use n-grams
  • Use grammar

Generate without template

  • Use n-grams
  • Use grammar
  • Use scoring rules

October 31

Today I finished and turned in my CS 497 midterm paper. It shares a lot with my ORCA grant proposal, except that since I had more time to ponder ideas for my research, it is more developed and ambitious in scope. I decided to unentangle my work from Paul's by separating my project from Pop* and naming it Lyrist. This will simplify the relationship between our work, especially when trying to submit scholarly papers. I will more than double the length of this paper for my CS 497 final, and in Janauary I hope to submit a polished version of it to ICCC 2017. It can be found here: midterm

October 26

Today I submitted my ORCA grant proposal. It can be found here: orcaproposal

October 19

My first theorized application of Word2Vec in writing pop song lyrics is a success! First, I create a theme from the average meaning behind the whole stanza or song. Next I choose a new theme for the new song. In both cases, I'm still using custom input themes. Then in each lyric replacement, I use the analogy old theme is to new theme as old lyric is to new lyric. It changes the new song's feel substantially.

Original Lyrics, theme: depression

Sorrow found me when I was young 
Sorrow waited, sorrow won 
Sorrow they put me on the pill 
It's in my honey, it's in my milk 

New Lyrics, theme: excitement

Joy found me when I was excited 
Joy waited, joy won 
Awe they put me on the buzz 
It's in my delight, it's in my popcorn 

When I make a replacement, I ensure that the new word falls into the same part of speech as the old word. That is the single more important constraint when sorting through word2vec's results. My part of speech tagger still isn't as advanced as I'd like, or more lyrics would be replaced.

October 5

Pop* Application

Word2Vec can be used in a few ways. In every case, there must be a framework that still works for other constraints, like rhyme and syllable.

  • After an initial lyric replacement, word2vec can replace all remaining lyrics in the song (appropriately, within the same POS) with words in that same initial relationship. “first original” is to “first replace” as “second original” is to “second replace”.
  • A simpler method is to merely replace lyrics by finding a word that is nearby it in the vectorspace. With POS restrictions, this could be a very simple, useful way to change up template lyrics yet still keep a similar feel.
  • More complex applications deal with doing more vector arithmetic on the original lyrics. Instead of analogy (add one distance then replace) or nearby (find word at closest distance), we could add / subtract multiple word distances to garner more specific meaning. An example would be finding a nearby word, subtracting that from the original, finding a word near to the new one, then adding those words together. That may produce a lyric that is more general. A similar thing could be done to intensify, find synonyms, or find opposites.

October 3

Using the Deep Learning For Java library with Maven, I have assembled a working example of word2vec. I combined the 3 Lord of the Rings books with The Hobbit into one .txt file, and used that as my text corpus for word2vec to learn on. My initial results are less than ideal, perhaps because

  • Training hyperparameters didn't match the corpus
  • Corpus was too small
  • Corpus was too sparse / the wrong kind of source for good word2vec results

Some results were slightly accurate:

09:22:03.789 [main] INFO  Tester - legolas is to gimli as frodo is to:
[frodo, sam, pippin, merry, gollum, softly, up, got, squealed, himself]
And while some results were incorrect, they evoked objects and imagery that were no doubt related:
09:22:03.732 [main] INFO  Tester - mordor is to sauron as rohan is to:
[mordor, rohan, helms, morgul, edoras, gate, cair, isengard, river, anduin]
Paul and I met and agreed that my major problem was not training methods, but data. I was thinking too small. Four novels is far too small a corpus for desirable results. Now I'll try to find a pre-trained model or train a new model on all of Wikipedia, for example. The only difference in training will be that I raise the minimum word occurrence to a figure more like 100 words to cut out rarely used, highly technical words.

September 19

I met with Paul today; we decided to move forward by utilizing the Stanford Parser more fully, and by integrating Word2Vec into our software. He approved yesterday's update, so I pushed everything up to his remote repository to help us share information. He also outlined his recent paper he submitted to AAAI, which found that a 4-gram is the most accurate method of finding a song's key and key signature.

September 18

I updated the word replacer to recognize equality between words. Now an original song with the same word used multiple times will be modified such that the replacement word replaces every instance of the original. Also added: if two words were unequal in the original version, they will remain unequal in the new version, so that no unintended duplicates come up.

Original stanza of The National's Sorrow

Sorrow found me when I was young 
Sorrow waited, sorrow won 
Sorrow they put me on the pill 
It's in my honey, it's in my milk 

New stanza

Love found me when I was old 
Love waited, love won 
Love they put me on the team 
It's in my name, it's in my word 

September 12

I with with Paul today and we agreed on new priorities. Rhyme and syllable structure are both upcoming constraints, but we decided to opt in the direction of more creative constraints first, namely:

  • Finer POS tags
  • Replace all instances of a word together
  • Replacing meaningfully (W2V, Parse-tree)

September 10

Today I finished the first rudimentary lyric-replacer. It reads in lyrics, uses the Stanford Parser to tag all parts of speech, then replaces a certain percentage of each POS with a new word from the same POS. The Penn Treebank POS tags work well for English text, but in many cases are too broad of classifications to replace one word with another word of the same tag and still make sense. For example, there is a Personal Pronoun tag (PRP), but no distinction for gender (male/female), number (singular/plural), person (first-person, second-person, third-person), or case (possessive, objective, subjective, reflexive). For this reason I am currently experimenting mostly with nouns.

Original lyrics of Perfume Genius's All Waters

When all waters still 
And flowers cover the earth 
When no tree's shivering 
And the dust settles in the desert 
When I can take your hand 
On any crowded street 
And hold you close to me 
With no hesitating 

Modified lyrics

When all communities still 
And things cover the game 
When no tree's shivering 
And the child settles in the line 
When I can take your idea 
On any crowded result 
And hold you close to me 
With no business 

This is a low-level lyric replacer that has only the constraint of Parts of Speech, with replacement frequency as the only user input. Other important constraints will be

  • syllabic
  • rhyming
  • associative (Word2Vec?)

September 4

Penn Treebank PoS tags

Number Tag Description
1 CC Coordinating conjunction
2 CD Cardinal number
3 DT Determiner
4 EX Existential there
5 FW Foreign word
6 IN Preposition or subordinating conjunction
7 JJ Adjective
8 JJR Adjective, comparative
9 JJS Adjective, superlative
10 LS List item marker
11 MD Modal
12 NN Noun, singular or mass
13 NNS Noun, plural
14 NNP Proper noun, singular
15 NNPS Proper noun, plural
16 PDT Predeterminer
17 POS Possessive ending
18 PRP Personal pronoun
19 PRP$ Possessive pronoun
20 RB Adverb
21 RBR Adverb, comparative
22 RBS Adverb, superlative
23 RP Particle
24 SYM Symbol
25 TO to
26 UH Interjection
27 VB Verb, base form
28 VBD Verb, past tense
29 VBG Verb, gerund or present participle
30 VBN Verb, past participle
31 VBP Verb, non-3rd person singular present
32 VBZ Verb, 3rd person singular present
33 WDT Wh-determiner
34 WP Wh-pronoun
35 WP$ Possessive wh-pronoun
36 WRB Wh-adverb

August 31

After all my research on rhyme, syllables, and phonemes, my impression is that this topic is much more complex than it appears on the surface. Rhyme and variations of rhyme are the quintessential textual device of poetry and song lyrics. My idea is to make a software tool that finds/generates rhymes with respect to every possible parameter, such as edit distance between phoneme chains, similar phoneme tolerance, cross-word rhyming tolerance, multisyllabic rhymes, perfect type vs imperfect (lots of variation), and rhyme scheme structures. I still need to research this topic more to organize these diverse elements into workable aspects of my software. I also need to delve more into past scholarship on the subject (Pat Pattison). This idea probably goes beyond the scope of the Pop* project, but many of these ideas still fit into its purpose well.

I created these pages to compile technical information on these subjects:

August 30

I am logging my researching hours on the Time Log page

mind/research-notes.txt · Last modified: 2016/12/19 08:22 by bayb2
Back to top
CC Attribution-Share Alike 4.0 International = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0