**This is an old revision of the document!**

This is Jacob O'Bryant's project to create a music player that learns what songs the user wants to listen to based primarily on when they skip or listen to the currently playing song.

# plans

1. Insert a dummy routine into the music app that chooses the next song to play.
2. make the app record skip ratios and use that to pick the next song
3. add a “mood change” feature so the user can manually specify that they're in a different mood
4. add feature to display the probability of certain songs/albums/artists being played.
5. implement the evaluation code so we can compare the moodless algorithm to normal shuffling and make a nice graph or something.
6. Set up a web app and make the app upload the data to it.
7. Recruit people to use the app so we can get their data.
8. Verify that the app works when the user specifies their mood manually ← even if this is as far as we get this semester, that could be ok
9. figure out an algorithm for detecting what mood the user is in automatically and organizing the data we collect accordingly
10. implement the moodful shuffling algorithm.
11. compare the algorithm's performance to normal shuffling and moodless smart shuffling.
12. write a paper about it

# activity notes

## 7 Mar 2017

I discovered a bug in the code that records skip events. I was recording what the current position in the song was and counting the song as skipped if over 50% of the song has been listened to. However, I discovered that I was receiving incorrect values for the current position in the song. I recently discovered how to find out if the song song is being skipped because the user pressed skip or simply because we've reached the end of the song. I changed the code to record skip events based on that so we don't even worry about the current song position. I've been using the app for a few days since fixing that, and I can now verify that the data is being collected corrected.

I wrote some python code to do statistical analysis on the data also. The output looks like this:

   control (n=20, confidence=0.950)
skip ratio: 0.500
margin of error: 0.184
   mood 0 (n=30, confidence=0.950)
skip ratio: 0.333
margin of error: 0.142

The control group is the songs that were suggested purely by random. All other groups have songs suggested by our algorithm. The data in mood 0 shows that our algorithm performs better than the control, but the margin of error is pretty high. I only have a couple days' data to work with, so the numbers will be more interesting after I've used the app for longer. The script also generates this graph:

[there's supposed to be a graph here…]

The graph shows the skip ratio when considering the first n skip events. Since the algorithm improves over time but the graph always includes data from the beginning, the graph will tend to overestimate the skip ratio for our algorithm.

## 2 Mar 2017

I've changed the app so whenever it suggests a new song, there is a 10% chance the song will be picked totally by random. When the app records the user's response to the song (skip or listen), it also records whether the song was suggested with our fancy algorithm or by chance. This data will be our “control group.” I've also added some very basic code for allowing the user to manually switch between different moods. Right now they get three different moods. Soon I'll make it so you can create however many moods you want, etc.

My goal right now is to get to the recruiting stage as fast as possible. As soon as I have the code in place to make the app upload its data to me, I'll start looking for some early research participants. While they start to use the app and give me their data, I'll work on the app's usability so I can start recruiting more heavily.

## 23 Feb 2017

I was thinking about how to remedy the following problem. With the current moodless player, it falls back to album and artist skip ratios if there isn't sufficient data about the actual song itself being played. If you have an album with nine songs you hate and one song you love, it's likely the one song will get buried. We record the timestamp of whenever you listen or skip a song, but currently it isn't being used. We could give higher weight to songs that haven't been played. The timestamps could be used so we don't….

OK different idea that just popped into my head: Before we make the app automatically detect the user's mood, we can give them an option to specify what mood they're in. This would make the app work for everyday use, so we could deploy it and start collecting data right away while we figure out the auto mood stuff. The data would probably help with figuring out the mood-detecting algorithm too. We could also start detecting how good the moodless algorithm is. To get the performance of our moodless algorithm, we calculate the number of skips per hour or number of skips per number of listens. To compare this to random shuffling, every ten songs we could give the user a totally random song and use their reaction to see what the performance is. That way we can get both kinds of data, but we don't have to make the user sit through a lot of random shuffling. (maybe throwing in a totally random song here and there would help with adding variety anyway).

The app can keep track of what version of the algorithm is used to pick songs. So if we want to improve the algorithm, all we do is make the change, update the version, push out an update for the app and wait for the new data to be uploaded to our server. That would be a really nice setup to have working. So I think getting that up and running should be the first priority.

After that's done, we could start making tweaks like the one I started to mention in the first paragraph. But more importantly, we'd be able to see if the current algorithm is good enough. If we can verify that the app is pretty good at playing the right songs when the user tells us what mood they're in, then we should be in good shape to do the auto mood stuff. And then we'd have some data we could use to write a paper.

Note on the user specifying their mood: To the app, a new mood would be like a new user account on a computer or something. The app wouldn't infer anything from what the user names the mood (like “rock” or “lighthearted” or whatever), it would basically be the same as “mood 1”, “mood 2” etc. When the user creates and switches to a new mood, the skip ratio data from other moods would be completely ignored. So that's why it's like a new user account: the new user's data is completely separate.

To help in improving the algorithm, I'll add a feature in the app so you can browse through your songs/artists/albums and the app will show you how likely it is to play them. This will help us to find ways to improve the algorithm. I've also read that users have been shown to have more trust/think more favorably towards recommendation systems when the systems are transparent, i.e. the user has some idea of what's going on under the hood instead of it being a black box. So the users would probably appreciate that. It could make them more willing to keep using the app. They might also be more enthusiastic about using it when we push out updates to the algorithm since they might be more cognizant of how our updates are changing the app's behaviour.

## 22 Feb 2017

As of now, the music player app keeps track of skip ratios correctly. I actually finished the skip-behaviour-recording code last Thursday, but there were some bugs I had to iron out. The music player app had some code to do gapless playback that messed things up. While the current song was playing, the next song would be preloaded so it could start right as the current one finishes. The problem is that the new code I write doesn't decide what song to play next until the current song ends, so sometimes the next song would be the preloaded one instead of the one my code came up with. I spent a few hours figuring that out, and the problem was fixed after commenting out a few lines of code. So everything is fine and dandy. Now I can start I can use the app to collect data from my own personal use. After a little while (a few days? a week?) there will hopefully be enough data to start working with.

Now that this part of the coding is over, I think I'm back into a thinking stage. Here's an updated todo list:

1. figure out how to evaluate the performance of the smart shuffle thing (e.g. we could record number of skips/hour and compare this to when we use normal shuffling)
2. figure out an algorithm for detecting what mood the user is in and organizing the data we collect accordingly
3. implement the evaluation code so we can compare the moodless algorithm to normal shuffling and make a nice graph or something.
4. implement the moodful shuffling algorithm.
5. compare the algorithm's performance to normal shuffling and moodless smart shuffling.
6. Set up a web app and make the app upload the data to it.
7. Recruit people to use the app so we can get there data.
8. Measure the performance of the algorithm on all the other people and write a paper about it.

I also just remembered that I'm technically supposed to write a progress report halfway through the semester. So maybe I'll do that soon too.

Side note: Here's an interesting paper about music information retrieval (MIR): http://www.cp.jku.at/research/papers/schedl_jiis_2013.pdf (The Neglected User in Music Information Retrieval Research by Schedl et. al.). It basically says that most MIR studies do evaluation by comparing their results to existing databases, etc. instead of having actual humans directly involved in evaluating. The problem is that these “system-based evaluations” don't necessarily match up with how humans would evaluate things (since humans are really complicated). He says other related fields (like information retrieval in general) are better about this than MIR is right now. So a cool thing about this project is that we're making an app that actual humans use, and we can use their usage data to directly evaluate our algorithm's performance. So that'll be a good thing to keep in mind.

## 09 Feb 2017

The raw data we collect will be a collection of records with these fields:

- song title
- album
- artist
- timestamp
- skipped?

A record will be created whenever the music player moves to the next song, whether that be from the current song finishing or the user skipping it.

The current state includes the following information that we care about:

- listen/skip ratios for songs, artists and albums (observable)
- the user's mood (hidden)

(Paul mentioned that I should look into hidden Markov models. From what I can tell, these would be the observable and hidden components of that model).

Using the timestamp data, we can find out when the listen/skip behaviour of the user changes. This means the user must be in a different mood.

Given a library of songs, we need to come up with a probability for each song that it is the one the user would most enjoy listening to next (so the sum of the probabilities will be 1). To simplify things, let's assume the mood is constant and the current skip ratios correspond to the current mood. We can calculate the probability that if we play a certain song, the user will listen to it (instead of skipping).

The ratio of number of times listened to number of times played (i.e. number of times listened + number of times skipped) for the individual song is this probability. If we have played song A for the user 1,000 times and they skipped it 500 times, then the probability that the user will listen to the song if we play it again is clearly 0.5. (Remember that we're holding the mood constant, and we also do not take into account that the user will get tired of the song if we play it over and over again). In this case, we don't care about the artist and album skip ratios. The song skip ratio is categorically more important than the other skip ratios.

However, things are different if we don't have as much data on the individual song. If we played song A for the user only two times and they skipped it once, then the ratio is the same, but we probably don't want to rely solely on it. If we have a lot of data on the album or artist skip ratios, those would probably improve our calculations. This suggests a confidence level for each ratio. Let R_s, R_al, R_ar be the ratio of number of times listened to number of times played for the song, album and artist, respectively. Let C_s, C_al, C_ar be the confidence levels of those ratios, where each confidence level is between 0 and 1. Then we could calculate P(the user will listen to the song | we play the song) like so:

   P = R_s*C_s + (1-C_s)(R_al*C_al + (1-C_al)(R_ar*C_ar))

Given this probability for each song, we could normalize them to get the probability of each song being the best song to play next. Then we would have the data we need to choose the next song to play.

We do need to calculate the confidence level though. If we've played the song 0 times, the confidence should be zero. As the number of times we've played the song increases, the confidence should approach 1. So some sort of simple, mathy function should do fine there.

   confidence(n) = n/(n + a)

where n is the number of times played and a is some constant. The smaller a is, the faster the confidence will increase. Perhaps machine learning could be used to find the optimal value of a?

But I think machine learning will primarily be needed only when we consider the mood. As a first step, I could implement all this stuff and make a music player that assumes you're always in the same mood. Then the next step would be figuring out how to model mood changes (how hard could it be?).

Here's something we could do: use the timestamps to cluster the data records into listening sessions. We could assume that the mood stays the same during each listening session (I don't think that assumption will be true all the time, so it'd be best to do something fancier down the road so we don't need that assumption). Then we could calculate the skip ratios for each listening session. Then we'd need a way to figure out if the skip ratios for one session are similar or different from another session. If they're similar enough, we can assume the user was in the same mood and combine the ratios. After doing that with all the listening sessions, we'll have a number of “moods”, each with a corresponding set of skip ratio data.

When the user starts listening and we don't know what mood they're in, we can sample songs from each mood set that are likely to be played when the user is in that mood. We can calculate the current skip ratio data and continue sampling until the skip ratio data becomes similar to one of the existing moods. If the data doesn't become similar, the user must be in a different mood. Then we assign them to a “new mood”, and make suggestions solely off of data from the current listening session.

short term goals: Figure out if this whole thing is the correct approach to take. If it is, start working on the moodless music player. Also figure out more concretely how to implement the moodful music player. Figure out a model and stuff like that.

## 02 Feb 2017

I finished step one, adding a dummy method to control the next song when skipping. In the code, there's not a huge distinction between when the next song is played because the user skipped or because the song finished. So when the song skips, I get the current position in the song which can be used to see if the song played to completion. I modified the music player so that if you skip a song and it's been playing for less than five seconds, the next song will be the theme song from “Bill Nye the Science Guy.” Once the machine learning stuff is in there, I just have to change the code to play whatever song the algorithm suggests.

Short-term plans: Finish the machine learning tutorial and start messing around with TensorFlow. Design the data model and start implementing it.

## 31 Jan 2017

Up until today, most of the work I've done has been researching and deciding on what platform/tools to use for the music player app. I looked into a few projects that allow you to write a mobile app and then compile it automatically for Android, iOS and other platforms. The two main ones I looked into were NativeScript and Kivy. With NativeScript, you write the app in HTML + CSS + JavaScript, and the project translates that somehow into actual native UI elements on the respective platforms. It looked nice, but Paul brought up the question of if machine learning libraries were available. I started looking at Kivy because it lets you write the code in Python. Kivy doesn't use native UI elements, so the app would look funny, but that doesn't really matter for our purposes. I made an Android hello world app with Kivy. It was pleasant, but after some more thought, I decided to scrap the whole cross-platform idea and just write a standard Android app in Java. I wanted to try one of the non-java solutions because

1. I've done a fair amount of normal Android development already and I don't particularly like Java.
2. It would make it easier to target iOS instead of just Android.

I ended up sticking with just Java + Android because

1. I can use an existing open-source music player instead of writing one from scratch (specifically, Vanilla Music Player*). I don't think it would have taken too long to write a basic music player from scratch (we only need the most basic features), but the time needed would still have been nontrivial. Starting with the open-source player will let me jump into machine learning faster.
2. Targeting iOS would be nice, but I think it's more important to get a prototype working as fast as possible. Even if the project stays on just Android, that should be sufficient for the near future. I can always switch to a cross-platform solution or create a separate iOS app later if it really would add enough value.

(*Last year, as part of a different research project, I tested out a bunch of open-source music players for Android. Vanilla Music Player is (relatively) simple and I can actually clone and build the project without getting a million errors (that's a lot more than I can say for the other players. Encore Music Player in particular was a nightmare, even though the precompiled app on the Play store looks really nice).)

So I've been delving into the code for Vanilla, making minor changes to help me understand how it's working. The disadvantage of using an existing player is that it's a relatively large project and I have to get used to the code base. I've also been reading through a machine learning tutorial on the TensorFlow website.

Short-term plans: Identify where the code for shuffling songs is. Figure out how to replace it with my own song-choosing code. Finish the machine learning tutorial and start messing around with TensorFlow.