Spotify’s Music Recommendations Lambda Architecture Esh Kumar @eshvk Emily Samuels @emilymsa
Overview Why Lambda? Use Case: Discover Recommendations Batch Architecture Real-time Architecture Challenges Future Work
Why Lambda? 1 new user every 3 seconds. Contextual, time based recs more & more important
The Discover Page Algorithmically generated fresh recs for users.
The Discover Batch Pipeline
Machine Learning Deep Dive
Word2Vec Words with similar contexts have similar meaning
Word2Vec King – Man + Woman = Queen
Annoy Approximate Nearest Neighbors Oh Yeah! https://github.com/spotify/annoy
Batch Architecture Strengths Recs based on complete user history Weakness User vector generation time increasing with no. users. Not reflective of current mood.
Intro to Storm
Storm Distributed real-time computation system
Storm @ Spotify
Workers die -> Cascading JVM Process death Memcache flakiness Cassandra JVM problems due to write/overwrite pattern Challenges
Future/Ongoing Work Simplify the topology Keep listens for 24 hours Ongoing work on other real time personalization features.