Spotify’s Music Recommendations Lambda Architecture Esh Kumar @eshvk Emily Samuels @emilymsa

Overview Why Lambda? Use Case: Discover Recommendations Batch Architecture Real-time Architecture Challenges Future Work

Why Lambda? 1 new user every 3 seconds. Contextual, time based recs more & more important

Discover Recs

The Discover Page Algorithmically generated fresh recs for users.

The Discover Batch Pipeline

Machine Learning Deep Dive

Word2Vec Words with similar contexts have similar meaning

Word2Vec King – Man + Woman = Queen

Annoy Approximate Nearest Neighbors Oh Yeah!

Batch Architecture Strengths Recs based on complete user history Weakness User vector generation time increasing with no. users. Not reflective of current mood.

Intro to Storm

Storm Distributed real-time computation system

Storm @ Spotify

Real-time Architecture

Workers die -> Cascading JVM Process death Memcache flakiness Cassandra JVM problems due to write/overwrite pattern Challenges

Future/Ongoing Work Simplify the topology Keep listens for 24 hours Ongoing work on other real time personalization features.

Esh Kumar Emily Samuels