Yahoo! Labs: Modeling networks of social behavior

I had the opportunity to attend this talk at the Yahoo! Labs in Santa Clara, California. Which, as its website says, "is responsible for the algorithms behind the quality of the web experience for hundreds of millions of users. We impact one out of every two people online..." Not many can make that claim.

Danny WyattToday's talk was by Danny Wyatt, a PhD candidate in Computer Science at the University of Washington. Danny is doing research in computational social science focusing on new methods for understanding and predicting social networks derived from automatically recorded behavioral data. He went over his data- and calculus-dense paper on Discovering Long Range Properties of Social Networks with Multi-Valued Time-Inhomogeneous Models

His paper covers three new models developed for analyzing such social behavior data, since existing social network analysis techniques cannot be easily applied to such data.

Spoken Networks projects

Danny described his work in the the University of Washington's Spoken Networks projects, a data-collection effort that gathered — while preserving participants' privacy — a year's worth of real-world, face-to-face conversations. This is new, as most studies have been of after-the-fact reports.

The 24 graduate students who participated wore recorders for up to 8-hours daily that ultimately recorded 4,400 hours of data.

This unprecedented access provides opportunities to address many questions including, in the study's words:

  • How does local behavior relate to the global structure of the social network?
  • How does a social network change over time?
  • How can meaningful information be extracted from raw data?
  • And how can all of this be done while protecting privacy?

What the data can show

There can be many real-world applications of this research. It was fascinating, for me, to look at quantifying the thresholds for "socially significant" time in conversations and at finding the point where the conversation's "social utility" starts to decline.

In Danny's summation, he proposed some immediate applications of this work — the ability, for example, to automatically rank email messages (i.e., predict which messages he would really want to see at a specific time of day or during an existing activity), or rank people's social influence (influential people are better ad targets).

His "influence mixture model" was particularly interesting. It quantifies how much influence each person exerts on others' speaking styles (pitch and rate), and the relationship between those influences and the global network structure. That is, people will change the way they speak, they will gradually converge in pitch and rate. The more central a person is in a social network, the more people change their pitch and rate to match that person. (Central, in this study, means the person who has the most short conversations (social interfactions) between people.)

If you have the chance, read through some of the material on the Spoken Networks website.

Incidentally, Yahoo! Labs is hiring

Yahoo! Labs has offices in Santa Clara, California; New York, New York; Barcelona, Spain; Haifa, Israel; Santiago, Chile; and Bengaluru, India. If you want to be a part of this scientific engine, Yahoo! Labs is looking to hire the following staff:
- outstanding Research Scientists at all levels in computer science, economics, social sciences, and large-scale computing
- Research Engineers to work alongside our scientists, for a variety of project types, including rapid prototype and proof-of-concept development, research engineering libraries, tools, and platforms for distributed computing and algorithm evaluation
- Search Sciences Director in Bengaluru (Bangalore)

Note Thanks to Daniel Raffel for his assistance.