The Human Network - Matthew O. Jackson - E-Book

The Human Network E-Book

Matthew O. Jackson

0,0

Beschreibung

It's not what you know, it's who you know. Or so the adage goes. Professor Matthew Jackson, world-leading researcher into social and economic networks, shows us why this is far truer than we'd like to believe. Based on his ground-breaking research, The Human Network reveals how our relationships in school, university, work and society have extraordinary implications throughout our lives and demonstrates that by understanding and taking advantage of these networks, we can boost our happiness, success and influence. But there are also wider lessons to be learnt. Drawing on concepts from economics, mathematics, sociology, and anthropology, Jackson reveals how the science of networks gives us a bold new framework to understand human interaction writ large - from banking crashes and viral marketing to racism and the spread of disease. Filled with counter-intuitive ideas that will enliven any dinner party - e.g. how can our popularity in school affect us for the rest of our lives? - The Human Network is a "big ideas" book that no one can afford to miss.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern
Kindle™-E-Readern
(für ausgewählte Pakete)

Seitenzahl: 603

Veröffentlichungsjahr: 2019

Das E-Book (TTS) können Sie hören im Abo „Legimi Premium” in Legimi-Apps auf:

Android
iOS
Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



THE HUMAN NETWORK

ABOUT THE AUTHOR

Matthew O. Jackson is the William D. Eberle Professor of Economics at Stanford University (where he received his PhD in 1988), an external faculty member of the Santa Fe Institute, and a fellow of the Canadian Institute for Advanced Research. He is a member of the National Academy of Sciences, a fellow of the American Academy of Sciences, a fellow of the Econometric Society, a Game Theory Society Fellow, and an Economic Theory Fellow. He has received the Social Choice and Welfare Prize, the John von Neumann Award from Rajk Laszlo College, the Berkeley Electronic Press Arrow Prize for Senior Economists, and a Guggenheim Fellowship. He teaches an online course on social and economic networks, and co-teaches (with Kevin Leyton-Brown and Yoav Shoham) two online game-theory courses, which together have reached more than a million students. He is the author of Social and Economic Networks and Handbook of Social Economics.

THE HUMAN NETWORK

How We’re Connected and Why It Matters

MATTHEW O. JACKSON

First published in the United States in 2019 by Pantheon Books, a division of Penguin Random House LLC, New York.

Published in trade paperback in Great Britain in 2019 by Atlantic Books, an imprint of Atlantic Books Ltd.

Copyright © Matthew O. Jackson, 2019

The moral right of Matthew O. Jackson to be identified as the author of this work has been asserted by him in accordance with the Copyright, Designs and Patents Act of 1988.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior permission of both the copyright owner and the above publisher of this book.

Grateful acknowledgment is made for permission to reprint images on the following pages: Page 62: Barbulat/Shutterstock.com; Page 83: Reprinted from Physica A: Statistical Mechanics and Its Applications, volume 379, “The Topology of Interbank Payment Flows,” by Kimmo Soramäki, Morten L. Bech, Jeffrey Arnold, Robert J. Glass, and Walter E. Beyeler, pages 317–33. Copyright © 2007 by Elsevier B.V. Reprinted by permission of Elsevier B.V.

All other images created by the author.

Every effort has been made to trace or contact all copyright holders. The publishers will be pleased to make good any omissions or rectify any mistakes brought to their attention at the earliest opportunity.

10 9 8 7 6 5 4 3 2 1

A CIP catalogue record for this book is available from the British Library.

Trade paperback ISBN: 978 1 78649 020 9

E-book ISBN: 978 1 78649 021 6

Printed in Great Britain

Atlantic Books

An imprint of Atlantic Books Ltd

Ormond House

26–27 Boswell Street

London

WC1N 3JZ

www.atlantic-books.co.uk

For Sally and Hal

CONTENTS

1. Introduction: Networks and Human Behavior

2. Power and Influence: Central Positions in Networks

3. Diffusion and Contagion

4. Too Connected to Fail: Financial Networks

5. Homophily: Houses Divided

6. Immobility and Inequality: Network Feedback and Poverty Traps

7. The Wisdom and Folly of the Crowd

8. The Influence of Our Friends and Our Local Network Structures

9. Globalization: Our Changing Networks

Acknowledgments

Notes

Bibliography

Index

THE HUMAN NETWORK

1 · INTRODUCTION: NETWORKS AND HUMAN BEHAVIOR

The More Things Change

“In Globalization 1.0, which began around 1492, the world went from size large to size medium. In Globalization 2.0, the era that introduced us to multinational companies, it went from size medium to size small. And then around 2000 came Globalization 3.0, in which the world went from being small to tiny.”

— THOMAS FRIEDMAN, INTERVIEW IN WIRED (AUTHOR OF THE WORLD IS FLAT)

On December 17, 2010, Mohamed Bouazizi, a twenty-six-year-old street vendor in the dusty small city of Sidi Bouzid in central Tunisia, lit himself on fire. He did so as a desperate statement of outrage at the tyrannical government that had ruled Tunisia for more than two decades and repeatedly crushed any opposition. His family had long been outspoken against the government and he found himself regularly harassed by the local police. That morning, the police publicly humiliated him and confiscated his day’s produce. Mohamed had borrowed the money to buy his produce, and its loss was the last of many straws. Mohamed drenched himself in gasoline and burned himself alive in protest.

Decades ago, the several-thousand-person protest that quickly followed would have been the end of the story. Few outside of Sidi Bouzid would have even been aware that anything happened. However, videos of the aftermath of Mohamed Bouazizi’s self-immolation were impossible to contain and were quickly shared via social media and reported widely. News of the Tunisian and other governments’ oppression had already been spreading after confidential documents appeared weeks earlier on WikiLeaks. The Arab Spring that would follow was enabled by and coordinated via social media such as Facebook and Twitter as well as cell phones.1

Although the methods of communication were modern, ultimately it was a network of humans spreading news and outrage. What was new was how widely and quickly news could spread, and how people were able to coordinate their responses. But understanding what happened still boils down to understanding how news spreads between people and how their behaviors influence each other.

The size and ferocity of the resulting Tunisian protests toppled the government by mid-January. The insurgency had also spread to neighboring Algeria, and over the next two months erupted in Oman, Egypt, Yemen, Bahrain, Kuwait, Libya, Morocco, and Syria, and even Saudi Arabia, Qatar, and the United Arab Emirates. The successes and failures of the Arab Spring are open to debate. But the swift proliferation of protests throughout that part of the world was not only unprecedented but highlighted the importance of human networks in our lives.

As dramatic as recent changes in human communication have been, as Thomas Friedman’s quote above indicates, the world has shrunk many times before—in the wake of: the printing press, the posting of letters, overseas travel, trains, the telegraph, the telephone, the radio, airplanes, television, and the fax machine. Internet technology and social media are only the latest chapter in the long history of changes in how people interact, at what distance, how quickly, and with whom.

Yet even as networks of interactions between humans change, much about them is enduring and predictable. Understanding human networks, as well as how they are changing, can help us to answer many questions about our world, such as: How does a person’s position in a network determine their influence and power? What systematic errors do we make when forming opinions based on what we learn from our friends? How do financial contagions work and why are they different from the spread of a flu? How do splits in our social networks feed inequality, immobility, and polarization? How is globalization changing international conflict and wars?

Despite their prominent role in the answers to these questions, human networks are often overlooked when people analyze important political and economic behaviors and trends. This is not to say that we have not been studying networks, but instead that there is a chasm between our scientific knowledge of networks as drivers of human behavior and what the general public and policymakers know. This book is meant to help close that gap.

Each chapter shows how accounting for networks of human relationships changes our thinking about an issue. Thus, the theme of this book is how networks enhance our understanding of many of our social and economic behaviors.

There are a few key patterns of networks that matter, and so the story here involves more than just one idea hammered home. By the end of this book, you should be more keenly aware of the importance of several aspects of the networks in which you live. Our discussion will also involve two different perspectives: one is how networks form and why they exhibit certain key patterns, and the other is how those patterns determine our power, opinions, opportunities, behaviors, and accomplishments.

Billions Upon Billions of Networks

“Life is really simple, but we insist on making it complicated.”

— UNKNOWN2

Carl Sagan, in his famous book on the cosmos, talked of the “billions upon billions” of stars that exist in our universe. The number of stars in the observable universe has been estimated to be on the order of three hundred sextillion: 300,000,000,000,000,000,000,000—a number that sounds fictitious, like a zillion or a gazillion. If you are anything like me, it makes you feel small and insignificant, and in awe of nature.

The amazing thing is that this is a tiny number compared to the number of different networks of friendships that could potentially exist among a small community—say a classroom, a club, a team, or the workers at a small company. Impossible, you say? How can this be so?

For example, if our community were completely dysfunctional, nobody would be friends with anyone else; we would have an “empty” network, devoid of relationships. So, all 435 possible friendships would be absent. If our community were completely harmonious, we would see the opposite extreme—a “complete” network in which every person would be friends with every other. There are many networks between these extremes. Maybe the first pair of people are friends with each other, but the second pair are not; then maybe the third and fourth pairs are friends, but not the fifth and sixth and so on. To find the total number of networks of friendships, we note that each possible friendship could either be switched “on” or “off,” and so there are 2 possibilities for each friendship. Thus, the number of possible networks is 2 × 2 × · · · × 2, with 435 entries. Doubling a number 435 times results in a 1 followed by 131 zeros—the sextillions previously mentioned have just 23 zeros.3 So: sextillions of sextillions of sextillions of . . . networks—many times the number of stars in the universe, in fact, many orders of magnitude larger than the estimated number of atoms in the universe!4

Even with just 30 people, there are far too many networks to label in any systemic way. In classifying animals, when someone says “zebra” or “panda” or “crocodile” or “mosquito” we know what they are talking about. Except for a few special classes, we really cannot do that with networks. This does not mean that we should throw up our hands and say that social structure is too complicated to understand.

There are also characteristics that allow us to classify and distinguish animals: Do they have a spine? How many legs do they have? Are they herbivores, carnivores, or omnivores? Do they have live births? How large are the adults? What type of skin do they have? Can they fly? Do they live underwater? . . . When classifying networks we can identify critical characteristics too. For example, we can distinguish networks by the fraction of relationships that are present, whether those relationships are evenly distributed among the people involved, and whether we see certain segregation patterns. Moreover, these patterns will enable us to understand such issues as economic inequality, social immobility, political polarization, and even financial contagions.

Describing networks for our purpose of understanding human behavior is manageable for several reasons. First, a few primary features of networks yield enormous insight into why humans behave the way they do. Second, these features are simple, intuitive, and quantifiable. Third, human activity exhibits regularities that lead to networks with special features: it is easy to distinguish a network formed by humans from one in which the links are just formed randomly without any dependence on the other links around them or which nodes they connect.

As an example, consider the two networks in Figure 1.1. The network in panel (a) is a network of close friendships between high school students (details about this network appear in Chapter 5). The network in panel (b) has the same number of nodes and connections, but with the connections placed completely randomly by a computer.

So what is so different about the two networks? You can see a couple of things just by looking carefully. One is a sad fact of high school: there are more than a dozen students who have no close friends, while the random network has all nodes connected. The second more striking and general feature of the human network is that it is highly segregated. The students in the top part of the network are very rarely friends with the students in the bottom part of the network. The random network has links going in all directions.

The split in the network gets much easier to see, and more telling, when I add the races of the students in the high school, as in Figure 1.2.

Such divisions are one key feature of human networks, among several, that figure prominently in what follows. Why we form networks that have such features has some obvious explanations as well as some subtle ones, as we shall see. Ultimately, we care about our networks and their features because of their impact, and so by the end of this book you should know, for instance, why having divisions such as that in the high school network above profoundly impacts decisions to go to college, but yet has almost no impact on contagion of a flu.

Figure 1.1: A human network and a random network.

Figure 1.2: The High School Network Coded by Race. The nodes with bold stripes are self-identified as being “Black,” the nodes with gray fill are “white,” and the few remaining nodes are either “Hispanic” (center dot fill) or “Other/Unknown” (blank).5

Part of what makes the science of networks such fun, beyond the fact that it is so immediately important in all of our lives, is that it cuts across fields: making sense of human networks draws on core concepts and studies from sociology, economics, math, physics, computer science, and anthropology.6 For instance, our discussion will make heavy use of the concept of externalities from economics—the fact that people’s behavior impacts those around them—coupled with various forms of feedback that amplify that impact. These are features of many complex systems: settings that are simple to describe and understand and yet rich in their features and behaviors.

Our discussion will also take us well beyond networks of personal friendships and acquaintances, to include relationships such as treaties between countries as well as contracts between banks. The full set of social and economic networks that we consider are all “human networks,” as they all involve human interaction at some level.7

Our starting point is how your position in a network determines your power and influence, as this matters in almost all of what follows. We will make sense of the many different ways in which you can be influential and see how each depends on your network.

2 · POWER AND INFLUENCE: CENTRAL POSITIONS IN NETWORKS

“Sometimes, idealistic people are put off by the whole business of networking as something tainted by flattery and the pursuit of selfish advantage. But virtue in obscurity is rewarded only in Heaven. To succeed in this world you have to be known to people.”

— SONIA SOTOMAYOR, MY BELOVED WORLD

Mahatma Gandhi mobilized tens of thousands of people to participate in the Salt March in 1930 to protest British rule. It was a walk of more than two hundred miles from Gandhi’s base to the town of Dandi near the sea, where salt was produced from seawater. The narrow purpose of the march was to protest a salt tax. In such a hot climate, salt is essential and is consumed in large quantities, and high salt taxes were particularly symbolic of the hardships imposed on India by the British colonialists. More generally, the Salt March put in motion the acts of civil disobedience that would eventually end British rule.

If you see a parallel to earlier protests of British taxes on its colonies, you are not alone. The Boston Tea Party that protested British taxes more than a century before was not lost on Gandhi. In fact, he stated, “Even as America won its independence through suffering, valour and sacrifice, so shall India, in God’s good time achieve her freedom by suffering, sacrifice and non-violence.” It is said that after the Salt March, at Gandhi’s meeting in London with Lord Irwin (the Viceroy of India), when asked if he wanted sugar or cream for his tea, Gandhi replied that no, he preferred salt “to remind us of the famous Boston tea party.”1

The Salt March offered just a glimpse of what Gandhi would later accomplish, and his act of illegally producing salt in April of 1930 encouraged millions to follow in civil disobedience. Martin Luther King Jr. mentions being moved when he first read of Gandhi’s march to the sea, and it is easy to see how it inspired King’s approach to the civil rights movement and organized marches.

These are examples in which an individual had the ability to, directly and indirectly, encourage millions of people to act. That reach was essential in Gandhi’s and King’s eventual success in changing the world. Judging power and influence by how many people a person can mobilize or impact is a natural starting point as it captures a person’s reach.

Networks help us to identify and measure this sort of reach. A first measure of reach is simply counting how many people one knows or can count as a friend or colleague. In today’s world we might also ask how many followers one has on social media. As we shall see, how many friends and followers a person has matters in subtle ways in driving a population’s perceptions and social norms.

However, having many direct friends or connections is just one way in which a person can be influential, and much of this chapter will be devoted to understanding other network sources of power. Neither Gandhi nor King directly knew more than a small fraction of, nor could they personally contact, everyone they mobilized. They had key allies and friends, and also reached many through the publicity that their acts created. The Salt March began with a contingent of dedicated followers and swelled as it progressed and its publicity grew.

A person can have few friends or contacts and still be very influential if those few friends and contacts are themselves highly influential. This sort of indirect reach is often where power resides, and we can see this sort of influence very clearly via network concepts. Gaining influence via influential friends becomes an iterative and somewhat circular notion, but one that turns out to be quite understandable in a network context, with many implications. Iterative, network-based measures of power and influence will help us understand how to best seed a diffusion, as well as what it was that made Google an innovative search engine.

When it comes to measuring power, this will not be the end of our story. Another way in which people can be important, and one that is particularly evident when considering networks, is being a key connector or coordinator. A person can be a bridge or intermediary between people who don’t know each other directly—enabling that person to broker favors and consolidate power by being uniquely positioned to coordinate the actions of others. This sort of power is seen in stories like The Godfather, and is evident in networks that explain the rise of the Medici in medieval Florence.

Understanding how networks embody power and influence will be useful when we later discuss things like financial contagions, inequality, and polarization. We will start with a look at direct influence.

Popularity: Degree Centrality

Although he did not mobilize people to march like Gandhi, Michael Jordan did mobilize people to buy shoes. His ability to influence huge numbers of people was unparalleled. It is not by accident that, just during his sports career, Michael Jordan was paid more than half a billion dollars by companies wanting to advertise their products.2 He earned just over $90 million in salary from actually playing basketball. By that metric, his value in marketing was (and remains) much larger than his direct value as an athlete and entertainer. Michael Jordan’s incredible visibility enabled him to directly influence the decisions of millions of people around the globe.3

In network parlance, how many connections or links (relationships) that a person has in some network is called that person’s “degree.” The associated measure of how central a person is within the network is known as “degree centrality.” If someone has 200 friends and someone else has 100 friends, then the first person is twice as central according to degree centrality. Such a count is instinctual and an obvious first method of measuring influence.4

And it is not just at the scale of a Gandhi, King, or Jordan that the number of people whom someone can reach matters. You are constantly being influenced by your friends and acquaintances. The people with the highest degree in any community, however small, have a disproportionate presence and influence.

What I mean by disproportionate presence refers to an important phenomenon known as the “friendship paradox,” which was pointed out by the sociologist Scott Feld in 1991.5

Have you ever had the impression that other people have many more friends than you do? If you have, you are not alone. Our friends have more friends on average than a typical person in the population. This is the friendship paradox.

In Figure 2.1, we see the friendship paradox in a network of friendships in a high school from a classic study by James Coleman.6 There are fourteen girls pictured. For nine of them, their friends have on average more friends than they do. Two have the same number of friends as their friends do on average, while only three of the girls are more popular than their friends on average.7

Figure 2.1: The friendship paradox. Data from James Coleman’s 1961 study of high school friendships. Each node (circle) is a girl and a link indicates a mutual friendship between two girls. The paradox is that most of the girls are less popular than their friends. The first number listed for each girl is how many friends the girl has and the second number is the average number of friends that the girl’s friends have. For instance, the girl in the lower left-hand corner has 2 friends, and those friends have 2 and 5 friends, for an average of 3.5. So the 2 / 3.5 represents that she is less popular than her friends on average. This is true for 9 out of 14 of the girls, while only 3 are more popular than their friends, and 2 are equal in popularity to their friends.

The friendship paradox is easy to understand. The most popular people appear on many other people’s friendship lists, while the people with very few friends appear on relatively few people’s lists. The people with many friends are overrepresented on people’s lists of friends relative to their share in the population, while the people with very few friends are underrepresented. Someone with ten friends is counted as a friend by twice as many people as another person who has just five friends.

In a mathematical sense, the paradox is not very deep—but paradoxes rarely are. Nonetheless it has implications for almost all of our interactions. Anyone who has been a parent, or a child for that matter, is familiar with statements like “everybody else at school has a . . .” or “everybody else at school is allowed to . . .” Although these sorts of statements are usually false, they often reflect what we perceive. The most popular students can be greatly overrepresented among the children’s friends, and so if the most popular students are all following some fad, then children end up thinking that everyone else is. Popular people disproportionately set perceptions and determine norms of behavior.

To see the implications of the friendship paradox most starkly, let us consider a simple example, and then look at some data that corroborate the example.

Consider a class of students who are influenced by their friends.8 These students, deep down, are conformists. They are faced with a simple choice: do they wear solid or plaid clothes? They each have a preference for solid or plaid and on the first day of school they follow that preference, as pictured in Figure 2.2.

Figure 2.2: The first day of school: The four most popular students have a preference for solids; the eight others prefer plaids.

Being true conformists, the students would like to do what the majority of others are doing, and only follow their own preference if there were equal numbers of others in each style. As pictured in Figure 2.2, four students prefer solids and eight prefer plaids. Thus, two thirds of the students prefer plaids, and if they could all see the whole group’s preferences, then they would all wear plaids the next day. However, note that it is the four most popular students, perhaps the boldest students, who prefer solids.

Figure 2.3: Students look around and try to match the majority of their friends. The most popular are all friends with each other (a clique) and all stay with solid. Popular students are overrepresented in students’ perceptions, and begin a cascade of people switching to solid.

The students don’t see everyone—they interact mainly with their friends as indicated by the links.

Figures 2.3 (a) to (d) show what happens each following day. The popular students all see each other and some others, and they all see a majority wearing solids and so they continue to wear solids. Some other students see mostly popular students, and so they switch to wearing solids. As we see in Figure 2.3 panel (a) the popular students all stay with solids and four more students switch to solids, and by the second day we have eight of the students wearing solids. Things quickly unravel from there, as we see in panels (b) to (d). Each day more of the students who are still wearing plaids see a majority of their friends wearing solids and they switch to solids. By the fifth day every student in the class ends up wearing solids, despite the fact that a majority of them started with a preference for plaids.

We can see the friendship paradox’s role in this cascade of fashion by examining Figure 2.4, which shows how the students incorrectly perceive the population preferences, based on what they see among their friends on the first day. The most popular students are overrepresented in people’s friendships and so three quarters of the students perceive that solids are in the majority even though two thirds of the students prefer plaid.

There are two aspects that you might notice about the structure of this example. One is that the most popular students all have the same preferences: all like solids. This helps in speeding coordination, and to their preferred fashion. This matters, and there are reasons for why the most popular students will be similar to each other, as we shall soon explore. The second is that the popular students form a clique—they are all friends with each other. This reinforces their behaviors and maintains their norm of solids, which then eventually takes over the rest of the population. It makes the example work cleanly, but the idea that the most popular people are disproportionately influential still holds without this. Indeed, fashion designers have long understood the importance of having celebrities wear their new and different designs on the Oscars red carpet.

The impact of popularity and the friendship paradox is perhaps at its purest in settings of peer influence, such as students’ perceptions of others at school. A long series of studies have found that students tend to overestimate the fraction of their peers who smoke, consume alcohol, and use drugs, as well as the frequency with which they do so, and often by substantial margins. For instance, a large study covering one hundred U.S. college campuses found that students systematically overestimate consumption of eleven different substances, including cigarettes, alcohol, and marijuana.9 In particular, a further study focusing on alcohol consumption compared students’ self-reported drinking behavior—how many drinks they had the last time they partied or socialized, with their perceptions of how many the typical student at their school had the last time she or he partied or socialized. Out of the more than 72,000 students at the 130 colleges in the study, the median student answered 4 drinks—a fact that seems alarming, especially given that a quarter of the students answered 5 drinks or more. But what is surprising, given how high these numbers are, is that more than 70 percent of the students still managed to overestimate the alcohol consumption of the typical student at their own school by a drink or more.10

Figure 2.4: The friendship paradox at work. The fractions next to the students are their perceptions of the preferences for solids over plaids, based on what they see among their friends. Most of them mistakenly perceive a majority preference for solids, with only the few students in the lower right initially perceiving a majority for plaids. Even those students will quickly see a majority wearing solids.

To explain these misperceptions, we don’t have to dig deeply into the psychology of the students. The friendship paradox provides an easy insight. When students are attending parties or social events, they are interacting disproportionately with the people who attend the most parties—so students’ perceptions of alcohol consumption ends up overrepresenting people who attend many parties. This is a version of the friendship paradox—the people who students see at the parties are more likely to attend more parties than the average student. Students’ perceptions are not only influenced by their experiences at parties and other social events, but also by what they know about their closer friends. Here again, the friendship paradox is at work. If more popular students are more likely to smoke or consume alcohol, that will bias students’ estimates. Indeed, one study estimated that each additional friendship that a middle school student had accounted for a 5 percent increase in the probability that the student smoked.11 Similar estimates have been found for alcohol: being named as a friend by five additional others accounted for a 30 percent increase in the likelihood that a middle school student had tried alcohol.12

There are several effects that push students who socialize the most to be higher consumers of alcohol and cigarettes. One is that such consumption by teenagers is a social activity. People who spend more time socializing with others thus have more reason to consume alcohol. There is also a reverse effect: students with a higher propensity to consume alcohol would tend to seek out opportunities to consume it and others with whom to share it.13 On top of these effects is that students who have less parental supervision have more time to hang out with other students, and more opportunities to try alcohol, cigarettes, and drugs. Finally, social activities by their very nature experience feedback. Viewing one’s peers drinking encourages drinking. That increased level of drinking further increases the drinking of peers, and so this continues to cycle in a feedback loop.14

So, given that students’ estimates of peer behavior are based at least in part, if not largely, on their personal observations, the friendship paradox and the fact that the most socially active students often take more extreme behaviors leads us to expect students to systematically overestimate peers’ behaviors. More generally, given that many behaviors are influenced by perceived norms, we end up with behaviors driven disproportionately by those who socialize the most, and the resulting norms are more extreme than if our perceptions were not network-based.

The friendship paradox is enhanced by social media, where the magnitude of the effect can be staggering. For example, a study of Twitter behavior15 found that more than 98 percent of users had fewer followers than the people whom they followed: typically a user’s “friends” had more than ten times as many followers as the user. Those more popular users are more active, and despite their small numbers they play an important role in viral content. Given the increased use of social media, especially by adolescents, the potential for biased perceptions in favor of a tiny proportion of the most popular users becomes overwhelming, especially when one factors in that the most popular social media users may have very different behaviors, as we see from the relationship between students’ popularity and earlier and heavier use of alcohol and cigarettes. Partying is also by its very nature a social activity, which can further amplify the effects of social media as pictures and stories of alcohol and drug consumption are shared. In contrast, behaviors such as studying tend to be more solitary events and information about them is less likely to be shared. It is thus natural for a teen to overestimate the amount of drugs and alcohol consumed by his or her peers and to underestimate the time spent studying by those same peers.

The bias that accompanies the friendship paradox, whether or not we realize it, applies well beyond “friendships.” The friendship bias is an example of “selection bias”: our observations are often from biased samples depending on how that sample was picked. We disproportionately fly on the most heavily booked flights, eat at the most popular restaurants, drive on the busiest roads and at the busiest times, go to parks and attractions at the most crowded times, and attend the most crowded concerts and movies. These experiences bias our perceptions as well as our perceived social norms, usually without our understanding those effects. As Shane Frederick (2012) states in a study about our tendency to overestimate other people’s willingness to pay for things: “Customers in the queue at Starbucks are more visible than those hidden away in their offices unwilling to spend $4 on coffee.”16

Comparisons, Comparisons

“If you torture the data enough, nature will always confess.”

— RONALD COASE, HOW SHOULD ECONOMISTS CHOOSE?17

“I want to be perceived as a guy who played his best in all facets, not just scoring.”

— MICHAEL JORDAN, 2003 NBA ALL-STAR GAME

Who was the best basketball player of all time, Wilt Chamberlain or Michael Jordan? Maybe you would like to make a case for LeBron James. Having grown up in Chicagoland, I have my own answer to such questions, but the comparison is really between great athletes who had very different styles and roles in the game.

There are many different statistics that can be used to summarize their careers. For instance, Jordan and Chamberlain are amazingly similar on several dimensions: they each averaged 30.1 points per regular season game during their careers, both had just over 30,000 points in regular season games (32,292 for Jordan and 31,419 for Chamberlain), and each amassed several Most Valuable Player awards (5 for Jordan and 4 for Chamberlain). However, there are other dimensions on which they differed: Michael Jordan led his team to more NBA championship wins (6 to Wilt’s 2), but Wilt Chamberlain amassed dizzying numbers of rebounds per game (22.9 to Michael’s 6.2).

There are other dimensions on which other players stand out. Steph Curry’s record three-point totals are far beyond anything seen before. Kareem Abdul-Jabbar’s longevity at a high level is unparalleled. Kareem played for 20 years, amassing more than 40,000 points in total, and played in 19 All-Star Games, after having dominated basketball at the college level as nobody had before. LeBron James’s all-around dominance has been evident since he appeared on Sports Illustrated’s cover as a junior in high school. But if we really want to measure all-around contributions then we should consider the triple-double—having at least 10 points, 10 rebounds, and 10 assists—all three statistics in double figures. Then one has to remember Oscar Robertson, who averaged a triple-double for a whole season (a feat only recently matched by Russell Westbrook), and had so many triple-double games that nobody else even comes close, not even Magic Johnson.

The point here is not really to have a “da Bears, da Bulls”18 argument about basketball prowess, but to emphasize several things: statistics capture useful information in a succinct manner, different statistics encapsulate different things, and even a long list of statistics can fail to capture all of the nuances of the things that they describe.

Our lives would be simpler if measuring something could always be boiled down to a single statistic. But part of what makes our lives so interesting is that such unidimensional rankings are generally impossible for many of the things that are most important to understand: lists of rankings end up being both controversial and intriguing. How does one compare the musical innovations of Haydn, Strauss, and Stravinsky; or the contributions to human rights of Eleanor Roosevelt, Harriet Beecher Stowe, and Harriet Tubman? Is Lionel Messi or Diego Maradona the more impressive soccer player? Can one possibly compare the art of Pablo Picasso to that of Leonardo da Vinci? Or, is it easier to compare the paintings of Pablo Picasso to those of Henri Matisse, not only because Picasso and Matisse were contemporaries, but because they were rivals? Many might argue that such comparisons are hopeless and meaningless. However, they force us to think carefully about the various dimensions on which these people made contributions and why those contributions were game-changing.19 When one looks at different basketball statistics one sees different players stand out, with each amazing in his or her own way. Similarly, when looking at different statistics that characterize people’s network positions, different people stand out as being most “central.” Some people end up being very central according to some but not other measures, and which network statistic(s) are most appropriate depends on the context, just as whether you would rather add a top scorer or a top defender to your basketball team would depend on the circumstances.

We have already seen that one measure of centrality—degree centrality—helps us understand why the highest-degree people in a network end up having disproportionate influence. This is a first “network effect.” As the most basic and obvious measure of network centrality, degree centrality is akin to average points per game in the basketball example. However, to complete the analogy, different people can have different strengths in terms of their positions in networks—so that who is most “central” will vary with the way in which we ask the question, just as Wilt was a dominant rebounder while Michael drove his team to championships and Steph Curry stretched defenses in new ways. Comparing nodes (e.g., people) in a network based on their degree centrality can completely miss some of the most essential aspects of power and influence. So let’s see some other concepts.

It’s Who You Know—Locating the Needles in the Haystack

“Networking is rubbish; have friends instead.”

— STEVE WINWOOD

Google might not even exist except for the serendipitous assignment of Sergey Brin to show Larry Page around the Stanford University campus in 1995, when Larry was considering Stanford for his doctoral studies. Sergey’s family had emigrated to the U.S. from Russia in the late 1970s. Long fascinated by mathematics and computer programming, Sergey had come to Stanford for its computer science program. Larry Page shared a similar fascination with computers, and recalls a childhood of “poring over books and magazines, or taking things apart at home to figure out how they worked.” Although their strong personalities clashed at times, their common interests and intellects led them to a fast friendship. Most important for us, they shared a growing curiosity about the structure of the World Wide Web.

By 1996 Sergey and Larry were working together on the design of a search engine for the Web. They began using Larry’s dorm room to house a set of computers cobbled together out of the parts that they could find, and Sergey’s room for an office where they developed their ideas and programs. In a paper that they wrote together as students, Sergey and Larry describe how rapidly the Web was expanding in the late 1990s and how search engines were not really up to the task. One of the first search engines, the World Wide Web Worm of 1994, indexed just over 100,000 pages. By 1997, another search engine, AltaVista, was claiming to have tens of millions of queries per day, and the Web already had hundreds of millions of pages to search and index. The sheer volume of pages to index was making it impossible to find what the user wanted. To quote Brin and Page, “as of November 1997, only one of the top four commercial search engines finds itself (returns its own search page in the top ten results in response to a query of its name).”

So how does one locate the right needles in such a giant haystack? There are some obvious ideas as to how to identify the Web pages that a user might want to see when they type in some keyword. But huge numbers of pages contain the same keywords. Having the keyword appear frequently on some page does not come close to guaranteeing that it is what most users are looking for. Perhaps tracking past traffic and looking deeper into the content of various pages might help. Many variations on this theme were being tried but nothing seemed to work adequately. It was easy to begin to think that the Web was just becoming too large, and indexing and navigating it in any sensible way was destined to be an overwhelming task.

Brin and Page’s breakthrough was born from their interest in the network structure of the Web: it holds a lot of useful information, as the structure is not an accident. Web pages link to other Web pages that they see as being important. So how did Brin and Page understand and use that information? Brin and Page’s key insight was that a useful way to identify a page that a searcher might be most interested in was to look at which other Web pages have links pointing to that Web page. If other important Web pages point to a page, then that suggests that it is an important page. One does not judge a page simply by how many pages link to it, but by whether it is linked to by well-connected pages. In many settings it is more important to have “well-connected” friends than just to have many friends.

This sort of definition is circular: a page is “important” because it is linked to by other “important” pages, which are in turn “important” because they are linked to by other “important” pages. Despite the circularity, it turns out to have a beautiful solution, and one that is extremely helpful in network settings.

Suppose we want to spread a rumor or some information that we think will be relayed via word of mouth. To see why a straight measure of popularity falls short, consider the network in Figure 2.5. It is clear from just looking that the positions of Nanci and Warren are quite different from each other, even though they each have two friends. They differ in terms of how well-connected their friends are, and relatedly, how well-positioned they are in the network. Warren’s friends have only two friends each, while Nanci’s friends have seven and six friends. So, while Warren and Nanci score equally well in terms of their “degree” (number of friends), Nanci’s friends have higher degrees than Warren’s.

We could stop here: instead of just counting friends, we could count how many additional friends each of those friends brings—so we could track friends of friends: which we can call “second-degree friends.” Looking beyond direct friends to count friends of friends would be a good start, and Nanci already becomes clearly better situated to spread information than Warren. But why stop iterating here? Why not consider “third-degree friends”? Now Nanci’s friendship with Ella is not so fruitful in terms of third-degree friends, but her friendship with Miles leads to even more connections. By the time we have gone out three steps from Nanci we have reached everyone except Warren. Going out three steps from Warren we only reach five other people, while from Nanci we have reached sixteen. This makes Nanci a much better candidate for spreading information than Warren, even though they both have the same degree.

Figure 2.5: Two people, Nanci and Warren, both have degree 2. However, they differ in how connected their friends are and in their overall positions in the network.

How does one capture this in a large network, as we could go on forever? There are various ways of doing so, but let me describe the crux of the idea. Let us start by just adding up first-degree (direct) friends. So, as we see in Figure 2.5, Nanci and Warren each get a value of 2 since they each have two friends. Next, let’s add in second-degree friends. But should we count these as highly as first-degree friends? For example, if we think about spreading information starting with Nanci, it is more likely that information gets from Nanci to Miles, than to a friend of Miles—as it first has to pass from Nanci to Miles and then also has to be spread further from Miles. It might be much less likely to make it two steps than just one step, for instance half as likely. So, for now, let’s weight a friend of a friend half as much as we value a friend. Nanci has eleven second-degree friends, so she gets a score of 11/2 for her friends-of-friends. Warren has only one second-degree friend and so he gets 1/2. Thus, Nanci has a score of 7.5 so far, counting first-and second-degree friends, while Warren’s score is now only 2.5. As we move out to third-degree friends, Nanci has three, while Warren has two. Again let’s weight those by another factor of a half, so we will give each of them a score of 1/4. So Nanci adds 3/4 to her score and Warren adds 2/4 to his score, and so Nanci is up to 8.25 and Warren is up to 3. Iterating in this manner, we can quantify how much more reach Nanci has in the network than Warren.

The relative comparison between Nanci and Warren also turns out to be the solution to another question. Let’s define each person’s centrality as being proportional to the sum of their friends’ centralities. This is similar to the calculation we just did. By doing this, Nanci gets some fraction of Ella’s and Miles’s scores, which come from adding up some fraction of their friends’ scores, and so forth. This iteration is similar, because Ella’s and Miles’s scores are coming from their friends, which are Nanci’s second-degree friends, and those scores come from their friends, which are Nanci’s third-degree friends, and so on.20

Luckily, this type of system of equations, in which each person’s centrality is proportional to the sum of her friends’ centralities, is a quite natural and manageable math problem. It developed through a series of contributions from a who’s who list of mathematicians from the eighteenth through twentieth centuries: Euler, Lagrange, Cauchy, Fourier, Laplace, Weierstrass, Schwarz, Poincaré, von Mises, and Hilbert. Hilbert named the solutions to such problems “eigenvectors” (pronounced “eye”), the common modern name. Not surprisingly, eigenvectors pop up in all sorts of applications from quantum mechanics (Schrödinger’s equation) to the definitions of the “eigenfaces” that comprise the basic building blocks used in facial recognition patterns. When solving for the eigenvector in our example, we find that Nanci’s score is about 3 times that of Warren, as we see in Figure 2.6.21

The Brin and Page innovation was to rank Web pages by what they termed PageRank—which relates to our discussion above and to an eigenvector calculation. Although Brin and Page’s problem was not spreading a rumor through a network, it was based on another closely related iterative problem called the “random surfer problem.” A user starts at some page and then randomly follows a link from that page to another page, with each link getting equal probability. The user then repeats this, randomly surfing the Web in this fashion.22 Over time, if we calculate the relative fraction of times that the user lands on each page, it is an eigenvector calculation. In this case, the weights that are being used at each step are proportional to the number of links embedded in each page.

Figure 2.6: Eigenvector centralities of each node (person). Nanci outscores Warren by almost a factor of 3, even though they both have the same number of connections. Miles ranks highest, even though Ella has the highest degree centrality.

There were two challenges that Brin and Page faced. The conceptual challenge of finding the most relevant pages was addressed by not just ranking pages by popularity, but by calculating how ‘“well-connected” the pages were in this iterative, eigenvector sense. The more practical challenge was implementing this on the huge scale of the Web, which involves crawling the Web and indexing pages, storing data about the content and links of each page, and then making such iterative calculations about network position. It is one thing to calculate such things for Nanci and Warren in our small network above, but it is another to approximate this for billions of pages, especially when they are constantly evolving in their content and links.

Brin and Page developed an algorithm based on these sorts of calculations, and well-suited for huge networks, and called it BackRub, which they started running on Stanford servers. The name BackRub comes from looking at backlinks—the links that lead one to a page. BackRub quickly outgrew the student accounts that Brin and Page had on the Stanford servers, and by 1997 they had moved the search engine and renamed it Google—a variation on googol, which is the number corresponding to a 1 followed by 100 zeros, referring to the vast size of the Web that their algorithm managed to conquer. For anyone who struggled through the early days of searching the Internet, the ability of Google to find useful pages was incredible. There were many competing search engines, and typically one would try several search engines in an often futile attempt to find a Web page that one really needed. By 1998, PC Magazine reported that Google “has an uncanny knack for returning extremely relevant results” and ranked it in its top hundred Web sites.23 The rest is history.24

The Diffusion of Microfinance

Although Google history suggests that an eigenvector-centrality-based algorithm outperformed the alternatives, search engine algorithms are complicated and other differences in their algorithm might have also explained Google’s success. It would be nice to see more definitive evidence that one’s friends’ positions matters. Also, BackRub was identifying pages by how easy they are to reach, while in many situations we are interested in how influential someone is in terms of reaching others.

This was on my mind when I was visiting MIT in 2006 and talking with Abhijit Banerjee, a friend and professor there, and discussing how it would be wonderful to really test such differences in action. As luck would have it, Abhijit was precisely the right person for me to be talking to (as he often is). It turns out that Esther Duflo, another MIT professor, was in touch via her sister Annie with a bank in southern India, called BSS (Baratha Swamukti Samsthe), that was planning to roll out a new microfinance program via word of mouth. (You can see the network at play even in how this research project got off the ground.) The word-of-mouth program ended up offering us a perfect opportunity to see how network structure mattered in the spreading of information, and ended up allowing us to test which centrality measure would best predict a person’s ability to spread information. Abhijit, Esther, and I, together with Arun Chandrasekhar, who was then a graduate student at MIT (and coincidentally whose family was from Karnataka, the region in question), began what turned out to be a long-term study.

The pioneer of the microfinance revolution was Muhammad Yunus. He founded the Grameen Bank in Bangladesh in the 1970s and began making widespread loans of very small amounts in the 1980s. Yunus and the Grameen Bank were recognized for their innovation with a Nobel Peace Prize in 2006. The innovation was simple but clever. Many loans throughout the world involve a house or a car as collateral, or are advances on paychecks to people with an employment record, or are a loan via a credit card for people with proven credit history, backed by aggressive collection agencies that go after defaulters. Microfinance loans are aimed at extremely poor people with variable employment, and little to no collateral, and in settings where trying to collect would be prohibitively expensive. So, what was the innovation?

The innovation was that the loans were based on joint liability—holding several people responsible if someone failed to repay their loan. If someone defaults on their loan, their friends also feel the consequences. Now there are many variations of such microfinance loans, but a typical system is the one followed by the bank in our story, BSS, and illustrates the idea. BSS’s loans were offered exclusively to women between the ages of eighteen and fifty-seven, with a limit of one loan per household. Women were formed into groups of five who were held jointly liable for the loans: if one of the women defaulted on her own loan, then the entire group was called into default on their loans. Default then denies a borrower access to future loans—or at least makes it more difficult for the defaulter to borrow again. In some cases, this operates at an even wider level in which joint liability extends across groups, so that too many defaults would cut a village off from a lender entirely. Holding people jointly liable for repayments leads to reputational and social pressures on people to not let their fellow villagers down by defaulting, and also means that group members have incentives to help each other out and step in if someone is unable to pay.

Also, repayment of one loan typically enables the borrower to take out subsequent loans, which then increase in size. The promise of larger future loans based on current repayments—essentially allowing these people to build a credit history one step at a time—was another big incentive to repay. In addition, participants often receive some basic financial training that encourages some savings and teaches them to track income, plan, and how to keep a simple book tracking payments. While this training may seem rudimentary, it can be empowering to the villagers.25 On a visit to one of the villages, a woman being interviewed about her finances gave me an illuminating lecture on how she had been increasing the size of her loans, was maintaining an accounting system tracking money in and out of the household, had built better-diversified groups involving both Muslims and Hindus, and had put together several loans to buy a used truck and start a business.

Although there were some late payments, defaults on loans issued by BSS in these villages in the years of our study were almost nonexistent.26

Another important aspect of microfinance is that the restriction of the loans to women impacts the dynamics of a household. Even though some, if not much, of the money ends up in the control of males within the households in such villages, the fact that the loans can enter only via a woman in a household can give the women some say in how the money ends up being invested or spent.27

BSS’s spread of microfinance illustrates the importance of network centrality, and the difference between degree and eigenvector centrality.

The bank BSS in our study was faced with the question of how to disseminate the news about the availability of microfinance to potential borrowers in the seventy-five villages in Karnataka that it was planning to enter. The volatile and caste-based politics of the villages, coupled with corruption, meant that the bank did not wish to rely on local village governments to spread information. Although some villagers can be reached via cell phones, they are so bombarded with spam texts that advertising via phones was also not viable. Posting flyers and even driving around with a loudspeaker are other techniques for advertising; but again these are overused and primarily associated with political campaigning. So, for better or worse, the method that the bank settled on was to find a few “central” individuals and ask them to spread the word about the bank and availability of microfinance.

Without knowing the networks of friendships, how could the bank identify the most central villagers? Would it even matter? The bank guessed that the best-positioned villagers to spread information would be teachers, shopkeepers, and self-help group leaders.28 Let us call these people “the initial seeds.” Essentially the bank expected these initial seeds to be central—and the bank was thinking of degree centrality and had no concept of eigenvector centrality.

What was useful for our study was that in some villages the initial seeds did have high degree, while in other villages they happened to have low degree. For instance, in some villages a teacher had many contacts, but in another village the teacher did not. More important, there were also villages in which the initial seeds had high eigenvector centrality, but low degree centrality, and other villages with the reverse. Also, in some villages this technique of seeding information worked well, while in other very similar villages it failed miserably: the participation rate of eligible households in some villages was nearly half and in others it was less than one in ten. Thus, we could see which centrality measure best predicted the spread of information from the initial seeds. So, which measure of network centrality of the initial seeds explains the more than six-fold difference in eventual diffusion across villages?

In 2007—before BSS entered the villages—we surveyed the adult villagers and mapped out their networks. These small villages are especially well-suited for network analysis because most interactions are within the village and in person.29

Given our discussion of the importance of popular people in determining the perceptions of others and in setting trends, at first blush it makes sense that high-degree people should be good seeds for diffusing information about microfinance. This turned out not to be the case at all—there was no relationship between the initial seeds’ degrees and the spread of microfinance in the villages.30

Was our discussion of the importance of popularity nonsense? Clearly not. As with basketball players, popularity can be important, but it is just one facet of a rich picture. Popular individuals play roles in creating perceptions of social norms and fads, and directly reaching people. However, we found in our study, the main issue in the microfinance villages was getting information out widely to a whole village rather than simply influencing perceptions. Even for people living in a remote village, by 2008 it was hard to be unaware of microfinance, just as most people in the developed world are aware of credit cards and know that it is useful to have one. This was not about creating a trend, or influencing villagers’ perceptions of how many other villagers are taking out microfinance loans; it was a matter of making as many villagers as possible aware that loans were available.31

Indeed, spreading the news about microfinance was not simply about how many friends the initial seeds could reach, but also about how many friends-of-friends (second-degree friends) and third-degree friends, etc., that the initial seeds could reach.32