16,99 €
Bots – automated software applications programmed to perform tasks online – have become a feature of our everyday lives, from helping us navigate online systems to assisting us with online shopping. Yet, despite enabling internet users, bots are increasingly associated with disinformation and concerning political intervention.
In this ground-breaking book, Monaco and Woolley offer the first comprehensive overview of the history of bots, tracing their varied applications throughout the past sixty years and bringing to light the astounding influence these computer programs have had on how humans understand reality, communicate with each other, and wield power. Drawing upon the authors' decade of experience in the field, this book examines the role bots play in politics, social life, business, and artificial intelligence. Despite bots being a fundamental part of the web since the early 1990s, the authors reveal how the socially oriented ones continue to play an integral role in online communication globally, especially as our daily lives become increasingly automated.
This timely book is essential reading for students and scholars in Media and Communication Studies, Sociology, Politics, and Computer Science, as well as general readers with an interest in technology and public affairs.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 311
Veröffentlichungsjahr: 2022
Cover
Series Title
Title Page
Copyright Page
Acknowledgments
Abbreviations
1 What is a Bot?
Where Does the Word ‘‘Bot’’ Come From?
History of the Bot
Early bots – Daemons and ELIZA
Bots and the early internet: Infrastructural roles on Usenet
Bots proliferate on internet relay chat
Bots and online gaming on MUD environments
Bots and the World Wide Web
Crawlers, web-indexing bots
Spambots and the development of the Robot Exclusion Standard
Social media and the dawn of social bots
Different Types of Bots
APIs – How bots connect to websites and social media
Social bots
Chatbots
Service bots and bureaucrat bots
Crawlers/spiders
Spambots
Cyborgs
Zombies, or compromised-device bots
Lots of bots – botnets
Misnomers and Misuse
Important bot characteristics
Conclusion
Notes
2 Bots and Social Life
Bots and Global Society
Social Bots, Social Media
Bots, Journalism, and the News
Bots, Dating, Videogames, and More
Conclusion
Notes
3 Bots and Political Life
Astroturfing, Inauthenticity, and Manual Messaging
Identifying Bots: Actors, Behavior, Content
The Tactics Used by Political Bots
Dampening
Hashtag poisoning
DDoS attacks
Amplification
Harassment
Political Bots and Their Uses
Influencing voter turnout
Surveillance
Passive surveillance – information gathering and analytics
Active surveillance – transparency
Social activism – The dawn of the bots populi
The Bot Arms Race
Conclusion
Notes
4 Bots and Commerce
What Is a Business Bot?
Business Chatbots and Customer Service
Transactional Bots and Finance
Conclusion
Notes
5 Bots and Artificial Intelligence
What Is AI?
History of AI and Bots
What Limits the Progress of AI?
Agent-Based AI, the Semantic Web and Machine Learning
How Bots Use AI
Bot detection
Chatbots and Natural Language Processing (NLP)
Non-AI chatbots
Corpus-based chatbots and fuzzy logic
AI-based chatbots
Conversational interfaces and AI assistant chatbots
Open-domain chatbots
Conclusion
6 Theorizing the Bot
Theorizing Human–Computer Interaction and Human–Machine Communication
Overview of the Literature
The Human–Bot Relationship
The Infrastructural Role of Bots
Conclusion
Notes
7 Conclusion: The Future of Bots
The Future of Bot Development and Evolution
NLP
Synthetic media
Semantic Web
Future Questions for Bot Policy
Future Ethical Questions
The Future Study of Bots
Notes
References
Index
End User License Agreement
Cover
Table of Contents
Begin Reading
ii
iii
iv
vi
vii
viii
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
Nancy Baym,
Personal Connections in the Digital Age
, 2nd edition
Taina Bucher,
Mercedes Bunz and Graham Meikle,
The Internet of Things
Jean Burgess and Joshua Green,
YouTube
, 2nd edition
Mark Deuze,
Media Work
Andrew Dubber,
Radio in the Digital Age
Quinn DuPont,
Cryptocurrencies and Blockchains
Charles Ess,
Digital Media Ethics
, 3rd edition
Terry Flew,
Regulating Platforms
Jordan Frith,
Smartphones as Locative Media
Gerard Goggin,
Apps: From Mobile Phones to Digital Lives
Alexander Halavais,
Search Engine Society
, 2nd edition
Martin Hand,
Ubiquitous Photography
Robert Hassan,
The Information Society
Tim Jordan,
Hacking
Graeme Kirkpatrick,
Computer Games and the Social Imaginary
Tama Leaver,
Tim Highfield and Crystal Abidin
,
Leah A. Lievrouw,
Alternative and Activist New Media
Rich Ling and Jonathan Donner,
Mobile Communication
Donald Matheson and Stuart Allan,
Digital War Reporting
Nick Monaco and Samuel Woolley,
Bots
Dhiraj Murthy,
, 2nd edition
Zizi A. Papacharissi,
A Private Sphere: Democracy in a Digital Age
Julian Thomas,
Rowan Wilken and Ellie Rennie
,
Wi-Fi
Katrin Tiidenberg,
Natalie Ann Hendry and Crystal Abidin
,
tumblr
Jill Walker Rettberg,
Blogging
, 2nd edition
Patrik Wikström,
The Music Industry
, 3rd edition
nick monaco and samuel woolley
polity
Copyright © Nick Monaco and Samuel Woolley 2022
The right of Nick Monaco and Samuel Woolley to be identified as Authors of this Work has been asserted in accordance with the UK Copyright, Designs and Patents Act 1988.
First published in 2022 by Polity Press
Polity Press
65 Bridge Street
Cambridge CB2 1UR, UK
Polity Press
101 Station Landing
Suite 300
Medford, MA 02155, USA
All rights reserved. Except for the quotation of short passages for the purpose of criticism and review, no part of this publication may be reproduced, stored in a retrieval system or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior permission of the publisher.
ISBN-13: 978-1-5095-4358-8 (hardback)
ISBN-13: 978-1-5095-4359-5 (paperback)
A catalogue record for this book is available from the British Library.
Library of Congress Control Number: 2021949069
by Fakenham Prepress Solutions, Fakenham, Norfolk NR21 8NL
The publisher has used its best endeavours to ensure that the URLs for external websites referred to in this book are correct and active at the time of going to press. However, the publisher has no responsibility for the websites and can make no guarantee that a site will remain live or that the content is or will remain appropriate.
Every effort has been made to trace all copyright holders, but if any have been overlooked the publisher will be pleased to include any necessary credits in any subsequent reprint or edition.
For further information on Polity, visit our website: politybooks.com
At this point, it’s hard to believe there was ever a time in my life when I hadn’t heard the word “bot.” The last decade of research and learning has been a thrilling journey, and I’m grateful for all the wonderful people I’ve met along the way. In no particular order, I want to express my deepest thanks to Tim Hwang, Marina Gorbis, John Kelly, Vladimir Barash, Camille François, Phil Howard, Clint Watts, Mark Louden, Frieda Ekotto, Helmut Puff, Roman Graf, Ralph Hailey, Rosemarie Hartner, Chou Changjen, Yao Yuwen, and Yauling and Joel for their support, guidance, and encouragement. I’m continually inspired by all the courageous journalists, activists, and researchers I’ve worked with over the years, especially my colleagues and friends in Taiwan – your work changes lives. A huge thanks to my friends Sam C., Nate, Amanda, Ike, Anj, Sylvia, Quin, Renata, Samantha, Jake, Jackie, Skyler, Trevor, Jane, and Doug for making life so full and always being up for an interminable conversation. My co-author, Sam Woolley, you’ve been an incredible friend and colleague, and I’m already looking forward to our next project. Lastly and above all, I’m most grateful to my wonderful family – Mom, Dad, Grammy, Mark, Ben, Britnea, Franki, Rocco, Benni, Murphy, Andi, and all the Monacos and Carmacks. Your love has made me who I am. There’s no better family on Earth.
Nick
First and foremost, I would like to thank my family for their constant, enthusiastic, support of my work. Without their encouragement, advice, and love I would never be able to do what I do. To Samantha, Pip, Mum, Dad, Oliver, Justin, Daniela, Manuela, Basket, Mathilda, Banjo, Charlie, and the Woolley, Donaldson, Loor, Shorey, Westlund, and Joens families – a sincere thank you for everything. To all of my friends – particularly Nick Monaco – thank you so very much for all of the learning and laughs. You make each day fun and inspiring. I’d also like to thank the members of my research team, the Propaganda Research Lab, collaborators at the Center for Media Engagement, and colleagues at the School of Journalism and Media and Moody School of Communication – all at the University of Texas at Austin. Finally, I’d like to thank the organizations that support my ongoing research, particularly Omidyar Network, the Open Society Foundations, and the Knight Foundation.
Sam
AI – Artificial Intelligence
ANT – Actor Network Theory
ASA – Automated Social Actors
CUI – Conversational User Interface
GPU – Graphics Processing Unit
GUI – Graphical User Interface
HCI – Human–Computer Interaction
HMC – Human–Machine Communication
IO – Information Operations
IRC – Internet Relay Chat
ML – Machine Learning
MT – Machine Translation
MUD – Multi-user domain, multi-user dungeon
NLP – Natural Language Processing
RES – Robot Exclusion Standard
STS – Science, Technology, and Society studies
The 2020 United States presidential election was one of the most impassioned in the country’s history. President Donald Trump and his Democratic opponent Joe Biden both contended they were fighting for nothing less than the future of American democracy itself. The election brought with it several events rarely seen in the history of American democracy – an election held in the middle of a global pandemic, citizens’ storming of the US Capitol, and attempts by a sitting president to overturn the results of a free and fair election. Unprecedented events weren’t only taking place offline, however – social bots, or computer programs posing as humans on social media sites such as Twitter and Facebook, were beginning to use artificial intelligence (AI) techniques to fly under the radar of security teams at social media platforms and target voters with political messages. One of the leading bot detection experts in the US bluntly admitted,
Back in 2016, bots used simple strategies that were easy to detect. But today, there are artificial intelligence tools that produce human-like language. We are not able to detect bots that use AI, because we can’t distinguish them from human accounts. (Guglielmi, 2020)
But bots were not only carrying out covert, deceptive, activity online in 2020. Working with amplify.ai, the Biden campaign deployed a chatbot to interact with users on Facebook messenger and encourage users to vote. This bot’s intent was not to deceive – it would reveal that it was not human if asked – rather it was a means of using AI techniques to try to boost the get-out-the-vote efforts. Amplify.ai’s bots helped Biden reach over 240,000 voters in fourteen states in the three weeks leading up to election day (Dhapola, 2021; Disawar & Chang, 2021). Bots’ activities in the 2020 election illustrated the dual nature of the technology – whether bots are “bad” or “good” for society depends on how they are designed and used.
Until recently, the word “bot” was fairly obscure, used mostly in arcane discussions in the academy between scholars, and in Silicon Valley meeting rooms full of computer programmers. The year 2020 was, of course, not the first time bots had been deployed to participate hyperactively in online political discussion in the US. The November 2016 presidential election was the one that gave bots a household name, both in the US and around the world. Journalists and researchers documented the underhanded automated tactics that were being used during that contest to promote both candidates. For many, this was the first time that they realized that political discussions online might not have an actual person on the other end – it might be a piece of software feeding us canned lines from a spreadsheet on the other side of the globe. Now, we can’t seem to get that idea out of our heads. These days, social media users quickly label any antagonistic arguer on social media a “bot,” whether it’s a troll, a disinformation agent, or a true bot (an automated account).
But before bots became a notorious byword for social media manipulation in 2016, they were already a central infrastructural part of computer architecture and the internet. Many bots are benign, designed to do the monotonous work that humans do not enjoy and do not do quickly. They carry out routine maintenance tasks. They are the backbone of search engines like Google, Bing, and Yandex. They help maintain services, gather and organize vast amounts of online information, perform analytics, send reminders. They regulate chatrooms and keep them running when users are fast asleep. They power the voice-based interfaces emerging in AI assistants such as Apple’s Siri, Amazon’s Alexa, or Microsoft’s Cortana. They carry out basic customer service as stand-ins for humans online or on the phone. On the stock market, they make split-second decisions about buying and trading financial securities; they now manage over 60 percent of all investment funds (Kolakowski, 2019). In video games, they run the interactive agents known as non-player characters (NPCs) that converse with human players and advance storylines.
Other bots are malicious. They amplify disinformation and sow discord on social media, lure the lonely onto dating sites, scam unsuspecting victims, and facilitate denial-of-service cyberattacks, crashing websites by overloading them with automated traffic. They generate “deep fakes” – realistic-seeming faces of humans who have never existed, which can serve as a first step to larger fraudulent activity on the web (such as creating fake accounts to use for scams on dating apps). They artificially inflate the popularity of celebrities and politicians, as companies sell thousands of fake online followers for only a few dollars (Confessore et al., 2018).
As obedient agents following their developers’ programming, bots’ uses and “interests” are as diverse as humans themselves. They can be written in nearly any programming language. They can sleuth from website to website, looking for relevant information on a desired topic or individual. They are active on nearly all modern social media platforms – Facebook, Instagram, Twitter, Reddit, Telegram, YouTube – and keep the wheels turning at other popular sites like Wikipedia. They can interact with other users as official customer service representatives, chat under the guise of a human user, or work silently in the background as digital wallflowers, watching users and websites, silently gathering information, or gaming algorithms for their own purposes.
This book is about bots in all their diversity: what they do, why they’re made, who makes them, how they’ve evolved over time, and where they are heading. Throughout these chapters, we’ll draw on research from diverse fields – including communications, computer science, linguistics, political science, and sociology – to explain the origins and workings of bots. We examine the history and development of bots in the technological and social worlds, drawing on the authors’ expertise from a decade of interviews in the field and hands-on research at the highest levels of government, academia, and the private sector.
It’s easy to think bots only emerged on the internet in the last few years, or that their activities are limited to spamming Twitter with political hashtags, but nothing could be further from the truth. Bots’ history is as long as that of modern computers themselves. They facilitate interpersonal communication, enhance political communication through getting out the vote or supercharging low-resourced activists, degrade political communication through spam and computational propaganda, streamline formulaic legal processes, and form the backbone of modern commerce and financial transactions. They also interact with one another – allowing computers to communicate with each other to keep the modern web running smoothly. Few technologies have influenced our lives as profoundly and as silently as bots. This is their story, and the story of how bots have transformed not only technology, but also society. The ways we think, speak, and interact with each other have all been transformed by bots.
Our hope is that through this book, the reader will gain a thorough understanding how technology and human communication intertwine, shaping politics, social life, and commerce. Throughout these seven chapters, we’ll cover all these areas in detail. In this chapter, we give the history of bots and define the different types of bots. In Chapter 2, Bots and Social Life, we explore the role that these computational agents play across global digital society. Chapter 3 explores the various ways that bots have been used for political communications, for both good and bad purposes, focusing especially on the advent of widespread digital campaigning and social media political bots in the last decade. In Chapter 4, we turn to the role of bots in the private sector, detailing commercial uses of automated agents over time in finance, customer service, and marketing. Chapter 5 explores the intersection of bots and artificial intelligence (AI). In Chapter 6, we trace the history of bot theory in academia – drawing on social science, philosophy, art, and computer science – to understand how the conception of bots has evolved over time and to consider bots’ future, particularly as it relates to questions in policy, ethics, and research. Finally, we close with thoughts on the future of bots, and key recommendations for researchers, policymakers, and technologists working on bots in the future.
“Bot” is a shortened version of the word “robot.” While the concept of a self-managing machine that performs tasks has arguably been around for hundreds of years (for example, DaVinci’s 1479 mechanical knight), the word “robot” was not coined until 1920. It was originated by Czech playwright and activist Karel Capek in a play called “Rossum’s Universal Robots” (“RUR”). In the play, the titular robots are humanlike machine workers who lack a soul, which are produced and sold by the R.U.R. company in order to increase the speed and profitability of manufacturing. Capek called these machines roboti at the suggestion of his brother Josef, who adapted the term from the Czech words robotnik (“forced worker”) and robota (“forced labor, compulsory service”) (Flatow, 2011; Online Etymology Dictionary, n.d.). Robota has cognates in other modern European languages, such as the German Arbeit (“work”). Inherent in these roots is the idea of forced servitude, even slavery – a robot is an object that carries out tasks specified by humans. This idea is key to the understanding of bots in the online sphere today, where bots are computer programs that carry out a set of instructions defined, ultimately, by a human. There is always a human designer behind a bot.
While “bot” began as a shortened form of “robot,” in the era of the modern internet, the connotations of the two terms have diverged. Bot is now used mostly to designate software programs, most of which run online and have only a digital presence, while robots are commonly conceived of as possessing a physical presence in the form of hardware – of having some form of physical embodiment. Wired journalist Andrew Leonard writes that bots are “a software version of a mechanical robot” whose “physical manifestation is no more than the flicker of electric current through a silicon computer chip” (Leonard, 1997, pp. 7–24). Today, social bots’ implementation may involve a visual presence, such as a profile on Twitter or Facebook, but the core of their functioning lies in the human-designed code that dictates their behavior.
Many people think that bots emerged only recently, in the wake of the incredibly rapid uptake of smartphones and social media. In fact, although they emerged into mainstream consciousness relatively recently, bots are nearly as old as computers themselves, with their roots going back to the 1960s. However, it is difficult to trace the history of the bot, because there is no standard, universally accepted definition for what exactly a bot is. Indeed, bot designers themselves often don’t agree on this question. We’ll begin this history by discussing some of the first autonomous programs, called daemons, and with the birth of the world’s most famous chatbot in the late 1960s.
Daemons, or background processes that keep computers running and perform vital tasks, were one of the first forms of autonomous computer programs to emerge. In 1963, MIT Professor Fernando Corbato conceived of daemons as a way to save himself and his students time and effort using their shared computer, the IBM 7094. While it is debatable whether these programs count as bots (it depends on how you define bot), their autonomy makes them noteworthy as a precursor to more advanced bots (McKelvey, 2018).
A more recognizable bot emerged only three years later. In 1966, another MIT professor, Joseph Weizenbaum, programmed ELIZA – the world’s first (and most famous) chatbot,1 arguably “the most important chatbot dialog system in the history of the field” (Jurafsky & Martin, 2018, p. 425). ELIZA was a conversational computer program with several “scripts.” The most famous of these was the DOCTOR script, under which ELIZA imitated a therapist, conversing with users about their feelings and asking them to talk more about themselves. Using a combination of basic keyword detection, pattern matching,2 and canned responses, the chatbot would respond to users by asking for further information or by strategically changing the subject (Weizenbaum, 1966). The program was relatively simple – a mere 240 lines of code – but the response it elicited from users was profound. Many first-timers believed they were talking to a human on the other end of the terminal (Leonard, 1997, p. 52). Even after users were told that they were talking to a computer program, many simply refused to believe they weren’t talking to a human (Deryugina, 2010). At the first public demonstration of the early internet (the ARPANET) in 1971, people lined up at computer terminals for a chance to talk to ELIZA.
ELIZA captured people’s minds and imaginations. When Weizenbaum first tested out ELIZA on his secretary, she famously asked him to leave the room so they could have a more private conversation (Hall, 2019). Weizenbaum, who had originally designed the bot to show how superficial human–computer interactions were, was dismayed by the paradoxical effect.
I was startled to see how quickly and how very deeply people conversing with DOCTOR became emotionally involved with the computer and how unequivocally they anthropomorphized it, [Weizenbaum wrote years later]. What I had not realized is that extremely short exposures to a relatively simple computer program could induce powerful delusional thinking in quite normal people. (Weizenbaum, 1976, pp. 6–7)
This response was noteworthy enough to be dubbed the “ELIZA effect,” the tendency of humans to ascribe emotions or humanity to mechanical or electronic agents with which they interact (Hofstadter, 1995, p. 157).
Other early bots did not have the glamor of ELIZA. For most of the 1970s and 1980s, bots largely played mundane but critical infrastructural roles in the first online environments. Bots are often cast in this “infrastructural” role,3 serving as the connective tissue in human–computer interaction (HCI). In these roles, bots often serve as an invisible intermediary between humans and computers that make everyday tasks easier. They do the boring stuff – keeping background processes running or chatrooms open – so we don’t have to. They are also used to make sense out of unordered, unmappable, or decentralized networks. As bots move through unmapped networks, taking notes along the way, they build a map (and therefore an understanding) of ever-evolving networks like the internet.
The limited, nascent online environment from the late 1970s onward was home to a number of important embryonic bots, which would form the foundation for modern ones. The early internet was mainly accessible to a limited number of academic institutions and government agencies (Ceruzzi, 2012; Isaacson, 2014, pp. 217–261), and it looked very different: it consisted of a limited number of networked computers, which could only send small amounts of data to one another. There were no graphical user interfaces (GUIs) or flashy images. For the most part, data was text-based, sent across the network for the purposes of communication using protocols – the standards and languages that computers use to exchange information with other computers. Protocols lay at the heart of inter-computer communication, both then and now. For example, a file is sent from one computer to another using a set of pre-defined instructions called the File Transfer Protocol (FTP), which requires that both the sending computer and the receiving computer understand FTP (all computers do, nowadays). Another of the most widespread and well-known protocols on the modern internet is the hypertext transfer protocol (HTTP). HTTP was first developed in 1989 by Tim Berners-Lee, who used it as the basis for developing the World Wide Web. Before HTTP and the World Wide Web became nearly universal in the 1990s, computers used different protocols to communicate with each other online,4 including Usenet and Internet Relay Chat (IRC). Both of these early online connection forums still exist today, and both played a critical role in incubating bot development. These were early breeding grounds for bot developers and their creations.
Usenet was the first largely available electronic bulletin-board service (often written simply as “BBS”). Developed in 1979 by computer-science graduate students at Duke and the University of North Carolina, Usenet was originally invented as a way for computer hobbyists to discuss Unix, a computer operating system popular among programmers. Users could connect their computers to each other via telephone lines and exchange information in dedicated forums called “news groups.” Users could also use their own computers to host, an activity known as running a “news server.” Many users both actively participated in and hosted the decentralized service, incentivizing many of them to think about how the platform worked and how it could be improved.
This environment led to the creation of some of the first online bots: automated programs that helped maintain and moderate Usenet. As Andrew Leonard describes, “Usenet’s first proto-bots were maintenance tools necessary to keep Usenet running smoothly. They were cyborg extensions for human administrators” (Leonard, 1997, p. 157). Especially in the beginning days, bots primarily played two roles: one was posting, the other was removing content (or “canceling,” as it was often called on Usenet) (Leonard, 1996). Indeed, Usenet’s “cancelbots” were arguably the first political bots. Cancelbots were a Usenet feature that enabled users to delete their own posts. If a user decided they wanted to retract something they had posted, they could flag the post with a cancelbot, a simple program that would send a message to all Usenet servers to remove the content. Richard Depew wrote the first Usenet cancelbot, known as ARMM (“Automated Retroactive Minimal Moderation”) (Leonard, 1997, p. 161).
Though the cancelbot feature was originally meant to enable posters to delete their own content, with just a little technical savvy it was possible to spoof identities and remove others’ posts. This meant that, in effect, a bot could be used to censor other users by deleting their content from the web. Once the secret was out, users and organizations began cancelling other’s users’ posts. For example, a bot called CancelBunny began deleting mentions of the Church of Scientology on Usenet, claiming they violated copyright. A representative from the Church itself said that it had contacted technologists to “remove the infringing materials from the Net,” and a team of digital investigators traced CancelBot back to a Scientologist’s Usenet account (Grossman, 1995). The incident drew ire from Usenet enthusiasts and inspired hacktivists like the Cult of the Dead Cow (cDc) to declare an online “war” on the Church, feeling the attempt at automated censorship violated the free speech ethos of Usenet (Swamp Ratte, 1995). Another malicious cancelbot “attack” from a user in Oklahoma deleted 25,536 messages on Usenet (Woodford, 2005, p. 135). Some modern governments use automation in similar ways, and for similar purposes as these cancelbots and annoybots: using automation to affect the visibility of certain messages and indirectly censor speech online (M. Roberts, 2020; Stukal et al., 2020).
Another prolific account on Usenet, Sedar Argic, posted political screeds on dozens of different news groups with astonishing frequency and volume. These posts cast doubt on Turkey’s role in the Armenian Genocide in the early twentieth century, and criticized Armenian users. Usenet enthusiasts still debate today whether the Argic’s posts were actually automated or not, but its high-volume posting and apparent canned response to keywords such as “Turkey” in any context (even on posts referring to the food) seem to point toward automation.
Over time, more advanced social Usenet bots began to emerge. One of these was Mark V. Shaney, a bot designed by two Bell Laboratories researchers that made its own posts and conversed with human users. Shaney used Markov Chains, a probabilistic language generation algorithm, which strings together sentences based on what words are most likely to follow the words before it. The name Mark V. Shaney was actually a pun on the term Markov Chain (Leonard, 1997, p. 49). The Markov Chain probabilistic technique is still widely used today in modern natural language processing (NLP) applications (Jurafsky & Martin, 2018, pp. 157–160; Markov, 1913).
Like Usenet, Internet Relay Chat (IRC) was one of the most important early environments for bot development. IRC was a proto-chatroom – a place where users could interact, chat, and share files online. IRC emerged in 1988, nine years after Usenet first appeared, coded by Finnish computer researcher Jarkko Oikarinen. Oikarinen made the code open-source, enabling anyone with the technical know-how and desire to host an IRC server. Along with the code, Oikarinen also included guidelines for building an “automaton,” or an autonomous agent that could help provide services in IRC channels (Leonard, 1997, pp. 62–63).
The arc of bot usage and evolution in IRC is similar to that of Usenet. At first, bots played an infrastructural role; then, tech-savvy users began to entertain themselves by building their own bots for fun and nefarious users began using bots as a disruptive tool; in response, annoyed server runners and white-hat bot-builders in the community built new bots to solve the bot problems (Leonard, 1997; Ohno, 2018).
Just as with Usenet, early bots in IRC channels played an infrastructural role, helping with basic routine maintenance tasks. For instance, the initial design of IRC required at least one human user to be logged into a server (often called a “channel”) for it to be available to join. If no users were logged into an IRC server, the server would close and cease to exist. Eventually, “Eggdrop” bots were created to solve this problem. Users deployed these bots to stay logged into IRC servers at all times, keeping channels open even when all other human users were logged out (such as at night, when they were sleeping). Bots were easy to build in the IRC framework, and users thus quickly began designing other new bots with different purposes: bots that would say hello to newcomers in the chat, spellcheck typing, or allow an interface for users to play games like Jeopardy! or HuntTheWumpus in IRC.
Given the ease of developing bots in IRC and the technical skill of many early users, this environment was the perfect incubator for bot evolution. Good and bad IRC bots proliferated in the years to come. For example, Eggdrop bots became more useful, not only keeping IRC channels open when no human users were logged in but also managing permissions on IRC channels. On the malicious side, hackers and troublemakers, often working in groups, would use collidebots and clonebots to hijack IRC channels by knocking human users off of them, and annoybots began flooding channels with text, making normal conversation impossible (Abu Rajab et al., 2006; Leonard, 1997). In response, other users designed channel-protection bots to protect the IRC channels from annoybots. In IRC, bots were both heroic helpers and hacker villains – digital Lokis that played both roles. This dual nature of bots persists to this day on the modern internet on platforms like Reddit, where both play helpful and contested roles on the platform (Massanari, 2016).
In addition to Usenet and IRC, computer games were also a hotbed of early bot development. From 1979 on, chatbots were relatively popular in online gaming environments known as MUDs (“multi-user domains” or “multi-user dungeons”). MUDs gained their name from the fact that multiple users could log into a website at the same time and play the same game. Unlike console games, MUDs were text-based and entirely without graphics,5 due to early computers’ limited memory and processing power, making them an ideal environment for typed bot interaction. These games often had automated non-player characters (NPCs) that helped move gameplay along, providing players with necessary information and services. MUDs remained popular into the 1990s, and users increasingly programmed and forked their own bots as the genre matured (Abokhodair et al., 2015; Leonard, 1997).
ELIZA, the original chatbot from the 1960s, served as a prototype and inspiration for most MUD chatbots. One of the big 1990s breakthroughs for MUD bots was a chatbot named Julia. Julia was part of an entire family of bots called the Maas-Neotek Family, written by Carnegie Mellon University graduate student Michael “Fuzzy” Mauldin for TinyMUD environments. Julia, a chatbot based on ELIZA’s code, inspired MUD-enthusiasts to build on the publicly available code from Maas-Neotek bots, to hack together their own bot variants (Foner, 1993; Julia’s Home Page, 1994; Leonard, 1997, pp. 40–42). Bots became legion in TinyMUDs – at one point, a popular TinyMUD that simulated a virtual city, PointMOOt, had a population that was over 50 percent bots (Leonard, 1996) – which was an essential part of the appeal for both players and developers.
As we have seen, early internet environments such as Usenet, IRC, and MUDs were the first wave of bot development, driving bot evolution from the 1970s through the 1990s. The next stage of bot advancement came with the advent of the World Wide Web in 1991.
The World Wide Web became widely available in the early 1990s, growing exponentially more complex and difficult to navigate as it gained more and more users. Gradually, people began to realize that there was simply too much information on the web for humans to navigate easily. It was clear to companies and researchers at the forefront of computer research that they needed to develop a tool to help humans make sense of the vast web. Bots came to fill this void, playing a new infrastructural role as an intermediary between humans and the internet itself. Computer programs were developed to move from webpage to webpage and analyze and organize the content (“indexing”) so that it was easily searchable. These bots were often called “crawlers” or “spiders,”6 since they “crawled” across the web to gather information. Without bots visiting sites on the internet and taking notes on their content, humans simply couldn’t know what websites were online. This fact is as true today as it was back then.
The basic logic that drives crawlers is very simple. At their base, websites are text files. These text files are written using hypertext markup language (HTML), a standardized format that is the primary base language of all websites.7
