Can you moderate an unreadable message? 'Blind' content moderation via human computation

Seth Frey; Maarten W Bos; Robert W Sumner

doi:10.15346/hc.v4i1.5

Authors

Seth Frey Neukom Institute for Computational Science Dartmouth College http://orcid.org/0000-0002-5298-5089
Maarten W Bos Disney Research
Robert W Sumner Disney Research

DOI:

https://doi.org/10.15346/hc.v4i1.5

Keywords:

user-generated content, content moderation, privacy, safety, human computation, institution design, crowdsourcing

Abstract

User-generated content (UGC) is fundamental to online social engagement, but eliciting and managing it come with many challenges. The special features of UGC moderation highlight many of the general challenges of human computation in general. They also emphasize how moderation and privacy interact: people have rights to both privacy and safety online, but it is difficult to provide one without violating the other: scanning a user's inbox for potentially malicious messages seems to imply access to all safe ones as well. Are privacy and safety opposed, or is it possible in some circumstance to guarantee the safety of anonymous content without access to that content. We demonstrate that such "blind content moderation" is possible in certain domains. Additionally, the methods we introduce offer safety guarantees, an expressive content space, and require no human moderation load: they are safe, expressive, and scalable Though it may seem preposterous to try moderating UGC without human- or machine-level access to it, human computation makes blind moderation possible. We establish this existence claim by defining two very different human computational methods, behavioral thresholding and reverse correlation. Each leverages the statistical and behavioral properties of so-called "inappropriate content" in different decision settings to moderate UGC without access to a message's meaning or intention. The first, behavioral thresholding, is shown to generalize the well-known ESP game.

References

Adler, B. T., & de Alfaro, L. (2007). A content-driven reputation system for the wikipedia (p. 261). Presented at the ACM WWW, New York, New York, USA: ACM Press. http://doi.org/10.1145/1242572.1242608

Ahn, von, L., & Dabbish, L. (2004). Labeling images with a computer game (pp. 319–326). Presented at the ACM EC, New York, New York, USA: ACM Press. http://doi.org/10.1145/985692.985733

Ahn, von, L., & Dabbish, L. (2008). Designing games with a purpose. Communications of the ACM, 51(8), 57–67. http://doi.org/10.1145/1378704.1378719

Akdeniz, Y. (1998). Who watches the watchmen? Internet content rating systems and privatised censorship. The Australian Library Journal, 47(1), 28–42. http://doi.org/10.1080/00049670.1998.10755831

J. Cheng, C. Danescu-Niculescu-Mizil, J. Leskovec. (2015). Antisocial Behavior in Online Discussion Communities. Ninth International AAAI Conference on Web and Social Media.

Chen, A. (2014, October). The Laborers Who Keep Dick Pics and Beheadings Out of Your Facebook Feed. Retrieved August 2015, from http://www.wired.com/2014/10/content-moderation/

Davidson, J. (2015, July). Facebook’s Zuckerberg Defends Controversial “Real Name” Policy. Retrieved September 2015, from http://time.com/money/3942997/facebook-real-name-policy/

Dotsch, R., & Todorov, A. (2012). Reverse Correlating Social Face Perception. Social Psychological and Personality Science, 3(5), 562–571. http://doi.org/10.1177/1948550611430272

Dotsch, R., Wigboldus, D. H. J., Langner, O., & van Knippenberg, A. (2008). Ethnic Out-Group Faces Are Biased in the Prejudiced Mind. Psychological Science, 19(10), 978–980. http://doi.org/10.1111/j.1467-9280.2008.02186.x

Duggan, M., & Brenner, J. (2013). The Demographics of Social Media Users — 2012. Pew Research Center. Pew Research Center.

Ekstrom, M., & Ostman, J. (2015). Information, Interaction, and Creative Production: The Effects of Three Forms of Internet Use on Youth Democratic Engagement. Communication Research, 42(6), 796–818. http://doi.org/10.1177/0093650213476295

File, T., & Ryan, C. (2014). Computer and Internet Use in the United States: 2013 (No. ACS-28). census.gov. U.S. Census Bureau.

Flanagan, M., Howe, D. C., & Nissenbaum, H. (2005). Values at play (pp. 751–760). Presented at the ACM SIGCHI, New York, New York, USA: ACM Press. http://doi.org/10.1145/1054972.1055076

Ghosh, A., & McAfee, P. (2011). Incentivizing high-quality user-generated content (pp. 137–146). Presented at the ACM WWW, New York, New York. http://doi.org/10.1145/1963405.1963428

Ghosh, A., Kale, S., & McAfee, P. (2011). Who moderates the moderators? (p. 167). Presented at the ACM EC, New York, New York, USA: ACM Press. http://doi.org/10.1145/1993574.1993599

Granovetter, M. (1978). Threshold models of collective behavior. American Journal of Sociology, 1420–1443.

Grimmelmann, J. (2015). The Virtues of Moderation, 17, 42–109.

Gutnick, A. L., Robb, M., Takeuchi, L., & Kotler, J. (2010). Always connected: the new digital media habits of young children. joanganzcooneycenter.org. New York: The Joan Ganz Cooney Center at Sesame Workshop.

Harrison, J. (2010). User-generated content and gatekeeping at the BBC hub. Journalism Studies, 11(2), 243–256. http://doi.org/10.1080/14616700903290593

Hartikainen, H., Iivari, N., & Kinnula, M. (2015). Children and Web 2.0: What They Do, What We Fear, and What Is Done to Make Them Safe. In Lecture Notes in Business Information Processing (Vol. 223, pp. 30–43). Cham: Springer International Publishing. http://doi.org/10.1007/978-3-319-21783-3_3

Hartley, J., Lumby, C., & Green, L. (2009). Untangling the Net: The Scope of Content Caught By Mandatory Internet Filtering (No. 39549). Internet Industry Association.

Hermida, A., & Thurman, N. (2007). Comments please: How the British news media are struggling with user-generated content. Presented at the 8th International Symposium on Online Journalism.

Hidalgo, J. M. G., Sanz, E. P., García, F. C., & De Buenaga Rodríguez, M. (2009). Web Content Filtering. In Advances in Computers: Social Networking and The Web (Vol. 76, pp. 257–306). Elsevier. http://doi.org/10.1016/S0065-2458(09)01007-9

Holloway, D., Green, L., & Livingstone, S. (2013). Zero to eight. Young children and their Internet use. eprints.lse.ac.uk. LSE, London: EU Kids Online.

Hughey, M. W., & Daniels, J. (2013). Racist comments at online news sites: a methodological dilemma for discourse analysis. Media, Culture & Society, 35(3), 332–347. http://doi.org/10.1177/0163443712472089

Julie, M.-M., Mangini, M. C., Fagot, J., & Biederman, I. (2006). Do humans and baboons use the same information when categorizing human and baboon faces? Psychological Science, 17(7), 599–607.

Karremans, J. C., Dotsch, R., & Corneille, O. (2011). Romantic relationship status biases memory of faces of attractive opposite-sex others: Evidence from a reverse-correlation paradigm. Cognition, 121(3), 422–426. http://doi.org/10.1016/j.cognition.2011.07.008

Kelly, K. (2009, March). Overheard@GDC09: TTP = Time To Penis. Retrieved September 2015, from http://www.engadget.com/2009/03/24/overheard-gdc09-ttp-time-to-penis/

Koblin, A. M. (2009). The sheep market (pp. 451–452). Presented at the Proceeding of the seventh ACM conference, New York, New York, USA: ACM Press. http://doi.org/10.1145/1640233.1640348

Kollock, P., & Smith, M. (1996). Managing the virtual commons. Computer-Mediated Communication: Linguistic, Social, and Cross-Cultural Perspectives, 109–128.

Macy, M. (1991). Chains of cooperation: Threshold effects in collective action. American Sociological Review, 56(6), 730–747.

McWhertor, M. (2008, June). When Spore Penis Monsters Attack. Retrieved May 24, 2016, from http://kotaku.com/5017350/when-spore-penis-monsters-attack

Pantola, A. V., Pancho-Festin, S., & Salvador, F. (2011). TULUNGAN: A Consensus-Independent Reputation System for Collaborative Web Filtering Systems. Science Diliman, 23(2), 17–39.

Purslow, M. (2015, May). LEGO Universe couldn’t deal with the cost of the penis police. Retrieved May 2015, from http://www.pcgamesn.com/lego-universe-couldn-t-deal-with-the-cost-of-the-penis-police

Schelling, T. C. (2006). Micromotives and macrobehavior. W. W. Norton & Company.

Silberman, M. S., Irani, L., & Ross, J. (2010). Ethics and tactics of professional crowdwork. XRDS: Crossroads, the ACM Magazine for Students, 17(2), 39. http://doi.org/10.1145/1869086.1869100

G. Wang et al. (2012). Social Turing Tests: Crowdsourcing Sybil Detection. arxiv.org. http://arxiv.org/abs/1205.3856

Sood, S., Antin, J., & Churchill, E. (2012). Profanity use in online communities (p. 1481). Presented at the ACM EC, New York, New York, USA: ACM Press. http://doi.org/10.1145/2207676.2208610

Stefanovitch, N., Alshamsi, A., Cebrian, M., & Rahwan, I. (2014). Error and attack tolerance of collective problem solving: The DARPA Shredder Challenge. EPJ Data Science, 3(1), 13–27. http://doi.org/10.1140/epjds/s13688-014-0013-1

Subramaniam, M., Valdivia, C., Pellicone, A., & Neigh, Z. (2014). Teach Me and Trust Me: Creating an Empowered Online Community of Tweens and Parents. In Maxi Kindling & Elke Greifeneder (Eds.). Presented at the iConference 2014 Proceedings: Breaking Down Walls. Culture - Context - Computing, iSchools. http://doi.org/10.9776/14078

Todorov, A., Dotsch, R., Wigboldus, D. H. J., & Said, C. P. (2011). Data-driven Methods for Modeling Social Perception. Social and Personality Psychology Compass, 5(10), 775–791. http://doi.org/10.1111/j.1751-9004.2011.00389.x

Zipf, G. K. (1949). Human Behavior and The Principle of Least Effort: An Introduction to Human Ecology. Reading, Mass., Addision-Wesley.