Can you moderate an unreadable message? 'Blind' content moderation via human computation




user-generated content, content moderation, privacy, safety, human computation, institution design, crowdsourcing


User-generated content (UGC) is fundamental to online social engagement, but eliciting and managing it come with many challenges. The special features of UGC moderation highlight many of the general challenges of human computation in general. They also emphasize how moderation and privacy interact: people have rights to both privacy and safety online, but it is difficult to provide one without violating the other: scanning a user's inbox for potentially malicious messages seems to imply access to all safe ones as well. Are privacy and safety opposed, or is it possible in some circumstance to guarantee the safety of anonymous content without access to that content. We demonstrate that such "blind content moderation" is possible in certain domains. Additionally, the methods we introduce offer safety guarantees, an expressive content space, and require no human moderation load: they are safe, expressive, and scalable Though it may seem preposterous to try moderating UGC without human- or machine-level access to it, human computation makes blind moderation possible. We establish this existence claim by defining two very different human computational methods, behavioral thresholding and reverse correlation. Each leverages the statistical and behavioral properties of so-called "inappropriate content" in different decision settings to moderate UGC without access to a message's meaning or intention. The first, behavioral thresholding, is shown to generalize the well-known ESP game. 


Adler, B. T., & de Alfaro, L. (2007). A content-driven reputation system for the wikipedia (p. 261). Presented at the ACM WWW, New York, New York, USA: ACM Press.

Ahn, von, L., & Dabbish, L. (2004). Labeling images with a computer game (pp. 319–326). Presented at the ACM EC, New York, New York, USA: ACM Press.

Ahn, von, L., & Dabbish, L. (2008). Designing games with a purpose. Communications of the ACM, 51(8), 57–67.

Akdeniz, Y. (1998). Who watches the watchmen? Internet content rating systems and privatised censorship. The Australian Library Journal, 47(1), 28–42.

J. Cheng, C. Danescu-Niculescu-Mizil, J. Leskovec. (2015). Antisocial Behavior in Online Discussion Communities. Ninth International AAAI Conference on Web and Social Media.

Chen, A. (2014, October). The Laborers Who Keep Dick Pics and Beheadings Out of Your Facebook Feed. Retrieved August 2015, from

Davidson, J. (2015, July). Facebook’s Zuckerberg Defends Controversial “Real Name” Policy. Retrieved September 2015, from

Dotsch, R., & Todorov, A. (2012). Reverse Correlating Social Face Perception. Social Psychological and Personality Science, 3(5), 562–571.

Dotsch, R., Wigboldus, D. H. J., Langner, O., & van Knippenberg, A. (2008). Ethnic Out-Group Faces Are Biased in the Prejudiced Mind. Psychological Science, 19(10), 978–980.

Duggan, M., & Brenner, J. (2013). The Demographics of Social Media Users — 2012. Pew Research Center. Pew Research Center.

Ekstrom, M., & Ostman, J. (2015). Information, Interaction, and Creative Production: The Effects of Three Forms of Internet Use on Youth Democratic Engagement. Communication Research, 42(6), 796–818.

File, T., & Ryan, C. (2014). Computer and Internet Use in the United States: 2013 (No. ACS-28). U.S. Census Bureau.

Flanagan, M., Howe, D. C., & Nissenbaum, H. (2005). Values at play (pp. 751–760). Presented at the ACM SIGCHI, New York, New York, USA: ACM Press.

Ghosh, A., & McAfee, P. (2011). Incentivizing high-quality user-generated content (pp. 137–146). Presented at the ACM WWW, New York, New York.

Ghosh, A., Kale, S., & McAfee, P. (2011). Who moderates the moderators? (p. 167). Presented at the ACM EC, New York, New York, USA: ACM Press.

Granovetter, M. (1978). Threshold models of collective behavior. American Journal of Sociology, 1420–1443.

Grimmelmann, J. (2015). The Virtues of Moderation, 17, 42–109.

Gutnick, A. L., Robb, M., Takeuchi, L., & Kotler, J. (2010). Always connected: the new digital media habits of young children. New York: The Joan Ganz Cooney Center at Sesame Workshop.

Harrison, J. (2010). User-generated content and gatekeeping at the BBC hub. Journalism Studies, 11(2), 243–256.

Hartikainen, H., Iivari, N., & Kinnula, M. (2015). Children and Web 2.0: What They Do, What We Fear, and What Is Done to Make Them Safe. In Lecture Notes in Business Information Processing (Vol. 223, pp. 30–43). Cham: Springer International Publishing.

Hartley, J., Lumby, C., & Green, L. (2009). Untangling the Net: The Scope of Content Caught By Mandatory Internet Filtering (No. 39549). Internet Industry Association.

Hermida, A., & Thurman, N. (2007). Comments please: How the British news media are struggling with user-generated content. Presented at the 8th International Symposium on Online Journalism.

Hidalgo, J. M. G., Sanz, E. P., García, F. C., & De Buenaga Rodríguez, M. (2009). Web Content Filtering. In Advances in Computers: Social Networking and The Web (Vol. 76, pp. 257–306). Elsevier.

Holloway, D., Green, L., & Livingstone, S. (2013). Zero to eight. Young children and their Internet use. LSE, London: EU Kids Online.

Hughey, M. W., & Daniels, J. (2013). Racist comments at online news sites: a methodological dilemma for discourse analysis. Media, Culture & Society, 35(3), 332–347.

Julie, M.-M., Mangini, M. C., Fagot, J., & Biederman, I. (2006). Do humans and baboons use the same information when categorizing human and baboon faces? Psychological Science, 17(7), 599–607.

Karremans, J. C., Dotsch, R., & Corneille, O. (2011). Romantic relationship status biases memory of faces of attractive opposite-sex others: Evidence from a reverse-correlation paradigm. Cognition, 121(3), 422–426.

Kelly, K. (2009, March). Overheard@GDC09: TTP = Time To Penis. Retrieved September 2015, from

Koblin, A. M. (2009). The sheep market (pp. 451–452). Presented at the Proceeding of the seventh ACM conference, New York, New York, USA: ACM Press.

Kollock, P., & Smith, M. (1996). Managing the virtual commons. Computer-Mediated Communication: Linguistic, Social, and Cross-Cultural Perspectives, 109–128.

Macy, M. (1991). Chains of cooperation: Threshold effects in collective action. American Sociological Review, 56(6), 730–747.

McWhertor, M. (2008, June). When Spore Penis Monsters Attack. Retrieved May 24, 2016, from

Pantola, A. V., Pancho-Festin, S., & Salvador, F. (2011). TULUNGAN: A Consensus-Independent Reputation System for Collaborative Web Filtering Systems. Science Diliman, 23(2), 17–39.

Purslow, M. (2015, May). LEGO Universe couldn’t deal with the cost of the penis police. Retrieved May 2015, from

Schelling, T. C. (2006). Micromotives and macrobehavior. W. W. Norton & Company.

Silberman, M. S., Irani, L., & Ross, J. (2010). Ethics and tactics of professional crowdwork. XRDS: Crossroads, the ACM Magazine for Students, 17(2), 39.

G. Wang et al. (2012). Social Turing Tests: Crowdsourcing Sybil Detection.

Sood, S., Antin, J., & Churchill, E. (2012). Profanity use in online communities (p. 1481). Presented at the ACM EC, New York, New York, USA: ACM Press.

Stefanovitch, N., Alshamsi, A., Cebrian, M., & Rahwan, I. (2014). Error and attack tolerance of collective problem solving: The DARPA Shredder Challenge. EPJ Data Science, 3(1), 13–27.

Subramaniam, M., Valdivia, C., Pellicone, A., & Neigh, Z. (2014). Teach Me and Trust Me: Creating an Empowered Online Community of Tweens and Parents. In Maxi Kindling & Elke Greifeneder (Eds.). Presented at the iConference 2014 Proceedings: Breaking Down Walls. Culture - Context - Computing, iSchools.

Todorov, A., Dotsch, R., Wigboldus, D. H. J., & Said, C. P. (2011). Data-driven Methods for Modeling Social Perception. Social and Personality Psychology Compass, 5(10), 775–791.

Zipf, G. K. (1949). Human Behavior and The Principle of Least Effort: An Introduction to Human Ecology. Reading, Mass., Addision-Wesley.




How to Cite

Frey, S., Bos, M. W., & Sumner, R. W. (2017). Can you moderate an unreadable message? ’Blind’ content moderation via human computation. Human Computation, 4(1), 78-106.