Research Question: What types of identifying practices result in the highest rate of anonymous censorship?

Background

craigslist is a community-oriented site that resembles the form and function of newspaper-based classified advertisements. The site was launched by Craig Newmark in 1995 and features advertisements (or “posts”) that can be submitted free of charge. The prototypical Missed Connection consists of a sighting or exchange which did not result in the inter-personal connection the author desired. An author will submit a post to craigslist that details the events of the encounter with a title and message, and hope that his intended target will read the post and recognize the description. Authors often include physical descriptions (“very attractive”), logistical details (“you were running on the treadmill”) and any interaction that might have occurred between himself and his target (“we made eye contact a few times”).

Content moderation on the craigslist is placed squarely into the hands of its users. craigslist has provided a “flagging” system that allows reader to mark content they feel is inappropriate. Anyone is allowed to flag posts, and no identifying account or information is required. Users are instructed to “flag with care” and select from one of four categories. Other than posts flagged as “best of craigslist”, posts flagged an undisclosed number of times are removed from the system. In this study we consider what types of identities are most likely to be censored.

Example Posts

FLAGGED: Hotel Action in Fairfax – m4m (Fairfax) 20yrHey dudes, please be 26 or younger. no blacks, no asians, no Albinos, no latinos, no mixed black and asian, black and latino, asian or latino. Please. dont be fat and hairy, i dont mind hair. im a little hairy myself. very goodlooking young guy. probably the hottest guy you\’ll see walking the streets. 8in cut. send me your face pix. no dick, ass, stomach, and especially no feet pix. Face pic has to be a full clear view of your face. no bullshit angles. Other than that goodluck. To the man that sucks my dick, you should feel privileged. 20/m in Hotel near fairoaks mall. meet me at the room. suck my dick and bounce the FUCK OUT!

NOT FLAGGED: dark hair, light eyes, sitting in the sun at lunchtime – m4w – 30yr
You were sitting at a table by yourself with a black laptop. Wearing all black. I sat near you and made a few comments. You\’re beautiful, I\’m married, but I am intrigued by you. If interested in knowing more, write back.

Dataset

The dataset (N=2518) was obtained via a custom software application written by the author to collect posts from craigslist Missed Connections, and monitor if and when posts were censored.  This dataset represents a random selection of censored (N=582) and uncensored (N=1936) posts submitted to craigslist Missed Connections in Washington D.C., New York City, and the San Francisco Bay Area. All posts were submitted between November 1st and the 30th, 2008; censored posts that were censored after this timeframe were excluded from this study.

The mean age of authors, when reported, was 31.45 (SD=10.95).  Posts were collected from the four gender/sexual orientation subsections of Missed Connections:

  • m4m (N=497, 19.7%)
  • m4w (N=920, 36.5%)
  • w4m (N=560, 22.2%)
  • w4w (N=95, 3.8%).

Method

This is a mixed-method approach. Quantitative data (corollary, and regressional) was used to assess system-level identity variables:

  • age
  • gender
  • sexual orientation

Qualitative findings (via digital ethnography) were used to assess identity information included the post’s main content. Ethnographic work  was performed on a randomly selected subset of the total dataset (N=300), resulting in the following themes for censored content:

  • SPAM-like
  • Miscategorized
  • Confessionary
  • Complaints
  • Sexual Content

Results
picture-2picture-4

Conclusion
This study found small effects for a number of system-level identity variables, as well as post-level variables. Identity attributes relative to group norms, as predicted by SIDE theory, appears to play a role in predicting censorship in this anonymous space. Interestingly, offensive attributes were both static (relating to the person) and dynamic (relating to their behavior). Surveying the results of both the quantitative and qualitative findings reveals the following reoccurring themes:

  • Gender
  • Sexual Orientation
  • Explicit Content
  • Anonymity

While I had originally expected post-level identity attributes to play a dominant role in this study, system-level attributes appeared to have a greater influence. This may be the result of the clear identities (particularly gender/sexual orientation) by which the craigslist system is organized. It may also be the result of the complicated array of content that actually exists inside censored posts. However, I suspect that the implicit identities outlined by the system play a substantial role in organizing reader and authors into sub-populations with their own censoring behavior. Future research should compare post-level identity content, comparing across system-level identity groups. This would serve as an excellent follow up study to test the idea of a “familiarity and appropriateness” model of censorship.