The Social Dynamics of Reddit
 Part 1: Who participates?


Communities, audiences, and crowds

What, exactly, are online communities?

During the early years of the internet, communication scholars referred to groups of people interacting online as "virtual communities." The use of the word "communities" was, in part, an effort to legitimize online group interaction. Each year, it becomes harder to remember just how skeptical people were about the idea of forging social bonds online. At the time, it was easy to see how communication technologies like the telephone and written correspondence could extend, sustain, or augment communities formed in the "real world," but the idea of communities existing in the total absence of face-to-face interaction struck many as implausible. 

Gradually, anecdotes of computer-mediated social bonding and intimacy began to pile up. As more of us went online, as communication technologies added pictures and video to text, the idea of online groups as true communities seemed less far-fetched. In 2017, millions of individuals interact with people they will never meet face-to-face. 

But are these massive online groups really communities? Early online group communication took place among relatively small groups of dedicated individuals. In terms of their sizes and the frequency with which group members interacted with one another, they were not unlike the relatively small groups of people with whom we repeatedly interacted in offline communities. As online groups grew to the size of large cities, they began to seem more like two other types of groups: audiences (large groups of people passively consuming what a small number of creators produce) or crowds (larger groups of people that assemble, maybe interact for a short period of time, and then disperse).

analyzing group behavior on Reddit is an ideal space in which we can explore the diversity of online group dynamics. As of the writing of this entry, the website is the fourth most popular website in the United States, behind, YouTube, and Facebook. Reddit hosts over one million "subreddits," topic-oriented webpages that feature links to noteworthy content and, most relevantly to the topic of online communities, feature discussion threads. 

Many of Reddit's discussion threads contain posts from thousands of different users, which seems to illustrate that online communities are possible in an era of mass participation. But are these groups really analogous to communities, or are they more like the aforementioned definitions of audiences or crowds?

The massive groups inhabiting Reddit did not become massive overnight. By examining the way in which they changed over time, we may gain insight into the processes by which online communities are formed and sustained. 

distributions of participation

The difference between a community and an audience is, in part, a question of how many people actively participate in discourse and how many passively consume that discourse. Do we have a group in which a handful of individuals are producing witty, insightful comments for many others to read and enjoy, or do we have a group in which contribution to discussions is more evenly distributed among its members? 

To assess the distribution of participation in conversations on a given subreddit at a given time, our research team used a statistic called the Gini coefficient. Commonly used in economics, the Gini coefficient possesses a value ranging from one (reflecting a highly concentrated distribution) to zero (reflecting a highly dispersed distribution). In the context of discussions, a discussion in which one person did all of the talking and everyone else listened (I suppose this would be more of a monologue than a discussion, wouldn't it) would have a value of 1 while a discussion in which all members contributed equally to the discourse would have a value of 0. 

A visualization of an aggregated sample of 20 subreddits (shown above) reveals a pattern: as subreddits grow, participation within the subreddits tends to become more highly concentrated among a relatively small sub-group of users within each subreddit (as is evidenced by an increasingly high aggregated Gini coefficient). In other words, at least within this sample, subreddits tend to start off resembling communities and become more like audiences. 

The positive relationship between group size and Gini coefficient can be observed in relatively small subreddits, like r/arduino (a subreddit dedicated to the discussion of Arduino technology)...

...but it can also be observed in larger subreddits, like r/news (a subreddit dedicated to the discussion of current events):

The concentration of participation does not seem to be the effect of time passing (i.e., how long the subreddit has been around), at least not in all cases. An examination of participation patterns in r/CFB, a subreddit dedicated to discussions of college football, makes it easier for us to separate the effect of time from the effect of size on concentration of participation. In this subreddit, neither the size nor the Gini coefficient are on a steady, upward trajectory. Instead, both fluctuate with the passage of the college football season. But the pattern persists: when the subreddit grows, the Gini coefficient rises, and when it shrinks, the Gini coefficient declines. 

There are, however, subreddits that do not conform to this pattern. Some subreddits, like r/TwoXChromosomes, start out small with highly concentrated participation and then, as they grow, exhibit a greater dispersion in participation. It's ambiguous as to whether the dispersion is the result of the passage of time or if growth actually increases equal participation among members in these subreddits. 

In our initial analysis of the relationship between community size and distribution of participation within 20 subreddits, there are enough subreddits that diverge from the expected positive relationship between the two variables to prompt us to believe that it is not a fluke. A certain type of online group, comprised of a certain type of people discussing a certain type of topic, actually becomes more community-like as it grows. What characteristics define these types? That's what we intend to find out!

Feel free to explore the relationship between subreddit size and Gini coefficient in various subreddits in the graphic below (use the drop-down menu on the right to select a subreddit). 

Next steps

The Gini coefficient provides a useful way of thinking about the differences between online communities and online crowds, but it doesn't tell us much about whether the participants are sticking around (as you would expect community members to do) or engaging in fleeting group interaction (a behavior more indicative of a crowd). In Part 2, we will attempt to speak to this issue by comparing online groups based on their "stickiness" or the extent to which they retain participants over time. 

Thanks to Felipe Hoffa (u/fhoffa) for making Reddit comment data available for analysis using Google Query, and thanks to Jamie Witter for Tableau advice. Data analyzed by Conor Hollenbach, Tyler Rhodes, and Jinjie Yang at the University of Alabama.

