// container width, slider nav dots color
Twitter is experiencing troubles. Twitter Inc.'s statistics shows that Twitter's monthly active users (MAU) have experienced a slowdown in growth since the first quarter of 2015. Active engagement is an important factor of gauging Twitter’s performance as it is directly related to ad revenues, and Twitter cannot become profitable without increasing the engagement level of its users.
Here is a quote from a 2009 Nielsen Online report, "more than 60 percent of U.S. Twitter users fail to return the following month." The data is old, but should reflect Twitter's problem of high quitting rate. A more recent study by Liu et al. showed a dramatic increase in inactive user accounts. Both studies showed that inactive users on Twitter constituted a large fraction of the Twitter population and solving this problem would be of great significance for both Twitter and its users.
Problem we were trying to solve -
Considerable number of new users stopped actively engaging on Twitter soon after they registered.
"Active Users" Definition
Twitter users who have contributed contents on Twitter (tweet or retweet) during the month prior to the data was collected. In this way, we excluded those users who lurked on the platform and merely passively browsed contents.
Our study aimed to advance the prior art by conducting data mining, survey and analysis specific to different behaviors of new users who continued actively engaging and those who discontinued. Based on our observation and findings, we suggested design solutions that could help Twitter increase active engagement among new users. Then, we carried out usability study and evaluated on the effectiveness of our design solutions.
The overarching objective of this project was to help identify some of the differences in the way active and inactive users used Twitter soon after they joined and provided design solutions that could help Twitter reduce the number of new users who stopped actively engaging on Twitter. By analyzing data right after a user joined, we could begin to gain an understanding of how the two groups of users used Twitter differently.
Explore statistically significant differences between active users and inactive users by analyzing their Twitter usage.
Use these differences to brainstorm potential reasons why new users stop using Twitter.
Propose design solutions that could alleviate some of the potential issues.
Test and evaluate the effectiveness of the proposed design solutions.
While there was not much research on why new Twitter users stopped actively engaging on the platform, we found some research on what motivated users into engaging on Twitter. A research by Johnson and Yang gave six incentives for users to engage on Twitter: entertainment, passing the time, information providing, information seeking, professional and social interaction. One assumption is that the unfulfillment of one or a combination of these motivations will lead to users dropping out of Twitter. While we could not infer from the research if the six motivations applied to inactive users on Twitter, it inspired us to generate hypotheses.
Another survey done by Deutsche Bank in 2014 listed the top reasons on why people who tried Twitter quitted (as below). The top three reasons boiled down to users inability to find and filter stuff that matters to them.
Also, we found there were four situations which might bring bias into our data analysis results: (1) users who delete their tweets recently and their last public post is more than a month ago, (2) users who switch their accounts to “protected,” and we have no access to their latest tweets, (3) users whose accounts are suspended by Twitter administrators, and (4) users who switched to other accounts (Liu, 2014). These are limitations of mining data on Twitter. We will need to look into these issues and see if we can get a more comprehensive set of data attributes and generate an accurate sample that aligns with our research objective.
The above secondary research indicated that information seeking and sharing were two of the major intentions of using Twitter, and therefore we planned to test if failing to meet either of the two needs was associated with low engagement level. We used the two sets of hypotheses below:
Active and inactive new users share the same amount of attention
Active new users acquire more attention from the community than inactive users
Active and inactive users have the same interest level for the contents on Twitter
Active users are more interested than non-active users in the contents on Twitter
The first set of hypotheses was related to the major user need for information sharing, and the second set of hypotheses reflected the intention for information seeking. In both cases, the activity level was the dependent variable, while the degrees of fulfillment for both needs were independent variables. To quantify both the independent variables and dependent variables for our hypotheses testing, we used metrics like the number of retweets per tweet, the number of likes per tweet, and the number of followers for the first set of hypotheses related to attention level. As for the second set, we used metrics including the number of followees, likes of others’ tweets and the number of retweets to represent user's’ interest levels.
Furthermore, we discussed the different indications of each metric. Having a large number of followees did not necessarily mean a user was interested in contents if no active engagement was involved. A user could follow a wide range of accounts at registration and then stopped actively engaging on the platform if he/she found no interest in the feeds. Therefore, to reflect a user’s interest level, the number of likes was more accurate. To understand how much attention a user acquired from the community, we used both the number of followers and average likes per tweet. The number of followers was directly related to a user’s influence on the platform. The measurement of average likes per tweet reflected the feedbacks a user gained for the contents he/she created. The former was about the person, while the latter was about contents. We used both metrics to study how much attention a user gained from the platform.
To collect data from twitter, we used a combination of open source python libraries and the official twitter API to retrieve the relevant data. We first searched for all tweets posted in the month of May which contained the hashtag #myFirstTweet. We chose this hashtag as this is one of Twitter’s suggested tweets right after an account is created. After we retrieved all tweets posted, we iterated through the users who posted them and filtered out those who did not create their account in may. This resulted in 723 users. We then categorized users as active or inactive based on if they have at least one tweet or retweet in the month of October. We then retrieved the data on each of these users’ tweets, retweets, the number of followers and number of likes and stored it locally to perform our data analysis. While collecting all of this data, we had to ensure that we were not exceeding twitter’s rate limits and had also optimized the way we retrieve data to reduce the number of calls as much as possible. This process had to be spread over multiple days in order for us to retrieve all the data that we would need.
In line with our proposed hypotheses, we did some preliminary analysis on the retrieved twitter data. We used a two-tailed t-test to look for a statistically significant difference in the means for various metrics between the two groups of users.
p-value (needs to be < 0.05)
Num of Tweets Liked
Num of Likes per Tweet
Num of Followers
Num of Users Followed
The test results showed that there was a statistically significant difference in the number of tweets liked, the number of likes per tweet, number of followers, and the number of users followed by active and inactive users. While most of the t-values indicate that the mean of each of the metrics in higher for active than for inactive users, it’s not the case for the number of likes per tweet. Upon further analysis, we found that the number of likes per tweet is in fact much higher on average for inactive users than it is for active users. This is counterintuitive and we found that a large number of tweets by active users did not consistently get likes, while the few tweets by the inactive users often had a number of likes. This could potentially point to the fact that active users don’t rely on how often their tweets are liked in order to stay engaged, and perhaps consider the number of followers as a more important metric. However, further data analysis would be required to assess this.
This page is under construction. Please come back later.