Is scraping Twitter an ethical approach?

Earlier today, I used software from the website to scrape over 3,000 tweets which contained #privacy in the last 7 days.  Information I was able to access included the full tweet, the date and time it was created, the user ID (and picture, if available) of the poster and any retweets or replies.

This brings up an interesting question: although Twitter is ‘public’ and, in theory, is used on the understanding anything posted can be viewed by anybody, anywhere, for any purpose, is that really what we think when we join in a conversation or tweet about a favourite topic?  And, if we do not have that knowledge at the forefront of our minds, is it then ethical for a researcher to use, analyse and, potentially, republish those tweets?

Michael Zimmer wrote about this issue in his 2010 blog post “Is it Ethical to Harvest Public Twitter Accounts without Consent?”.  He highlights that, although Twitter users know their tweets are public, they also know that millions of tweets are created every day, and that this will almost certainly mean their individual tweets are obscured (and are therefore, to all intents and purposes, at least semi-private).  In addition, even if users do accept the public nature of tweets, this does not mean they accept those tweets being used for in depth analysis and research.

This is a complex area, with no simple answers.  I would suggest that a considerable proportion of the ethical question rests on the subject matter of the tweets: if it can reasonably be expected that a subject is innocuous, there are fewer questions of consent when using the contents for analysis.  If, however, the subject matter is sensitive, explicit consent may need to be considered, particularly for detailed research.  Rebecca Hogue has written of feeling ‘violated’ by a researchers’ analysis of the hashtag bcsm (breast cancer social media).  She identifies several factors leading to her reaction:

  • #bcsm was used by a small group of women (approximately 80 individual posters), making it an intimate group
  • The group was formed around an emotive and difficult subject, and the members provided support for each other through their posts
  • The researcher did not attempt to contact the group before carrying out analysis of the tweets
  • The analysis carried out was extremely detailed, subjecting the tweets to psychological content analysis

The example given here highlights that, if a small group of twitter users are analysed, it is perhaps more likely those individuals would feel uncomfortable with their tweets being subject to said analysis.

Due to the sheer number of tweets produced each day, it is more likely that most sample sizes would stretch into the tens or even hundreds of thousands, posing a different, but “dramatic challenge in numerous ways, technically, politically, (and) also ethically” (Neuhaus and Webmoor, 2011, p.47).  It is clear that this ‘massified research’ cannot be approached with the aim of ‘explicit consent from all participants’.  Instead, consent should conceivably only be sought if the subject is emotive or there is a possibility of participant identification.


Hogue, R. 2015. Ethics of researching twitter communities (#smsociety15).  30th July. Rebecca J. Hogue. [Online]. [Accessed 9th October 2016]. Available from:

Neuhaus, F and Webmoor, T. 2011. Agile ethics for massified research and visualization.  Information, Communication and Society. 15(1), pp.43-65

Zimmer, M. 2010. Is it Ethical to Harvest Public Twitter Accounts without Consent? 12th February. Internet Research Ethics. [Online]. [Accessed 9th October 2016]. Available from:



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

Blog at

Up ↑

%d bloggers like this: