New chapter out on text-mining and digital research

A chapter I co-authored on text-mining has (finally!) been published in the Sage publication, Innovations in Digital Research Methods.

51w3SHTuVJLCo-authored with Lawrence Ampofo, Ben O’Loughlin and Andrew Chadwick our chapter covers  researching social media, discussing key techniques for textual analysis and exploring some of the opportunities, limitations and ethical considerations.

The full blurb from the introduction chapter is here:

Text Mining and Social Media: When Quantitative Meets Qualitative and Software Meets People
Text mining has developed dramatically in recent years in its power to analyse and extract information from very large bodies of unstructured text. Its applications are motivated by a growing awareness that researchers need more powerful tools in order to benefit from rapidly increasing amounts of textual data being generated through the proliferation and unprecedented levels of take up of Web 2.0 technologies. Chief among these are blogs and social media (‘micro-blogs’), the latter exemplified by the rise of platforms such as Facebook and Twitter.

In this chapter, Ampofo, Collister, O’Loughlin and Chadwick explore how text mining using natural language processing (NLP) techniques can provide qualitative social researchers with powerful analytical tools for extracting information from this unstructured data, including harvesting data and analysing it in real time. They survey the range of research tools for text mining, broadly defined, available both in the academic and commercial spheres. People’s use of social media is seen by many researchers as providing an ideal source of data through which to monitor rapidly changing situations, hence, it has come to particular prominence during civil unrest (e.g., the so-called ‘Arab Spring’) and natural disasters (e.g., Hurricane Sandy). Beyond these inherently unpredictable phenomena, one of the most popular emerging applications of social media analysis lies in the tracking of public opinion through the application of NLP-based techniques such as sentiment analysis. These techniques have the capacity to generate results in real time, which offers intriguing possibilities for both commercial and academic research.

To illustrate the potential and challenges of using text mining techniques in social research, Ampofo, Collister, O’Loughlin and Chadwick present overviews of two projects. The first is a study of social media during the televised debates between political party leaders in the 2010 UK general election campaign. The second is also drawn from this election campaign and focuses on the reporting of accusations of bullying against then-Prime Minister Gordon Brown in the British media. The application of NLP-based text analysis tools to social data is still, in many respects, in its infancy. With this thought in mind, the authors conclude by outlining the ontological challenges (echoing the reservations that Elliot and Purdam set out in Chapter 3) and the technical challenges of mining text in social research settings. They note, in the case of social media, increasingly restrictive access policies, and they also consider the ethical implications of text mining used as a social research tool.

You can download the introduction over at the Sage website and download a full pre-publication version of the chapter from Royal Holloway’s New Political Communications Unit.

And of course, you can buy a copy of the whole book from Amazon.