This book showcases the unique possibilities of corpus linguistic methodologies in engaging with and analysing language data from social media, surveying current approaches, and offering guidelines and best practices for doing language analysis.
The book provides an overview of how language in social media has been approached by linguists and non-linguists, before delving into the identification of the datasets requirements needed to pursue investigations in social media, and of the technical aspects of particular platforms that may influence the analysis, such as emoticons, retweets, and metadata. Sample Python code, along with general guidelines for using it, is provided to empower researchers to apply these techniques in their own work, supported by actual examples from three real-life case studies. Di Cristofaro highlights the full potential of using these methodologies in analysing social media language data and the ways in which they might pave the way for future applications of data analysis and processing for corpus linguistics.
The book will be key reading for researchers in corpus linguistics and linguists and social scientists interested in data-driven analysis of social media.
Chapter 1 - Introduction Chapter 2 - Social Media as Digital Research Data Chapter 3 - Fundamentals of Corpus Linguistics Chapter 4 - Imagining the Data: corpus design Chapter 5 - Creating the Data: corpus collection Chapter 6 - Case studies Chapter 7 - Conclusion
Matteo Di Cristofaro is Lecturer in Digital Humanities and Corpus Linguistics at Università degli Studi di Modena e Reggio Emilia and Researcher Fellow in Corpus Linguistics at Università di Pisa.