IC2S2 2024 Tutorial

Exploring Emerging Social Media: Acquiring, Processing, and Visualizing Data with Python and OSoMe Web Tools

Time: Wed, July 17, 2024, 9:00 AM to 12:30 PM
Location: Irvine: Amado Recital Hall (D), University of Pennsylvania, Philadelphia, PA

Stay ahead in the evolving field of social media research with our tutorial!

With new platforms like Bluesky and Mastodon emerging and established ones restricting data access, we will explore different innovative tools and techniques developed by the Observatory on Social Media (OSoMe) to facilitate the analysis and understanding of these platforms. In the first part of this tutorial, you will learn how to use the OSoMe infrastructure to collect and analyze social media data. This includes tools such as Botometer-X, Coordiscope, TopFIBers, and more. Additionally, we will explore how to generate simulated data using SimSoM.

In the second part of the tutorial, you will gain hands-on experience in preprocessing and analyzing social media data using Python and Jupyter notebooks. We will provide a dataset obtained from Mastodon and Bluesky. You will learn various techniques for building networks from social media data, such as co-reposts, co-hashtags, and co-urls. Several network science techniques will be covered too, including centrality measures and community detection. Finally, we will introduce text embedding techniques for analyzing posts using sentenceBERT.

The third part of our tutorial focuses on visualization and narrative extraction. We will demonstrate how Helios-Web can be used to interactively visualize social media networks and embeddings. The tutorial will also cover techniques to extract narratives using natural language processing.

This tutorial is suited for anyone with an interest in social media analysis, encompassing a wide range of disciplines and expertise. While prior knowledge of Python is desirable, the tutorial is designed to be inclusive and accessible to those with varying levels of technical proficiency. The tutorial will be conducted in Python and use Jupyter notebooks preloaded with datasets and scripts. The OSoMe tools and materials will be open-source, available via the observatory’s website or GitHub, providing participants with a toolkit to kickstart or advance their social media research endeavors.

You can find the tutorial materials and instructions on how to set up your environment in our GitHub page: https://github.com/osome-iu/IC2S2_OSoMe_tutorial_2024.

Program

Preparation session (9:00 AM - 9:10 AM)

  • Quick overview of the tutorial.
  • Introduce the presenters.
  • Demonstration on how to set up the environments and API keys.
  • The organizers will help the attendees with the setup process.

Demonstration of the OSoMe tools and data acquisition (9:10 AM - 10:00 AM)

  • Utilizing OSoMe tools for analyzing and acquiring data, including the Botometer-X, network tool, CoordiScope, Top FIBers, and OSoMe Mastodon Search.
  • Data acquisition from an emerging platform using OSoMe infrastructure.
  • Generate synthetic data from SimSoM, a minimal model that simulates information-sharing on a social media platform.

Break and setup (10:00 AM - 10:10 AM)

Building Networks and Embeddings, Simple Analysis (10:10 AM - 11:00 AM)

  • Preprocessing, filtering, cleaning.
  • Constructing interaction networks (re-post, reply, mention).
  • Building co-hash and co-post networks.
  • Generating embeddings of posts using BERT.
  • Employing similarity measures to find posts with similar content.
  • Illustrating a classification task using embeddings.

Break (11:00 AM - 11:10 AM)

Visualization of Networks and LLM Integration to Explore Narratives (11:10 AM - 12:00 PM)

  • Using Helios-web for visualizing user networks and post embeddings.
  • Demonstrating the use of semantic axes to illustrate polarity.
  • Extracting and analyzing communities.
  • Identifying and discussing the narratives prevalent in each community, based on their content.

Q&A and discussion (12:00 PM - 12:30 PM)

Organizers

  • Filipi N. Silva, Research Scientist, Observatory on Social Media

  • Bao Tran Truong, Ph.D. candidate in Informatics, Observatory on Social Media

  • Wanying Zhao, Ph.D. candidate in Complex Networks and Systems, Indiana University

  • Kai-Cheng Yang, Postdoctoral research associate at the Network Science Institute, Northeastern University