Estimating Social Influence with Probabilistic Machine Learning

September 8, 2020 in Closed Projects Fall 2020

We are developing machine learning (ML) methods to understand how people influence each others’ behavior in social networks. For example, on Twitter, do users influence the content shared or posted by their followers? Methods that can identify such patterns of influence will play a role in studying, e.g., the spread of misinformation on social media sites.

Using social network analysis to improve access to mental health treatment in Brazil

September 8, 2020 in Open Projects Fall 2020

The proposed project would focus on analyzing quantitative data from a 4 year NIMH-funded study entitled “Integrating evidence-based depression treatment in primary care: Tuberculosis (TB) in Brazil as a model” (PI: Sweetland, K01MH104514). The aim of the study was to assess whether social network analysis could be used to leverage the receptivity and connectivity of TB providers in a Brazilian public health system in a way that could accelerate the adoption (implementation) and diffusion (dissemination) of an evidence-based treatment for depression treatment in a primary care network. Baseline receptivity was operationalized via six brief quantitative scales to measure mental health literacy, work self-efficacy, organizational climate, attitudes towards evidence-based practices, organizational readiness to change and individual innovation thresholds. Connectivity was assessed by asking TB providers with whom they discuss difficult cases, give advice to, or receive advice regarding difficult TB cases. Baseline receptivity and connectivity data was used to identify 3 pilot sites in which to train primary care providers to deliver evidence-based depression treatment for one year.

Data For Good: The Consequences of Language Policing

January 15, 2020 in Project Spring 2020

Contestation over language use is an unavoidable feature of American politics. Yet, despite the rise of language policing on both sides of the aisle, we know surprisingly little about how ordinary citizens respond to norms governing language use from both in-group and out-group members. Following Munger (2017), I would like to leverage social media platforms such as Reddit and Twitter to evaluate whether injunctions to use particular words (e.g., undocumented immigrant, Latinx) are effective. I plan to use an experimental approach, where conditional on mentions of “illegal alien” or “Hispanic/Latino,” users are randomly assigned to receive a “language correction.” Outcome measures would include subsequent use of corrected terms, valence of user responses, and upvoting/liking/RTing behavior.

Measuring Tax Evasion using Twitter Feeds

January 15, 2020 in Project Spring 2020, Project Summer 2020

Tax evasion is one of the main sources of informal economic activity and has drastic effects on different macroeconomic variables. However, due to various reasons, it is difficult to directly measure the extent of tax evasion. This project aims to develop a novel way of measuring aggregate tax evasion in national economies using Twitter feeds. To this end, using carefully selected keywords in different national languages, we will collect country and regional level data from Twitter feeds in different frequencies for a large cross section of economies and then construct a measure of tax evasion using the collected data. In addition to fully describing the collected dataset, the project will also examine the evolution of the constructed series.

Estimating Social Influence with Probabilistic Machine Learning

Using social network analysis to improve access to mental health treatment in Brazil

Data For Good: The Consequences of Language Policing

Measuring Tax Evasion using Twitter Feeds

Columbia Data Science Institute (DSI) Scholars Program