Skip to content

Scripts for classifying political ads by ad tone (e.g., contrast, promote, attack)

License

Notifications You must be signed in to change notification settings

Wesleyan-Media-Project/ad_tone

Repository files navigation

CREATIVE --- Ad Tone

Welcome! This repo contains scripts for classifying political ads by ad tone (e.g., contrast, promote, attack).

This repo is part of the Cross-platform Election Advertising Transparency Initiative (CREATIVE). CREATIVE is an academic research project that has the goal of providing the public with analysis tools for more transparency of political ads across online platforms. In particular, CREATIVE provides cross-platform integration and standardization of political ads collected from Google and Facebook. CREATIVE is a joint project of the Wesleyan Media Project (WMP) and the privacy-tech-lab at Wesleyan University.

To analyze the different dimensions of political ad transparency we have developed an analysis pipeline. The scripts in this repo are part of the Data Classification step in our pipeline.

A picture of the repo pipeline with this repo highlighted

Table of Contents

1. Introduction
2. Data
3. Setup
4. Thank you!

1. Introduction

This repository contains code that generates two variables. First is called ad tone mention-based, which codes ad as 'contrast, 'promote' or 'attack'. We use the outputs from the entity linking 2022 repo as an input for this. This coding is decided based on who is mentioned in the ad:

  • If an ad from a candidate mentions the candidate in the ad and not their opponent: Promote
  • If an ad from a candidate mentions their opponent in the ad and not themselves: Attack
  • If an ad from a candidate mentions both the candidate and their opponent in the ad: Contrast

Second, we have ad tone constructed, which utilizes results from the mention-based classification, as well as results from ABSA and race of focus repos to code ads as 'contrast, 'promote' or 'attack'. To visualize our decision-making process for ad-tone constructed, consult this diagram. Also see below for more details.

Data output by the scripts is in the format ad_id,ad_tone. An example row looks like x1949505221867086,Promote.

Diagram showing the process by which ad tone constructed is gotten

This repo contains eight R scripts, three that deal with ad tone constructed and five that deal with ad tone mention-based. Of the five scripts related to ad tone mention-based, scripts related to non-2022 data are in a folder called "ad_tone_mentionbased". The scripts that are related to Facebook and Google 2022 are in another folder called "ad_tone_mentionbased_2022". Thus, if you only want to work on the 2022 data, you do not have to run anything in the "ad_tone_mentionbased" folder. All of the a constructed scripts are in "ad_tone_constructed", regardless of whether they are related to 2022 or non-2022 data.

2. Data

The code in this repository creates two variables, ad tone mention-based, and ad tone constructed. Results are saved as a csv file, in the data folder:

2.1 Ad tone mention-based

Mention-based (or reference-based) results in ads coded as 'Contrast' if both the candidate and their opponent are mentioned in the ad (either in text, or in image appearance), 'Promote' if only the candidate is mentioned, and 'Attack' if only the opponent is mentioned. If no candidate is mentioned, the ad is coded as 'Support' (given that the basic purpose of an ad is to further the preferred candidate's electoral prospects). This variable is available for the candidate ads in the 1.4m dataset.

2.2 Ad tone constructed

The construction of ad tone is based on this flowchart.

Diagram showing the process by which ad tone constructed is gotten

When traditional mention-based ad tone is available, we use that; otherwise we sum over ABSA results (also using race of focus). The variable is available for a larger number of ads, and the rest have no ad tone.

3. Setup

3.1 Install R and Packages

First, make sure you have R installed. In addition, while R can be run from the terminal, many people find it much easier to use R-Studio along with R. A link to this program can be found here

The scripts use are tested on R 4.2, 4.3, and 4.4.

Next, make sure you have the following packages installed in R (the exact version we used of each package is listed in the requirements_r.txt file. These are the versions we tested our scripts on and thus, they may also work with more recent versions):

  • data.table
  • stringr
  • purrr
  • dplyr
  • tidyr
  • R.utils

3.2 Input Files

In order to use the scripts in this repo, you will need outputs from a number of other repos. Specifically which repositories are needed depends on which script you are executing.

3.2.1 Mention-based Scripts

All the scripts for ad tone mention-based require datasets. In addition, depending on the specific script, various other repos must also be downloaded.

Looking at the scripts within the ad_tone_mentionbased_2022 folder, they all require datasets. In addition, depending on the specific script, various other repos must also be downloaded. Note that for the files hosted on Figshare, you will need to fill out a form before gettig immediate access to the data. Specifically:

Scripts in the ad_tone_mentionbased folder were not used for 2022 election ads data production are described down below. They are legacy scripts serving similar purposes towards our 2020 TV and online ads data. They are preserved here for internal use.

-- Some input files for mention-based scripts require the metadata (e.g., var1 files) for Facebook or Google. These are too large to be uploaded to GitHub. You can download them through our Figshare page:

Pre-2022 data production:

3.2.2 Constructed Scripts

Looking at the scripts within the ad_tone_constructed folder, they require:

Legacy scripts for pre-2022 data production, preserved here for internal use:

In addition, all scripts within the ad_tone_constructed folder require the ad tone mention-based results (see above).

3.3 Run Files

Now, depending on which variable and what data you are interested in analyzing, and you can run the script you want accordingly. For example, to do the mention-based classification for Facebook 2022 data, run ad_tone_mentionbased_fb2022.R.

Running the scripts through the terminal would look like this

cd ad_tone_mentionbased_2022
Rscript ad_tone_mentionbased_fb2022.R

and can also alternatively be done through the RStudio interface.

4. Thank You

We would like to thank our supporters!


This material is based upon work supported by the National Science Foundation under Grant Numbers 2235006, 2235007, and 2235008. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

National Science Foundation Logo

The Cross-Platform Election Advertising Transparency Initiative (CREATIVE) is a joint infrastructure project of the Wesleyan Media Project and privacy-tech-lab at Wesleyan University in Connecticut.

CREATIVE Logo

Wesleyan Media Project logo

privacy-tech-lab logo

About

Scripts for classifying political ads by ad tone (e.g., contrast, promote, attack)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages