Background:
Foreign
actors–both independent and state-sponsored–are increasingly using the Internet
as a tool of subversion in order to foment distrust in American institutions. The
recent Director of National Intelligence report asserts the U.S. intelligence
community’s confidence that Russian President Vladimir Putin ordered an
influence campaign in 2016 that was intended to undermine public faith in the
U.S. democratic process.[i] Much
of this influence campaign took place on social media platforms, leading to
congressional inquiries and requests for increased transparency. In response to
those requests, Twitter has released a substantial dataset related to this
alleged foreign interference in political conversations on Twitter in an effort
to improve, through research, our ability to detect, understand, and neutralize
disinformation campaigns as quickly and robustly as technically possible.
As
researcher Jonathan Albright notes, “The state of the modern information
ecosystem — including our dilemma addressing misinformation, propaganda,
security, and hate speech — are manifestations of intrinsic
misalignments within an increasingly hybridized global information system.”[ii]
As we work to find solutions to these new challenges, we must better understand
the movements of the content that comprises influence operations and the behaviors
of those who propagate it.
Of
particular interest is those cases where Russian trolls were successful in
infiltrating U.S. news and establishing direct engagement with American Twitter
users. One such account, with the username @cassishere, posted a photo of a
Putin banner on the Manhattan Bridge that won a credit from The New York Daily
News.[iii]
There are also allegations that Michael Flynn Followed Russian troll accounts
and pushed their messages in the days leading up to the 2016 election.[iv]
Twitter itself even informed 1.4 million people that they had interacted with
Russian trolls.[v] By
looking at the behaviors of these Twitter accounts in closer detail, we can
determine whether patterns exist that might help us defend against foreign influence
operations in the future.
Two
researchers at Clemson University have already begun to analyze the data. To
date, they have divided the Russian Twitter trolls into five distinct
categories: Right Troll, Left Troll, News Feed, Hashtag Gamer and Fearmonger. My
SNA will build off of their research.
Research Question: How do identified Russian trolls engage
with mainstream U.S. media accounts on Twitter?
-
Sub-Question:
Based on an identified number of cases (TBD[vi])
in which trolls successfully infiltrated mainstream news coverage (i.e. were quoted
in a news article), do “successful” trolls exhibit unique characteristics
within the troll network?
-
[Based
on existing data sets or previous SNA studies] Does the Russian Twitter troll network
mirror mainstream news media Twitter networks, “normal” Twitter user networks,
and/or known terrorist networks on Twitter?
Why SNA?
Social
network analysis is uniquely positioned to begin exploring new questions about
influence operations via social media. While this is an area of increasing
academic and political inquiry, there is very little established literature due
to the contemporary nature of the topic and the difficulty in studying it.
While there are many limitations to any political science research about trust,
influence, and behavior (see below), we can use SNA to identify patterns in the
behavior of these “bot” accounts. Such patterns can help social media companies
decide how to handle their presence on their platforms, and can help the wider
public learn to defend against political manipulation online.
Hypothesis:
I predict
that Russian trolls engaged directly with mainstream news media accounts on a
regular basis, in order to push content to those entities. However, I believe
the most successful trolls would work through third-party actors (“real”
Twitter users) in order to erase their trace and make the information appear
more believable. I believe that Russian trolls’ ultimate goal is to sway the
beliefs of large portions of the American population by having their message
picked up by trusted mainstream news channels. When it comes to the “successful”
cases, I hypothesize that the strategies used varied widely—a deliberate
attempt by Russia trolls to obfuscate and avoid any potential pattern recognition.
I predict that the “successful” cases were pushing content that was more
closely aligned with public opinion or other sources, suggesting that trolls
are more likely to succeed if they stick closer to reality. Finally, I
hypothesize that Russian Twitter troll networks are visually unique from known
networks (for mainstream media, “normal” Twitter users, or even terrorists)
when looking at patterns of interactions (both mentions and retweets). I
believe this would give us the most useful insight into potential ways to
identify troll accounts in the future.
Methodology:
For my
SNA, I plan to identify a list of mainstream news media Twitter accounts, based
on the news organizations Americans most commonly turn to receive their news.[vii]
I plan use my larger dataset to identify “levels” of interaction, identified as
general tweets (perhaps with certain keywords, yet to be determined), “mentions”
that include either another identified Russian troll or one of the established mainstream
news accounts I have identified, or “retweets” of any of the aforementioned
users’ tweets. While the cohesion measures of this network itself
I also
plan to look at the ego networks of a few specific Twitter handles which have
been identified in prior research and investigation as having “successfully”
infiltrated U.S. news or interacted with public figures (assumed to have wider
influence).
Data Collection:
The
initial dataset is publicly available on GitHub
thanks to Clemson University researchers Darren Linvill and Patrick Warren. Linvill
and Warren gathered this data using custom searches on Social Studio, a tool
owned by Salesforce and contracted by Clemson’s Social Media
Listening Center. The directory contains nearly 3 million tweets from
Twitter handles that were found to be connected to the Internet Research
Agency, a Russian “troll factory” that was implicated in special counsel Robert
Mueller’s February 2018 indictment.
Twitter provided Congress with 2,752 handles that were connected to the IRA in
November 2017, and added an additional 946 handles in June 2018 (at which point
they also removed 19 handles from the original list).[viii] The majority of the
tweets in this data set were posted between 2015 and 2017, though I may limit
this timeframe as necessary upon further review of the data. The full data file
includes 2,973,371 tweets from 2,848 Twitter handles.
In an
exciting update, on Wednesday, October 17, 2018 Twitter released an even more substantial
archive
of Tweets and media that “resulted from potentially state-backed information
operations” on the platform. The dataset includes information from 3,841
accounts believed to be connected to the Russian Internet Research Agency, and
770 accounts believed to originate in Iran. These datasets include all public,
nondeleted Tweets and media (e.g., images and videos) from accounts believed to
be connected to state-backed information operations, including more than 10
million Tweets and more than 2 million images, GIFs, videos, and Periscope
broadcasts.
Limitations:
In Twitter’s
recent release, some account-specific information is hashed in the dataset for
accounts with fewer than 5,000 followers in order to protect user privacy. I do
not expect this to affect my SNA, since I do not plan to analyze normal user
accounts other than the Russian trolls themselves. Also missing from this
dataset is the reciprocal Twitter data of mainstream media accounts, which would
provide crucial evidence as to how exactly news organizations engaged, if at
all, with Russia trolls on Twitter. Without this piece, influence can only be
assumed from the limited data available.
Most
importantly, the impact of this
information is still largely unknown, and incredibly difficult to measure. Despite
knowing that these Russian troll accounts existed (and likely still exist, in
different forms, today), we do not and cannot know if and to what extent their
presence influenced American beliefs, much less American voter behavior in the
2016 election. For that reason—and to limit the scope of this project—I will
not look at user interaction with these Russian trolls. Rather, I will focus on
the behaviors of Russian trolls and their interactions with mainstream news
media. This analysis will be admittedly one-way in nature, but I can use
external sources to corroborate my findings through network analysis.
One area
of future research would be to track, in real time (using NodeXL), the behavior
patterns of other networks of Twitter users (e.g. politicians, news media
accounts, or average users with respect to a specific trending topic)
surrounding specific events (namely the upcoming 2018 midterm elections). Since
Twitter data is constantly changing and users are able to delete tweets, the
best way to analyze data is to use Twitter’s API in real time. Even this
strategy is limited, though, because it only captures public accounts (not
those who make their accounts private). However, public, “verified” accounts
are a good representation of potential influence
due to their high follower counts. Another area of future research would be to assess the traction of the Russian trolls identified as being influential by searching Media Cloud's open source text aggregation, indexing, and analysis tool of online information sources for references to that username.
[i] Office of the Director of National Intelligence,
Intelligence Community Assessment, Assessing Russian Activities and Intentions
in U.S. Elections, January 6, 2017.
[ii] Albright, Jonathan. “Web no.point.0: rise of the Splintr.net,”
Medium, 17 October 2018. <
https://medium.com/tow-center/web-no-point-0-rise-of-the-splintr-net-d45869aa1b8>
[iii] Shane, Scott and Mark Mazzetti. “The
Plot to Subvert an Election: Unraveling the Russia Story So Far,” New York Times, 20 September 2018 <https://www.nytimes.com/interactive/2018/09/20/us/politics/russia-interference-election-trump-clinton.html>
[iv] Collins, Ben and Kevin Poulsen. “Michael
Flynn Followed Russian Troll Accounts, Pushed Their Messages in Days Before
Election,” Daily Beast, 1 November
2017 < https://www.thedailybeast.com/michael-flynn-followed-russian-troll-accounts-pushed-their-messages-in-days-before-election>
[v] https://blog.twitter.com/official/en_us/topics/company/2018/2016-election-update.html
[vi] Shane, Scott and Mark Mazzetti. “The
Plot to Subvert an Election: Unraveling the Russia Story So Far,” New York Times, 20 September 2018 <https://www.nytimes.com/interactive/2018/09/20/us/politics/russia-interference-election-trump-clinton.html>
[vii] I plan to use the list of sources
used by Pew Research Center in its 2014 study of Political Polarization and
Media Habits: <http://www.pewresearch.org/wp-content/uploads/sites/8/2014/10/Political-Polarization-and-Media-Habits-FINAL-REPORT-7-27-15.pdf>
[viii]
https://democrats-intelligence.house.gov/news/documentsingle.aspx?DocumentID=396
1 comment:
There's a lot to unpack from what you've written, but perhaps the best place to start is with your main Q. It's a How Q, which is fine, as you will have accepted (and have us, the readers, accept) that engagement is a given. The problem is that you then discuss a bunch of ways that you can go about visualizing and analyzing the problem, and it becomes difficult to understand how they combine to inductively address your How Q.
If you haven't already, I suggest you get together with Arik B. as soon as you can to see what's feasible, given the large amount of data from the Clemson dump and the October 17 release.
Post a Comment