Background
The Internet has amplified the ability of individuals to connect across
the globe. As Internet technology
proliferated, criminal organizations, terrorist networks, and other threat
actors recognized its usefulness and began leveraging its benefits (i.e.,
criminal and terrorist organizations became “globalized”). One of the most useful benefits of the
Internet is the anonymity it provides actors, but users utilizing TOR (The
Onion Router) software can increase their anonymity still further.
TOR anonymizes the browsing habits of its users by encrypting their
network traffic, using specific TOR nodes to transmit the encrypted data, and
following random paths to the desired servers.
Users also benefit from a type of “herd immunity” (i.e., the more TOR
users there are in a given area, the more difficult it is to “de-anonymize”
users). Perhaps more interesting, TOR
grants access to the Deep Web—users can access websites that search engines
cannot find. Deeper still is the Dark
Web; a subsection of the Deep Web that is intentionally hidden and contains
illegal/illicit activities.
Malicious actors (e.g., criminal organizations, terrorist/extremist
organizations, etc.) frequent the Dark Web to purchase illicit goods as well as
to communicate. This communication
occurs in different forums and Internet-relay chatrooms (IRCs). IRCs frequently require specific knowledge to find
and passwords to enter, forums by contrast are often open-access to anyone that
can find them—this is because such forums are necessary to spread the various
“messages” of the groups present.
Utilizing the process that Elizabeth Philips, Jason Nurse, Michael
Goldsmith, and Sadie Creese laid out in their paper, “Applying Social NetworkAnalysis to Security,” this analysis will explore how social network analysis
techniques can provide insights into Dark Web networks. While previous studies have qualitatively
analyzed Dark Web forums, or used relatively small datasets, this study will utilize
an extremely large dataset that spans over a decade of collection across
multiple forums.
TOR presents anonymity, but it is not truly anonymous. Numerous techniques exist to de-anonymize TOR
users (e.g., monitoring exit nodes, etc.), but law enforcement and intelligence
agencies do not have the resources to de-anonymize every potential actor. Social network analysis provides a tool to
focus the efforts of such agencies on disrupting extremist networks.
Research Question
I will conduct a social network analysis of Dark Web forum message and
posting metadata utilizing a dataset compiled from various English-language
Dark Web extremist forums. Each dataset
spans a number of years and contains various numbers of members and postings,
but I have compiled them into a single dataset that spans 13 years and contains
over 2.5 million unique posts/messages.
I want to analyze three different aspects of the network:
1.
Can groups and leaders (i.e., hierarchy) be
predicted or discovered based solely on metadata?
a.
If leaders can be discovered, how connected are
those leaders? Are they cross-forum, or
are they leaders of only one forum?
b.
Are posters united, or do their beliefs and
posts diverge?
2.
How much interaction is there between
individuals and groups across the different forums?
a.
Do individuals or groups remain on a handful of
forums, or do they spread across a wider network?
b.
How connected are the various groups? Do Dark Web criminal organizations interact,
or do they establish “turf”?
Theory & Hypotheses
I expect to be able to determine significant amounts of
individual-based information based on friend-groups and communication
habits. Furthermore, I expect that I
will be able to determine the hierarchy of Dark Web posting groups based on
social network analysis techniques (i.e., distinguishing “broadcasters” from
“sinks,” etc.). I predict that leaders
will remain on specific forums, but that lower-level individuals will act as
“bridges” connecting forums and groups together. Moreover, I predict that groups will
distinguish themselves using specific language (i.e., slang) that denotes
group-membership.
Data Collection
I will utilize a variety of datasets created by Arizona StateUniversity’s Artificial Intelligence Laboratory. These datasets are compiled from various
English-language Dark Web extremist forums (e.g., Islamic Awakening, Islamic
Network, Turn to Islam, etc.). Each
dataset extends over time—the shortest dataset covers 2 years, while the
longest covers 8 years. Taken together,
the dataset is comprised of seven different Dark Web forums, spans 13 years,
has over 48,000 members, and over 2.5 million posts. Using Ucinet, a software package used for
social network analysis, I will map and analyze the connections between
posters, groups, and the forums themselves.
The social network data gathered will be directed and one-mode. Nodes will correspond to forum members. I intend to analyze content from 2000-2013 to
track how the Dark Web community groups transformed over time. Finally, I will analyze and determine sub-groups
as well as the leaders of said sub-groups utilizing the attribute data (i.e.,
metadata) contained within the dataset (e.g., post date, member name, etc.).
Conclusion
The Dark Web has been a powerful tool that has connected criminal
organizations, terrorist networks, and illicit actors worldwide. Furthermore, given that the technology used
to access the Dark Web (i.e., TOR—The Onion Router) provides substantial,
though not complete, anonymity, these actors can use the Dark Web to avoid
surveillance while simultaneously conducting their business in “the open.” If the metadata scraped from Dark Web forums
can provide insight into the organizational structure and leadership behind
these shadow groups, then the effectiveness of law enforcement and intelligence
organizations will be multiplied. Analyzing
the social network surrounding these forums will hopefully allow us to make
inferences and predictions about how communication habits predict hierarchies
and social organization.
1 comment:
Very nice job, Mr. Dark Web. As discussed in class, it needs a Key Question to focus data collection and analysis, and you could also expand a bit more on which net measures you will use, and what results they might yield. I hope that you will actually do this work as all or part of a capstone, as there's some real value to be gained from it. Also, if your job interests lie in this direction, it would look very good in the Experience section of your CV.
BTW, if you are willing to share the data set that you complied, I'm sure others would be interested.
Post a Comment