Discovering the network behind the founder’s myth
Bin Feng Zheng (Currently not taking the second module)
Background
Every great business has a
great story and a founder who speaks to their values. We have come to learn a great deal about
these founders, specifically, their personal traits that led them on their journey. I am interested in the social network behind
these founders myth. I propose that
success is a product of the social network around you and that the right social
network around you can predict success.
Taking a step back, it is
important to analyze this genesis point in which business ideas become products
and products become wildly successful. My
specific focus will be on the commercial drone sector based in the West Coast
of the United States. The commercial
drone industry is at a critical moment in its young history. Hardware, software, regulation and capital
are all aligning to potentially make them the enabling technology of the
future. It is also an industry that is
really wide-open with no dominant players.
Thus, the networks within the industry may well be forming. I want to focus on the West Coast, to provide
a geographical limit but also recognize the considerable advantage of access to
Silicon Valley and venture capitals.
Social Network Question/Research Question
This is really a test case
about social networks around ideas. I
want to see if social networks have an impact on ideas becoming products in the
business world. One approach would be to
look at historical data to tease out connections. However, I think it would be more much interesting
to examine networks that are still forming, to project and forecast where the
industry is heading in terms of its social relations. In this case, the context is the commercial
and business drone sector in the West Coast.
The social network questions
I will be asking are the following:
1.
Does a network
exist? Can we tease one out?
2.
What are the
attributes of those in the industry? How
are they connecting?
3.
Who are the
established and emerging leaders?
4.
Which companies
are best positioned to leverage social network of established and emerging
leaders?
5.
Do extra-company
networks exist?
Additionally, this is an
approach to researching an industry. It
could be important in terms of corporate intelligence. For the novice, it is a
fun way to learn what is going on.
Hypothesis
An analogy for what we’re
looking for here is the rolodex of contacts.
One collect cards through previous ties or being at the same
companies. We’ll be looking at many
rolodexes and looking for the network within that ecosystem.
My hypothesis is that such
networks exist within the industry, within companies and across companies. Additionally, the networks will congeal
around certain brokers and emerging leaders, who may or may know it themselves
right now, will go on to dominate the industry.
My goal is to identify them.
I expect the networks to be
dominated by weak ties and be relatively insular in terms of education and work
experience. In fact, it is entirely
likely that a common work experience is the key past link of many key actors--overall,
wide ranging and distributed networks but with cliques. In the end, leaders are those who can
overcome weak ties and build more lasting relations. Additionally, it would be interesting to look
for well-connected actors, who by other measures are considered outliers.
Data
Data for this network analysis
will be a challenge. It would be
impractical and almost impossible to conduct a network survey with a defined
group of individuals in any sector.
However, I do think there is open-sourced information that can provide a
creative solution.
We’re going to build an industry
attribute dataset of individuals in the industry, using publically available
information on Linkedin profiles and company websites. These are the steps:
1.
Identify target
companies, limiting to West Coast-based and industry specific.
2.
Comb their staff
biography pages
3.
Search on
Linkedin for sector, region (West Coast), and keywords.
4.
Build an
attribute data set of all relevant individuals—with the understanding that data
will be incomplete in some columns.
Attributes:
Name
Age
Gender
Education/School(s)
Degree(s)
Concentration
Past 3 Companies (each
company will be coded differently so this list may grow really large)
Connection to Tech Lab (each
tech lab will be coded differently)
Any additional Linkedin
information off profile that may be interesting
(This list will be modify
after a more comprehensive preliminary search.)
Target population: From 100
individuals to upwards of 500 or 1000, depending on resources and free
time. If total number is smaller, there
will be a preference for leadership roles at company over rank and file.
Making a One-Mode Dataset (with limitation)
With the information we
pulled from the Internet, we can be creative in building an useful One-Mode
dataset. Here’s how we will do it:
Through the attributes
dataset, we should have the following information relevant to target population:
Education/School, Age Range, and Concentration or Field of Studies. Each selection within the attribute category would
have distinct value coding. We’ll translate
these attributes into assumptions about ties.
If an individual share one of these data points, then they are consider
to have a tie valued at 1. If they share
two, then they are considered to have a tie valued at 2; three shares for tie
valued at 3 and four shares for tie valued at 4. Here is a breakdown of what it might look
like:
1.
If an individual
went to the same educational institution with another individual, they share an
undirected weak tie (valued at 1)
2.
If an individual
went to the same educational institution and is within the same age range with
another individual, they share an undirected medium tie (valued at 2)
3.
If an individual
went to the same educational institution, is within the same age range, and
share similar field of study with another individual, then they share an
undirected strong tie (valued at 3).
4.
If an individual
went to the same educational institution, is within the same age range, share
similar field of study, and currently or has worked at the same company with another
individual, then they share an undirected very strong tie (valued at 4).
A note on preparing and
cleaning this dataset: Obviously, it would be challenging. But one way to do so after the attribute
dataset has been collected is to manipulate the excel transpose, copy and paste
functions to get relevant columns next to each other and collate their data
points.
Two-Mode Dataset:
It would also be interesting
to analyze individuals who have been employed at multiple companies. This type of experience would represent
invaluable institutional knowledge.
It would be valuable to look
at companies that have accumulated individuals with multiple-companies
experience and the network of individuals who have multi-companies
experience. Specific SNA techniques will
be elaborated on later.
Creating a Two-Mode dataset:
This too will require some
creativity. From the attribute dataset,
we have a list of individuals as well as the their last three companies of
employment. We can extrapolate the list
of individuals and a total list of most popular companies among the
individuals’ work experience. Thus, our
Two-Mode dataset will be a matrix of individuals and companies (as defined by
work experience). The values of ties
would be binary, 0 for no work experience at company, and 1 for work experience
at company.
Some limitations to keep in mind before analysis:
1.
Each individual
would only be tied to three companies maximum, given that we’re only accounting
for the top three companies in the Attribute dataset. We could certainly try to get a cumulative
list of every company individuals have claimed to work for; however, that could
get prohibitively difficult.
2.
If, for example,
we take the top twenty companies most frequently listed by individuals in their
work history, each company could have anywhere from 1 to many ties. This is the component that will allow for
further analysis. We could certainly
attempt a more complete list of companies but for leadership network analysis
purposes, the top ones will probably provide a sufficient network.
3.
The Two-Mode
dataset will allow us to look at companies with many individuals but not
necessarily individuals with many companies.
Further Social Network analysis would need to be employed for those
insights.
4.
The same caveat
for the One-Mode dataset applies here.
Given that this network data is not based on a survey or network
questionnaire, it would be a close approximate of a network, not the actual
network.
Social Network Analysis/Methodology
There are three datasets for
analysis: a master attribute dataset, a One-Mode actor-by-actor network dataset
and a Two-Mode actor-by-company network dataset.
Network Cohesion Measures of One-Mode Network
dataset: This analysis will provide
density and centralization measures of the network in the One-Mode
dataset. It is possible that no network
exists at all and we get a set of islands build around companies. The visualization through NetDraw will
provide a good sense of where we’re headed.
In addition, visualizing the data at different tie strength value will
help decide a good dichotomy threshold.
One-Mode Network and Attribute datasets:
1.
Dichotomize
One-Mode Dataset at tie strength greater than or equal to 3. The One-Mode
dataset is now binary, 1 for having ties, and 0 for no ties.
2.
Add the Attribute
dataset.
3.
Using E-Index
Analysis, search for homophily score based on each of the attributes
listed.
This analysis will give you
a sense of the network diversity and whether individuals have ties based on
attributes. Alternatively, if the E-I
index score is positive, then it indicates that individuals (who may have been
a homogenous group in the first place) are not connecting based on similar
attributes, which in itself, is a valuable insight for evaluating hidden networks. It would propose the possibility of a network
based on merits, over, say for example, the old boys’ club.
Looking for subgroups using Components, Faction, Girvan-Newman
and clique analyses: Once a network visual has been established, dichotomized
and separated from isoquants, we can use Components, Faction, Girvan-Newman and
clique analyses to identify subgroups based on number of inward ties. These analyses can lead to subgroups that are
not obvious from the homophily analysis.
Alternatively, we can also compare ties subgroups with the subgroups
based on attributes to see if there are overlaps. Clique overlap analysis will identify the
most involved individuals. The two
layers of analysis will add nuance to our interpretation of the data. Overall, the factions in the network should
be interpreted as the group of most inter-connected individuals in the drone
sector.
Centrality Measures and Egonet of brokers and
emerging leaders: The One-Mode
Network dataset at this point would be binary and undirected. Using centrality measures can help us
positively identify the most well-placed network leaders, emerging leaders and
brokers (individuals who straddle various networks). Egonet will allow us to examine the network
of these specific leaders. Our goal at
this point in the analysis is to identify the promising leads for further
investigation and comparison.
Source of Comparison in the Two-Mode Dataset:
1.
The Two-Mode
Dataset is currently a binary dataset with 1 for tie with company and 0 for no
tie. By splitting the dataset from a
Two-Mode to One-Mode matrix of actors by actors, we’ll be able analyze a value dataset
of actors who are connected via the same company overlap. You can choose the tie strength at which it
makes sense to dichotomize the data. One
suggestion would be at greater than or equal to 1 since the network could be
severely limited at this point. Once the
dataset is binary, we can run faction, Girvan-Newman and clique analyses, centrality
measures, and then Egonet. The analyses
can be interpreted in the following way:
a.
At value dataset,
actors with high value can be considered individuals with high reserve of
industry and cross-institutional knowledge.
It would be great to collapse these individuals into a list.
b.
At binary
dataset, one can identity the individuals who are most connected with other
through overlapping company experiences.
We are looking for the leaders, emerging leaders and brokers with this
network data too.
At this point, we have two
lists of emerging leaders and brokers; one from the first One-Mode Network
dataset which approximates possible ties within the industry; the second, a separated
One-Mode Network of actors based on shared company experiences. To reiterate, the former indicates network ties;
the second indicates ties based on having worked at the same company. It would be insightful to make a comparison
of the two sets of emerging leaders and brokers.
Adding the two One-Mode datasets to see if there are
overlaps:
1.
We can add the
two One-Mode, binary datasets to create a value dataset with the following
distinction:
a.
Coding 1 to
stand for the ties in the first One-Mode Network
b.
Adding the value
of 1 to the second One-Mode Network data excel so that after the dataset merger,
those originally with ties in this network will be valued at 2 in the new value
dataset.
c.
Adding the two
network matrices so that those who share both types of ties are valued at 3.
2.
At this point, we
can dichotomoized the newly value One-Mode dataset at tie strength greater than
or equal to 3 to produce a binary dataset with which we can perform the
previous leadership analysis, including Egonet.
a.
We will have clearly identified leadership,
emerging leadership and brokers. You can
trace backward, in the previous networks for their Egonet for more insight.
b.
We will also
need to refer to context and industry information to evaluate if these names
make sense.
Revisiting the Two-Mode Dataset for company analysis:
1.
It would also be
interesting to look at which companies have the most connections via shared
employees. This would represent a flow
of information, institutional knowledge, and expertise from one company to the
other, without specifying in which direction.
We can identify the companies with the highest number of employees who
have the most connections, to suggest that these companies are best positioned
to succeed moving forward, if not already because they can channel their
employees’ networks.
2.
Cross-referencing
this list of companies with the list of leaders, emerging leaders and brokers
will provide a single indicator of companies that are well placed in the
industry.
3.
Our hypothesis
is that companies that employ leaders, emerging leaders, and brokers AND their
Egonets may be the most well-tuned to succeed.
4.
We also want to
look for outliers. These are companies
and individuals who perhaps, have their own network and need one bridge to tap
into a larger network. It would be worth
paying attention to them moving forward in monitoring the industry.
5.
Lastly, it is
important to return to the context analysis to evaluate if these findings make
sense. Some of the results will be
obvious, others not so much.
Conclusion
Using SNA, we would have identified
emerging leaders in the commercial drones sector through selecting for the networks
around them, identify companies that employ these well connected leaders, and
identified companies that may be in position to generate a founder’s myth of
their own.
1 comment:
What an intriguing idea: using SNA to test the validity (if that's the right word) of the "founder's myth." You've thought this through well, except for one thing: the network question. Yes, you have some good ideas about creating nets out of two-mode data, or attributes. But with such a large sample, I would think you could find some affective connections, like actual or aspirational collaborations ("who would you like to work with,) on the personal level. It would seem to me that these personal connections would be very helpful, if not necessary, to support your hypothesis about the myth, no? My point is that myth-spreading is done from personal relationships, not necessarily between those who have common attributes.
But this is all in the nice-to-have category. You could come up with interesting conclusions from the data and approaches you've so nicely described.
Post a Comment