Identification of high risk depression patient via SNA
method
Xin Li-MBA-2016
Problem/Challenge
Depression is a very common mental disorder. According to WHO, there are approximate 350
million people that are diagnosed as depression covering all age groups. Depression is also one of the major
reason that people get disability. Furthermore, depression can even cause
patient to suicide.
Even though there are some effective therapy for depression
patients, less than 50% of depression patients get effective treatment globally. In many countries, only less than 10% of
patients get effective treatment. One of
main reasons is that identifying people in high risk of depression is very
difficult. Therefore, finding a method
to identify high risk depression patient has significant meaning for public
health and human wellness.
Social network analysis method and development of social
media with big data provides us a way to resolve this issue.
What data do I need?
Easy/hard to get?
Data needed:
|
Easy /Hard to get?
|
Reason why I need this data:
|
How will I deal with data?
Coding.
|
Name
|
Easy
|
Name is important biometry information to identify people
|
Recorded as biometry information.
|
Gender
|
Easy
|
Woman has more possibility to get depression.
|
Weigh more for female
|
Job title
|
Easy
|
Some high pressure industry may has high risk that lead to depression.
|
High risk: 3
Mid risk: 2
Low risk: 1
No risk: 0
|
Living country
|
Easy
|
People from some country in cold area has high risk like ones in
Russia, North Europe. And also there
some political and nationality reason that cause high risk of depression.
|
Weight for high risk countries.
|
Historical post
|
Easy
|
High risk people like to post their own angry or over-excited feeling
in social media.
|
Filter key word and rate them as weight.
|
On-line interval
|
Easy
|
The possibility to get depression is higher for people who stay long on-line
than the one for people who stay shorter on-line. On the other hand, longer
on-line time means less sleeping time that is potential symptom of
depression.
|
> 12h 5
8-12h 4
6-8h 3
4-6h 2
2-4h 1
0-2h 0
|
Response time and frequency for reply and like/unlike
|
Easy
|
The faster and the more frequently response, like and unlike, the more possibility to get depression
|
Frequency=FR
Response time=RT
Higher FR/RT means higher risks.
FR/RT percentile:
99% 3
97% 2
68% 1
|
Family connections
|
Easy and hard
|
Genetic factors: If subject has family connections who are identified
as depression patients, the subject has higher risk. The connections information are easy to get,
but it’s hard to get depression patients’ information.
|
SNA method
|
Employment status
|
Easy
|
Unemployment subject has higher possibilities of depression.
|
Yes 0
No 1
|
If Heart disease?
|
Hard
|
Heart disease patient has higher possibilities of depression.
|
Yes 1
No 0
|
History of losing family or horrible accident
|
Hard
|
The event of losing family or horrible accident can cause depression
disease.
|
Yes 1
No 0
|
What will be the most
important network measures? What will the SNA help me do?
Once I calculated all the weight, I will combine the weight
information for each subject I got for first round analysis. I can label high weighted subject, for
example bigger size, and analyze the social network via SNA method.
If I can find some subjects with bigger weight who have
family connections with some confirmed depression patients, the subjects might
have high risk.
Next steps are to further investigate each risk factors of
the selected subjects. Based on the
attributes of subjects, check his/her historical posts to further evaluate.
With healthcare professional’s help, we can identify the
high risk subject more scientifically. And then provide interception treatment
proactively.
Let’s make the world happier!
1 comment:
You do a great job of laying out the data you need and evaluating its availability and how you'll deal with it. For an ordinary statistical analysis of a dependent variable, this would have been very good.
The problem is, you don't really have a network here; what you're doing is figuring out how to identify individual nodes but without any consideration (that I can see) for the network effect. You talk about calculating "all the weight," but it's unclear what that refers to, or means, and there are no network measurements mentioned other that "the SNA method," whatever that means.
You might have given some thought to what kind of networks could give researchers some insight into the incidence of depression. I'm not talking about individual node attributes, like the ones you mention, but some kind of co-occurrence (e.g. served in the military.)
You have done a really good job of laying out the data, as I said, but more consideration of the network aspects would have made an OK post much better.
Post a Comment