Wednesday, December 1, 2010

Social Network Analysis for Open Source Software (OSS) Communities

What is Open Source Software (OSS)? Over the last few years, there has been a world of change in the way software is made available to users. OSS, essentially means that the source code of software is made available directly and freely to everybody and thousands of software programmers come together in what are called OSS communities to spend tremendous amounts of time writing and debugging the software with no direct monetary benefit. The effects of the OSS movement have been far-reaching. It has now started to become incorporated into the strategies of large corporations, who cannot afford to directly compete against software that is usually free to use. Hugely popular products like Mozilla, Linux etc are direct results of OSS collaboration, which suggests that the importance of the communities that come together to create this software cannot be ignored.

Open source software development teams, are generally comprised of volunteers working not for monetary return, but for the enjoyment and pride of being part of a successful virtual software development project. Team members often come from around the world and rarely meet one another face-to-face. The open source projects are self-organized; employ extremely rapid code evolution, massive peer code review, and rapid releases of prototype code. But often, these open source software groups are comprised of generally unaffiliated individuals and organizations who work in a seemingly chaotic fashion and who participate on a voluntary basis without direct financial incentive. One of the most commonly used terms with respect to OSS communities is “Knowledge Collaboration”. Knowledge collaboration is the key to the success of OSS communities as often, not all members have the knowledge and skills necessary for professional software development. Usually, OSS members communicate through virtual communication channels so that geographically distributed members can collaborate and coordinate their work.

One of the newly applied methods to analyze the communication in OSS communities is Social Network Analysis (SNA). SNA allows analysts to look at Actors as nodes, and to visualize the links between them so as to understand the communication between them. They also study such concepts as Degree Centrality, Betweenness, Degree Coordination & Centrality Eigenvector. The main target of these procedures is to understand who the key communicators within the network are, who the knowledge brokers are, and who links the peripheral members to the central actors. The goal of this analysis is to visualize the communication within the network, to analyze how and why some OSS communities operate more efficiently than others, and to streamline operations of other less efficient OSS communities. Within successful OSS communities also, the visual networks make it easy for the team members to train new members about the culture within the OSS communities. Thus, the application of SNA in OSS communities is becoming very important. Over the last few years, many such studies have been conducted and published. Studies, which analyze the knowledge collaboration within the community, the direct communication between members etc. A list of some published papers on such studies is mentioned as part of the sources for this post as below.

Most often, data is collected by selecting certain OSS communities (most often selected from www.sourceforge.net) by using mailing lists and surveys. The data thus collected is then analyzed using SNA programs such as Pajek or UCINET. Interestingly, some open source SNA software packages also exist! Finally, conclusions are drawn based on the visual networks that are derived from the software. The data thus collected may be used by the communities themselves in order to streamline their operations or by other communities to change the way they operate.

Though many positives can be gained from using SNA to analyze OSS communities, there do exist certain problems and limitations. These are not often publicly spoken about, but do exist. The primary problem being privacy. Information used for SNA is, as mentioned before, gained through surveys and mailing lists which require users to input their basic information. From personal experience, I have been able to gather that, in some cases, contributors to OSS communities are professionally attached in the realm of software development, to companies competing directly with the same communities. These members are, as a result hesitant to divulge direct information, for fear of being linked to the community by the companies. This problem has been spoken about by Nelson Ko, who leads Citadel Rock Online Communities Inc. Mr. Ko often speaks about SNA and OSS communities. Another limitation of such studies is that all members may not contribute to the surveys, and thus, information may be incomplete. Incomplete data, would give incomplete results.

Considering the above issues, there exist limitations to the application of SNA for OSS communities, but I personally believe that these problems are outweighed by the inherent advantages. The possibility of streamlining OSS communities to provide better quality and FREE software to users is a big motivating factor. Making high quality software available to lower income communities and economically developing regions that often are unable to access licensed software is another factor that must be considered while taking a decision on the subject. As a result I believe that the advantages of using SNA to make OSS communities operate more efficiently compensate for the innate negatives that come with it.


Sources:

1)http://www.aswec2008.debii.curtin.edu.au/ResearchSlide/Open%20Source%20Communities%20as%20Social%20Networks.pdf

2) http://se.naist.jp/achieve/pdf/171.pdf

3) http://nlp.uned.es/~juaner/papers/juaner08-oss.pdf

4) http://digitalcommons.fiu.edu/cgi/viewcontent.cgi?article=1062&context=etd

5) http://www.slideshare.net/nkoth/social-network-analysis-in-open-source

6) http://perso.univ-rennes1.fr/eric.darmon/floss/papers/VICENTE.pdf

7) http://page.mi.fu-berlin.de/oezbek/pub/OezThiPre10-SNA.pdf

8) Greg Madey, Vincent Freeh, Renee Tynan, “The Open Source Software Development Phenomenon: An analysis based on social network theory”; 2002 . Eighth Americas Conference on Information Systems.


:Yatin Mulky

3 comments:

Christopher Tunnard said...

Excellent. I haven't thought about using SNA to look at OSS communities (or Wikis, for that matter.) Just today, I saw that someone was using SNA to try to analyze the pattern of leaks from the Wikilieaks documents. Your Note 5 links to a whole collection of slide shows on the subject. Thanks!

matdrawment said...

Thanks. The slide sets were very useful when I was researching the post. Some of the research done on Wikis is very interesting. Nelson Ko, did an analysis on TikiWiki, his presentation is available here:

http://www.youtube.com/user/koth55

: Yatin Mulky

Social Network said...

These are really the most interesting posts........through this posts people will aware of OSS communities.....thank you......