Sunday, December 26, 2010

Social Network Analysis

Background


Traditional data mining techniques focus on the interpretation of numeric or ordinal data, such as dollar amounts, or relative levels of affluence. More recent efforts are delving into social data though, by extracting the implicit social information contained within this numeric and ordinal data. This involves more finesse than simply including an employee’s paygrade but looks instead at indicators such as:
  • How many co-workers contact this employee with questions or for advice?
  • Does this person call people at inconvenient hours, “receive quick callbacks,” “and tend to get more calls at times when social events are most often organised.” “Influential customers also reveal their clout by making long calls, while the calls they receive are generally short.”
  • Has “an applicant associated with known criminals?”
  • Are budget items, payment details, or orders being discussed with unauthorized personnel (as determined by email scanning)?
  • What parties or events (as listed on Facebook, mySpace, or Twitter) are to be attended by this person? 
  • Does this individual connect several unrelated social networks, or are they firmly entrenched in their clique?
  • Analyzing social information creates an “index of power”
While all of these indicators are intriguing from a data mining perspective, the scale of existing investment in such software is astonishing, “IBM… says its annual sales of [SNA] software, now growing at double-digit rates, will exceed $15 billion by 2015. In the past five years IBM has spent more than $11 billion buying makers of [social] network-analysis software.”

Richmond Police Using Facebook to Predict Crime


The Richmond police department now plans staffing to address additional crime “on paydays and when there is a full moon.” Interestingly, they pay special attention to party plans and the social networks of suspects. “Richmond’s police have started monitoring Facebook, MySpace and Twitter messages to determine where the rowdiest festivities will be.” The article mentions that the system has replaced "officers’ reliance on ‘gut feel’" and “saves about $15,000 on overtime pay, because officers are deployed to areas that the software deems ripe for criminal activity” with crime having declined significantly as a result.

Telecom Companies Targeting ‘Influencers’

Telecom firms were early adopters of data mining, given their large amount of data, customers tendency for churn and competitors willingness to offer incentives to switch. Beyond their traditional churn/retention models, telecom firms are now using their call records to identify ‘influencers’… those “subscribers [that] frequently persuade their friends, family and colleagues to follow them when they switch to a rival operator” so that they can be offered incentives to retain them and correspondingly their entire social network. Bharti Airtel, the largest mobile network in India, indicates that customer churn has been significantly reduced by applying such SNA software.

Counterterrorism Network Analysis

The cellular, decentralised structure of terror groups leads them to dedicate specific low-level operatives to memorize key addresses, phone numbers and act as Rolodexes. It is therefore more useful for intelligence and law enforcement agencies to target those “Rolodexes” for capture, than more senior terrorists. The article also credits such analysis of the phone records of Saddam Hussein’s chaueffeurs with his ultimate capture.

Personal Thoughts


The article implies that network analysis, link analysis and predictive analysis are similar, although it would be more accurate to say link analysis is a subset of network analysis, and both are subsets of predictive analysis. A substantial portion of the article is also devoted to analytical efforts to predict societal change, such as rioting, terrorist action by Hezbollah, and selecting optimal partners for encouraging social change in failed states such as Sudan. I chose to disregard those items because those efforts were all human-labor intensive, non-algorithmic, and almost entirely unproven in their accuracy, potency, or pertinence.

The strategic value of SNA though, is that it extracts greater knowledge from the same data... a capability that can quickly evolve into a competitive advantage. For example, the first telecom company to identify the importance of influencers would benefit from lower churn than their competitors, and the first company to deliberately hire employees with connections to disparate social networks would likely benefit.

The Economist. Quarterly Technology Update. "Untangling the Social Web." 02SEP2010