Big data and agent based simulation for policy analysis

“We live in a network world. Everything we do is an outcome of multiple elements. The pervasion of social media in our lives means hundreds and thousands of tweets and retweets by the minute. Gone are the times when information asymmetry was exploited,” remarked Dr Alok Chaturvedi, professor of Management and Computer Science, Purdue University while initiating a talk at ORF Delhi on Big Data and Agent Based Simulation for Policy Analysis on 8 May, 2018. The discussion was moderated by Rakesh Sood, Distinguished Fellow, ORF and a former ambassador.

With the advent of social media and its tremendous impact in our lives, information today is circulated at the speed of light. Facebook’s system processes 2.5 billion pieces of content and 500+ terabytes of data each day.

Hundreds of companies like Cambridge Analytica are trying to influence and change our thinking and our behaviours today. As consumers, we are rarely mindful about what we’re consuming on social media. Our thinking is evolving rapidly. In this age, given the amount of information, it’s reasonable to be unsure about what to trust or what data to use.

The presentation took off with a very interesting observation featuring from the latest movie Infinity Wars where Dr. Strange uses ‘Time Stone’ to evaluate about 14 million alternate futures, ultimately identifying one best suited for them. He then discussed how in the real world, big data and agent based modeling can be the tools evaluate alternate futures.

Big data has uses in business, policy, and behavior and strategy intelligence. The amount of data generated in the past 2 years is more than the amount of data generated since the beginning of humanity – your Facebook likes, comments, subscriptions, applications on your phone, satellite images, toll roads, commercial airlines are all collecting data. All this data can be used to generate synthetic environments.

Talking about the infamous Cambridge Analytica-Facebook scandal, it was brought to light how CA was able to create multiple versions of the same advertisement for different sets of population. In some cases they could create as many as 8000 versions of the same ad to influence targeted sets of population. Say for instance, a set of people were against gun ownership, content on their Facebook accounts would center around how guns could be of advantage in situations of danger.

The discussion further went on to talk about dense and sparse data. Dense data is essentially many different pieces of information about a particular subject, whatever it happens to be. For example, if in a survey people were asked a number of open-ended questions, the answers given could be adequately described as being dense. Most techniques present today are ideal for evaluation of dense data. Sparse data, on the other hand represents information that is less comprehensive. More and more techniques are being developed to make the best use of sparse data.

The talk also introduced the journey of data mining, as it has progressed from the late 1990s.

Another interesting piece of information shared was regarding how UBS used satellite imagery to gather data about the parking lots at Wal-Mart stores and then used these images to calculate Wal-Mart’s quarterly earnings. Neil Currie, the UBS analyst who studied the satellite data, concluded that there was enough correlation between the images of the parking lot and Wal-Mart’s earnings for that quarter.

A set of case studies was presented where big data analysis was used to solve problems – saudisation of workforce in the Kingdom of Saudi Arabia, stabilising Afghanistan and understanding the South China Sea conflict.

During the Q/A round, Prof. Chaturvedi was asked his thoughts on UID. According to him, UID is a terrific concept. It helps take out the illegal elements of the system. However, the process is flawed. There remain unanswered questions regarding the ownership of UID as well as the process of getting UID. Another pertinent question remains how we deal with the 400 million people living below the poverty line. Despite all the concerns and current issues related to UID, professor remained a pretty positive positon about the success of UID.

The talk concluded with issues raised regarding the legal implications involved with using Big Data. ‘How much information do you really want to share?’ It’s really a question of how much information you want to give vs your desire to remain isolated.

This report is prepared by Priyanka Mehta, Research Intern, Observer Research Foundation, Delhi

The views expressed above belong to the author(s). ORF research and analyses now available on Telegram! Click here to access our curated content — blogs, longforms and interviews.