How Public Sector Can Leverage Big Data

In this post, I want to talk about three distinct areas: Big Data, Crowdsourcing, and Public Sector. Each of the these areas is vast on its own but through this post I want to argue that it is the intersection of the three which offers unique and immense possibilities that can truly make the world a better place.

Before we get into discussing what is so special about the overlap of the three, let us go through some of the key characteristics of each:

Big Data:
  • Typically characterized by 3 Vs: Volume, Velocity, and Variety.
  • Major corporations are working to leverage opportunities created by Big Data and have had varying degrees of success with their Big Data initiatives.
  • Small and Medium Enterprises have been successful only in pockets, primarily owing to their inability to sustain the technological footprint or skills required. I should also add that due to their limited scale, SMEs also don’t always have a solid business case for making Big Data investments.
  • Governments, despite being data rich, have often been careful in how they leverage Big Data since they don’t want to be seen as acting in a “Big Brother” fashion. This probably has been a major impediment to adoption of Big Data for Public Sector, along with the paucity of funds and know-how.
  • Has been around for some time and is essentially a by-product of “disintermediation” caused by the internet.
  • Used by organizations of all sizes, big and small, for a variety of tasks.
  • Crowdsourcing has also been employed by some governments in very innovative ways. For example, the screenshot below is from, a site maintained by Ministry of Communications and IT, Government of India.

Example of Crowdsourcing by Indian Central Government

Public Sector:
  • Public sector enterprises find themselves under pressure to reduce costs and deliver greater bang for buck.
  • Across many countries, governments seem to be suffering from trust deficit, particularly after having had to put public money to compensate for excessive private sector risk-taking.
  • Public sector enterprises need to show greater transparency in execution, agility in adoption of technology, and need to prove their raison d’être through the results that they generate.
So what is interesting about the intersection of the Big Data, Crowdsourcing and Public Sector?

The possibility of raising the efficiency and effectiveness of public sector enterprises, by leveraging Big Data, through crowdsourcing manpower and skills [primarily] and infrastructure [possibly] is a concept that can truly change the world for better.

I would like to argue my case by giving an example but before I get to that, let us consider at a theoretical level why it seems like a good idea:

  • Using Big Data for Public Sector to drive decision making in a very transparent and accessible manner will allow governments to win trust. Crowdsourcing also allays the concerns around inappropriate use of data, since no malevolent objectives can be served through crowdsourcing [by definition].
  • The applications are limitless and can be public-sector-led or public-sector-leading (as in, getting governments to act). Therein lies the ability for public sector enterprises to improve their execution regardless of whether they are acting proactively or reactively.
  • At a time when there is a sense of “us versus them” between the populations and the respective governments, even in mature democracies, it will create a sense of belonging for people and make them feel a part of the governance process.
  • There is a direct cost saving associated with crowd sourcing as compared to more traditional methods of sourcing. This will directly lead to greater value creation by the public sector enterprises.
  • There are significant indirect benefits as well, such as skills building, skills identification, generating new business ideas, and innovation, all of which are extremely valuable for the governments.

Now, I would like to expand the argument further by giving an example application.

The case study that follows is probably easier to relate to for readers from developing countries where issues due to corruption are relatively more common. Having said that, it is easy to see how something similar could be employed in a different set up to the same effect. Also, while this is an example of “public-sector-leading” analytics, the same could very well be “public-sector-led” depending upon the political will of a particular administration.

Intersection of Big Data, Public Sector and Crowdsourcing – A Possible Application

In India it is fairly common for roads to have potholes. Potholes can be seen throughout the year but the problem is particularly pronounced each year during and after the monsoon (tropical rains) season. It is not uncommon for one to come across pictures such as the ones below:



In extreme cases, accidents such as the one shown below happen. Up to a dozen people die every year in accidents directly related to pothole menace.

Indian commuters gather around a truck w

Why does no one repair these roads?

Well, the answer is they do. Actually, they do it more often than they do it in most other countries! But each time they do it using substandard material (often with disproportionate amount of sand in the mix) that can only withstand the pressure of the traffic until the next time it rains and the sand gets washed away.

Cheap Material

How can they get away with it?

There is a nexus between the contractors who are supposed to fix the roads and the politicians and bureaucrats who influence the process of contractor selection. Political parties in power blame the parties who came to power before them and bureaucrats blame the politicians and it is an endless blame game. In the end, a common man can never successfully hold anyone accountable for the condition of the roads.

So what can be done?

Imagine that anytime someone saw a pothole, he/she takes a picture with “geo-tagging” on and uploads it to a particular website. This will ensure that the picture properties will contain latitude and longitude information as shown in the image below:


A simple program can be written to access geo-tagging information for each picture, while human intelligence required to “tag” the condition of the road can be “crowdsourced”. This analysis when done, will lead to a table of information such as below:

Tab 1

It is also possible to get the data about the political party in power and the contractor responsible for road building/repair for a particular Latitude and Longitude from public records. Again, the task can be crowdsourced. Once this data is available then it will lead to a table such as below:

Tab 2

Imagine, the table above with hundreds of thousands of records covering pretty much all the roads in the city. It is easy to see the power of making such a data-store available publicly. Isn’t it?

One can draw all kinds of insights from it. For example, is there a correlation between a certain political party and a certain contractor winning the bid, is there a correlation between condition of the roads and the political party in power, or is there a correlation between the condition of the roads and a certain contractor in-charge of building or repairing? The analysis can then be used by journalists and citizens to ask all kinds of questions.

It is easy to add another dimension of time-stamp (based on time when the picture was clicked) which will allow analyses such as how often does a pothole appear in the same place? Citizens or Media can then ask, “Why do the potholes reappear, and why isn’t the contractor able to fix them for good?” or “Why has a particular contractor been re-selected despite a poor record?”

While corruption remains a deep-rooted problem that will not go away with some magic Big-Data-silver-bullet, every such effort that goes towards ensuring greater transparency is a step in the right direction.

To the extent that corruption feeds on information asymmetry, every time there is something that reduces the information asymmetry, it indirectly helps reduce avenues as well as upside of corruption.

In Conclusion

The ability of using Big Data for Public Sector to provide hitherto unavailable and truly meaningful insights is real. The public sector has a lot of work to do in order to improve its effectiveness, particularly in the case of developing counties that are often crippled by mass corruption. And so, the public sector has much to benefit from Big Data. Crowdsourcing is a viable, and arguably the preferred, option for public sector when it comes to sourcing skills and manpower required to leverage Big Data.

What do you think about this article? Are there any other interesting applications that you can think of for leveraging Crowdsourcing and Big Data to improve public services? Please do share your thoughts, comments and feedback on

References and Image Credits: