Click here to login if you're an NAE Member
Recover Your Account Information
Author: Jing He, Xiaohui Liu, Guangyan Huang, Michael Blumenstein, and Clement Leung
“Big Data” refers to the very high volume and complex information that requires new tools and capacities for storage and analysis. Although the concept of Big Data arose late in the digital age as computer power, storage, and data use expanded to a previously unimaginable extent, Big Data have always existed in the natural and predigital world. For example, archaeologists and geologists extract Big Data accumulated in DNA sequences and ice cores to study the evolution of the natural world over millions or billions of years. Cameras, personal computers, and CDs all capture Big Data. And if every Internet user is considered a sensor on the Internet, social media (e.g., Twitter, Facebook) can be considered a Big Data collection network.
In the 21st century researchers are constantly developing new methods for Big Data to help professionals in many fields better understand the world. Increasing literacy and use of information and communication technology worldwide are spurring exponential growth in the production and transmission of information, all of which has the potential for social and economic benefits through the application of Big Data technology.
In this article we present an assessment of current trends and future plans for Big Data development in five Commonwealth countries on five continents: Australia, Canada, India, South Africa, and the United Kingdom. The Commonwealth of Nations is an intergovernmental organization of 53 states that were mostly territories of the former British Empire. Member states have no legal obligation to each other but are united by language, history, culture, and shared values of democracy, human rights, and the rule of law. The Commonwealth covers almost a quarter of the world’s land mass and includes almost a third of the world’s population, and in 2012 its nominal gross domestic product (GDP) was $9.8 trillion,1 15 percent of world GDP.
We begin by characterizing Big Data and their value, and then consider the current and future development of Big Data in the selected countries, looking particularly at the role of government, industry, and education. We then turn our attention to overarching considerations, including concerns to be addressed to ensure the realization of the benefits of Big Data.
Characterizing Big Data
A “4Vs” management model has been proposed for Big Data (Laney 2001) to handle increasing volume (amount of data), velocity (speed of data), variety (of data types), and veracity (accuracy of data).
One way to comprehend the potential magnitude of Big Data is the output of the Square Kilometer Array (SKA; www.skatelescope.com), a diffused radio telescope involving hundreds of dishes and hundreds of thousands of antennae connected by optic fiber and colocated in the United Kingdom, South Africa, and Australia. It is designed to produce data equivalent to 100 times global Internet traffic, and to process that volume the SKA supercomputer will need to perform 1018 operations per second—equivalent to the number of stars in three million Milky Way galaxies (BIS 2013, p. 16)!
The challenge of Big Data arises not only from their volume but also their variety—Big Data such as social media interactions, rich media files, and geospatial information are both structured and unstructured, which exacerbates demands on infrastructure.
The vast majority of new data being generated are unstructured, so the ability to handle large unstructured volumes is vital. For example, the development of the Internet of Things could mean that by 2020 sensor data will be created continuously by up to 50 billion connected devices across the globe, vastly increasing the amount, variety, and complexity of data available for analysis (Shilovitsky 2013).
Data are the oil of the 21st century, but unlike oil there is no danger of a shortage (Shilovitsky 2013). Each day 2.5 billion gigabytes of data are created—enough to fill more than 27,000 iPads per minute (BIS 2013). In addition to experimental modelling and simulation, satellite imagery, road traffic cameras, video surveillance, online banking and sales transactions, and healthcare monitoring systems all contribute to Big Data continuously with relentless velocity.
Data accuracy, or veracity, represents a major challenge. Data cleaning to address contamination and inaccuracy is always essential, and these problems are compounded and amplified with Big Data. In India’s 2014 general election, for example, some voters were listed as 19,545 years old and others zero years old. Identical voter names (327,000 women named Sita live in Bihar alone) further complicated the process. Yet, paradoxically, it was presidential candidate Narendra Modi’s combination of technologies—Big Data analytics and social media—that separated him from the other candidates (Pansare 2014). His electoral success has been regarded as the success of Big Data technology.
The Value of Big Data
Big Data have the potential to increase the wealth of both individuals and businesses as well as government efficiency, industry competitiveness and profitability, and the exploratory power and effectiveness of research. These opportunities require sustained development and investment in new tools and techniques for data analytics, but once achieved will help to reduce costs and improve services and marketing.
Efficient Big Data analysis reduces the cost of computing. When the human genome was first fully sequenced in 2000, the project cost over $850 million, whereas with developments in health and information technology (IT) it will soon be possible to sequence a human genome for less than $2,000, revolutionizing the data available to researchers and the care of patients (BIS 2013).
Improved Service Targeting
Australia’s personal eHealth record system involves a personally controlled, secure online summary of individual health data. The system allows doctors, hospitals, and other healthcare providers to view and share a patient’s health information to provide the best possible care.
Marketing to Consumer Preferences
With Big Data analytics it is possible to streamline and simplify consumers’ purchases based on information about their habits and previous use of goods and services. For example, a leading Indian provider of Big Data and analytics solutions, Mu Sigma, has used video cameras in shopping centers to detect consumers’ purchase intentions, actions, and satisfaction and, based on analysis of the resulting data, provide customers with targeted offers and the best deals (Viswanathan and Nitin 2014).
The Present and Future of Big Data in the Commonwealth
Many Commonwealth countries are positioning themselves to take advantage of Big Data opportunities. We review Big Data practices in Australia, Canada, India, and the United Kingdom—the Commonwealth’s four largest economies—as well as South Africa to extend our review to countries on five different continents and give a more complete picture.
The United Kingdom is investing heavily in Big Data research and development (R&D), India’s new Fourth Paradigm Institute focuses on large-scale modelling and computation, and Canada has adopted an industrial strategy in which Big Data are a key element. In the fast-developing countries of India and South Africa service industries contribute strongly to the economy, and Big Data offer huge economic benefits to the service sector.
In the following sections we focus on three broad areas for Big Data applications: government, industry, and education and training. The status of each country in these areas varies considerably; we present only highlights and major factors as applicable.
Governments are both key drivers and beneficiaries of Big Data capability, hence government strategies are critical for national Big Data development. Governments can also broadcast the importance of Big Data, an important role as they need business leaders to be aware of the potential benefits of Big Data analytics as well as citizens who understand how data can be interpreted and used.
As technology continues to develop apace, governments must invest heavily in R&D, infrastructure, and skilled people. By investing in and implementing Big Data development, training, and technologies, government service efficiency can be transformed to deliver better outcomes for citizens and better value for money for taxpayers.
The United Kingdom led Commonwealth countries in releasing a new Big Data strategy in 2013 that aims to build the country’s capability to exploit data for the benefit of citizens, business, and academia, and the UK Department for Business, Innovation, and Skills (BIS) announced investment of $324 million to develop Big Data technology (BIS 2013). The government’s Information Economy Council and E-infrastructure Leadership Council will oversee implementation of the strategy and develop plans to support this vision.
In 2009 the British Ministry of Finance registered an official Twitter account and began publishing all government spending data online (BIS 2013). Through an Economic and Social Research Council the UK government has provided significant funding for research on deidentified administrative data, routinely collected by government departments and other agencies, as well as other public or business datasets (ESRC 2013). And in the 2014 budget the government announced a further investment of $72 million to set up the Alan Turing Institute to help ensure that the United Kingdom leads the way in Big Data research.
Canada has many attributes that make it a plausible leader in this area: good infrastructure, cheap and clean energy, widespread broadband facilities, a stable political environment, commitment to privacy preservation, preferential policies for Big Data industries, and a mostly cool climate that allows cheap natural cooling for energy-hungry data centers. In addition, the government began implementing a Big Data strategy in 2007, and in 2012 unveiled the joint IBM Canada Leadership Data Centre, one of the nation’s most advanced computing facilities. It boasts the latest advances in energy-efficient data center management and aims to help organizations efficiently manage growth while reducing costs and mitigating risk.
Australia’s economy is heavily reliant on mining, agriculture, and education, fields that require sophisticated IT infrastructure and can benefit substantially from Big Data analytics. The country is an open data “trendsetter,” together with Canada, France, the United Kingdom, and the United States.
In 2012 the Australian Government Information Management Office released a strategy describing the productivity benefits of Big Data analytics, such as more accurate prediction of policy outcomes and improved targeting of key services for citizens (AGIMO 2012). Government agencies rolled out 93,000 terabytes of storage between 2008 and 2012 to cope with increasing data production. Australian decision makers will soon analyze Big Data to “model different policy options and more accurately predict the outcomes of policies before they are implemented and use this information to inform and improve the policy development process” (AGDFD 2013, p. 11). For example, a Big Data Working Group will work in conjunction with the Australian Tax Office’s Data Analytics Centre of Excellence to help government agencies take advantage of Big Data (AGIMO 2012).
South Africa currently lacks the infrastructure and other prerequisites for the development of Big Data. Massive investment in communication equipment is needed, and the country is deficient in skilled IT staff and licensed software.
Notwithstanding South Africa’s disadvantages, its companies and other enterprises, from health insurance to financial services, are beginning to recognize Big Data opportunities. Some industry executives predict that Big Data will bring revolutionary changes to the country.
Access to storage power, computing power, and connectivity is an important prerequisite for Big Data analytics as is the capability to store data safely but accessibly. India has the lowest average Internet speed in the Asia-Pacific region (Press Trust of India 2014), but the newly elected government has pledged to focus on electronics manufacturing and connecting broadband across India (The Hindu 2014). India’s Open Government Data Platform (data.gov.in) is intended to be used by government ministries and departments to publish datasets, documents, services, tools, and applications for public use.
The UK Centre for Economics and Business Research estimates that the Big Data marketplace could create 58,000 new jobs in the United Kingdom between 2012 and 2017 (Cebr 2012), and a recent report estimated that the direct value of public sector information to the UK economy is around $3.1 billion per annum, with wider social and economic benefits bringing this to around $11.7 billion (BIS 2013). The government actively promotes the UK data storage market overseas and encourages the exploitation of Big Data in key industrial sectors.
A recent survey of major Canadian financial services, retail, telecommunication, and utility companies showed that fewer than half were engaged in data analysis, a quarter were just beginning to think about Big Data, and 15 percent had no Big Data plans (Wallis 2012). This technological lag has major economic implications: systemic underinvestment in IT is a major reason why Canada’s GDP per capita is nearly $10,000 less than that of the United States (Wallis 2012).
At present, the companies taking advantage of Big Data opportunities in Australia tend to be foreign-owned rather than domestic. Few Australian Big Data companies are providing data discovery tools to customers and businesses around the world. Australian businesses are largely focused on improving data center economics. To achieve this, many are considering virtualization-led consolidation to migrate 3–5 physical sites into 1–2 sites, a transition that may involve new data center construction (AGDFD 2013). The Commonwealth Bank and retail giants Coles and Woolworths are pioneering the Big Data movement in Australian industry (Macaskill 2013).
India has a booming service industry that includes major IT outsourcing companies. The country’s Big Data industry is predicted to grow in value from $200 million in 2012 to $1 billion in 2015 (Economic Times 2012).
South Africa’s challenges, such as inadequate technological infrastructure and economic and human resource scarcity, exacerbate concerns associated with Big Data such as privacy, imperfect methodology, and interoperability. Significant development of Big Data in South Africa will take considerable time. Nevertheless, positive signs exist. In January 2014 IBM announced that it would cooperate with a leading South African bank to set up a forecasting system that integrates real-time data collected from social media to create more convenient customer information (IBM 2014). The company expects that seamless docking of Big Data between banks and social media can be realized in the near future.
Education and Training
In the development and use of Big Data, the overall quality of a country and its skilled workforce is crucial. Moreover, citizens should have an understanding of data—how they are created and stored—and confidence in their use and in individuals’ privacy rights.
Many students do not understand the career paths and benefits of a career in data analytics, partly due to the fact that career routes are less well established than for other professions, such as law or medicine, for which information is more readily available and skillsets do not change as rapidly. Therefore, Big Data education should be enhanced at all education levels. Schoolchildren must be equipped with basic mathematics and analytical skills, the wider workforce informed of developments in data use, and doctoral students working at the cutting-edge of data analytics funded to continue to expand Big Data’s horizons.
India’s strong education culture will benefit its Big Data development. The country is already producing skilled professional data analysts who are highly valued on the world market (Agarwal and Nisa 2009). And there is evidence that such academic adaptations are taking place in other countries. In the United Kingdom, the government is reforming school curricula on computing and mathematics (BIS 2013), and the introduction of an A-level course on data analytics is being recommended by the Royal Statistical Society (Porkess 2013) to lay solid foundations for the data analysts of the future. In Canada Simon Fraser University recently created the country’s first master’s degree program to provide students with skills in advanced Big Data analysis. And 20 major universities in Australia now offer postgraduate database/data mining courses; RMIT University, for example, offers a specific Big Data course.
The United Kingdom faces a shortage of human capital in Big Data analytics (BIS 2013), as does Canada, where the IT industry will be short at least 100,000 skilled workers by 2016 (Pilieci 2012). Australia, on the other hand, has become one of the three largest immigration targets for Big Data professionals. And in India, both the relative low cost and the quality of talent are key factors in the success of the country’s data analysis professionals.
Big Data as a Multidisciplinary Field
The greatest potential for innovative use of Big Data lies at the boundaries of disciplines. Big Data algorithms can often be applied to multiple areas. For example, a UK company, McLaren Electronic Systems, is using real-time data system expertise, developed from Formula One, to help Birmingham Children’s Hospital improve the monitoring of children in intensive care (BIS 2013).
Use of Big Data in Research
Despite the many advantages of Big Data, it is important to note that the sample bias problem remains: a large amount of data cannot guarantee accurate results. In addition, choosing different analytics algorithms in Big Data will affect results; the algorithm is objective but determining its parameters is subjective. Different applications drive various algorithms for analyzing the data from different viewpoints.
Unlike many other forms of scientific enquiry, Big Data analytics may not begin with any hypothesis. Different analysts may come up with different hypotheses and approaches. Data analysis is thus a creative process that is not always exactly repeatable, and this is one of the biggest challenges in Big Data analytics. Furthermore, data do not always speak for themselves, and common sense and domain-specific knowledge are often required when analyzing Big Data.
Energy-Efficient Computing for Big Data
With rising energy prices and global action to reduce carbon emissions, energy-efficient computing—the use of smart algorithms to get the same result with less effort—is a key technology. Making good sense of data is challenging at the best of times, and analysis of Big Data is often considered more art than science. “Trial and error” is the norm, thus requiring extensive processing to achieve useful results. Enhanced and more efficient computing capacity is also needed because traditional low-throughput analysis cannot cope with Big Data—it is actually preventing or dramatically slowing the discovery of valuable business intelligence, cures to serious diseases, and great scientific insights.
In addition to country-specific challenges described above, the following concerns about Big Data need to be addressed:
The growth in use of Big Data and associated discovery tools is expected to continue as industries seek better, more cost-effective ways to derive meaning from their Big Data. Many new data mining tools provide immense opportunities not only to manage the growing volume, variety, and velocity of data but also to turn the data into value.
Whatever a country’s current Big Data capacity and performance, it must develop and plan for the future to maintain international competitiveness and build capability to realize benefits for its citizens, business, academia, and government. While Big Data generate new benefits, they need to be considered in their social, economic, and political context. Countries must learn how to handle the volume of Big Data and bring different datasets together, combine them synergistically, make more data open but safe, and gain new knowledge and insight through continuing data analytics R&D. They must also balance the drive for innovation and growth with the rights and privacy of the individual. Cooperation between Commonwealth and other countries could facilitate advances in Big Data collection, research, analysis, management, and other activities.
This paper has provided an overview of the Big Data landscape, with particular reference to five Commonwealth countries, and of the challenges compelling government and industry to look for new Big Data solutions. Open access to Big Data and analytics can promote progress and civilization, making learning and information more widely available than ever before. With appropriate measures and privacy protection in place, Big Data capability can protect civil liberties while allowing for economic growth and innovation. Consumers benefit when they can access and use information about them held by government, utility companies, banks, and other major service providers.
The world is experiencing a Big Data revolution that will unlock undreamt-of potential for humankind.
Agarwal R, Nisa S. 2009. Knowledge process outsourcing: India’s emergence as a global leader. CCSE Asian Social Science 5(1):82–92.
AGDFD [Australian Government Department of Finance and Deregulation]. 2013. Big Data Strategy – Issues Paper. Parkes ACT: Commonwealth of Australia. Available at www.finance.gov.au/files/2013/03/Big-Data-Strategy-Issues- Paper1.pdf.
AGIMO [Australian Government Information Management Office]. 2013. Australian Government ICT Expenditure, 2008-09 to 2011-12 Report. Parkes ACT: Commonwealth of Australia. Available at www.finance.gov.au/files/2012/04/Australian-Government- ICT-Expenditure-Report-2008-09-to-2011-12.pdf.
BIS [Department for Business, Innovation, and Skills]. 2013. Seizing the data opportunity: A strategy for UK data capability. London: HM Government. Available at https://www.gov.uk/government/uploads/system/uploads/ attachm ent_data/file/254136/bis-13-1250-strategy-for-uk- data-capability-v4.pdf.
Cebr [Centre for Economics and Business Research]. 2012. Data Equity: Unlocking the Value of Big Data. Report for SAS, April. London.
Economic Times. 2012. Big data market in India to touch $1 billion: Report, September 5. Available at http://articles.economictimes.indiatimes.com/2012-09-05/ news/33616176_1_big-data-wipro-bpo-indian-bpo.
ESRC [Economic and Social Research Council]. 2013. Big data investment: Capital funding. Announcement, April 9. Available at www.esrc.ac.uk/news-and-events/announcements/25683/Big_ Data_Investment_Capital_funding_.aspx.
Hindu. 2014. Centre keen on bringing broadband connectivity to entire rural India. July 2. Available at www.thehindu.com/todays-paper/centre-keen-on-bringing- broadband-connectivity-to-entire-rural-india/article6168139. ece.
IBM. 2014. IBM supports economic growth opportunities in South Africa. IBM news release, March 31. Available at www.ibm.com/news/za/en/.
Laney D. 2001. 3D Data Management: Controlling Data Volume, Velocity, and Variety. Stamford, CT: MetaGroup.
Macaskill A. 2013. Big data: Big hype or big hope? ZDNet.com, October 1. Available at www.zdnet.com/big-data-big-hype-or-big-hope-7000021246/ .
Pansare P. 2014. How social media played game changer for election 2014 in the world’s largest democracy. Dnaindia, June 11. Available at www.dnaindia.com/blogs/post-how-social-media-played-game- changer-for-election-2014-in-the-world-s-largest-democracy- 1994841.
Pilieci V. 2012. IT shortage to hit 100,000 by 2016: IBM report finds Canada lacks skills in fastest-growing areas of industry. Canada.com, December 5. Available at http://o.canada.com/technology/techbiz/it-shortage-to- hit-100000-by-2016-ibm-report-finds-canada-lacks-skills- in-fastest-growing-areas-of-industry.
Porkess R. 2013. A World Full of Data: Statistics Opportunities across A Level Subjects. London: Royal Statistical Society and the Institute and Faculty of Actuaries. Available at www.rss.org.uk/uploadedfiles/userfiles/files/A-world-full- of-data.pdf.
Press Trust of India. 2014. India has lowest average Internet speed in Asia-Pacific: Akamai. Available at http://gadgets.ndtv.com/internet/news/india-has-lowest- average-internet-speed-in-asia-pacific-akamai-513785.
Shilovitsky O. 2013. How PLM can discover “data opportunity.” Beyond PLM, November 19. Available at http://beyondplm.com/2013/11/19/how-plm-can-discover-data- opportunity/.
Vaughan JM. 2006. Attrition through enforcement: A cost-effective strategy to shrink the illegal population. CIS Backgrounder. Washington: Center for Immigration Studies. Available at www.cis.org/Enforcement-IllegalPopulation.
Viswanathan AG, Nitin A. 2014. MuView: The Enterprise of Things (EoT) and decision flow operationalization—Next frontier for retailers. Bangalore: Mu Sigma. Available at www.mu-sigma.com/analytics/thought_leadership/decision- sciences-the-enterprise-of-things-eot-and-decision-flow- operationalization-next-frontier-for-retailers.html.
Wallis N. 2012. Big Data in Canada: Challenging Complacency for Competitive Advantage. White Paper, sponsored by SAS. Toronto: International Data Corporation (IDC). Available at www.sas.com/offices/NA/canada/lp/Big-Data-Survey2012/IDC- Big-Data-Survey-White-PaperEN.pdf.
1 Here and throughout, dollar amounts are in US dollars.