This short video provides an overview of the book and precis of each chapter, with written descriptions below along with links to full PDF versions of the preface, Chapters 1 and 4.
Preface [to read a PDF of the preface click here]
Chapter 1: Conceptualising data [to read a PDF of the chapter click here]
To supply an initial conceptual platform for the book as a whole, this chapter examines the forms, nature and philosophical bases of data. It provides an overview of different kinds of data and charts how data fits into the hierarchy of data, information, knowledge and wisdom. This is followed by a discussion of how data are framed technically, ethically, politically and economically, spatially and temporally, and philosophically, and how to make theoretical sense of databases and data infrastructures. Far from being simple building blocks, the discussion highlights how data do not exist independently of the ideas, instruments, practices, contexts and knowledges used to generate, process and analyze them. In the final section, a case is made for conceptualising data as the products of complex socio-technical assemblages and that a more nuanced analysis is needed to make sense of the unfolding data revolution than much of the open and big data literature presently demonstrates.
Key words: data, information, knowledge, wisdom, ontology, philosophy, critical data studies, socio-technical assemblage, ethics, politics
Chapter 2: Small data, data infrastructures and data brokers
This chapter examines the differences between small data and big data, and the fate of small date in an era of big data. It argues that small data will continue to be an important component of the research landscape due to their utility in answering targeted queries. However, small data will increasingly be made more big data-like through the development of new data infrastructures that pool, scale and link small data in order to create larger datasets, encourage sharing and re-use, and open them up to combination with big data and analysis using big data analytics. The chapter sets out the forms and nature of such data infrastructures, the rationale behind their creation, and the challenges in building them. In the final section, the data infrastructures of commercial data brokers and the formation on new multi-billion dollar data markets is discussed.
Key words: small data, data infrastructure, data brokers, data market, rationale, challenges, sharing, re-use
Chapter 3: Open and linked data
Given the expense and resources required to produce data and their value in revealing information about the world, access to them has generally been restricted to approved users or those willing to pay. This position has recently come under sustained critique through the open data movement that seeks to position data as a common public good that is freely accessible, and linked data advocates who desire data to be accessible across the internet in non-propriety, machine-readable formats. This chapter examines the characteristics of open and linked data, the various ways in which the case is being made for opening data, and the economics associated with making data open. In the final section, the case for opening data is critiqued, not with the intention of advocating data be kept locked in institutions, but rather to expose the politics and problems of opening data so that they can be countered.
Key words: open data, linked data, economics, rationale, critique, neoliberalism, sustainability, empowerment
4. Big data [to read a PDF of the chapter click here]
Prior to 2008 few people had heard of ‘big data’. Five years later and it had become a somewhat ubiquitous buzzword. Given its rapid development and deployment there are on-going debates as to what constitutes big data and its associated characteristics. This chapter explores the ontology of big data. It is argued that big data has seven essential characteristics — volume, velocity, variety, exhaustivity, resolution/indexicality, relationality, and flexibility/scalability — that make them qualitatively different to previous forms of data. Each of these characteristics is discussed in detail.
Key words: big data, volume, exhaustivity, resolution, indexicality, relationality, velocity, variety, flexibility
5. Enablers and sources of big data
The production of big data is the result of a convergence of technological developments. In this chapter, these big data enablers — including fixed and mobile internet, the embedding of software into all kinds of objects, machines and systems, ubiquitous computing, advances in database design and systems of information management, distributed and forever storage of data, and new forms of data analytics designed to cope with data abundance as opposed to data scarcity — are discussed. This is followed by detailing the three main sources of big data: directed surveillance and data capture; automated forms of data production, including digital devices, sensors, scanners, and interactions; and volunteered data generation including transactions, social media, sousveillance, crowdsourcing and citizen science.
Key words: internet, ubiquitous computing, NoSQL databases, data analytics, surveillance, automation, prosumption, sensors, scanners, social media, crowdsourcing, citizen science
6. Data analytics
Data are not useful in and of themselves. They only have utility if meaning and value can be extracted from them. Making sense of scaled small data and big data poses new challenges, such as linking together varied datasets to gain new insights and coping with data abundance, dynamism, messiness and uncertainty, and the fact that much of them are generated with no specific question in mind or are a by-product of another activity. The solution has been the development of new data analytics that are rooted in research around artificial intelligence and expert systems that employ machine learning techniques that can computationally and automatically mine and detect patterns and build predictive models. This chapter discusses machine learning and four broad classes of analytics: data mining and pattern recognition; data visualization and visual analytics; statistical analysis; and prediction, simulation, and optimization.
Key words: pre-analytics, machine learning, data mining, pattern recognition, visualisation, visual analytics, statistical analysis, prediction, simulation, optimization
7. The governmental and business rationale for big data
The data revolution is not unfolding in a non-ideological, passive manner. Like all revolutions, it is being driven by a powerful set of discourses, forwarded by passionate believers of the benefits of new ways of knowing and acting in the world and an alliance of vested interests who gain from its unfolding. This chapter discusses the arguments made by the proponents of big data, especially those of the business community and government, with respect to four major tasks: governing people, managing organisations, leveraging value and producing capital, and creating better places. It is contended that such rhetoric needs to be subjected to critical scrutiny to tease apart the various suggested benefits to reveal the diverse ways in which big data impact society and economy.
Key words: governance, management, value, capital, smart cities
8. The reframing of science, social science and humanities research
This chapter explores the potential effects on academic research of big data, data infrastructures and open data. In particular, it examines how the availability of big data and data infrastructures, coupled with new analytic tools, challenges established epistemologies in different disciplines — how questions are asked and how they are answered — and is leading to the creation of new fields and disciplines and paradigm shifts across the principal domains of the academy. The chapter first discusses the sciences, a potential new form of empiricism which rejoices in the notion that big data analytics is ushering in ‘the end of theory’, and the creation of data-driven rather than knowledge-driven science. It then turns to the development of digital humanities and computational social sciences, both of which propose radically different ways to make sense of culture, history, economy and society than established approaches. It concludes that there is a need for wider critical reflection on the epistemological implications of the data revolution.
Key words: epistemology, philosophy, empiricism, fourth paradigm, end of theory, data-driven science, computational social sciences, digital humanities
9. Technical and organisational issues
Whilst scaled small data and big data offer opportunitiesfor measuring and understanding the world in new, productive ways, they also pose a range of technical and organisational issues. This chapter examines in detail these issues, including the scope of datasets, access to data, the quality of data and associated metadata, data integration and interoperability, the misapplication of analytics and ecological fallacies, and skills and organisational capabilities and capacities. Whilst some of these issues can be tackled through technical and management solutions, others are more intractable and difficult to address and pose significant challenges to be overcome.
Key words: data deluge, access, data quality, veracity, lineage, integration, interoperability, ecological fallacies, human resourcing, skills
10. Ethical, political, social and legal concerns
Whilst data can be produced with the aim of making societies more secure, safer, competitive, productive, efficient, transparent and accountable, they often do so through processes that somewhat paradoxically monitor and discipline people and which can affect their life chances. Consequently, the generation of data and the work these data do are inherently infused with ethical, social, political and legal concerns. This chapter examines such concerns, including dataveillance and data footprints and shadows, privacy, data security, profiling, social sorting and redlining, control creep, anticipatory governance, technocratic and corporate governance and technological lock-ins, and ownership and intellectual property. How each of these issues is thought about is contested, with views varying within and between science, companies, government and civil society, who have differing agendas, vested interests, and political sensibilities. The discussion thus highlights how there are no easy answers to resolving these issues, and resolutions always consist of compromises.
Key words: ethics, law, data shadows, dataveillance, privacy, data security, profiling, social sorting, control creep, anticipatory governance, technological lock-ins
11. Making sense of the data revolution
The final chapter sets out an indicative road map to making sense of data and the data revolution given present gaps in conceptual thought and knowledge. It argues that such sense making needs to occur in two related ways: first, through philosophical reflection and synoptic, conceptual and critical analysis; second, through detailed empirical research concerning the genesis, constitution, functioning and evolution of data assemblages. The chapter conclude with some final thoughts on the data revolution.
Key words: data assemblages, data revolution, ontology, epistemology, philosophy, methodology, normative thinking, genealogy, ethnography