Cline Center Event Data White Papers
Coup d'états are important events in the life of a country. They constitute an important subset of irregular transfers of political power that can have important and enduring consequences for a country’s well-being. This notwithstanding, a comprehensive and well-documented inventory of coups has yet to be compiled. The Coup D’état Project (CDP) is an effort to fill that void. It is a two-stage project and the second stage is still in progress. The first stage involved compiling and documenting existing information on coups from various sources, as well as generating quantitative data on each coup. This white paper provides a succinct definition of a coup d’état used by the CDP, an overview of the typology that was developed to differentiate among coup types, and an outline of the quantitative data collected for each coup. Reliability checks for the coup data collected are reported in Appendix I.
Download Report. (Length: 12 pages / Size: 102K)
(Updated: 3:46 PM 9/17/2013
White Papers Regarding the Social Political, and Economic Event Database (SPEED) Project
This document provides an introduction to, and an overview of, the SPEED Project's Societal Stability Protocol (SSP). The SSP's aim is to generate event data that will advance our understanding of civil unrest in the post-WWII era. The SSP's focus is on human-initiated destabilizing events, which are defined as happenings that unsettle the routines and expectations of citizens, cause them to be fearful, and raise their anxiety about the future. The SSP's destabilizing event ontology contains four Tier 1 categories (political expression events, politically motivated attacks, disruptive state acts, and political power reconfigurations). Because of the enormous variations that exist across and within these broad categories, advancing our understanding of civil unrest requires a good deal of event-specific information (who, what, where, when, how, why, etc.). The SSP was created to collect this information and the purpose of this paper is to provide an overview of its design and structure.
Download Report. (Length: 52 pages / Size: 569K)
(Updated: 12:01 PM 6/4/2012)
SPEED is a technology-intensive effort to collect a comprehensive body of global event data for the Post WWII era. It is a protocol-driven system that was designed to provide insights into key behavioral patterns and relationships that are valid across countries and over time. SPEED's Societal Stability Protocol has been the focus of most developmental work at this stage in SPEED's development. There are a number of highly regarded event data projects that exist throughout the world that have also been designed to shed light on societal stability; this document compares SPEED with a number of the more prominent ones. It is organized into two sections. The first describes SPEED's distinctive features: its global news archive, the comprehensiveness of its event ontology, its search technologies, the richness of the information collected on individual events, and its training and quality control capacities. The second section compares SPEED with other event projects.
Download Report. (Length: 31 pages / Size: 374K)
(Updated: 12:34 PM 6/4/2012)
Creating a valid and reliable body of event data requires meeting a number of challenges (clearly defining the events to be studied, developing reliable sources of information on those events, identifying source documents with relevant information, etc.). The fact that most event data projects, including SPEED, use news reports as the source of information on events generates an additional set of challenges. Some of the most important of these are cognitive challenges involved in transforming textual information contained in news reports into event-centered information. This document outlines these challenges and how they are addressed within the SPEED project.
Download Report. (Length: 11 pages / Size: 231K)
(Updated: 12:16 PM 2/21/2012)
This whitepaper offers a brief introduction to the BIN system of the Social, Political and Economic Event Database (SPEED) project. BIN provides automatic document categorization of highly nuanced topics across massive-scale document archives. The BIN system allows a group of trained human editors to present the computer with a relatively small collection of hand-categorized documents representing a given topic. It uses the semantic characteristics of these documents to develop a statistical model that is capable of identifying other documents on that same topic from the Cline Center global news archive, which contains tens of millions of news reports. Tests have shown that BIN has a false negative (incorrectly discarded relevant documents) rate of 1-4%. This paper outlines the basic premise and motivation behind BIN, its development, and its application to the SPEED project.
Download Report. (Length: 12 pages / Size: 306K)
(Updated: 11:56 AM 3/9/2012)
This document is designed to provide operators of the EXTRACT suite of programs with an accessible guide to the definition and meaning of events intended to be captured in the Societal Stability Protocol (SSP) with the Social, Political and Economic Event Database (SPEED) project. It is a companion document to "The SPEED Project's Societal Stability Protocol: An Introduction for Operators of the EXTRACT Suite of Programs." Creating an archive of reliable event data using a large number of operators over an extended period of time requires that operators employ shared meanings of the events. This document is intended to provide the basis for that shared meaning.
Download Report. (Length: 48 pages / Size: 325K)
(Updated: 1:24 PM 6/4/2012)
Destabilizing events - whether they are political expression events, politically motivated attacks, disruptive state acts, or some other manifestation of discontent - can vary enormously in their intensity. It is important to capture differences in intensity because they can affect the impact of seemingly similar events or the reactions of others to those events. The SPEED project's Societal Stability Protocol captures a great deal of information on what can be considered "intensity indicators." These indicators include such things as the type of weapons employed, the number of protesters, the number of people killed/ injured, and the number of people arrested. Developing composite measures of intensity is complicated because different sets of intensity indicators are relevant for different types of events. This document reports the procedures that were used to derive intensity measures for the different categories of destabilizing events recognized in SPEED's Societal Stability Protocol.
Download Report. (Length: 17 pages / Size: 163K)
(Updated: 1:21 PM 6/4/2012)
Destabilizing events such as those captured by SPEED's Societal Stability Protocol (SSP) - protests, politically motivated attacks, disruptive state acts, mass movements of people, irregular transfers of political power - do not happen in a vacuum. Rather, most are rooted in something. Developing the capacity to identify the origins destabilizing events can potentially lead to important advances in our understanding of civil unrest. It can also broaden the utility of event data and greatly enhance their explanatory potential. This document outlines the definitions and rationale for the Event Origins fields in the SPEED Societal Stability Protocol.
Download Report. (Length: 36 pages / Size: 183K)
(Updated: 1:19 PM 6/4/2012)
Extracting information from global news reports in the post WWII era makes it possible to capitalize on the billions of dollars that have been invested in reporting on newsworthy events during that timeframe. It also offers unprecedented opportunities to improve our understanding of important societal developments and processes. Developing empirically well-grounded insights into these matters, however, requires that the data extracted from news reports are robust and dependable. This paper analyzes the data generation process employed by SPEED's Societal Stability Protocol. We first outline our approach to providing for quality control in data generation and then we discuss our approach to gauging data reliability. We also report the results of several reliability tests that have been conducted 2009. The results indicate that our coders meet basic social science standards. Coders reliably identify 72-85% of all relevant events, and accurately code the information on those events 75-89% of the time.
Download Report. (Length: 14 pages / Size: 187K)
(Updated: 5:15 PM 6/1/2012)
The SPEED’s project capacity to contribute to advances in social research is bounded by the breadth and depth of its information base and the tools employed to mine data from that information base. If events are not captured in an information base – or identified using data mining tools – then they cannot be used for research purposes. Consequently, we employed a range of technologies to enhance both the scope of the information base and our ability to mine relevant data embedded in it. The first section of this document provides a brief overview of SPEED’s global news archive. While the techniques used to identify news reports relevant to a particular SPEED protocol are reported elsewhere,1the remaining two sections assess the adequacy of SPEED’s information base and the power of it data mining tools. These assessments focus on destabilizing events that are relevant to SPEED’s Societal Stability Protocol (SSP). The adequacy of SPEED’s information base is gauged by comparing it to Armed Conflict Location and Events Dataset project, or ACLED (Raleigh et al. 2010). ACLED focuses largely on African countries beginning in the mid 1990’s, but it draws from an encompassing set of news sources, including local ones. The power of SPEED’s data mining tools is assessed by comparing SSP event counts with the World Handbook of Social and Political Indicators (WHSPI, Taylor and Jodice 1983), which employs wholly human-centric procedures to identify relevant events. We find SPEED’s information base to be comparable to ACLED’s and SPEED’s data mining tools to be far superior to WHSPI’s.
Download Report. (Length: 38 pages / Size: 253K)
(Updated: 12:43 PM 4/2/2013)
This document outlines the procedures and criteria used to delineate episodes of civil strife for 164 countries in the world for the period from January 1, 1946 to December 31, 2005. Prior research has been handicapped by a lack of data on civil strife events and defensible criteria for differentiating major episodes of civil strife from others. We use an inductive, iterative approach that builds on the work of the Political Instability Task Force and the Social, Political and Economic Event Database project (SPEED). We integrate event data from SPEED’s Societal Stability Protocol (SSP) with PITF episodes, which are organized on a country-month basis. An inductive approach is used in conjunction with the integrated data because concrete standards are not available for identifying major episodes of strife and no well-developed body of theory exists from which these standards can be deduced. An iterative approach is needed because the integration of SSP data with the PITF episodes revealed that PITF’s use of subjective, holistic judgments in demarcating the temporal boundaries of their episodes introduced a great deal of measurement error. To enhance the utility of PITF’s episodes, the integrated SSP data was used to de-construct PITF episodes and then re-specify them. The re-specified episodes provide the basis for inductively generating more refined criteria for identifying major episodes of civil strife as well as less disruptive, yet still noteworthy, episodes.
The first main section introduces the approach used to derive the criteria for identifying major episodes of strife. The second main section applies the inductively derived criteria to the countries being studied.
Download Report. (Length: 24 pages / Size: 156K)
(Updated: 3:19 PM 3/13/2014)
News media provide a unique source of information on important societal developments, both contemporary and historical. Consequently, over the past forty years, social scientists have attempted to utilize media data to study important questions in a number of fields. But these efforts have been subjected to sobering critiques in an on-going debate over the utility of media data in social science research. The advent of the Information Age has both raised the stakes of this sustained debate and restructured it. Over the past several decades we have seen the emergence of the Internet, the rise of news websites, the widespread availability of digitized news reports, and the creation of 24x7 news stations. These developments have led to unprecedented increases in the volume, scope and accessibility of news reports. Advances in data science and computational capacity have greatly enhanced the ability of researchers to process information embedded in those news reports.
The confluence of these developments has laid the groundwork for third-generation media data projects that have the potential to generate major advances in several fields of research. But the implications of these Information Age developments for the sustained debate over the utility of media data have not been explored and, without a better understanding of those implications, the potential of third-generation projects may never been fully realized. Thus, this paper re-examines the on-going debate over media data in light of these recent developments. We begin by summarizing the key issues raised by critics and asserting that they identify three sets of problems with media data: a comprehensiveness problem, an identification problem, and a distortion problem. In the second main section of the paper we decompose each of these issues and assess their implications for contemporary research employing media data. In this assessment we focus on civil strife research, but most of the main points pertain beyond this field. We also discuss the potential for the remediation of the problems that posed serious threats to the utility of media, with an emphasis on third-generation research efforts like the Social, Political and Economic Event Data (SPEED) project.
Download Report. (Length: 30 pages / Size: 156K)
(Updated: 1:17 PM 2/27/2014)