Authors: Ng Annalyn
History books are written by people, which could be subject to author biases. But what if we could assemble a more candid account of history from news archives using computational techniques? This study sets out to prove this notion through a test case examining the history of Singapore. A total of 20,160 news articles, containing mentions of Singapore between 1955 and 2014, were harvested from The New York Times. The content of these articles was distilled into topics using Latent Dirichlet Allocation, and topic proportions were aggregated across time. Results revealed gradual shifts in the type of news topics associated with Singapore, such as decreased mentions of agriculture and increased mentions of air travel, signaling her progress from a third to first-world nation. Unexpected insights were also discovered, such as while only articles with mentions of Singapore were analyzed, these articles frequently contained co-mentions of other countries, subtly revealing one key factor in Singapore’s success – global connectivity. Besides enabling us to better understand the past, topic trends appeared to have predictive properties, hence they might also be used to forecast future events. In sum, this cross-disciplinary study demonstrates how unstructured data in qualitative fields, such as history, could be systemically studied using quantitative techniques.
Keywords: History; History; Political Science; Sociology; News; Topic Modeling; Computational Methodology