By Tim Jones

The ability to search through colossal amounts of data with a few key strokes is one of the most powerful gifts of the digital age. While vastly improving the standard of common knowledge the world over (with no foreseeable limit to this trend), we have opened up areas of research that would be too arduous for humans, or simply never imagined before the rise of digital data analysis. An awesome example of this is Google’s Ngram Viewer, a corpus of digitised texts containing around 6% of all books ever printed. Linguists use it to track changes in language through time, e.g. the usage of “burnt” vs “burned” or the emergence of phrases such as “it takes two to tango”. I’ve used it to track the occurrence of four words between 1800 and 2000; physics, chemistry, biology, and geology. There are some interesting correlations that can been drawn between trends in word usage and the timing of developments and discoveries in these fields of science. For example, geology begins its greatest period of growth from the year 1829, one year before Charles Lyell began publishing his seminal work, Principles of Geology.


Thomas Kuhn, physicist and one of the great philosophers of science, claimed that scientific revolutions involve a paradigm shift, whereby a new discovery is found to be incompatible with an existing understanding of nature, and thus changing the basic assumptions of science to make way for a new paradigm. To highlight the difference between an emerging theory and a dying one, I’ve also made a plot of ‘plate tectonics’ and ‘expanding Earth’. What trends and turning points can you see, and do they relate to real developments in science?


To play with the viewer yourself, follow the link: Google Ngram Viewer

*normalised to number words in the corpus for each year