Page 67 - i1052-5173-28-5
P. 67
Figure 1. Screenshot of Ngram Viewer chart showing the frequency in the Google Books corpus of the N-grams “geosyncline” and “plate tectonics,”
from 1900 to 2000. Y-axis is frequency of the N-gram in the corpus.
rise of plate tectonics and the fall of geo- The decisions to change department Haq, B.U., and Boersma, A., eds., 1998, Intro
synclines can be examined more closely by names, revise course descriptions, and ini- duction to marine micropaleontology (2nd
accessing the corpus on which the search tiate new journals described here were edition): Amsterdam, Elsevier, 376 p.
is based. In addition to the chart (Fig. 1), made before there was a Google Books
Ngram Viewer searches return links to the corpus, but these decisions were undoubt- Jurafsky, D., and Martin, J.H., 2014, Speech and
corpus on which the search is based, edly affected by trends in metrics, like Language Processing: An Introduction to Natural
binned by year of publication. Clicking on student enrollment and funding priorities, Language Processing, Computational Linguistics,
these bins opens a Google search page which are now indirectly reflected in that and Speech Recognition (2nd edition): New
with links to each publication included in database. York, Prentice Hall, 1024 p.
the corpus. The diligent researcher can
then sort through the titles and assess the SUMMARY Lyell, C., 1830, Principles of geology, being an
quality of the data on which the Ngram attempt to explain the former changes of the
Viewer chart is based. The output of Google’s “shiny new toy Earth’s surface, by reference to causes now in
for nerds” (Zhang, 2015), Ngram Viewer, operation: London, John Murray, volume 1.
OTHER USES FOR N-GRAMS IN is not sufficient to support hypotheses of
THE GEOSCIENCES causality suggested by the correlations it Lyell, C., 1832, Principles of geology, being an
generates, but its accessibility and ease of attempt to explain the former changes of the
Charting word frequency trends can use can serve an important function in Earth’s surface, by reference to causes now in
contribute to identifying directions for introducing scholars to the possibilities of operation: London, John Murray, volume 2.
research or investment of resources. In digital research (Cohen, 2010). The fre-
the U.S., a number of Departments of quency of N-grams through time maps Lyell, C., 1833, Principles of geology, being an
“Geology” became Departments of where we have been, and, mindful of the attempt to explain the former changes of the
“Geological Sciences” in the late 1970s adage, “those who cannot remember the Earth’s surface, by reference to causes now in
and early 1980s (including the department past are condemned to repeat it,” history operation: London, John Murray, volume 3.
at Michigan State University), mirroring ought not be ignored in identifying trends
the increase in frequency of the bigram in support of education, policy, planning, Michel, J.B., Shen, Y.K, Presser Aiden, A., Veres,
“geological sciences.” In 2016, MSU’s and funding objectives of our discipline. A., Gray, M.K., Brockman, W., The Google
department changed its name, again, to Books Team, Pickett, J.P., Hoiberg, D., Clancy,
“Earth and Environmental Sciences,” ACKNOWLEDGMENTS D., Norvig, P., Orwant, J., Pinker, S., Nowak,
reflecting the increase in frequency of the M.A., and Lieberman Aiden, E., 2011,
“Environmental Sciences” bigram, which A.M. Velbel introduced me to Ngram Viewer Quantitative analysis of culture using millions of
started in 1990. The N-gram frequency of and was instrumental in the evolution of this digitized books: Science, v. 331, p. 176–182,
other geologic disciplines also chart what manuscript. Three reviewers contributed to a more https://doi.org/10.1126/science.1199644.
might be interpreted as evolving priorities, focused and improved final version.
especially in the textbook-rich academic Nunberg, G., 2009, Google’s book search: A
environment: References to “evolutionary REFERENCES CITED disaster for scholars: The Chronicle of Higher
biology” now approach those of “paleon Education, http://www.chronicle.com/article/
tology.” As frequency of the bigram “evo- Cohen, D., 2010, Initial thoughts on the Google Googles-Book-Search-A/48245/ (last accessed
lutionary biology” increased, through the Books N-gram Viewer and datasets, http://www 10 May 2017).
mid-1970s, the Paleontological Society .dancohen.org/2010/12/19/initial-thoughts-on-
debuted its new journal, Paleobiology. the-google-books-N-gram-viewer-and-datasets/ Pechenick, E.A., Danforth, C.M., and Dodds, P.S.,
(last accessed 10 May 2017). 2015, Characterizing the Google Books corpus:
Strong limits to inferences of socio-cultural and
linguistic evolution: PLoS One, v. 10, no. 10,
https://doi.org/10.1371/journal.pone.0137041.
Rudwick, M.J.S., 2010, Worlds before Adam: The
reconstruction of geohistory in the age of reform:
Chicago, University of Chicago Press, 648 p.
Zhang, S., 2015, The pitfalls of using Google
N-gram to study language, https://www.wired.
com/2015/10/pitfalls-of-studying-language-with-
google-N-gram/ (last accessed 10 May 2017).
Manuscript received 11 May 2017
Revised manuscript received 3 January 2018
Manuscript accepted 7 February 2018
www.geosociety.org/gsatoday 67