Page 42 - i1052-5173-29-1_GSAT
P. 42
Big Data and Artificial Intelligence Analytics in Geosciences:
Promises and Potential
Roberto Spina, Geologist and DCompSci, CNG (National Council of Geologists), Rome, Italy, robertospina@geologi.it
ABSTRACT can contain huge amounts of hetero- at the beginning of ocean exploration.
Big data and machine learning are IT geneous, structured and unstructured data Since then, the map has undergone few
methodologies that are bringing substan- (text, numerical values, images, e-mail, changes, with at most six types of
tial changes in the analysis and interpreta- GPS data, and data acquired from social sediment dominant in the ocean basins.
tion of scientific data. By adding GPU networks), which can be extrapolated, The digital map was created using an AI
processing resources to the typical equip- analyzed, and correlated with each other. method consisting of the support vector
ment of a server host, it is possible to Artificial Intelligence (AI) is a branch machine (SVM) model. Through a cross-
speed up queries performed on large data- of computer science that studies the way validation approach, the classifier was
bases and reduce training time for deep in which the combination of hardware and trained by adding new data gradually so as
learning architectures. software systems can simulate typical to allow its learning. Learning the param-
A recent pairing of the big data technolo- behaviors of the human brain. One of the eter values, which optimize the classifier’s
gies, applied to old and new data, and arti- most important applications consists of performance on withheld data, is an impor-
ficial intelligence techniques has enabled a a complex algorithm, called machine tant step in the workflow. In this way, the
team of scientists to create an interactive learning, which is able to learn and vast set of point data has been transformed
virtual globe that shows a color mosaic of make decisions. into a continuous digital map with very
the seabed geology. This interactive model GPU Parallel Computing (GPGPU) high accuracy (up to 80%).
allows us to obtain robust reconstructions involves the processing of data by the pro- The new lithological map of the seabed
and predictions of climate changes and cessors present in the graphics card (GPU) is very important for the interpretation of
their impacts on the ocean environment. and has allowed the computation, in rela- global phenomena related to the evolution
We suggest a possible evolution of such tively short times, of huge amounts of data of ocean basins. An example of this is dia-
a model by means of the expansion of with an efficiency of at least two orders of toms, siliceous phytoplankton that live in
functionalities and performance improve- magnitude greater compared to the past. the oceans and that through chlorophyll
ments. We refer respectively to the imple- There are several cases in which these photosynthesis produce about one-quarter
mentation of isochronic layers of seabed technologies have been applied both in of the oxygen present in the atmosphere,
lithologies and the addition of GPU the field of potential earthquakes (Rouet- contributing to reduce global terrestrial
resources to speed up the learning phase of Leduc et al., 2017), volcanic eruptions warming. At their death, these organisms
the support vector machine (SVM) model. (Ham et al., 2012), and to solve the prob- precipitate through the water column,
These additional features would allow us lems of spatial modeling in the field of accumulating on the underlying sea floor.
to establish broader correlations and extract the assessment of landslide susceptibility Satellite surveys over the years have identi-
additional information on large-scale (Korup and Stolle, 2014). fied places where diatomaceous activity is
geological phenomena. The following describes a mixed more productive; that is, the marine areas
approach (AI and Big Data) in the field of in which there are the maximum concentra-
INTRODUCTION geosciences—analyzing potentials and tions of chlorophyll, considering that they
The Earth system generates continuous possible future developments. should also correspond to the areas of max-
data, and our acquisition capacity has imum accumulation of these organisms in
significantly increased over time. The CASE STUDY: BIG DATA AND AI the sea floor. Surprisingly, the digital map
growing availability of acquired geological MAP WORLD’S OCEAN FLOOR of the seabed has revealed that there is a
data and the methods developed in the field An example of an application combining decoupling between the productivity of
of information technology make it possible Big Data and machine learning technolo- diatoms and the corresponding accumula-
to identify associations and understand gies was implemented by a team of tion areas in the sea floor. The possibility
patterns and trends within data (Big Data), Australian scientists who created the first of diatom ooze formation is however
solve difficult decision problems (artificial digital map of seabed lithologies favored by the low surface temperature
intelligence), and provide acceleration to (Dutkiewicz et al., 2015) through the analy- (0.9–5.7 °C), by salinity (33.8–34 PSS),
data processing (GPU computing). sis and cataloging of ~15,000 samples of and by the high concentration of nutrients,
Big Data is a term that indicates very sediments found in marine basins. Before and therefore can represent an important
large databases (often by order of such a map, the most recent map of oceanic indicator of the oceanographic variables
zettabytes, i.e., billions of terabytes) that lithologies was hand drawn ~40 years ago, of the surface of the sea (Cunningham and
GSA Today, v. 29, https://www.doi.org/10.1130/GSATG372GW.1. Copyright 2018, The Geological Society of America. CC-BY-NC.
42 GSA Today | January 2019