Describe current or emerging science or engineering research
challenge(s), providing context in terms of recent research activities
and standing questions in the field.
1.1 Paleogeosciences: Major Scientific Questions and
Research
Challenges
\label{paleogeosciences-major-scientific-questions-and-research-challenges}
The grand challenge in the paleogeosciences is to enable a fully
resolved understanding of the past dynamics of the Earth-Life System and
its interacting subsystems, across the entire history of Earth, at
temporal scales of \(10^9\) to \(10^1\) years,
by organizing and mobilizing the many millions of individual
geoscientific observations that make up the long tail of paleogeoscience
data (Transitions Report, 2012, Earth Cube Paleogeosciences Domain
Workshop 2012, NRC 2013, 2011a,b). The paleogeosciences branch of Earth
System Science encompasses paleoclimatology, paleobiology, paleoecology,
geochronology, sedimentary geology, geochemistry, glaciology, and other
disciplines. In our era of global change, with projected rates of change
and states of the climate system with no analog in recorded human
history, the paleogeosciences are vital to studying how the Earth-Life
system responds to and recovers from large perturbations to the global
carbon cycle, global biodiversity, regional and global climates,
cryosphere, and hydrosphere.
Four overarching scientific challenges in Earth System Science were
identified in the National Research Council’s Transitions Report (2012):
What is the full range of potential climate system states and
rates of transitions experienced on Earth?
What are the thresholds, feedbacks, and tipping points in the
climate system, and how do they vary among different climate states?
What are the ranges and rates of ecosystem response, modes of
vulnerability, and resilience to change in different Earth system
states?
How have climate, the oceans, the Earth’s sedimentary crust,
carbon sinks and soils, and life itself evolved together, and what
does this tell us about the future trajectory of the integrated
Earth-Life system?
See also National Research Council reports (2013, 2011a,b) and EarthCube
Domain Working Group Reports (Noren et al. 2013, Aufdenkampe et al.
2013, Chan and Budd 2013, and Singer et al. 2013, all refs:
http://bit.ly/2nUOUQc).
We can answer these questions through the study of Earth’s history and
its rich record of past abrupt change, evolutionary innovations, and
complex dynamics driven by interactions among multiple components of the
earth system, across multiple temporal and spatial scales. Earth’s
history provides multiple model systems for
21st-century changes (Williams et al. 2013).
Areas of active research include:
The effect of early life on atmospheric evolution and global
geochemical cycles (e.g., Peters et al. 2017).
The five major mass extinctions of Earth’s biosphere and understanding
the processes that govern rates of speciation and evolutionary
innovation.
Disruptions to the Earth’s global carbon cycle, (e.g., the
rapid release of organic carbon into the atmospheric-ocean system
during the Paleocene-Eocene Thermal Maximum, and ensuing effects on
species extinction and evolution).
The glacial-interglacial cycles of the Quaternary, paced by variations
in the Earth’s orbit and amplified by feedbacks among ice sheet
dynamics, ocean circulation and chemistry, and climate.
The persistence of biodiversity during past glacial periods and the
community turnover and species range shifts during past
glacial-interglacial cycles
Reconstructing global and regional temperature trends over the current
interglacial and last millennium, while disentangling the effects of
external forcings, internal feedbacks, unforced variations, and the
growing anthropogenic footprint.
1.2 Paleogeoscientific Data: Key Features and
Challenges
\label{paleogeoscientific-data-key-features-and-challenges}
Here we summarize key features of paleogeoscientific data, practice, and
practitioners. These characteristics have been the starting point for
current cyberinfrastructure-building efforts (Sect.2.1) and
inform our recommendations for the next generation of
cyberinfrastructure advances (Sect.2.2-3.5).
1. Paleogeoscientific observations are long-tail data collected
by scientists from many disciplines and institutions, with many data
types and forms of measurement. Individual records are temporally long
but spatially point-level data, collected at one or more outcrops, drill
sites, or other discrete sites. Hence, site-level paleogeoscientific
data must be assembled into global-scale data networks in order to
understand the Earth System, its external forcings, and internal
feedbacks (e.g. PAGES 2k, 2013). Assembling such data is
labor-intensive. Few widely accepted data standards and identifiers
exist (McKay & Emile-Geay, 2016), although several are emerging through
EarthCube-supported Research Coordination Networks (
Cyber4Paleo)
and Integrative Activities (
ePANNDA,
Earth-Life Consortium,
Open Core
Data).
2. Paleogeoscientific data share common underlying structure .
Despite the above heterogeneity, paleogeoscientific data share several
underlying common features: They typically involve a measurement of a
proxy in various geological archives, often structured by
depth , from which we must estimate time. This structural
homogeneity facilitates the development of common data models in the
paleogeosciences.
3. Paleogeoscientific data has a long shelf life.
Paleogeoscientific data derive primarily from physical samples of
geological materials collected in the field and the laboratory
measurements of these samples. As new techniques are developed, we often
seek to reanalyze previously collected samples, cf. the recent
wave of ancient DNA analyses from museum fossils. We must curate
physical samples and maintain an unbroken chain of provenance from
sample to all data generated from the sample (Sect.2.3).
4. Time is an unknown variable that must be estimated in the
paleogeosciences (Singer et al. 2013). We must infer age through
discrete age estimates (called age controls) and age models that provide
age estimates between dated samples. Age models must be regularly
updated as more precise and accurate dates become available and as more
sophisticated age-depth software modeling approaches are developed.
Published geochronological frameworks become obsolete with every new
date and refinement to dating methods, decay constants, and other
parameters. Data repositories exist for some geochronological data
(
GeoChron/
IEDA), but they are not
systematically linked to one another or to other affiliated databases.
5. Dark Data . Data are often not fully published. For example,
papers presenting microfossil data often show only summary diagrams for
selected taxa and may fail to include supplementary data. Published
metadata are incomplete, e.g. geochronological labs usually do not
publish all instrumental parameter settings. Some disciplines have
adopted minimal metadata standards and established a common data
repository; others have not. A great deal of data is still digitally
“dark”, even if publications themselves are available electronically.
Data mobilization efforts are essential (Sect.3.3).
6. Paleodata are increasingly assimilated with Earth System
Models. Our field uses Earth system models to simulate the processes
governing the past and present evolution of the Earth-Life system. These
same models are also the basis for climate scenarios over the coming
decades, and paleodata offer an important constraint on modeled
estimates (e.g. sensitivity of global temperatures to atmospheric CO2,
Hargreaves et al. 2012). Increasingly, data assimilation methods are
being employed to make joint inferences from paleodata and Earth system
models (Crucifix, 2012). For example, atmospheric general circulation
models now include stable isotopic tracers (e.g.
d18O), enabling direct assimilation of earth system
models with paleodata. Data assimilation creates new needs for
well-structured datasets with rigorous estimates of temporal and proxy
uncertainty and for high-capacity computing.
7. Paleogeoscientific Expertise is Widely Distributed , with
individual paleogeoscientists specializing in particular proxy types,
archives, time periods, regions, and questions. Dispersion of expertise
places a premium on developing decentralized, but interlinked governance
and data management systems for our data (Sect.2.1, 3.1-3.2)
8. Uneven Workforce Training and Interest in Informatics . The
paleogeosciences emphasize high-quality field and laboratory
measurements. Informatics has not traditionally been part of the core
geoscientific curriculum, except for courses in statistics and calculus.
Most geoscientists have not sought to keep pace with recent rapid
advances in informatics. Disciplinary and cultural norms vary with
respect to data sharing. Training programs at all levels are needed
(Sect.3.4).