Abstract
\label{abstract}
In an era of global change, we use paleogeoscientific data to study how
the Earth-Life system responds to and recovers from large perturbations
to the global carbon cycle, biodiversity, climates, cryosphere, and
hydrosphere. The grand informatics challenge is to organize and mobilize
billions of observations distributed across space, time, disciplines,
and institutions, so that we can bring all relevant data to bear on any
time, place, or process. The emerging cyberinfrastructure model consists
of a distributed, federated network of resources, with community curated
data repositories (CCDRs), physical sample repositories, individual
geoscientists, the scientific literature, and networking/coordination
efforts. In our field, the most productive scientific return from NSF
cyberinfrastructure investments will come from distributed, meso-scale
investments: 1) Long-term investments in the human capital necessary to
develop and sustain community-curated data resources (CCDRs), 2) Data
mobilization campaigns targeted to high-priority research questions, 3)
Scientific workforce training at all career stages, 4) Reduced data
friction via integrated data handling systems from field collection to
measurement, paper publication, and data publication, 5) Automated
data-mining systems for extracting information from unstructured
sources, 6) A National Center for Paleodata Synthesis to accelerate and
coordinate the above global-scale science and informatic activities.