Innovation in the data science stack is something that data scientists seem to be prepared for: JupyterLab was introduced to NERSC users as the default notebook option while it was still in beta, and we learned how to support it while users learned to use it. Users even share solutions or give us advice about issues with Jupyter. What we have found here is that proactive engagement with users about Jupyter plans, status, and feedback actually helps cultivate user interest in experimentation with new features as the need arises.
As supercomputing shifts to accelerator hardware to maintain performance growth as Moore's Law withers away, GPUs take center stage for Jupyter in HPC. Jupyter has become a principal platform for GPU-powered AI and analytics application workflows built on Tensorflow, PyTorch, and RAPIDS. NERSC's first production system with a GPU partition, Perlmutter, is arriving in 2021 with "built-in" Jupyter support from the vendor. This is happening because system vendors see Jupyter as a key component of the data/AI ecosystem and are motivated to engage with HPC centers to meet user needs.
Rollin add back paragraph about engagement: Authorea ate it.

Use Cases

We noticed some common patterns in our engagements with scientific users that use Jupyter for their computational workflows on NERSC systems. At the highest level there is a need for combining exploration of very large datasets with some computational and analytical capabilities. Crucially the scale of data or compute (or both) required to enable these workflows typically exceeds the capacity of the users own machines and the users need a user-friendly way to drive these large-scale workflows interactively.
We often see a two phased approach, where the user performs some local notebook development and then runs these on machines like NERSC on their production data and compute pipelines. It is important to be able to seamlessly go between these modes and our approach is grounded in trying to make sure that a user can easily take a notebook and its associated environment over to our systems, with minimal effort and making sure that they have a consistent user experience.
As an example, we describe a use case [ref: Heagy et al.] applying geophysical simulations and inversions for imaging the subsurface. This was done by running 1000 1D inversions that each produces a layered model of the subsurface conductivity, which are then stitched together to create a 3D model. The goal of this particular survey was to try and understand why the Murray River in Australia was becoming more saline. This involved running simulations, data analysis, and machine learning ML on HPC systems. The outputs of these runs need to be visualized and queried interactively. The initial workflow was developed on the users local  laptop environment, and needed to be scaled up at NERSC.
In practice this involves running Jupyter at NERSC in a Docker container with a pre-defined reproducible software environment. Parallel computing workers are launched on Cori from Jupyter with the "Dask-jobqueue" Jupyter extension. Workers can be scaled up or down on-demand. The SimPeg Inversion notebook farms out parallel tasks to Dask - the results of these parallel runs are pulled into notebook and visualized. Running of a large batch of simulations is then used to generate data for a machine learning application.