Follow BigDATAwire:

October 22, 2024

Monterey Data Conference Explores Role of AI Foundation Models in Future of Science

Oct. 22, 2024 — In science, data is essential. Scientific methods rely on accurate, well-documented data to draw conclusions, test hypotheses, build models, and explain the natural world. Data isn’t just a byproduct of research; it’s a critical component that ensures results are reproducible and reliable – and the integrity of the scientific process depends on how data is collected, managed, and preserved throughout its life cycle.

In late August, 180 experts and leaders from across the computing landscape gathered at the Monterey Data Conference (MDC) on the beautiful California coast to discuss that life cycle, highlighting and discussing the latest advances and continuing challenges in scientific data analysis and computing.

Founded by staff from Lawrence Berkeley National Laboratory (Berkeley Lab) in 2019, MDC is an opportunity for experts from industry, academia, and U.S. Department of Energy (DOE) laboratories and user facilities to gather for learning and collaboration.

The theme of this year’s meeting was “Foundations,” exploring the ways in which recent computational advances such as AI foundation models, emerging hardware, and integrated research infrastructure (IRI) will be foundational to the future of data-driven discovery. In particular, the meeting addressed ways in which the HPC community can collaborate to develop and build the next generation of computational tools and infrastructure.

“MDC is an exceptional venue for fostering collaboration across disciplines and organizations, and brings together a diverse group of experts and early-career researchers to enable meaningful discussions that push the boundaries of scientific discovery,” said Ana Kupresanin, Scientific Data Division Director at Berkeley Lab, who presented at the conference. “This year’s focus on foundation models and infrastructure truly highlights the pivotal role data plays in shaping the future of science.”

Sessions included presentations on a range of related topics: among others, AI and foundation models for science; next-generation data infrastructure, including the DOE’s new High-Performance Data Facility; disruptive technology, including quantum computing and other potential future technology.

In addition to information sharing through talks and presentations, this year’s conference featured a range of interactive events, ranging from panels in which experts took audience questions, to a poster session for early-career researchers, to simply having plenty of time for informal networking. These events encourage collaboration across the board, says Kupresanin, but they also offer support and inclusion to the next generation:

“This year’s poster session for postdocs and early-career researchers shows that MDC is nurturing the next generation of data scientists with opportunities to present their work and engage with more senior colleagues,” she said. “This conference offers opportunities for technical exchange as well as mentorship.”

Ultimately, the MDC presents a unique opportunity for data scientists at all levels to gather and learn from one another in service of a larger mission, said NERSC Division Director Sudip Dosanjh:

“We founded MDC so that scientists, computer scientists, and technology providers could gather to discuss advances in data analysis, AI, and complex workflows for open DOE science; we know that forming interdisciplinary teams is critical for attacking these problems at large scale. These topics are critical to DOE mission science and there are several DOE initiatives being launched in this area,” said Dosanjh. “I’ve been gratified to see the tremendous interest in MDC by the community – this year we were at maximum capacity. We have had a large team working on MDC and I want to thank everyone who helped.”

About NERSC and Berkeley Lab

The National Energy Research Scientific Computing Center (NERSC) is a U.S. Department of Energy Office of Science User Facility that serves as the primary high performance computing center for scientific research sponsored by the Office of Science. Located at Lawrence Berkeley National Laboratory, NERSC serves almost 10,000 scientists at national laboratories and universities researching a wide range of problems in climate, fusion energy, materials science, physics, chemistry, computational biology, and other disciplines. Berkeley Lab is a DOE national laboratory located in Berkeley, California. It conducts unclassified scientific research and is managed by the University of California for the U.S. Department of Energy.


Source: Elizabeth Ball, NERSC

BigDATAwire