Current Funding Research Projects
U.S. Department of Energy
10/2025 - 09/2027
OPAL - Orchestrated Platform for Autonomous Laboratories
$12,500,000 (SP) — US Department of Energy, Offices of Biological & Environmental Research and Advanced Scientific Computing Research[Details]
The Orchestrated Platform for Autonomous Laboratories (OPAL) is a multi-laboratory initiative led by the U.S. Department of Energy (DOE) to turn biological discovery into a self-driving process. By combining artificial intelligence (AI), robotics, and automated experimentation, OPAL seeks to create a network of autonomous laboratories that can learn, adapt, and accelerate breakthroughs across biology, biotechnology, and energy science. At Oak Ridge National Laboratory, OPAL leverages the unique capabilities of one of the world’s most advanced automated plant research facilities: the Advanced Plant Phenotyping Laboratory (APPL). APPL integrates high-resolution imaging, robotics, and AI-powered analytics to link genetic variation to phenotypic traits, enabling rapid, data-driven plant characterization. Scientists are using this unique facility to develop plants that can “mine” rare earth elements critical for energy technologies.
10/2023 - 09/2028
SWARM - Scientific Workflow Applications on Resilient Metasystem
$8,7500,000 (Co-PI) — US Department of Energy, Office of Advanced Scientific Computing Research (Award #DE-SC0024387)[Details]
Existing adaptive management and resource partitioning strategies developed for resilient infrastructures are often static, based on rules developed by experts with years of experience, and dependent on centralized control. While significant attention has been paid to online and dynamic resource management using mainstream artificial intelligence (AI) methods, their effectiveness has not been demonstrated at scale because of their lack of ability to deal with the unique set of challenges related to the complexity and scale of the resilient infrastructures. The SWARM project explores how distributed intelligence, specifically, swarm intelligence (SI), can provide robust, performant, resilient, and fault-tolerant execution of DOE scientific workflows that span across a continuum of resources from edge devices near sensors and instruments through wide area networks to leadership-class systems. The goal is to design SI-based resilient IRI that can quickly recover from failures, adapt to changes in the environment, maximize overall resource utilization, and optimize the execution time of workflows submitted by DOE scientists.
10/2023 - 09/2028
REDWOOD - Resilient Federated Workflows in a Heterogeneous Computing Environment
$10,0000,000 (SP) — US Department of Energy, Office of Advanced Scientific Computing Research[Details]
We can optimize the resilience of the workflows by intelligently placing the data and processing across the distributed resources. This will involve designing and developing an intelligent, introspective, and dynamic workflow by drawing on a system model based on years of data captured from a large-scale production system. It will offer control and management capabilities required at all timescales -- from NRT to delivery of scientific insight. The overarching goal of this proposed work is to conduct research in three areas: 1) dynamic modeling (these methods can quantitatively capture dynamic and adaptive workflow and system behavior at runtime), 2) complex high-throughput (HT) and NRT workflows resilience, and 3) optimal data placement and resources utilization.
Oak Ridge National Laboratory
10/2023 - 09/2025
Multi-workflow Orchestration and Lightweight Integrated Data Analysis Across Facilities
$743,000 (Co-PI) — Laboratory Directed Research and Development Program - INTERSECT (Award #11521)[Details]
Department of Energy (DOE) science requires more experimental complexity and computational scale than ever before, with multidisciplinary teams spanning across science groups and computing facilities. As an example, chemists use an electron microscope to generate terabytes of data per hour that are preprocessed in the Edge and shipped to an HPC cluster for distributed deep learning, which is used to assist the chemist to re-tune the microscope (i.e., feedback loop). However, no existing solution allows scientists to run a science campaign with a series of connected workflows, known as a multi-workflow, as a single, systematic, and automated process; hence, they instead run the individual pieces at their facility and manually transfer data between each step. Coordinating federated and heterogeneous scientific tools, workflow, storage systems, and facilities is challenging, and addressing these challenges can accelerate scientific discoveries and the DOE's mission. Our proposal combines Multi-workflow Execution Orchestration and Multi-workflow Integrated Data Analysis in a symbiotic manner across facilities. To implement this, we use the Zambeze and FlowCept frameworks and together they could compose the System-of-Systems architecture of INTERSECT. Zambeze acts as the controller for workflows across facilities in the control plane, while FlowCept interfaces simultaneously with the data and control planes to autonomously capture provenance, metadata, and pointers to large scientific data files, providing a lightweight integrated data view at runtime.
07/2022 - 07/2024
SWAT - Science to Workflow Acceleration Tool
$682,431 (PI) — Laboratory Directed Research and Development Program - Strategic Hire (Award #11184)[Details]
Extreme scale science requires scientists to combine multiple computational tasks in complex workflows, efficiently managing large amounts of data, and fully exploiting the performance of the entire DOE computing ecosystem, from the edge to supercomputers. To this end, ORNL scientists have to express their research ideas into these workflows that can be implemented by computer scientists. Going from science to workflows requires domain scientists to express their needs from a different perspective to select the most adapted and efficient tools. However, this is an error-prone and tedious process, as there is no turnkey solution in such a diverse ecosystem. Thus, we will develop a comprehensive simulation-based framework to help scientists to easily prototype their scientific workflows, while expressing all the important information needed by computer scientists to provide them with the most efficient implementation. We will be building simulators that allow them to get major insights into the expected performance and scalability of their workflow as it changes in time. These simulators will grow in accuracy and fidelity as we further develop this long term vision and engage with all the needed actors, from facilities to scientific applications. We will eventually create a world-class scientific instrument for the performance evaluation of scientific workflows in the early stages of their design. Our framework will ultimately lead to enhanced productivity and the ability for DOE scientists to run their workflows at larger scales.