Hybrid modeling: The complexity of learning to learn

In 2022, I had the opportunity to win the WIS Professional Development Scholarship. With the funds, I attended the 5th Annual Learning for Dynamics & Control Conference at University of Pennsylvania in June 2023. It was a great opportunity to learn more about hybrid modeling to apply it to my current research. In this blog post, I would like to explain simply and briefly what hybrid modeling is and how I'm using it in my project.


Imagine two artists who claim to know how to paint landscapes. We know that the techniques they use to choose colors are different: Artist 1 has seen thousands of daylight landscapes. She doesn’t know why she chose those colors; she simply learned some patterns from the thousands of landscapes she has seen. Artist 2 hasn't seen many landscapes; instead, she has read a lot about the colors in a landscape, with only a few examples for explanations.

We want to test the skills of both artists, so we have 3 tasks.

Task 1: Paint a landscape with grass, sky, and some flowers.


Artist 1's answer: She knows that the sky is blue, grass is green, and usually, flowers have warm and vivid colors.


Artist 2's answer: She knows that the sky is blue due to the scattering of shorter-wavelength blue light by molecules and small particles in the Earth's atmosphere. The green color of grass results from the reflection of green light by the chlorophyll pigment in plant cells, as it is the part of the spectrum of visible light that is not absorbed by chlorophyll during photosynthesis. She has also read that the color of a flower is determined by the presence of pigments such as chlorophyll (which makes plants green), carotenoids (which create yellow and orange colors), and various types of anthocyanins (which produce red, blue, and purple hues).

Both artists will correctly paint their task.

Task 2: We include a new element in the landscape, a cat on the grass.


Artist 1's answer: In some of the thousands of examples that the first artist has learned, she has seen a brown cat. So, she paints it with that color.


Artist 2's answer: She has never heard about cats, so she won’t know what to do. "Maybe there is an error in the specification," she thinks; and since it is on the grass, she will paint it green too.


Artist 2 fails.


Task 3: The landscape has a sunset.


Artist 1's answer: She knows that a sunset should be colorful, so she paints it with several colors without considering the position of the colors.


Artist 2's answer: But the second artist knows that she must add new colors to the sky that transition from red at the bottom to blue at the top. 


Artist 1 fails.


From the results, which of the two artists has the best performance? Who will you trust more? Or better, how do you combine the knowledge of both artists?

In the example, the first artist represents purely data-driven machine learning techniques, and the second artist represents traditional ways to model, such as mechanistic models described by ordinary differential equations (ODE). Sometimes, the first artist’s knowledge is enough because you solve the task correctly with some pitfalls, and that’s all. But what if there are lives involved? We need to understand how the model is making decisions; we need to make sure that we are adhering to all the physical rules (such as not giving a negative amount of medicine, which wouldn’t make sense). The good news is that we can mix the two ways of learning: creating interpretable machine learning models through hybrid modeling.


This hybrid modeling approach can accurately capture complex patterns from the data while adhering to physics principles. It has been applied in various fields such as geophysics [1], epidemiology [2], and fluid dynamics [3].

In particular, I’d like to explain how we are using hybrid modeling in the AIMS at OHSU lab as a tool for personalized simulation of the glucoregulatory simulators of individuals living with type 1 diabetes (T1D). On one hand, we have a glucoregulatory model that uses ODEs (Artist 2), and on the other, we have a large amount of data on glucose levels collected from people living with T1D, along with detailed information about their routines, such as the timing of exercise or continuous heart rate data (Artist 1). The glucoregulatory model is great because it is interpretable. For instance, from the model, we can learn about the impact of different doses of insulin on an average person with T1D. However, these models are not perfect because they don’t account for all the little aspects of a person's life that affect glucose level such as exercise, hormone cycles, stress and other external disturbances.

To address this problem, we can use data from people living with T1D along with machine learning models to learn complex, unmodeled glucose patterns from factors such as physical activity, sleep efficiency, or macronutrients from food. Without the technical details of how we are implementing hybrid modeling, we can use the information from the ODE as a baseline for glucose simulations and then use machine learning to better adjust for other information in a person's life.


In conclusion, in modeling and in life, there is not a single correct way to do things. All methods, akin to the painting styles of the artists in the initial example, are merely approaches. Perhaps the best course of action is to learn about all of them and use the methods that best suit our needs. Alternatively, we can combine the strengths of different methods to make our approach more robust and adaptable. 


Sources

[1] J. Yang, H. Wang, Y. Sheng, Y. Lin, and L. Yang, “A Physics-guided Generative AI Toolkit for Geophysical Monitoring.” arXiv, Jan. 06, 2024. doi: 10.48550/arXiv.2401.03131.

[2] D. Wu et al., “DeepGLEAM: A hybrid mechanistic and deep learning model for COVID-19 forecasting.” arXiv, Mar. 23, 2021. doi: 10.48550/arXiv.2102.06684.

[3] R. Wang, K. Kashinath, M. Mustafa, A. Albert, and R. Yu, “Towards Physics-informed Deep Learning for Turbulent Flow Prediction.” arXiv, Jun. 13, 2020. Accessed: Feb. 12, 2024. [Online]. Available: http://arxiv.org/abs/1911.08655

Author Bio

Valentina Roquemen-Echeverri is a third-year graduate student at the Artificial Intelligence for Medical Systems (AIMS) Lab within the Department of Biomedical Engineering at OHSU. Her concentration on machine learning and data science has cultivated a strong interest in data, encompassing visualization, ETL (extract, transform, load), and modeling.

As a Latin American woman, she considers herself fortunate to occupy her current position. Nevertheless, she also recognizes her responsibility to turn science into a multicultural space, reducing the gap between those who have access to it.

Valentina Roquemen-Echeverri

Valentina Roquemen-Echeverri is a third-year graduate student in the Artificial Intelligence for Medical Systems (AIMS) Lab in the Department of Biomedical Engineering at OHSU. My focus on machine learning and data science led me to develop a passion for data, including visualization, ETL (extract, transform, load), and modeling.

As a Latin American Woman, I feel lucky to be where I am. However, I also feel a great responsibility to turn science into a multicultural space, reducing the gap between those who have access to it.


Next
Next

Leveling Up: A Brief History of Women’s Pay Equity