Quickstart

In the following, we will use some of the techniques provided by DataDrivenDiffEq to infer some models.

Linear Damped Oscillator - Dynamic Mode Decomposition

To begin, let's create our own data for the linear oscillator with damping.

using OrdinaryDiffEq
using Plots
gr()

using DataDrivenDiffEq
using LinearAlgebra

function linear!(du, u, p, t)
  du[1] = u[2]
  du[2] = -u[1] - 0.1*u[2]
end

u0 = Float64[0.99π; -0.3]
tspan = (0.0, 40.0)

problem = ODEProblem(linear!, u0, tspan)
solution = solve(problem, Tsit5(), saveat = 1.0)

plot(solution)

Let's assume we have just the trajectory data and let's call it X. Since we gathered the data at a fixed interval of one time unit, we will try to fit a linear model. And, of course, we use a subset of the data for training and the rest for testing.

X = Array(solution)

approximation = DMD(X[:, 1:20])

approx_prob = DiscreteProblem(approximation, u0, tspan)
approx_sol = solve(approx_prob, FunctionMap())

Yeah! The model fits! But what exactly did we do?

DMD is short for Dynamic Mode Decomposition, a technique which generates a linear model from data. So, given the data matrix X, we simply divided it up into two data sets and performed a linear fitting between those.

Note that we fitted a discrete model, which fits our continuous data. This is possible because:

  • The measurements were taken at an interval of 1.0
  • The original, unknown model has a discrete, linear solution

To check this, we can compare the operator of our linear fit with the matrix exponential of the original model.

dt = 1.0
K = operator(approximation)
norm(K - exp(dt*[0.0 1.0; -1.0 -0.1]), 2)

5 . 0 6 6 9 4 2 1 7 4 9 5 7 5 5 7 e - 5

The reason for using operator as a function to get the corresponding matrix of the approximation is the connection of Dynamic Mode Decomposition to the Koopman Operator. You might have noticed that the return value of DMD is a LinearKoopman.

The LinearKoopman overloads some useful functions from LinearAlgebra to perform analysis. Let's have a look at the eigenvalues of the operator:

scatter(eigvals(approximation))

# Add the stability margin
ϕ = 0:0.01π:2π
plot!(cos.(ϕ), sin.(ϕ),
  color = :red, linestyle = :dot,
  label = "Stability Margin",
  xlim = (-1,1), ylim = (-1,1), legend = :bottomleft)

For more information on the LinearKoopman, have a look at the corresponding documentation.

But wait! We want a continuous model. There is also a corresponding algorithm for this : gDMD ! As opposed to DMD, which provides a discrete model based on the direct measurements X, gDMD estimates the generator of the dynamical system given X and the differential states DX. Since we did not measure any differential states, we can just provide a vector of time measurements. gDMD will automatically interpolate using DataInterpolations.jl and perform numerical differentiation using FiniteDifferences.jl.

Here, we will provide gDMD with the measurement data and use a new sample time of 0.1

t = solution.t
X = Array(solution)

generator_approximation = gDMD(t[1:20], X[:, 1:20], dt = 0.1)

generator_prob = ODEProblem(generator_approximation, u0 , tspan)
generator_sol = solve(generator_prob, Tsit5())

Since we have a continuous estimation, let's look at the generator of the estimation

G = generator(generator_approximation)
norm(G-[0.0 1.0; -1.0 -0.1], 2)

0 . 0 1 7 3 6 9 7 7 3 9 7 4 2 8 7 4 5 8

Nonlinear Systems - Extended Dynamic Mode Decomposition

But what about nonlinear systems? Even though Dynamic Mode Decomposition will help us to figure out the best linear fit, we are interested in figuring out all the nonlinear parts of the equations. Luckily, Koopman theory covers this! To put it very (very very) simply : If you spread out your information in many observable functions, you will end up with a linear system in those observables. So you might end up with a trade-off between a huge system which is linear in the observables vs a small, nonlinear system.

But how can we leverage this? We use the Extended Dynamic Mode Decomposition, or EDMD for short. EDMD does more or less the exact same thing like DMD, but in the new Basis of nonlinear observables. We will investigate now a fairly standard system, with a slow and fast manifold, for which there exists an analytical solution of this problem.

using OrdinaryDiffEq
using Plots
gr()

using DataDrivenDiffEq
using LinearAlgebra
using ModelingToolkit

function slow_manifold(du, u, p, t)
  du[1] = p[1]*u[1]
  du[2] = p[2]*(u[2]-u[1]^2)
end

u0 = [3.0; -2.0]
tspan = (0.0, 10.0)
p = [-0.05, -1.0]

problem = ODEProblem(slow_manifold, u0, tspan, p)
solution = solve(problem, Tsit5(), saveat = 0.2)

X = Array(solution)
DX = solution(solution.t, Val{1})

Since we want to estimate the continuous system, we also capture the trajectory of the differential states. Now, we will create our nonlinear observables, which is represented as a Basis in DataDrivenDiffEq.jl.

@variables u[1:2]

observables = [u; u[1]^2]

basis = Basis(observables, u)
3 dimensional basis in ["u₁", "u₂"]

A Basis captures a bunch of functions defined over some variables provided via ModelingToolkit.jl. Here, we included the state and u[1]^2. Now, we simply call gEDMD, which will compute the generator of the Koopman Operator associated with the model.

approximation = gEDMD(X, DX, basis)

approximation_problem = ODEProblem(approximation, u0, tspan)
generator_sol = solve(approximation_problem, Tsit5(), saveat = solution.t)

Looking at the eigenvalues of the system, we see that the estimated eigenvalues of the linear system are close to the true values.

Nonlinear Systems - Sparse Identification of Nonlinear Dynamics

Okay, so far we can fit linear models via DMD and nonlinear models via EDMD. But what if we want to find a model of a nonlinear system without moving to Koopman space? Simple, we use Sparse Identification of Nonlinear Dynamics or SINDy.

As the name suggests, SINDy finds the sparsest basis of functions which build the observed trajectory. Again, we will start with a nonlinear system

using DataDrivenDiffEq
using ModelingToolkit
using OrdinaryDiffEq
using LinearAlgebra
using Plots
gr()

function pendulum(u, p, t)
    x = u[2]
    y = -9.81sin(u[1]) - 0.1u[2]
    return [x;y]
end

u0 = [0.4π; 1.0]
tspan = (0.0, 20.0)
problem = ODEProblem(pendulum, u0, tspan)
solution = solve(problem, Tsit5(), atol = 1e-8, rtol = 1e-8, saveat = 0.001)

X = Array(solution)
DX = solution(solution.t, Val{1})

which is the simple nonlinear pendulum with damping.

Suppose we are like John and know nothing about the system, we have just the data in front of us. To apply SINDy, we need three ingredients:

  • A Basis containing all possible candidate functions which might be in the model
  • An optimizer which is able to produce a sparse output
  • A threshold for the optimizer

It might seem to you that the third point is more a parameter of the optimizer (which it is), but, nevertheless, it is a crucial decision where to cut off parameters.

So, let's create a bunch of basis functions for our problem first

@variables u[1:2]

h = Operation[u; u.^2; u.^3; sin.(u); cos.(u); 1]

basis = Basis(h, u)

DataDrivenDiffEq comes with some optimizers to tackle sparse regression problems. Here, we will use SR3, used here and introduced here. We choose a threshold of 3.5e-1 and start the optimizer.

opt = SR3(3e-1, 1.0)
Ψ = SINDy(X[:, 1:1000], DX[:, 1:1000], basis, opt, maxiter = 10000, normalize = true)
2 dimensional basis in ["u₁", "u₂"]
du₁ = p₁ * u₂
du₂ = sin(u₁) * p₃ + p₂ * u₂

We recovered the equations! Let's transform the SINDyResult into a performant piece of Julia Code using ODESystem

sys = ODESystem(Ψ)
p = parameters(Ψ)

dudt = ODEFunction(sys)

estimator = ODEProblem(dudt, u0, tspan, p)
estimation = solve(estimator, Tsit5(), saveat = solution.t)