# Problems

`DataDrivenDiffEq.DataDrivenProblem`

— Type`struct DataDrivenProblem{dType, cType, probType} <: DataDrivenDiffEq.AbstractDataDrivenProblem{dType, cType, probType}`

The `DataDrivenProblem`

defines a general estimation problem given measurements, inputs and (in the near future) observations. Three construction methods are available:

`DirectDataDrivenProblem`

for direct mappings`DiscreteDataDrivenProblem`

for time discrete systems`ContinousDataDrivenProblem`

for systems continuous in time

where all are aliases for constructing a problem.

**Fields**

`X`

State measurements

`t`

Time measurements (optional)

`DX`

Differental state measurements (optional); Used for time continuous problems

`Y`

Output measurements (optional); Used for direct problems

`U`

Input measurements (optional); Used for non-autonoumous problems

`p`

Parameters associated with the problem (optional)

`name`

Name of the problem

**Signatures**

**Example**

```
X, DX, t = data...
# Define a discrete time problem
prob = DiscreteDataDrivenProblem(X)
# Define a continuous time problem without explicit time points
prob = ContinuousDataDrivenProblem(X, DX)
# Define a continuous time problem without explicit derivatives
prob = ContinuousDataDrivenProblem(X, t)
# Define a discrete time problem with an input function as a function
input_signal(u,p,t) = t^2
prob = DiscreteDataDrivenProblem(X, t, input_signal)
```

## Defining a Problem

Problems of identification, estimation, or inference are defined by data. These data contain at least measurements of the states `X`

, which would be sufficient to describe a `DiscreteDataDrivenProblem`

with unit time steps similar to the first example on dynamic mode decomposition. Of course, we can extend this to include time points `t`

, control signals `U`

or a function describing those `u(x,p,t)`

. Additionally, any parameters `p`

known a priori can be included in the problem. In practice, this looks like:

```
problem = DiscreteDataDrivenProblem(X)
problem = DiscreteDataDrivenProblem(X, t)
problem = DiscreteDataDrivenProblem(X, t, U)
problem = DiscreteDataDrivenProblem(X, t, U, p = p)
problem = DiscreteDataDrivenProblem(X, t, (x,p,t)->u(x,p,t))
```

Similarly, a `ContinuousDataDrivenProblem`

would need at least measurements and time-derivatives (`X`

and `DX`

) or measurements, time information and a way to derive the time derivatives(`X`

, `t`

and a Collocation method). Again, this can be extended by including a control input as measurements or a function and possible parameters:

```
# Using available data
problem = ContinuousDataDrivenProblem(X, DX)
problem = ContinuousDataDrivenProblem(X, t, DX)
problem = ContinuousDataDrivenProblem(X, t, DX, U, p = p)
problem = ContinuousDataDrivenProblem(X, t, DX, (x,p,t)->u(x,p,t))
# Using collocation
problem = ContinuousDataDrivenProblem(X, t, InterpolationMethod())
problem = ContinuousDataDrivenProblem(X, t, GaussianKernel())
problem = ContinuousDataDrivenProblem(X, t, U, InterpolationMethod())
problem = ContinuousDataDrivenProblem(X, t, U, GaussianKernel(), p = p)
```

You can also directly use a `DESolution`

as an input to your `DataDrivenProblem`

:

`problem = DataDrivenProblem(sol; kwargs...)`

which evaluates the function at the specific timepoints `t`

using the parameters `p`

of the original problem instead of using the interpolation. If you want to use the interpolated data, add the additional keyword `use_interpolation = true`

.

An additional type of problem is the `DirectDataDrivenProblem`

, which does not assume any kind of causal relationship. It is defined by `X`

and an observed output `Y`

in addition to the usual arguments:

```
problem = DirectDataDrivenProblem(X, Y)
problem = DirectDataDrivenProblem(X, t, Y)
problem = DirectDataDrivenProblem(X, t, Y, U)
problem = DirectDataDrivenProblem(X, t, Y, p = p)
problem = DirectDataDrivenProblem(X, t, Y, (x,p,t)->u(x,p,t), p = p)
```

## Concrete Types

`DataDrivenDiffEq.DiscreteDataDrivenProblem`

— FunctionA time discrete `DataDrivenProblem`

useable for problems of the form `f(x[i],p,t,u) ↦ x[i+1]`

.

```
DiscreteDataDrivenProblem(X; kwargs...)
```

`DataDrivenDiffEq.ContinuousDataDrivenProblem`

— FunctionA time continuous `DataDrivenProblem`

useable for problems of the form `f(x,p,t,u) ↦ dx/dt`

.

```
ContinuousDataDrivenProblem(X, DX; kwargs...)
```

Automatically constructs derivatives via an additional collocation method, which can be either a collocation or an interpolation from `DataInterpolations.jl`

wrapped by an `InterpolationMethod`

.

`DataDrivenDiffEq.DirectDataDrivenProblem`

— FunctionA direct `DataDrivenProblem`

useable for problems of the form `f(x,p,t,u) ↦ y`

.

```
DirectDataDrivenProblem(X, Y; kwargs...)
```

# Datasets

`DataDrivenDiffEq.DataDrivenDataset`

— Type`struct DataDrivenDataset{N, U, C} <: DataDrivenDiffEq.AbstractDataDrivenProblem{N, U, C}`

A collection of DataDrivenProblems used to concatenate different trajectories or experiments.

Can be called with either a `NTuple`

of problems or a `NamedTuple`

of `NamedTuples`

. Similar to the `DataDrivenProblem`

, it has three constructors available:

`DirectDataset`

for direct problems`DiscreteDataset`

for discrete problems`ContinuousDataset`

for continuous problems

**Fields**

`name`

Name of the dataset

`probs`

The problems

`sizes`

The length of each problem - for internal use

**Signatures**

A `DataDrivenDataset`

collects several `DataDrivenProblem`

s of the same type but treads them as union used for system identification.

## Concrete Types

`DataDrivenDiffEq.DiscreteDataset`

— FunctionA time discrete `DataDrivenDataset`

useable for problems of the form `f(x,p,t,u) ↦ x(t+1)`

.

```
DiscreteDataset(s; name, kwargs...)
```

`DataDrivenDiffEq.ContinuousDataset`

— FunctionA time continuous `DataDrivenDataset`

useable for problems of the form `f(x,p,t,u) ↦ dx/dt`

.

```
ContinuousDataset(s; name, collocation, kwargs...)
```

Automatically constructs derivatives via an additional collocation method, which can be either a collocation or an interpolation from `DataInterpolations.jl`

wrapped by an `InterpolationMethod`

provided by the `collocation`

keyworded argument.

`DataDrivenDiffEq.DirectDataset`

— FunctionA direct `DataDrivenDataset`

useable for problems of the form `f(x,p,t,u) ↦ y`

.

```
DirectDataset(s; name, kwargs...)
```

## DataSampler

`DataDrivenDiffEq.DataSampler`

— Type`struct DataSampler{T} <: DataDrivenDiffEq.AbstractSampler`

A simple sampler container. Takes in `AbstractSampler`

s to apply onto a `DataDrivenProblem`

in the order they are given. If a `Split`

sampler is provided, then it will be moved to the first index by definition.

`DataDrivenDiffEq.Split`

— Type`struct Split <: DataDrivenDiffEq.AbstractSampler`

Performs a train test split of the `DataDrivenProblem`

where `ratio`

defines the (rough) percentage of training data.

The optional keyword `shuffle`

indicates to sample from random shuffles of the data, allowing for repetition.

Returns ranges for training and testing data.

`DataDrivenDiffEq.Batcher`

— Type`struct Batcher <: DataDrivenDiffEq.AbstractSampler`

Partitions the `DataDrivenProblem`

into `n`

equal partitions. If used after performing a train test `Split`

, works just on the training data.

The optional keyword `shuffle`

indicates to sample from random shuffles of the data, allowing for repetition.

The optional keyword `repeated`

indicates to allow for repeated sampling of data points.

`batchsize_min`

is the minimum batchsize, which should be used within each partition of the dataset.

Returns ranges for each partition of the provided data.