-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathpresentation.tex
200 lines (157 loc) · 5.86 KB
/
presentation.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
\documentclass{beamer}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{default}
\usepackage{nicefrac}
\usepackage{microtype}
\begin{document}
%\title{Thousands of plots} %
\title{Project 6: Modeling unknown data}
\date{ECMI MW 2016}
\author{J. \'Sl\k{e}zak, C. Amorino, P. Andersson, M. Cornely, M. Eskelinen, D. Fidanov, D. Salgado}
\begin{frame}
\maketitle
\end{frame}
\begin{frame}
We found many directories containing files with two columns of some experimental data.
We tried plotting the numbers in one of them as a time series:
\centering
\includegraphics[width=0.7\textwidth, angle=-90, origin=c]{img/time_series_xy.eps}
\end{frame}
\begin{frame}
Let's look at the distributions of values for the two variables. Seems pretty Gaussian\dots
\centering
\includegraphics[width=0.6\textwidth, angle=-90, origin=c]{img/histograms_xy.eps}
\end{frame}
%\begin{frame}
%It looks a bit Gaussian. What do Kolmogorov and Smirnov say?
%
%For variable 1, normality is not excluded by the test:
%\[
%\text{p-value} = 0.08741
%\]
%
%For variable 2, it seems quite abnormal:
%\[
%\text{p-value} = 1.972e-05
%\]
%Let's look at the density anyway\dots
%\end{frame}
\begin{frame}
Plotting the kernel density and the theoretical Gaussian, it seems pretty close, we'll roll with it.
\centering
\includegraphics[height=\textwidth, angle=-90, origin=c]{img/densities_xy.eps}
\end{frame}
\begin{frame}
Wait, what about correlation between the two variables? Let's do a line plot:
\centering
\includegraphics[height=0.8\textwidth, angle=-90, origin=c, trim={2cm 1cm 5cm 2cm}, clip=false]{img/trajectory.eps}\\
Not much relation between the variables, but the overall behaviour is not completely random either. Let's compare\dots
\end{frame}
\begin{frame}
\begin{figure}
\centering
\includegraphics[height=0.8\textwidth, angle=-90, origin=c, trim={6cm 4cm 18cm 2cm}, clip=false]{img/comparison_2000pts.eps}
\caption{Left: First 2000 pairs from the data. Right: 2000 random pairs of independent white Gaussian noise}
\end{figure}
The jumps between the consecutive points are small compared to actual random variables. Hints of dynamics\dots
\end{frame}
\begin{frame}
If there are dynamics behind the data, we should see the later values depending on the earlier ones. Can we measure it somehow?
\begin{block}{Yes!}<2->
For a variable $X$ with mean $\mu$ and variance $\sigma^2$, we can define the autocorrelation as a function of time as
\[
\mathrm{acor}(\tau) = \frac{E[(X_t - \mu)(X_{t+\tau} - \mu)]}{\sigma^2}.
\]
This tells us the dependence between values of $X$ that are time $\tau$ apart.
\end{block}
\end{frame}
\begin{frame}
\centering
\includegraphics[height=\textwidth, angle=-90, origin=c, trim={2cm 1cm 8cm 1cm}, clip=false]{img/acf_xy.eps}\\
Noise would only have a peak at 0, so we have something! Let's check these in log scale\dots
\end{frame}
\begin{frame}
\centering
\includegraphics[height=\textwidth, angle=-90, origin=c, trim={2cm 1cm 8cm 1cm}, clip=false]{img/log_acf_xy.eps}\\
The dependence between values seems to decay exponentially\\ (for a while\dots)
\end{frame}
\begin{frame}
The only stochastic process which produces this decay for data that seems Gaussian is the \emph{Ornstein-Uhlenbeck process},
\[
\frac{\mathrm dx(t)}{\mathrm dt} = - k x(t) + \eta(t).
\]
Here
\begin{itemize}
\item $k$ is the stiffness parameter for a harmonic force
\item $\eta$ is white Gaussian noise
\end{itemize}
\end{frame}
\begin{frame}
The solution of the equation is
\[
x(t) = \int_{-\infty}^t e^{-k (t-s)}\mathrm dB(s),
\]
with $B$ denoting Brownian motion ($\mathrm dB(s) = \eta(s)\mathrm ds$).
We find the autocorrelation for this process to be
%\[
\begin{align*}
\mathrm{acor}(x(\tau)) &= \frac{E[x(0)x(\tau)]}{\sigma^2} \\
&=\dots \\
&= e^{-kt}.
\end{align*}
\uncover<2->{\emph{We can fit this to the autocorrelation of the data to determine $k$!}}
\end{frame}
\begin{frame}
\centering
\includegraphics[height=0.5\textwidth, angle=-90, origin=c, trim={1cm 1cm 1cm 1cm}, clip=false]{img/fit_actual.eps}
The fit gives us the values of $k$ for variables V1 and V2
\begin{align*}
k_{V1} &= 106.00,\\
k_{V2} &= 119.25.
\end{align*}
\end{frame}
\begin{frame}
How do we know if the $k$ we find is right? \emph{Modelling!}
We can integrate the Ornstein-Uhlenbeck process between two consecutive points $\Delta t$ apart to derive a formula for simulation:
\begin{align*}
x_{n+1} &= \alpha x_n + z_n,
\end{align*}
where
\begin{align*}
\alpha = e^{-k \Delta t}
\end{align*}
and $z_n$ is a white Gaussian noise with variance $(1-\alpha^2) \sigma^2$.
The filename mentions a frequency of 5kHz, so the time difference $\Delta t$ between the points is probably 0.2 ms.
\end{frame}
\begin{frame}
Let's calculate everything from the model using $\Delta t$ and fitted $k$ values:
\centering
\includegraphics[height=\textwidth, angle=-90, origin=c, trim={1cm 1cm 1cm 1cm}, clip=false]{img/everything_modeled.eps}\\
\end{frame}
\begin{frame}
For the simulated data, the linear fit results in the $k$ values
\begin{align*}
k_{V1} &= 109.37,\\
k_{V2} &= 138.61,
\end{align*}
which mean that for the first and second variables we are within 4\% and 17\% of the actual values in the simulation, respectively.
\end{frame}
\begin{frame}
Reversing the exact discrete model, we can also solve for the noise $z_n$:
\begin{align*}
x_{n+1} &= \alpha x_n + z_n \\
z_n &= x_{n+1} - \alpha x_n.
\end{align*}
If we know $\alpha$, we can now calculate $z_n$ from the data!
\end{frame}
\begin{frame}
\centering
\includegraphics[height=0.5\textwidth, angle=-90, origin=c, trim={1cm 1cm 1cm 1cm}, clip=false]{img/noise_acf.eps}\\
Calculating the autocovariance shows only a value of 1 at 0, indicating that the values $z_n$ are in fact white noise.
\begin{block}{Conclusion:}<2->
The data can be modeled quite well as an Ornstein-Uhlenbeck process
(and we didn't even need to know it was describing optical tweezers)!
\end{block}
\end{frame}
\end{document}