You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- $\xi_1,\text{...},\xi_n$ be the PCA basis functions of the input space $\mathcal X$.
112
-
- The operator $K_\mathcal X$ for a given $x\in \mathcal X$ would then be $K_\mathcal X(x) :=\mathrm Lx = \{\langle\xi_j,x\rangle\}_j$.
111
+
- $\xi_1,\text{...},\xi_n$ be the PCA basis functions of the input space $\mathcal X$. The operator $K_\mathcal X$ for a given $x\in \mathcal X$ would then be $K_\mathcal X(x) :=\mathrm Lx = \{\langle\xi_j,x\rangle\}_j$.
113
112
- $\psi_1,\text{...},\psi_m$ be the PCA basis functions of the output space $\mathcal Y$.
114
113
115
114
The final approximation $\mathcal G^\dagger_{\text{PCA}}:\mathcal X \times \Theta \rightarrow \mathcal Y$ is then given by:
@@ -134,17 +133,15 @@ One of the big problems of these approaches is the fact $L_\mathcal Y$ is a line
134
133
135
134
Let:
136
135
- $\mathcal X$ and $\mathcal Y$ be function spaces over $\Omega \subset \mathbb R^d$
137
-
- $\mathcal G^\dagger$ is the composition of non-linear operators: $\mathcal G^\dagger=S_1\circ \text{...} \circ S_L$
138
-
- In the linear case, as described before, $S_1 = K_\mathcal X$, $S_L = K_\mathcal Y$ and they're connected through multiple $\varphi_j$.
136
+
- $\mathcal G^\dagger$ is the composition of non-linear operators: $\mathcal G^\dagger=S_1\circ \text{...} \circ S_L$. In the linear case, as described before, $S_1 = K_\mathcal X$, $S_L = K_\mathcal Y$ and they're connected through multiple $\varphi_j$.
139
137
The above definition *looks a lot* like the typical definition of NNs, where each one of the $S_l$ is a layer of your NN. And, as we're going to see, it is! At least it is a generalization of the definition of NN to function space.
140
138
[9] proposed to create each one of this $S_l$ as follows:
- $\sigma_l:\mathbb R^k\rightarrow\mathbb R^k$ is the non-linear activation function.
146
-
- $W_l\in\mathbb R^k$ is a term related to a "residual network".
147
-
- This term is not necessary for convergence, but it's credited to help with convergence speed.
144
+
- $W_l\in\mathbb R^k$ is a term related to a "residual network". This term is not necessary for convergence, but it's credited to help with convergence speed.
148
145
- $b_l\in\mathbb R^k$ is the bias term.
149
146
- $\kappa_l:\Omega\times\Omega\rightarrow\mathbb R^k$ is the kernel function.
0 commit comments