Initial project setup and outline (9df9c359) · Commits · Amil Khan / Next Generation Reservoir Computing

NGRC-model.md

0 → 100644

+22 −0

Original line number	Diff line number	Diff line
		# Lorenz System as an Example Dynamical System
		$$
		\begin{aligned}
		\frac{dx}{dt} &= \sigma (y - x), \\
		\frac{dy}{dt} &= x(\rho - z) - y, \\
		\frac{dz}{dt} &= xy - \beta z,
		\end{aligned}
		$$

		where $\sigma = 10$, $\rho = 28$, and $\beta = \frac{8}{3}$.

		These equations define the Lorenz system, a nonlinear dynamical system that generates complex and chaotic behavior. By numerically integrating this system, time-series data are obtained. In this , only a single observed variable, $x(t)$, is used and treated as a one-dimensional time series.

		The objective is to construct a data-driven model that uses past values of $x(t)$ to predict the next value, which serves as the basis for building the NGRC model .


		$$
		x_i = x(i,\Delta t), \quad i = 1,2,\dots,N
		$$


		The objective is to use past values of $x_i$ to predict the next value $x_{i+1}$.

NGRC.md

0 → 100644

+113 −0

Original line number	Diff line number	Diff line
		# Next Generation Reservoir Computing

		## Feature Vector
		- NGRC (also called NVAR) constructs the feature vector directly from discretely sampled input data, without using a neural network or reservoir.

		- The feature vector is formed as
		$$
		O_{total} = c \;\oplus\; O_{\text{lin}} \;\oplus\; O_{\text{nonlin}}
		$$
		where $c$ is a constant term, $O_{\text{lin}}$ is the linear part, and $O_{\text{nonlin}}$ contains nonlinear features.

		- As in traditional reservoir computing, the output at time step $i$ is obtained by a linear combination of the feature vector using
		$$
		Y_i = W_{\text{out}}\, O_{total}
		$$

		- The output weights are learned using Tikhonov (ridge) regularization, exactly as in classical reservoir computing.

		- The key difference is that all features are constructed explicitly from the input data, rather than being generated by a recurrent neural network.

		## Linear Features
		- The linear feature vector $O_{\text{lin},i}$ is constructed from the input data by using the current input and past inputs.

		- At time step $i$, the features include the input vector $X$ at time $i$ and at $k-1$ previous time steps.

		- These past observations are spaced by a fixed step size $s$, meaning that $(s-1)$ samples are skipped between consecutive observations.

		- If the input $X_i = [x_{1,i}, x_{2,i},....x_{d,i}]^T$ is a $d$-dimensional vector, then the linear feature vector has $d \times k$ components.

		- The linear feature vector is given by
		$$
		O_{\text{lin},i}
		=
		X_i \;\oplus\; X_{i-s} \;\oplus\; X_{i-2s} \;\oplus\; \dots \;\oplus\; X_{i-(k-1)s}
		$$
		where $\oplus$ denotes vector concatenation.


		Although universal approximation theory suggests that the number of delays $k$ should be very large, in practice the corresponding Volterra series converges rapidly, so small values of $k$ are sufficient and do not introduce significant error. This can be understood by analogy with multi-step numerical integration methods, where only a few past steps are needed for accurate predictions.

		A key advantage of NGRC is its short warm-up period. Only $s \times k$ time steps are required to construct the first feature vector, which is much shorter than in traditional reservoir computing, where long warm-up times are needed to remove dependence on initial conditions. This is especially important when data are limited.

		For driven dynamical systems or systems with accessible parameters, the feature vector $\mathcal{O}_t$ is extended to include the driving signal and/or system parameters.


		## Nonlinear features

		The nonlinear feature vector $\mathcal{O}_{\text{nonlin}} $ is constructed by augmenting the linear features $\mathcal{O}_{\text{lin}}$ with polynomial terms. In practice, low-order polynomials are sufficient and provide strong predictive performance.

		A quadratic nonlinear feature vector is obtained from the outer product of the linear features:
		$$
		\mathcal{O}_{\text{nonlin,i}} =

		( \mathcal{O}_{\text{lin},i} \otimes \mathcal{O}_{\text{lin},i} \big)
		$$


		More generally, a polynomial feature vector of order $p$ is formed by including all unique monomials up to order $p$:
		$$
		\mathcal{O}^{(p)}_{nonlinear}
		=

		\mathcal{O}_{{lin}}
		\lceil \otimes \rceil \mathcal{O}_{{lin}}
		\lceil \otimes \rceil
		\cdots
		\lceil \otimes \rceil \mathcal{O}_{{lin}}

		$$
		with $\mathcal{O}_{{lin}}$ appearing $p$ times and $\lceil \otimes \rceil$ collects the unique monomials from the symmetric outer product.

		This construction provides the nonlinearity needed to model complex dynamical systems while keeping training linear and efficient, making it well suited for README-level presentation.

		---
		### Next-Generation Reservoir Computing (NGRC)

		NGRC replaces the reservoir with an explicit feature vector built directly from observed data.

		#### Feature Vector Construction

		Given a time series:

		$$
		X(t), X(t-1), X(t-2), \dots
		$$

		Time-delay embedding (memory):

		$$
		\mathbf{d}(t) = [X(t), X(t-1), \dots, X(t-k+1)]
		$$

		Nonlinear feature expansion examples:

		$$
		X(t)^2,\quad X(t)X(t-1),\quad X(t-1)^2
		$$

		Final feature vector:

		$$
		\phi(t) = [\mathbf{d}(t),\ \text{nonlinear functions of } \mathbf{d}(t)]
		$$

		This feature vector replaces the reservoir.

		## Learning in NGRC

		Prediction is obtained using linear regression:

		$$
		\hat{X}(t+1) = W \, \phi(t)
		$$

RC.md

0 → 100644

+193 −0

Original line number	Diff line number	Diff line
		# Reservoir Computing

		### Dynamical Systems and Time Series

		A dynamical system is a system whose state evolves over time. Examples include natural systems (e.g., weather) and engineered systems (e.g., UAVs). In practice, such systems are observed as time-series data, where measurements are collected sequentially over time.

		The goal is to predict future system behavior using only the observed data, without requiring explicit knowledge of the underlying governing equations.



		### Reservoir Computing (RC)

		Reservoir Computing (RC) is a machine learning paradigm designed for learning and predicting the behavior of dynamical systems from time-series data.

		The central idea of RC is to use a fixed recurrent neural network, called the reservoir, to transform input signals into a rich, high-dimensional representation, while keeping the training process simple.

		Key properties:

		* Works well with small datasets
		* Uses linear optimization
		* Enables fast training
		* Effective for nonlinear and chaotic dynamics

		#### Structure of Reservoir Computing

		An RC model consists of three main components:

		* Input layer, which feeds the time-series data into the system
		* Recurrent neural network (the reservoir), which provides memory and nonlinear dynamics
		* Output layer, which maps the reservoir state to the desired prediction

		In Reservoir Computing, the internal reservoir connections remain fixed and untrained, and only the output layer is trained. This design makes RC computationally efficient and stable for modeling dynamical systems.

		The reservoir state evolves as:

		$$
		r(t) = f\left(W_r r(t-1) + W_{in}X(t)\right)
		$$

		where:

		* $X(t)$ is the input time series
		* $r(t)$ is the reservoir state at time $t$
		* $W_{in}$ and $W_r$ are fixed, randomly initialized weight matrices
		* $f(\cdot)$ is a nonlinear activation function applied element-wise

		Only the output weights are trained:

		$$
		\hat{X}(t+1) = W_{out} r(t)
		$$

		Training is performed using regularized linear regression, which learns $W_{out}$ while keeping the reservoir dynamics unchanged.


		#### Role of Random Matrices

		The matrices $W_{in}$ and $W_r$ are:

		* Randomly initialized
		* Not trained
		* Fixed throughout learning

		Their role is to:

		* Mix input signals
		* Induce nonlinear transformations
		* Provide memory through recurrent connections

		Although random, these matrices create a rich, high-dimensional representation of the input time-series data, which enables effective learning using simple linear readouts.

		## Synchronization Between Reservoir and Data

		Let:

		* $X(t)$ denote the observed data
		* $r(t)$ denote the reservoir state

		Generalized synchronization refers to the situation in which the reservoir state becomes a stable function of the input history:

		$$
		r(t) = F\big(X(t), X(t-1), \dots\big)
		$$

		When synchronization occurs:

		* Reservoir dynamics are driven by the input data
		* The reservoir captures the underlying system behavior
		* Learning becomes stable and reliable

		For stable operation, the reservoir must satisfy the Echo State Property, which ensures that the reservoir state is uniquely determined by the input history and that the influence of initial conditions vanishes over time.


		#### Limitations of Classical Reservoir Computing

		Despite its effectiveness, classical Reservoir Computing has several limitations:

		* Strong dependence on randomly initialized matrices
		* A large number of hyperparameters
		* Limited interpretability of the reservoir dynamics
		* No guarantee of optimal reservoir behavior




		# Background: Reservoir Computing (Research Paper)

		* The goal of reservoir computing is to take the input time-series data $X_i$ and expand it into a higher-dimensional space using a network called the reservoir.

		* The reservoir consists of $N$ interconnected nodes, and the states of all nodes at time step $i$ form an $N$-dimensional vector $r_i$.

		* The connections between reservoir nodes are defined by a matrix $A$, whose entries are chosen randomly and kept fixed.

		* The input data $X_i$ is injected into the reservoir through a fixed random input matrix $W$.

		* The reservoir is a dynamical system: its current state depends on both its previous state and the current input.

		* The reservoir state is updated according to
		$$
		r_{i+1} = (1-\gamma)r_i + \gamma f\left(A r_i + W X_i + b\right)
		$$

		* Here:

		* $r_i = [r_{1,i}, r_{2,i}, . . ., r_{N,i}]^T$ is the reservoir state at time step $i$
		- N- dimensional vector
		- $r_{j,i}$: $j^{th}$ node and $i^{th}$ time
		* $\gamma$ is the decay rate controlling how much past information is retained
		* $f(\cdot)$ is a nonlinear activation function applied to each node
		* $b$ is a bias vector (taken to be the same for all nodes)

		* Time is discretized, so the reservoir state evolves step by step as new input samples arrive.

		* This update rule mixes past reservoir states with new input, creating memory and nonlinear transformations of the data.

		* The resulting reservoir state $r_i$ is later combined linearly to produce the output.

		* The output layer produces the reservoir computing output $Y_i$ by taking a linear combination of a feature vector constructed from the reservoir state $r_i$.

		* This relationship is written as
		$$
		Y_{i+1} = W_{\text{out}} O_{\text{total},i+1}
		$$
		where $W_{\text{out}}$ is the output weight matrix and $O_{\text{total},i+1}$ is the feature vector at time step $i$.

		* The feature vector $O_{\text{total},i}$ can include:

		* a constant term,
		* linear terms (the reservoir states $r_i$),
		* and possibly nonlinear functions of $r_i$.

		* In the standard reservoir computing setup:

		* the reservoir nodes use a nonlinear activation function, commonly
		$$
		f(x) = \tanh(x),
		$$
		* the output feature vector is usually linear, meaning it directly uses the reservoir state:
		$$
		O_{\text{total},i} = r_i.
		$$

		* Training is supervised: the model is given input–output pairs and learns to match the output $Y_i$ to a desired target $\hat{Y}_i$.

		* All training feature vectors are collected into a matrix $O$, and the output weights are learned using regularized least-squares regression (ridge/ Tihkonov regression).

		* The solution for the output weight matrix is
		$$
		W_{\text{out}} = Y_d O^{T}_{total}\left( O_{total}, O^{T}_{total} + \alpha I \right)^{-1},
		$$
		where:

		* $\alpha$ is the regularization (ridge) parameter,
		* $I$ is the identity matrix.

		* The role of the regularization parameter $\alpha$ is to prevent overfitting and make the training robust to noise.

		* Only the output weights $W_{\text{out}}$ are trained; all reservoir and input matrices remain fixed.


		#### Linear Reservoir with Nonlinear Output

		An alternative formulation of Reservoir Computing shifts the nonlinearity from the reservoir to the output layer. In this approach, the reservoir nodes use a linear activation function, while the output feature vector becomes nonlinear. Despite this change, the model remains an equivalently powerful universal approximator and can achieve performance comparable to standard RC.

		A simple example is to extend the standard linear feature vector by including quadratic terms of the reservoir states, obtained using the Hadamard (element-wise) product. The resulting nonlinear feature vector is

		$$
		\mathcal{O}_i = r_i \oplus (r_i \odot r_i) = [r_1, r_2,...r_N, r^2_1, r^2_2, . . ., r^2_N]^T
		$$

		where $\oplus$ denotes vector concatenation and $\odot$ denotes the Hadamard product.

		In this formulation, the reservoir dynamics remain linear, and all nonlinearity is introduced at the output stage through the feature construction. This approach preserves the simplicity of linear training while retaining the expressive power of classical Reservoir Computing.

README.md

+58 −93

Original line number	Diff line number	Diff line
		# Next Generation Reservoir Computing



		## Getting started

		To make it easy for you to get started with GitLab, here's a list of recommended next steps.

		Already a pro? Just edit this README.md and make it your own. Want to make it easy? [Use the template at the bottom](#editing-this-readme)!

		## Add your files

		* [Create](https://docs.gitlab.com/ee/user/project/repository/web_editor.html#create-a-file) or [upload](https://docs.gitlab.com/ee/user/project/repository/web_editor.html#upload-a-file) files
		* [Add files using the command line](https://docs.gitlab.com/topics/git/add_files/#add-files-to-a-git-repository) or push an existing Git repository with the following command:

		```
		cd existing_repo
		git remote add origin https://wgbma.ucc.ie/gitlab/Amil09/next-generation-reservoir-computing.git
		git branch -M main
		git push -uf origin main
		```

		## Integrate with your tools

		* [Set up project integrations](https://wgbma.ucc.ie/gitlab/Amil09/next-generation-reservoir-computing/-/settings/integrations)

		## Collaborate with your team

		* [Invite team members and collaborators](https://docs.gitlab.com/ee/user/project/members/)
		* [Create a new merge request](https://docs.gitlab.com/ee/user/project/merge_requests/creating_merge_requests.html)
		* [Automatically close issues from merge requests](https://docs.gitlab.com/ee/user/project/issues/managing_issues.html#closing-issues-automatically)
		* [Enable merge request approvals](https://docs.gitlab.com/ee/user/project/merge_requests/approvals/)
		* [Set auto-merge](https://docs.gitlab.com/user/project/merge_requests/auto_merge/)

		## Test and Deploy

		Use the built-in continuous integration in GitLab.

		* [Get started with GitLab CI/CD](https://docs.gitlab.com/ee/ci/quick_start/)
		* [Analyze your code for known vulnerabilities with Static Application Security Testing (SAST)](https://docs.gitlab.com/ee/user/application_security/sast/)
		* [Deploy to Kubernetes, Amazon EC2, or Amazon ECS using Auto Deploy](https://docs.gitlab.com/ee/topics/autodevops/requirements.html)
		* [Use pull-based deployments for improved Kubernetes management](https://docs.gitlab.com/ee/user/clusters/agent/)
		* [Set up protected environments](https://docs.gitlab.com/ee/ci/environments/protected_environments.html)

		***

		# Editing this README

		When you're ready to make this README your own, just edit this file and use the handy template below (or feel free to structure it however you want - this is just a starting point!). Thanks to [makeareadme.com](https://www.makeareadme.com/) for this template.

		## Suggestions for a good README

		Every project is different, so consider which of these sections apply to yours. The sections used in the template are suggestions for most open source projects. Also keep in mind that while a README can be too long and detailed, too long is better than too short. If you think your README is too long, consider utilizing another form of documentation rather than cutting out information.

		## Name
		Choose a self-explaining name for your project.

		## Description
		Let people know what your project can do specifically. Provide context and add a link to any reference visitors might be unfamiliar with. A list of Features or a Background subsection can also be added here. If there are alternatives to your project, this is a good place to list differentiating factors.

		## Badges
		On some READMEs, you may see small images that convey metadata, such as whether or not all the tests are passing for the project. You can use Shields to add some to your README. Many services also have instructions for adding a badge.

		## Visuals
		Depending on what you are making, it can be a good idea to include screenshots or even a video (you'll frequently see GIFs rather than actual videos). Tools like ttygif can help, but check out Asciinema for a more sophisticated method.

		## Installation
		Within a particular ecosystem, there may be a common way of installing things, such as using Yarn, NuGet, or Homebrew. However, consider the possibility that whoever is reading your README is a novice and would like more guidance. Listing specific steps helps remove ambiguity and gets people to using your project as quickly as possible. If it only runs in a specific context like a particular programming language version or operating system or has dependencies that have to be installed manually, also add a Requirements subsection.

		## Usage
		Use examples liberally, and show the expected output if you can. It's helpful to have inline the smallest example of usage that you can demonstrate, while providing links to more sophisticated examples if they are too long to reasonably include in the README.

		## Support
		Tell people where they can go to for help. It can be any combination of an issue tracker, a chat room, an email address, etc.

		## Roadmap
		If you have ideas for releases in the future, it is a good idea to list them in the README.

		## Contributing
		State if you are open to contributions and what your requirements are for accepting them.

		For people who want to make changes to your project, it's helpful to have some documentation on how to get started. Perhaps there is a script that they should run or some environment variables that they need to set. Make these steps explicit. These instructions could also be useful to your future self.

		You can also document commands to lint the code or run tests. These steps help to ensure high code quality and reduce the likelihood that the changes inadvertently break something. Having instructions for running tests is especially helpful if it requires external setup, such as starting a Selenium server for testing in a browser.

		## Authors and acknowledgment
		Show your appreciation to those who have contributed to the project.

		## License
		For open source projects, say how it is licensed.

		## Project status
		If you have run out of energy or time for your project, put a note at the top of the README saying that development has slowed down or stopped completely. Someone may choose to fork your project or volunteer to step in as a maintainer or owner, allowing your project to keep going. You can also make an explicit request for maintainers.
		## Next-Generation Reservoir Computing


		This project studies Next-Generation Reservoir Computing (NGRC) as a dat driven approach for modeling and predicting nonlinear dynamical systems. The focus is on conceptual understanding, mathematical intuition, and a machine earning riented perspective, supported by simple numerical experiments.

		### [Chapter 1. Introduction and Motivation](background.md)
		- Dynamical systems and time eries in science and engineering
		- Observations versus underlying system dynamics
		- Why modeling temporal dynamics is challenging
		- Limitations of standard machine learning models for time-series
		- Motivation for data-driven and equation free modelling
		- Positioning of reservoir based methods
		- Scope, objectives, and structure of the project


		### [Chapter 2. Conceptual Background: Reservoir Computing and NGRC](RC.md)
		- Intuitive idea of reservoir computing
		- Separation of dynamical transformation and learning
		- Role of memory in time series modelling
		- Role of nonlinearity in representing dynamics
		- Limitations of classical reservoir computing
		- Conceptual motivation for Next Generation Reservoir Computing


		### Chapter 3. Representing Dynamical Systems from Data
		- Observed measurements versus underlying system state
		- State space view of dynamical systems
		- Time delay embedding as a representation of memory
		- Intuition behind state reconstruction from time-series
		- Implications for data-driven modeling

		### Chapter 4. Learning and Stability
		- Linear regression as readout learning
		- Role of regularization in controlling complexity
		- Numerical stability and conditioning
		- Interpretability of the learned model

		### Chapter 5. NGRC Model Construction
		- Explicit construction of system state from delayed inputs
		- Nonlinear feature expansion as state enrichment
		- Choice of feature families and dimensionality
		- High-level structure of the NGRC model

		### Chapter 6. Dynamical Prediction and Experiments
		- One step versus iterative prediction
		- Short term predictability of nonlinear systems
		- Error growth and divergence in chaotic dynamics
		- Qualitative comparison with simple baseline models

		### Chapter 7. Discussion and Conclusion
		- Summary of main results and observations
		- Strengths and limitations of NGRC
		- Conceptual insights from a modeling perspective
		- Directions for future work

		### [Resource](resource.md)
		### Appendix
		- Implementation details
		No newline at end of file

background.md

0 → 100644

+169 −0

File added.

Preview size limit exceeded, changes collapsed.