1.3.5. Long term data storage enabler
1.3.5.1. Introduction
The role of the Long-Term Storage Enabler is to serve as a secure and resilient storage, offering different storage sizes and individual storage space for other enablers (which could request back when they are being initialising in Kubernetes pods). Therefore, it is considered as one of the ASSIST-IoT enablers envisioned to be deployed on the cloud rather than the edge. Next figure depicts the high-level overview of the LTSE components.
As it can be seen, it will be mainly formed by three components:
LTSE Gateway: The entrance gate to the LTSE, acting as a proxy from ASSIST-IoT enablers and external services, whose data should be collected either at SQL server databases or noSQL cluster nodes. To do so, the LTSE Gateway is based on restAPI request, with append SQL/noSQL endpoints, respectively. Furthermore, the LTSE gateway also guarantees that the data will be kept safe, in face of various kinds of unauthorised access requests, or hardware failures, by only allowing access to the data once the Identity Manager and the Authorisation enablers have confirmed their access rights.
LTSE noSQL cluster: A group of one or more LTSE NoSQL nodes instances that are connected together, and carries out the distribution of tasks, searching and indexing, across all the noSQL nodes. Every NoSQl node in the NoSQL cluster can handle HTTP and transport traffic by default with the external enablers through the LTSE gateway. The transport layer is used exclusively for communication between nodes; the HTTP layer is used by REST clients. The full hierarchy would be therefore, noSQL_Cluster > noSQL_Node > noSQL_Index > noSQL_document. For High Availability (HA), noSQL_document in LTSE_noSQL_Index may be distributed across multiple shards, which in turn are distributed across multiple nodes, if configured.
LTSE SQL server: It manages the SQL databases, formed by different enablers data tables. It performs, hence, backup database actions on behalf of the enablers. The SQL_Server can handle multiple concurrent connections from external enablers via the LTSE Gateway. In general, the full hierarchy is: SQL_Cluster > SQL_Database > SQL_schema > SQL_table > SQL_row. For High Availability, a master database with one or more standby servers can be setup.
1.3.5.2. Features
The LTSE is constructed as a Helm chart, formed by different subcharts. The following image illustrates the different frameworks/charts used for the instantiation of the LTSE components.
As it can be see, the main structure is the following:
SQL server –> PostGreSQL 14.2.0 version
NoSQL cluster –> ElasticSearch 7.16.3
LTSE gateway –> Custom HTTP web framework (Gin) written in Go
1.3.5.3. Place in architecture
The Long Term data Storage enabler is part of the Data Management Plane of ASSIST-IoT. The Data Management plane encompasses any process, in which data is processed to deliver features concerning data interoperability, annotation, security, acquisition, provenance, aggregation, fusion, etc. This enabler serve as a secure and resilient storage of any ASSIST-IoT deployment.
1.3.5.4. User guide
The entry point to the LTSE is by the LTSE Gateway. The currently supported REST API endpoints are listed below:
1.3.5.4.1. REST API endpoints
Method |
Endpoint |
Description |
Payload (if needed) |
Response format |
|---|---|---|---|---|
POST |
/sql/databases |
Creates a databasename in the ltse sql cluster |
databasename |
|
POST |
/sql/databases/:databasename/tables/ |
Creates a tablename in the databasename of ltse sql server |
tablename |
|
POST |
/sql/databases/:databasename/tables/:tablename/data |
Inserts data into the tablename on the databasename of ltse sql server |
data |
|
GET |
/sql/databases/:databasename/tables/ |
Obtains all the data contained within the tablename of the databasename of ltse sql server |
tablename |
|
PUT |
/nosql/index/ |
Creates a new index indexname in the ltse nosql cluster. when creating an index, you can specify the settings for the index, mappings for fields in the index, and index aliases |
indexname |
|
GET |
/nosql/index/ |
Returns information about indexname index from the ltse nosql cluster |
indexname |
|
PUT |
/nosql/index/<indexname>/document/ |
Adds a json document to the specified <indexname > index of the ltse nosql cluster and makes it searchable with an <_id> |
_id |
|
GET |
/nosql/index/<indexname>/_doc/ |
Retrieves the specified json document <_id> from the indexname of the ltse nosql cluster. |
_id |
1.3.5.5. Prerequisites
Kubernetes 1.19+
Helm 3.2.0+
PV provisioner support in the underlying infrastructure
Minimum cluster requirements include the following to run this chart with default settings. All of these settings are configurable:
Three Kubernetes nodes to respect the default “hard” affinity settings
1GB of RAM for the JVM heap
1.3.5.6. Installation
1.3.5.6.1. Installing the chart
The enabler is provided as a Helm chart. To install the chart with the release name my-ltse:
helm install my-ltse ltse
The command deploys PostgreSQL on the Kubernetes cluster in the default configuration. The Parameters section lists the parameters that can be configured during installation.
Alternatively, a YAML file that specifies the values for the parameters can be provided while installing the chart. For example,
helm install my-ltse -f values.yaml ltse
Tip: List all releases using
helm list
1.3.5.6.2. Verification / Visualization
In order to properly understand the LTSE deployment, you can install two separte IDE clients for each of the storage services of the LTSE, as well as an API client. The following applicaitons are proposed:
Kibana: It provides visualization capabilities on top of the content indexed on an Elasticsearch cluster (i.e., LTSE noSQL cluster). Kibana Helm Chart
pgAdmin: The most popular and feature rich Open Source administration and development platform for PostgreSQL (i.e., LTSE SQL server). Helm Chart for pgAdmin
Postman: an API platform for testing APIs (i.e., LTSE gateway). Postman API client
1.3.5.6.3. Uninstalling the Chart
To uninstall/delete the my-ltse deployment:
helm delete my-ltse
The command removes all the Kubernetes components but PVC’s associated with the chart and deletes the release.
To delete the PVC’s associated with my-ltse:
kubectl delete pvc -l release=my-ltse
Note: Deleting the PVC’s will delete postgresql data as well. Please be cautious before doing it.
1.3.5.7. Configuration options
The chart deploys a StatefulSet and by default will do an automated rolling update of your cluster. It does this by waiting for the cluster health to become green after each instance is updated.
It is important to verify that the JVM heap size in
esJavaOptsand to set the CPU/Memoryresourcesto something suitable for your cluster.
1.3.5.8. Developer guide
The following options are supported for each subchart. See values.yaml for more detailed documentation and examples:
1.3.5.8.1. noSQL cluster (Elasticsearch helm chart configuration)
1.3.5.8.2. SQL server (PostgreSQL helm chart configuration)
1.3.5.8.2.1. Global parameters
Name |
Description |
Value |
|---|---|---|
|
Global Docker image registry |
“” |
|
Global Docker registry secret names as an array |
[] |
|
Global StorageClass for Persistent Volume(s) |
“” |
|
Password for the “postgres” admin user (overrides auth.postgresPassword) |
“” |
|
Name for a custom user to create (overrides auth.username) |
“” |
|
Password for the custom user to create (overrides auth.password) |
“” |
|
Name for a custom database to create (overrides auth.database) |
“” |
|
Name of existing secret to use for PostgreSQL credentials (overrides auth.existingSecret) |
“” |
|
PostgreSQL service port (overrides service.ports.postgresql) |
“” |
1.3.5.8.2.2. Common parameters
Name |
Description |
Value |
|---|---|---|
|
Override Kubernetes version |
“” |
|
String to partially override common.names.fullname template (will maintain the release name) |
“” |
|
String to fully override common.names.fullname template |
“” |
|
Kubernetes Cluster Domain |
cluster.local |
|
Array of extra objects to deploy with the release (evaluated as a template) |
[] |
|
Add labels to all the deployed resources |
{} |
|
Add annotations to all the deployed resources |
{} |
|
Enable diagnostic mode (all probes will be disabled and the command will be overridden) |
false |
|
Command to override all containers in the statefulset |
[“sleep”] |
|
Args to override all containers in the statefulset |
[“infinity”] |
1.3.5.8.3. SQL API (PostgREST helm chart configuration)
Parameter |
Description |
Default |
|---|---|---|
|
Number of replicas |
1 |
|
The image to run |
postgrest/postgrest |
|
The image tag to pull |
v5.2.0 |
|
IfNotPresent |
|
|
n/a |
|
|
n/a |
|
|
Type of Service |
ClusterIP |
|
Port for kubernetes service |
80 |
|
Annotations to add to the service |
{} |
|
PostGREST Ingress host names |
[] |
|
PostGREST Ingress TLS configuration (YAML) |
[] |
|
CPU/Memory resource requests/limits |
{} |
|
Settings for nodeselector |
{} |
|
Settings for toleration |
[] |
|
Settings for affinity |
{} |
|
PostgreSQL connection |
n/a |
|
Database schema to expose |
public |
|
Number of connections to keep open |
100 |
|
Where to bind the PostgREST web server |
*4 |
|
The port to bind the web server |
3000 |
|
Overrides the base URL |
n/a` |
|
The secret or JSON Web Key (JWK) used to decode JWT tokens |
n/a |
|
false |
|
|
n/a |
|
|
A hard limit to the number of rows PostgREST will fetch |
n/a |
|
A schema-qualified stored procedure |
n/a |
|
.role |
1.3.5.9. Version control and release
Version 1.0. - Currently LTSE does not communicate with Cybersecurity enablers. In addition, API is only able to create DBs and tables, but the rest of API functionalities are still not supported
Improvements and new functionalities will be added in future versions.
1.3.5.10. License
Will be determined after the release of the enabler.
1.3.5.11. Notice(dependencies)
Will be determined after the release of the enabler.