GTS Handling and Storage
In this tutorial, you will learn how to retrieve, create, modify, store GTS. You will understand the link between token system and GTS storage. It is a key milestone in your learning path.
For this tutorial, you need your own running instance of Warp 10: getting started. You also need to read the basic concepts, the token mechanism page.
GTS internals
The JSON representation of a GTS object is close to the image upper. The GTS contains:
- "c": the classname, a string
- "l": the labels, a MAP of string/string key/value
- "a": the attributes, a MAP of string/string key/value
- "v": a table of value.
The table of values can have two to five columns:
- [ 0 3.14 ]: timestamp, value
- [ 0 12 3.14 ]: timestamp, elevation, value
- [ 0 -45.0 0.2 3.14 ]:timestamp, latitude, longitude, value
- [ 0 48.44218 -4.41427 80000 3.14 ]: timestamp, latitude, longitude, elevation, value
Timestamp is a signed LONG. Default platform configuration is microsecond. Using this configuration, you cover from year -290308 to year 294247. Up to your specific domain, you may need nanosecond precision, and restrict the range from year 1677 to year 2262. If you plan to use archaeological data, you can also change platform settings to millisecond timestamps.
Latitude and longitudes are double precision inputs, but they are stored internally as a highly optimized HHCode. This format is tailor made for latitude and longitude. Do not try to store other values here.
Elevation is a long, in millimeters. You can store any other long value inside this field.
Value could be one of the four allowed types: BOOLEAN, LONG, DOUBLE, STRING.
In a GTS, you can put any of the four allowed types and even mix them, but this is not advised. The type of the GTS is determined at read time. The type of the first read value will set the GTS type.
When you use text input format, if you plan to use DOUBLE, beware to add .0 to clearly indicate Warp 10 you want to store a DOUBLE. This error is common when you use the API update endpoint. Here is a WarpScript which takes the same input format and parse it to build a GTS:
You see that the values are rounded to the nearest integer.
RAM versus Storage
The storage part of Warp 10 is designed to store GTS. When you store a GTS, you cannot have different values for the same timestamp. When you manipulate a GTS in RAM memory (on the stack), you can have multiple timestamps. When you store such a GTS with UPDATE
function, the storage function remove duplicates timestamp, keeping the value of the last one.
Examples below cannot be executed, you need to practice with your own instance and your own tokens.
Hidden property RENAMED
As explained in another tutorial, erasing your own data with UPDATE is easy. For example, you FETCH some data, apply a mean mapper, then UPDATE the result: oops, you have just erased your original data. Warp 10 has a security mechanism to prevent this: if you do not rename a GTS, you cannot UPDATE it. A GTS object has a 'I was renamed' security flag. You cannot see it in the stack JSON representation.
Hidden property BUCKETIZED
When you fetch a GTS, or create it in your WarpScript, this is a raw GTS. You may have a regular period, maybe with missing points, you may or may not have timestamp aligned on a clock (a 10.005ms period for example).
If you want to MAP or REDUCE, you need aligned series, sharing common ticks. To realign GTS, you will use BUCKETIZE. A bucketized Geo Time Series has at most one measurement per bucket, there might be buckets with no measurements. As soon as the GTS is bucketized, you can use specific functions to fill empty buckets, detect outliers, detect patterns. A GTS object has a 'I am bucketized with this bucket span and this bucket end' flag. You cannot see it in the stack JSON representation.
Hidden attributes
A stored GTS has three attributes linked to the storage security model:
.app
: the application name. You can see in the stack JSON representation. Even if you try to override it manually, it will be enforced by the app name contained in your write token..owner
: the owner UUID. You cannot see it, it is enforced by your write token..producer
: the producer UUID. You cannot see it, it is enforced by your write token. When you create a GTS on the stack with WarpScript, these labels are not defined. As you will see below, write tokens may add other labels, which will be visible in the stack JSON representation. Following chapter explain in details the token interaction in GTS storage.
GTS Storage
Write process
When you push a GTS to the update endpoint, or with the UPDATE
function, what does the Warp 10 storage engine exactly do?
- Decrypt the token content. If the token is not encrypted with the platform key (warp.aes.token in your configuration) or if it is a read token, it stops there and raise an invalid token error.
- In the write token, Warp 10 extracts the
expiry
timestamp. If the token is expired, it stops there with an error. - In the write token, Warp 10 extracts
.app
,.owner
,.producer
labels, and add/overwrite them in the GTS object. - In the write token, Warp 10 extracts the
labels
MAP, which can contains other labels you may define while you create the token, and add/overwrite them in the GTS object.
Once the GTS object is complete, Warp 10 storage engine computes a hash of the classnames and all the labels name and value. If this hash is unknown in the system, it creates a new GTS. If it exists, it updates the values. Then it updates the internal Sensision metrics.
Tokens content directly contributes to the unicity of GTS in the Warp 10 storage system.
Delete process
When you want to delete a GTS, or an interval of time in a GTS, you also need a write token. What does the Warp 10 storage engine exactly do?
- Decrypt the token content. If the token is not encrypted with the platform key (warp.aes.token in your configuration) or if it is a read token, it stops there and raise an invalid token error.
- In the write token, Warp 10 extracts the
expiry
timestamp. If the token is expired, it stops there with an error. - In the write token, Warp 10 extracts
.app
,.owner
,.producer
labels. If theowner
is different fromproducer
UUID, then it stops here and raise an error. - In the write token, Warp 10 extracts the
labels
MAP, which can contains other labels you may define while you create the token. - In the write token, Warp 10 extracts
attributes
MAP. If it contains the special key.nodelete
then the delete operation is canceled even ifowner
is the same asproducer
. The input selector is completed with.app
,.owner
,.producer
andlabels
. Then it performs a dry delete. If the number of GTS found is less than the specified input, it performs the delete. If it is a full delete, the entry is removed from thedirectory
component of Warp 10. Then it updates the internal Sensision metrics.
Meta write process
Meta endpoint is used to update the attributes. Attributes are stored in the directory
component of Warp 10. You also need a write token to perform an attribute update. What does the Warp 10 storage engine exactly do?
- Decrypt the token content. If the token is not encrypted with the platform key (warp.aes.token in your configuration) or if it is a read token, it stops there and raise an invalid token error.
- In the write token, Warp 10 extracts the
expiry
timestamp. If the token is expired, it stops there with an error. - In the write token, Warp 10 extracts
.app
,.owner
,.producer
labels, and add/overwrite them in the GTS object. - In the write token, Warp 10 extracts the
labels
MAP, which can contains other labels you may define while you create the token, and add/overwrite them in the GTS object. - The
attributes
contained in the write token have no link with GTS attributes. Token attributes is just a way to add more information in a token. Once the GTS object is complete, Warp 10 storage engine computes a hash of the classnames and all the labels name and value. If this hash is unknown in the system, it does nothing. If it exists, it updates the attributes values.
Read process
Fetch endpoint is used to read GTS from the Warp 10 storage engine. What does the Warp 10 storage engine exactly do?
- Decrypt the token content. If the token is not encrypted with the platform key (warp.aes.token in your configuration) or if it is a write token, it stops there and raise an invalid token error.
- In the read token, Warp 10 extracts the
expiry
timestamp. If the token is expired, it stops there with an error. - In the read token, Warp 10 extracts
.app
,.owner
labels. These application and owner are not used for access control, but for billing system in the internal Sensision metrics - In the read token, Warp 10 extracts the
labels
,owners
,producers
,applications
lists, if they exists. If these lists does not exist, the read token is considered as a wildcard token. Warp 10 storage engine computes a hash of the classnames and all the labels name and value, and askdirectory
component for their locations. Then it performs the fetch and returns a list of GTS. Then it updates the internal Sensision metrics for the .owner of the token.
Tokens can restrict finely what you are allowed to read. Read tokens are designed to set up a pay as you fetch system.
GTS storage from WarpScript
UDPATE
The following WarpScript build 50 GTS and store them into your local Warp 10 instance:
The classname is the same (same sensors), unit is the same, but the sensor ID define unique GTS. As timestamps are fixed from 0 to 1000, running the same script again just overwrite data in the storage.
Basic FETCH
FETCH
instruction takes a LIST input.
[
token STRING
classname STRING
labels MAP
start date or end timestamp
end date or span timestamp or number of points
] FETCH
- Token must be a valid read token
- classname could be a regular expression. It allow you to select several classnames, based on a pattern or on alternations. If you want to use a regular expression, start your string with
~
. Here is a list of valid classnames selectors:
'sensor1' //only sensor1. no need for a regular expression
'~.*' //all the possible classnames (dot mean any character) (star mean zero ore more)
'~input.*' //all the classnames starting with input
'~.*temperature.*' // all the classnames which contains temperature
'~sensor[1-9]' //sensor1, sensor2, ... sensor8, sensor9
'~sensor(3|4|9)' //sensor3, sensor4 and sensor9
'~(temperature28|temperature29|sensor2|sensor4|sensor42)' //logical OR to select only these classnames
- labels is a MAP of labels. It can contains a mix of GTS attributes and labels, there is no difference in the syntax. It can also contain regular expressions. Here is a list of valid labels selector:
{} //every possible labels
{ 'unit' 'km/h' 'type' 'sedan' } //select only data in km/h of sedan type cars
{ 'vin' '~VF1BK5.*' } //select only vehicle numbers starting with VF1BK5
- The fourth object in the list could be a STRING or a NUMBER. See examples below
- The fifth object in the list could be a STRING or a NUMBER. See examples below FETCH always return a list of GTS. List could be empty.
Display the number of datapoints in each FETCH request:
Advanced FETCH
FETCH
can also take a MAP input. This MAP is more readable and gives you more subtle choices than the LIST. Beware of the time range and count combination:
end
is mandatory- If
timespan
is defined,start
andcount
are ignored. - If
timespan
is not defined,count
has priority overstart
.
FETCH always start from the
end
and walk back in time.
GTS attributes
To read only GTS attributes, the fastest way is to use FIND
function, which only ask directory
component.
You should have one GTS with an attribute named 'attr' from a precedent example. Use SETATTRIBUTES
to change, add, erase attributes, then META
to write attributes in the Warp 10 storage.
If you run the previous example again, you should now have a new attribute.
Explore your database
Directory index the GTS in the Warp 10 storage. If you want to build an interactive GUI, use FINDSETS
function. For a selection, it returns every possible attribute, labels, classnames. When the final user select a label, a classname, or an attribute, you can quickly update the other drop-down menu elements.
On a distributed version, FINDSTATS
extract cardinality statistic which can be very usefull when working with Warp 10 experts.
We also talk about GTS on the blog.
Key points
- GTS is unique in the storage if its classname and labels values are unique
- Token content add some labels
- For your security, you cannot UPDATE a GTS if you didn't rename them after a FETCH.
- Attributes are some sticky notes you can put or remove on a GTS. Attributes do not affect GTS unicity.
- In a FETCH or a FIND, you can also filter by attributes