AOQL
AOQL is the language developed and used by Kevel Audience to query event databases. Its main objective is to provide a set of operations common to all data storage backends in use, such as relational databases and Elasticsearch.
The language is a subset of SQL and its semantics try to mimic the latter's behavior on a standard relational database. It is intended to be suitable both for "table scanning" queries (e.g. selections, filters) and aggregation queries (with GROUP BY). It most notably differs from SQL due to the absence of subqueries – AOQL intends to be targeted at data retrieval with a low server-side computational cost.
Using AOQL
AOQL can be used through gRPC operations exposed by the Event Storage module. The endpoint in which each service resides is client-dependent.
AOQL Relations
Through the Event Storage module, Kevel Audience provides querying access to three AOQL relations. Each relation exposes an event dataset, whose schema is not necessarily the same as the real schema used by their backend. The representation of a full record ( SELECT * ) may be different from the representation of individual projections, both for efficiency and convenience purposes.
Match Events
The relation providing access to match events is named MatchEvents
and can be accessed using
the matchQuerySource
gRPC operation. Its star (full record) projection is a MatchEvent
.
Columns
- requestId (String) : Optional. the identifier of the request that triggered the match event
- providerId (String) : the identifier of the match provider;
- userIds (Set[UserId]) : the set of matched user IDs;
- hasAttributes (Boolean) : whether the user has attributes after the match operation;
- timestamp (Long) : the timestamp of the match, in milliseconds.
Examples
Return a stream of distinct providers of matches in January 2018:
SELECT DISTINCT providerId FROM MatchEvents WHERE timestamp >= 1514764800000 AND timestamp < 1517443200000
Return an all-time count of events per match provider:
SELECT providerId, COUNT(*) FROM MatchEvents GROUP BY providerId
Activation Events
The relation providing access to activation events is named ActivationEvents
and can be accessed
using the activationQuerySource
gRPC operation. Its star (full record) projection is an ActivationEvent
.
Columns
- requestId (String) : Optional. the identifier of the request that triggered the activation event
- receiverId (String) : the identifier of the data receiver;
- userId (UserId) : the activated user ID;
- linkedUserIds (Set[UserId]) : the user IDs linked with the activated one at the time of the activation;
- activatedAttributes (UserData) : the activated attributes before the application of any mapping rule;
- timestamp (Long) : the timestamp of the activation, in milliseconds.
Examples
Return a stream of distinct user IDs of matches in January 2018:
SELECT DISTINCT userId FROM ActivationEvents WHERE timestamp >= 1514764800000 AND timestamp < 1517443200000
Return an all-time count of events per data receiver:
SELECT receiverId, COUNT(*) FROM ActivationEvents GROUP BY receiverId
CookieSync Events
The relation providing access to cookiesync events is named CookieSyncEvents
and can be accessed
using the cookieSyncQuerySource
gRPC operation. Its star (full record) projection is a CookieSyncEvent
.
Columns
- requestId (String) : Optional. The identifier of the request that triggered the Cookie Sync event;
- cookieSyncId (String) : the identifier of the cookie sync configuration;
- primaryUserId (String) : the primary user id used when syncing;
- syncedUserId (SyncedUserId) : the id to which the primary user id was synced to;
- link (Boolean) : whether the request signaled the intent to link the two user ids internally;
- timestamp (Long) : the timestamp of the cookie sync, in milliseconds.
Examples
Return a stream of distinct primary user ids used in syncing in January 2018:
SELECT DISTINCT primaryUserId FROM CookieSyncEvents WHERE timestamp >= 1514764800000 AND timestamp < 1517443200000
Return an all-time count of cookie syncs per cookie sync ID:
SELECT cookieSyncId, COUNT(*) FROM CookieSyncEvents GROUP BY cookieSyncId
Tracking Events
The relation providing access to tracking events is named TrackingEvents
and can be accessed
using the trackingQuerySource
gRPC operation. Its star (full record) projection is a JSON
object (the exact type depends on the programming language).
The JSON object contains all the information of an HTTP request, as well as metadata externally collected or extracted from the event itself. The "Columns" section below contains some of the most common fields.
Columns
- meta.timestamp (Long) : the timestamp of the event, in milliseconds;
- meta.user (UserId) : the tracked user ID;
- meta.type (String) : the type of the tracking event;
- meta.clientId (String) : the client ID specified in the request;
- meta.siteId (String) : the site ID specified in the request;
- meta.country (String) : the country where the request was made, if available;
- meta.cookie.sticky (Boolean) : whether the browser has cookies enabled or not, as detected by the server;
- meta.user–device.os–family (String) : the operating system provided in the User–Agentheader, as detected by the server;
- uri.query.<name> (String) : the value of the query param
name
, if available; - headers.<name> (String) : the value of the HTTP header
name
, if available.
Examples
Return a stream of distinct tracked user IDs in January 2018:
SELECT DISTINCT meta.user FROM TrackingEvents WHERE meta.timestamp >= 1514764800000 AND meta.timestamp < 1517443200000
Return an all-time count of tracking events per client ID:
SELECT meta.clientId, COUNT(*) FROM TrackingEvents GROUP BY meta.clientId
Error Events
The relation providing access to error events is named ErrorEvents
and can be accessed using
the errorQuerySource
gRPC operation. Its star (full record) projection is an ErrorEvent
.
Columns
- entityId (String) : the id of the entity that caused the error;
- entityType (String) : the type of the entity that caused the error;
- errorType (ErrorType) : type of error;
- payload (Json) : complementary data for the error, in JSON;
- timestamp (Long) : the timestamp of the error occurrence, in milliseconds.
Examples
Return a stream of distinct type of errors in January 2018:
SELECT DISTINCT errorType FROM ErrorEvents WHERE timestamp >= 1514764800000 AND timestamp < 1517443200000
Return an all-time count of error events per type of entity that caused the error:
SELECT entityType, COUNT(*) FROM ErrorEvents GROUP BY entityType
Extended* Relations
AOQL provides an extra Extended* relation for each of the relations described above:
- ExtendedMatchEvents;
- ExtendedActivationEvents;
- ExtendedCookieSyncEvents.
- ExtendedTrackingEvents.
- ExtendedErrorEvents.
These extended relations augment the original relations with an internal ID. The star projection becomes:
- id
- event
where event
encapsulates all the original event columns. Thus, when using these relations, one
should reference the event columns by correctly namespacing them. For example:
SELECT receiverId, COUNT(*) FROM ActivationEvents GROUP BY receiverId
should become:
SELECT event.receiverId, COUNT(*) FROM ExtendedActivationEvents GROUP BY event.receiverId
when using the ActivationEvents
extended version.
These extra relations allows one to inspect the internal event IDs. These IDs can be used on subsequent queries to match against a specific event. For example:
Return the latest match:
SELECT * FROM ExtendedMatchEvents ORDER BY event.timestamp DESC limit 1
Use a given ID to match against a specific match event:
SELECT * FROM ExtendedMatchEvents WHERE id='2030'