Skip to content

Catalogs

Marcus Henriksson edited this page Sep 24, 2024 · 4 revisions

Contents

System

Catalog that add support for various system tables

  • catalogs (List registered catalogs in session)
  • functions (List built in functions)
  • tables (List temp tables in session)
  • caches (List sessions caches)
  • cachekeys (List sessions cache keys)

This table also redirects table queries to other registered catalogs (if they implement it). For example:

-- List various system tables for catalog registed with alias 'es'

select *
from sys#es.tables

select *
from sys#es.columns

select *
from sys#es.indices

These table queries can be implemented by catalogs via Catalog#getSystemTableDataSource The system catalog is automatically registered in the session with alias sys

Filesystem

Simple catalog that provides file system access. Does not provide any tables.

Catalog Properties

N/A

Table Qualified Name

N/A

Functions

Name Type Description Arguments Example
list Table Lists files in provided path path (String) [, recursive (Boolean)] select *
from fs#list('/tmp', true)
file Table Opens a single file with provided path path (String) select *
from table t
cross apply fs#file(t.file) f
contents Scalar Returns contents of provided path as a String.
NOTE! This function returns a java.io.InputStream as an Any value to be passed to functions that can read those. Ie. openjson/csv/xml functions. The wrapped value fallbaks to reading the file into a String value.
path (String) select fs#contents('/tmp/file').

Elasticsearch

Catalog that provides access to elasticsearch.

Catalog Properties

Name Description
endpoint (string) Endpoint to elasticsearch
index (string) Index in elasticsearch
type (string) Type in index
cache.mappings.ttl (integer) Cache TTL for mappings. To avoid excessive queries against ES for mappings etc. that is needed to properly utilize indices, predicate push downs, order by's etc. that information is put to PLB's GENERIC cache. Defaults to 60 minutes.

Table Qualified Name

Parts Description Example
1 Type
Expected catalog properties: endpoint, index
use es.endpont = 'http://localhost:9200'
use.es.index = 'myIndex'
select * from type
2 Index, Type
Expected catalog properties: endpoint
use es.endpont = 'http://localhost:9200'
select * from myIndex.type
3 Endpoint, Index, Type select * from "http://localhost:9200".myIndex.type
... Not supported

NOTE! For ES versions where type is not longer present the qualified name _doc is used.

Functions

Name Type Description Arguments Example
mustachecompile Scalar Compiles provided mustasch template with arguments template (Sting), model (Map)
rendertemplate Scalar Render provided mustasch template with arguments template (Sting), model (Map)
search Table Queries Elasticsearch with a nativ query Named arguments:
endpoint (String) (Optional if provided in catalog properties
index (String) (Optional if provided in catalog properties
type (String)
body (String) Mutual exclusive with template
template (String) Mutual exclusive with body
scroll (Boolean)
params (Map) Model provided to template
use es.endpoint = 'http://localhost:9200'
use es.index='myIndex'

select *
from es#search(
body: '{ "filter": { "match_all": {} }',
scroll: true
)
match Scalar Function that utilizes ES match operator as a predicate. NOTE! Only applicable in query predicates for ES catalog tables. matchFields (Qualified name or String with comma separated field names. If multiple fields are used then a multi_match query is used.)
field(s) (String), query (String)
select *
from es#_doc
where match(name, 'some phrase')

select *
from es#_doc
where es#match('name,message', 'some phrase')
query Scalar Function that utilizes ES query_string operator as a predicate. NOTE! Only applicable in query predicates for ES catalog tables. query (String)
select *
from es#_doc
where es#query('type:log AND severity:200')
cat Table Function that exposes elastic search cat-api as a table optional endpoint (String), catspec (String) select * from es#cat('nodes?s=name:desc')

Misc

LIKE operator is pushed down as a Elastic WILDCARD query

Jdbc

Catalog that provides access to Jdbc compliant systems

Table Qualified Name

Fully qualified name is sent to Jdbc as is

Catalog Properties

Name Description
driverclassname Jdbc driver class name
url Jdbc connection url
database Jdbc catalog to query
username Username for connection
password Password for connection

Functions

Name Type Description Arguments Example
query Table Queries Jdbc with a native query query (String)
parameters (List) Optional parameters for the prepared statement in query
use jdbc.url = 'jdbc://........'
use jdbc.username = 'user'
use jdbc.password = 'pass'

select *
from jdbc#query('select * from table where col1 = ?', array(1337))

Http

Catalog that transforms HTTP requests into table sources.

Table Qualified Name

The qualified name must be of size 1 and acts as the endpoint to call if the qualifier starts with http else the qualifier is treated as a catalog property, this to enable dynamic endpoints since the qualifiers is not expressions.

Ie.

select *
from http#"http://path.to/endpoint" x

or

use http.endpoint = 'http://path.to/endpoint' -- this one can be set by Java code to enable configurable endpoints
select *
from http#endpoint x

The http requests can be modifed by specifying a set of table source options.

Option Description Example
header."header name" Provides Http headers to the request. Each option where the first qualified part equals header is added as a Http request header header."Content-Type" = 'application/json'
method Specifies which Http method that should be used. Supported methods: get/post/put/patch. Default: get method = 'POST'
querypattern Appends a dynamic/evaluated part to the query of the request. Placeholders can be added here that will hint to the framework that a column is available as predicate/index querypattern = '/api/v1/ids/{{id}}'
This will enable column id to be used as a predicate or indexed column
bodypattern Body to send in request. See querypattern for placeholder description. bodypattern = `{ "filter": { "countryCode": "${@countryCode}", "ids":[ {{id}} ] } }`
This will enable column id to be used as a predicate or indexed column
failonnon200 Should request cause en exception on non 200 response or not. If false then a 0 rows result will be returned. Defaults to: true failonnon200 = true

NOTE! If placeholders are used then ALL placeholders in both query and body parts must be processed by either a predicate or an index else a Http datasource cannot be created.

NOTE! Placeholders must be present as values in payload else predicates (WHERE/JOIN) will be evaluated to false and no result will be returned from query.

NOTE! Only IN and EQUAL predicate types are supported.

Response

Depending on Accept/Content-Type headers in request/response the stream will be processed by core table valued functions.

  • application/json will be sent to openjson function
  • text/xml will be sent to openxml function
  • text/csv will be sent to opencsv function. If non matches the response will be transformed into a single row with a body column and a headers column.

Table options are forwarded to core table functions such as xmlPath/jsonPath etc.

Examples

Predicate

SET @customer = 'tenant-xyz' -- Can be provided by code

SELECT *
FROM http#"http://path.to.endpoint/api/v1" x
WITH
(
  method = 'GET',
  header."Accept" = 'application/json',
  header."x-tenant" = @customer,
  querypattern = '/products/{{productId}}'
)
WHERE x.productId IN ('product-010203', 'product-405060')

This will send a GET request like this:

GET http://path.to.endpoint/v1/api/products/product-010203,product-405060
Accept: application/json
x-tenant: tenant-xyz

Index

use http.orderservice = 'http://orderservice.svc.int.company/api'

SELECT *
FROM jdbc#query('
select orderId
from order
where createdAt > DATEADD(HOUR, -5, GETUTCDATE())
') order
INNER JOIN http#orderservice orderSummary
WITH
(
  method = 'POST',
  header."Content-Type" = 'application/json',
  header."Accept" = 'application/json',
  querypattern = '/orderSummary',
  bodypattern = '
  {
    "query": {
      "orderIds": [{{orderId}}]
    }
  }'
)
  ON orderSummary.orderId = order.orderId

This will send a POST request for each batch like this:

POST http://orderservice.svc.int.company/api/orderSummary
Accept: application/json
Content-Type: application/json

{
  "query": {
    "orderIds": [100,200,300,400]
  }
}

Functions

Name Type Description Arguments Example
query Table Performs a HTTP query. As with table sources this function takes a set of options to customize Http request.
- header."header-name" See previous section
- method See previous section
- body Sets body to send
- failonnon200 See previous section
endpoint (string) select *
from http#query('http://path.to.endpoint/') x
WITH
(
header."Accept" = 'application/json'
)