SlicingDice Documentation

SlicingDice - The All-in-One Solution Documentation

Welcome to the SlicingDice documentation. You'll find comprehensive guides and documentation to help you start working with SlicingDice as quickly as possible, as well as support if you get stuck. Let's jump right in!

Get Started    Guides

Aggregations


A SlicingDice aggregation query is equivalent to a SQL query containing GROUP BY clauses, like this:

SELECT Count(DISTINCT User.UserID), 
       Country 
FROM   User 
GROUP  BY Country 
ORDER  BY Count(UserID) DESC; 

Query purpose

SlicingDice aggregation is really powerful for generating insights from the data by calculating metrics, such as maximum, minimum, sum and average, as well as crossing data from different columns to generate pivot tables. If you are used to Relational Databases, imagine the power of crossing data from different tables without the performance hit of multiple JOIN operations.

Aggregation queries are ideal to answer this kind of questions:

  • What is the quantity of users grouped by city and age?
  • What is the quantity of users that accessed /online-demo or /platform at least 3 times in the last 7 days grouped by their browser-version and os-version?
  • What is the sum of all purchase-values from orders between {% convert_date 'YYYY-MM-DDT00:00:00Z' '-30d' %} and {% convert_date 'YYYY-MM-DDT00:00:00Z' '-1d' %}, grouped by gender and age?

Multi-level Aggregations are not commutative

Keep in mind the results are generated by recursively applying aggregations, i.e., when making an aggregation between the age and city columns, it first find the top N values for age and, based on these values, it will find the top N values for city. In this sense, multi-level aggregation are not commutative.


Query endpoint

In order to make a aggregation query, you simply need to send a POST request to https://api.slicingdice.com/v1/query/aggregation with a single JSON object containing a list of columns and predicates.

Pro Tip

Remember that you can also use test databases to query before inserting real data to your production database.

Consider generating a free demo database key for yourself to use in all examples below, as indicated on the try before you buy page.


Query syntax and structure

In this query below we will count how many unique entities there are on the dimension users, grouped by the top 2 most common values for the purchased-products column between {% convert_date 'YYYY-MM-DDT00:00:00Z' '-20d' %} and {% convert_date 'YYYY-MM-DDT00:00:00Z' '-1d' %}, considering only entities where country column is equals to USA.

SELECT [purchased-products.value] FROM users WHERE country='USA' GROUP BY [purchased-products.value] HAVING [purchased-products.date] BETWEEN '{% convert_date 'YYYY-MM-DDT00:00:00Z' '-20d' %}' AND '{% convert_date 'YYYY-MM-DDT00:00:00Z' '-1d' %}' LIMIT 2;
$ curl -X POST https://api.slicingdice.com/v1/sql \
    -H 'Authorization: MASTER_OR_WRITE_API_KEY' \
    -H 'Content-Type: application/sql' \
    -d 'SELECT [purchased-products.value] FROM users WHERE country='USA' GROUP BY [purchased-products.value] HAVING [purchased-products.date] BETWEEN '{% convert_date 'YYYY-MM-DDT00:00:00Z' '-20d' %}' AND '{% convert_date 'YYYY-MM-DDT00:00:00Z' '-1d' %}' LIMIT 2;'
curl -X POST https://api.slicingdice.com/v1/query/aggregation \
    -H 'Authorization: MASTER_OR_READ_API_KEY' \
    -H 'Content-Type: application/json' \
    -d '{
        "filter": [
            {
                "country": { "equals": "USA" }
            }
        ],
        "query": [
            {
                "purchased-products": 2,
                "between": [
                    "{% convert_date 'YYYY-MM-DDT00:00:00Z' '-20d' %}",
                    "{% convert_date 'YYYY-MM-DDT00:00:00Z' '-1d' %}"
                ]
            }
        ]
    }'
from pyslicer import SlicingDice
slicingdice = SlicingDice(master_key='MASTER_API_KEY')

query_data = {
    "filter": [
        {
            "country": {
                "equals": "USA"
            }
        }
    ],
    "query": [
        {
            "purchased-products": 2,
            "between": [
                "{% convert_date 'YYYY-MM-DDT00:00:00Z' '-20d' %}", 
                "{% convert_date 'YYYY-MM-DDT00:00:00Z' '-1d' %}"
            ]
        }
    ]
}

print(slicingdice.aggregation(query_data))

import com.slicingdice.jslicer.SlicingDice;
import java.io.IOException;
import org.json.JSONObject;
import org.json.JSONArray;

public class Example {
    public static void main(String[] args) throws IOException {
        SlicingDice slicingdice = new SlicingDice("MASTER_API_KEY");

        JSONObject queryData = new JSONObject()
        .put("filter", new JSONArray()
            .put(new JSONObject()
                .put("country", new JSONObject()
                    .put("equals", "USA"))))
        .put("query", new JSONArray()
            .put(new JSONObject()
                .put("purchased-products", 2)
                .put("between", new JSONArray()
                    .put("{% convert_date 'YYYY-MM-DDT00:00:00Z' '-20d' %}")
                    .put("{% convert_date 'YYYY-MM-DDT00:00:00Z' '-1d' %}"))));
        JSONObject result = slicingdice.aggregation(queryData);
        System.out.println(result.toString());
    }
}
require 'rbslicer'
slicingdice = SlicingDice.new(master_key: 'MASTER_API_KEY')

query_data = {
    "filter" => [
        {
            "country" => {
                "equals" => "USA"
            }
        }
    ],
    "query" => [
        {
            "purchased-products" => 2,
            "between" => [
                "{% convert_date 'YYYY-MM-DDT00:00:00Z' '-20d' %}", 
                "{% convert_date 'YYYY-MM-DDT00:00:00Z' '-1d' %}"
            ]
        }
    ]
}

puts slicingdice.aggregation(query_data)

const SlicingDice = require('slicerjs');

const slicingdice = new SlicingDice({masterKey: 'MASTER_API_KEY'});

const queryData = {
    "filter": [
        {
            "country": {
                "equals": "USA"
            }
        }
    ],
    "query": [
        {
            "purchased-products": 2,
            "between": [
                "{% convert_date 'YYYY-MM-DDT00:00:00Z' '-20d' %}", 
                "{% convert_date 'YYYY-MM-DDT00:00:00Z' '-1d' %}"
            ]
        }
    ]
};

slicingdice.aggregation(queryData).then((resp) => {
    console.log(resp);
}, (err) => {
    console.error(err);
});

<?php
use Slicer\SlicingDice;
$slicingdice = new SlicingDice(array("masterKey" => "MASTER_API_KEY"));

$queryData = array(
    "filter" => array(
        array(
            "country" => array(
                "equals" => "USA"
            )
        )
    ),
    "query" => array(
        array(
            "purchased-products" => 2,
            "between" => array(
                "{% convert_date 'YYYY-MM-DDT00:00:00Z' '-20d' %}", 
                "{% convert_date 'YYYY-MM-DDT00:00:00Z' '-1d' %}"
            )
        )
    )
);

print_r($slicingdice->aggregation($queryData));
?>
using System.Collections.Generic;
using Slicer;
using Newtonsoft.Json;

namespace SlicerTester.Console
{
    class Program
    {
        static void Main(string[] args)
        {
            var slicingdice = new SlicingDice(masterKey: "MASTER_API_KEY");

            var queryData = new Dictionary<string, dynamic>{
                {"filter", new List<dynamic>{
                    new Dictionary<string, dynamic>{
                        {"country", new Dictionary<string, dynamic>{
                            {"equals", "USA"}
                        }}
                    }
                }},
                {"query", new List<dynamic>{
                    new Dictionary<string, dynamic>{
                        {"purchased-products", 2},
                        {"between", new List<dynamic>{
                            "{% convert_date 'YYYY-MM-DDT00:00:00Z' '-20d' %}", 
                            "{% convert_date 'YYYY-MM-DDT00:00:00Z' '-1d' %}"
                        }}
                    }
                }}
            };

            var result = slicingdice.Aggregation(queryData);
            System.Console.WriteLine(JsonConvert.SerializeObject(result).ToString());
        }
    }
}
package main
import (
    "fmt"
    "github.com/SlicingDice/slicingdice-go/slicingdice"
)
func main() {
    keys := new(slicingdice.APIKey)
    keys.MasterKey = "MASTER_API_KEY"
    slicingdice := slicingdice.New(keys, 60)

    queryData := map[string]interface{}{
        "filter": []interface{}{
            map[string]interface{}{
                "country": map[string]interface{}{
                    "equals": "USA",
                },
            },
        },
        "query": []interface{}{
            map[string]interface{}{
                "purchased-products": 2,
                "between": []interface{}{
                    "{% convert_date 'YYYY-MM-DDT00:00:00Z' '-20d' %}",
                    "{% convert_date 'YYYY-MM-DDT00:00:00Z' '-1d' %}",
                },
            },
        },
    }

    fmt.Println(slicingdice.Aggregation(queryData))
}

Let's understand each component of this API request:

  • filter - An optional query to filter entities you want to perform the aggregation over (the same query you would pass to a count entities query)
  • query - A list of JSON objects representing aggregation levels

The API response for this query above will be in the format below:

{
    "status": "success",
    "result": {
        "purchased-products": [
            {
                "quantity": 1084,
                "value": "Laptop"
            }, {
                "quantity": 842,
                "value": "Mobile phone"
            }
        ]
    },
    "took": 0.103
}

Let's understand each component of the API response:

  • status - Indicating whether the request was successfully processed or not.
  • result - JSON object containing columns and their matches.
    • <column-name> - The name of the column you are querying over.
      • quantity - Number of unique entities or occurrences of value, depending on the aggregation type.
      • value - Value stored on SlicingDice.
  • took - Time spent processing this request, in seconds.

Quantity vs. Count metrics

Be aware that the quantity value represents how many entities (if it's an attribute column) or events (if it's an event column) are present in that aggregation level for that particular value. Since count-events receive its own between parameter, its value may differ from the quantity value. And since count-entities can be performed over event columns, its value may differ as well.


Query operators, predicates and parameters

This section is focused on API-based JSON endpoint if you're using SQL to make your queries feel free to skip this part.

Aggregation operators

At this time it's not possible to use operators such and, or, not or freqgroup on the aggregation itself, but it is possible to used them on the filter part of the filtered aggregation query type, as we described below.

Filtered aggregation operators

This is the list of supported operators to be used exclusively in the filter level of the filtered aggregation query.

Note that filter receives the same query you would pass when performing a count entities or Data extraction query.

and

Usage example: Filter all entities that have the name equal to "Jeff" and age equal to 18.

or

Usage example: Filter all entities that have the name equal to "Jeff" or age equal to 18,

not

Usage example: Filter all entities that name equal to "Jeff" but age not equal to 18.

freqgroup

Usage example: Filter all entities that have purchased on "NY" or on "CA" at least two times when summing up both occurrences

starts-with

Selects only values that start with the requested substring.

Usage example: What users have a name that starts with j?

ends-with

Selects only values that end with the requested substring.

Usage example: What products have a name that ends with ona?

contains

Selects only values that contain the requested substring.

Usage example: What countries contain m?

not-contains

Selects only values that do not contain the requested substring.

Usage example: What countries not contains m?

Note that minfreq parameter is required to define a minimum frequency when evaluating the summed up frequencies.


Aggregation predicates

This is the list of supported predicates for the aggregation query (they must be inside query objects).

equals (as a string list)

Selects only values that are within the equals list.

not-equals (as a string list)

Selects only values that are not within the not-equals list.

range

Selects only values that are within the given range.

gt or greater-than

Selects only values that are greater than given gt or greater-than value.

gte or greater-than-or-equal

Selects only values that are greater than or equal to given gte or greater-than-or-equal value.

lt or less-than

Selects only values that are less than given lt or less-than value.

lte or less-than-or-equal

Selects only values that are less than or equal to given lte or less-than-or-equal value.

between

Selects only data stored within given between time ranges.

Note that this predicate is required for event columns and can receive multiple time ranges at a time for metrics aggregations (your results will be aggregated by each time range), such as:

{
  "query": [
    {
      "my-event-column": "count-entities",
      "between": [
        ["now", "-1 day"],
        ["now", "-7 days"],
        ["now", "-30 days"]
      ]
    }
  ]
}

order

Defines how aggregation response will be ordered in each aggregation level.

Date histogram aggregation predicates

interval

Usage example: What are the daily averages of purchases in the last 30 days?

Metrics aggregation predicates

min

Answers the question: What is the minimum stored value in the age column?

Only supported by the following data types:

max

Answers the question: What is the maximum stored value in the age column?

Only supported by the following data types:

sum

Answers the question: What is the sum of all stored values in the purchases column?

Only supported by the following data types:

avg

Answers the question: What is the average of all stored values in the age column?

Only supported by the following data types:

count-entities

Answers the question: How many entities have at least one value in the signed-up column?

Supported by all data types.

count-events

Answers the question: How many stored values does the purchases have?

Supported by all event data types.

count-distinct

Answers the question: How many unique values were inserted in the signed-up column?

Supported by all data types.

General parameters

This is the list of general parameters that can be used on most queries.

bypass-cache (optional)

Usage example: Boolean value defining to bypass or not the cache mechanism. Use "bypass-cache": true to ignore cached data. bypass-cache is false by default, meaning that all queries will be cached.

hide-empty-values (optional):

Usage example: Boolean value defining whether the aggregation should return empty values when there aren't enough stored values to fulfill the desired number of values. For instance, this may happen when we want to aggregate the top 5 countries with the top 10 age, but there are only 3 stored age values for a particular country. Use "hide-empty-values": true to omit empty values from the response message. hide-empty-values is true by default, meaning that aggregations won't return the empty values.

dimension (optional)

In which the query should be performed over.

If you do not provide a value to dimension, the query will be performed over the default dimension.


Query examples

You can check the Aggregation query examples to see a lot of aggregation query examples.


What's Next

Top values