December 10, 2019

1404 words 7 mins read

cube-js/cube.js

cube-js/cube.js

Cube.js - Open Source Analytics Framework

repo name cube-js/cube.js
repo link https://github.com/cube-js/cube.js
homepage https://cube.dev
language JavaScript
size (curr.) 26565 kB
stars (curr.) 7386
created 2018-09-16
license Other

WebsiteDocsExamplesBlogSlackTwitter

npm version CircleCI FOSSA Status

Cube.js is an open source modular framework to build analytical web applications. It is primarily used to build internal business intelligence tools or to add customer-facing analytics to an existing application.

Cube.js was designed to work with Serverless Query Engines like AWS Athena and Google BigQuery. Multi-stage querying approach makes it suitable for handling trillions of data points. Most modern RDBMS work with Cube.js as well and can be tuned for adequate performance.

Unlike others, it is not a monolith application, but a set of modules, which does one thing well. Cube.js provides modules to run transformations and modeling in data warehouse, querying and caching, managing API gateway and building UI on top of that.

Cube.js Backend

  • Cube.js Schema. It acts as an ORM for analytics and allows to model everything from simple counts to cohort retention and funnel analysis.
  • Cube.js Query Orchestration and Cache. It optimizes query execution by breaking queries into small, fast, reusable and materialzed pieces.
  • Cube.js API Gateway. It provides idempotent long polling API which guarantees analytic query results delivery without request time frame limitations and tolerant to connectivity issues.

Cube.js Frontend

  • Cube.js Javascript Client. Сore set of methods to access Cube.js API Gateway and to work with query result sets.
  • Cube.js React, Angular and Vue. Framework specific wrappers for Cube.js API.

Why Cube.js?

If you are building your own business intelligence tool or customer-facing analytics most probably you’ll face the following problems:

  1. Performance. Most of effort time in modern analytics software development is spent to provide adequate time to insight. In the world where every company data is a big data writing just SQL query to get insight isn’t enough anymore.
  2. SQL code organization. Modelling even a dozen of metrics with a dozen of dimensions using pure SQL queries sooner or later becomes a maintenance nightmare which ends up in building modelling framework.
  3. Infrastructure. Key components every production-ready analytics solution requires: analytic SQL generation, query results caching and execution orchestration, data pre-aggregation, security, API for query results fetch, and visualization.

Cube.js has necessary infrastructure for every analytic application that heavily relies on its caching and pre-aggregation layer to provide several minutes raw data to insight delay and sub second API response times on a trillion of data points scale.

Contents

Getting Started

1. Install with NPM or Yarn

$ npm install -g cubejs-cli
# or
$ yarn global add cubejs-cli

2. Connect to Your Database

Run the following command to get started with Cube.js

$ cubejs create <project name> -d <database type>

specifying the project name and your database using -d flag. Available options:

  • postgres
  • mysql
  • mssql
  • athena
  • mongobi
  • bigquery
  • redshift
  • clickhouse
  • hive
  • snowflake
  • prestodb
  • oracle

For example,

$ cubejs create hello-world -d postgres

Once run, the create command will create a new project directory that contains the scaffolding for your new Cube.js project. This includes all the files necessary to spin up the Cube.js backend, example frontend code for displaying the results of Cube.js queries in a React app, and some example schema files to highlight the format of the Cube.js Data Schema layer.

The .env file in this project directory contains placeholders for the relevant database credentials. For MySQL and PostgreSQL, you’ll need to fill in the target host, database name, user and password. For Athena, you’ll need to specify the AWS access and secret keys with the access necessary to run Athena queries, and the target AWS region and S3 output location where query results are stored.

3. Define Your Data Schema

Cube.js uses Data Schema to generate and execute SQL.

It acts as an ORM for your database and it is flexible enough to model everything from simple counts to cohort retention and funnel analysis. Read more about Cube.js Schema.

You can generate schema files using developer Playground. To do so please start dev server from project directory

$ npm run dev

Then go to http://localhost:4000 and use UI to generate schema files.

Manually creating Data Schema files

You can also add schema files to the schema folder manually:

// schema/users.js

cube(`Users`, {
   measures: {
     type: `count`
   },

   dimensions: {
     age: {
       type: `number`,
       sql: `age`
     },

     createdAt: {
       type: `time`,
       sql: `createdAt`
     },

     country: {
       type: `string`,
       sql: `country`
     }
   }
});

4. Visualize Results

The Cube.js client connects to the Cube.js Backend and lets you visualize your data. This section shows how to use Cube.js Javascript client.

As a shortcut you can run your dev server first:

$ npm run dev

Then open http://localhost:4000 to see visualization examples. This will open a Developer Playground app. You can change the metrics and dimensions of the example to use the schema you defined above, change the chart types, generate sample code out of it and more!

Cube.js Client Installation

Vanilla JS:

$ npm i --save @cubejs-client/core

React:

$ npm i --save @cubejs-client/core
$ npm i --save @cubejs-client/react

Example Usage

Vanilla Javascript

Instantiate Cube.js API and then use it to fetch data:

import cubejs from '@cubejs-client/core';
import Chart from 'chart.js';
import chartjsConfig from './toChartjsData';

const cubejsApi = cubejs(
  'YOUR-CUBEJS-API-TOKEN',
  { apiUrl: 'http://localhost:4000/cubejs-api/v1' },
);

const resultSet = await cubejsApi.load({
  measures: ['Stories.count'],
  timeDimensions: [{
    dimension: 'Stories.time',
    dateRange: ['2015-01-01', '2015-12-31'],
    granularity: 'month'
  }]
})
const context = document.getElementById('myChart');
new Chart(context, chartjsConfig(resultSet));
React

Import cubejs and QueryRenderer components, and use them to fetch the data. In the example below we use Recharts to visualize data.

import React from 'react';
import { LineChart, Line, XAxis, YAxis } from 'recharts';
import cubejs from '@cubejs-client/core';
import { QueryRenderer } from '@cubejs-client/react';

const cubejsApi = cubejs(
  'YOUR-CUBEJS-API-TOKEN',
  { apiUrl: 'http://localhost:4000/cubejs-api/v1' },
);

export default () => {
  return (
    <QueryRenderer
      query={{
        measures: ['Stories.count'],
        dimensions: ['Stories.time.month']
      }}
      cubejsApi={cubejsApi}
      render={({ resultSet }) => {
        if (!resultSet) {
          return 'Loading...';
        }

        return (
          <LineChart data={resultSet.rawData()}>
            <XAxis dataKey="Stories.time"/>
            <YAxis/>
            <Line type="monotone" dataKey="Stories.count" stroke="#8884d8"/>
          </LineChart>
        );
      }}
    />
  )
}

Examples

Demo Code Description
Real-Time Dashboard real-time-dashboard Real-Time Dashboard Demo using Web Sockets transport
React Dashboard react-dashboard Dynamic dashboard with React, GraphQL, and Cube.js
D3 Dashboard d3-dashboard Dashboard with Cube.js, D3, and Material UI
Stripe Dashboard stripe-dashboard Stripe Demo Dashboard built with Cube.js and Recharts
Event Analytics event-analytics Mixpanel like Event Analytics App built with Cube.js and Snowplow
Node Express Dashboard node-express-dashboard Analytics Dashboard with Node, Express, and Cube.js
External Rollups external-rollups Compare performance of direct BigQuery querying vs MySQL cached version for the same data
AWS Web Analytics aws-web-analytics Web Analytics with AWS Lambda, Athena, Kinesis and Cube.js
Examples Gallery examples-gallery Examples Gallery with different visualizations libraries

Tutorials

Getting Started Tutorials

Advanced

Community

If you have any questions or need help - please join our Slack community of amazing developers and contributors.

Architecture

Cube.js acts as an analytics backend, translating business logic (metrics and dimensions) into SQL and handling database connection.

The Cube.js javascript Client performs queries, expressed via dimensions, measures, and filters. The Server uses Cube.js Schema to generate a SQL code, which is executed by your database. The Server handles all the database connection, as well as pre-aggregations and caching layers. The result then sent back to the Client. The Client itself is visualization agnostic and works well with any chart library.

Contributing

How you can help:

  1. Upvote issues with 👍 reaction so we know what’s the demand for particular issue to prioritize it within road map.
  2. Create issues every time you feel something is missing or goes wrong.
  3. Ask questions on Stack Overflow with cube.js tag if others can have these questions as well.
  4. Provide pull requests for all open issues and especially for those with help wanted and good first issue labels.

All sort of contributions are welcome and extremely helpful 🙌 Please refer to the contribution guide for more information.

License

Cube.js Client is MIT licensed.

Cube.js Backend is Apache 2.0 licensed.

FOSSA Status

comments powered by Disqus