Loading data from Amazon Redshift

On this page, you'll learn how to add a new Amazon Redshift data source to SlicingDice and create a new loading job using it to synchronize your source with your databases.

Whitelist SlicingDice's IPs on Redshift

Before loading your data from your Redshift source you'll need to whitelist our IP addresses, allowing the Data Loading & Preparation Module to access your data.

How to whitelist the Data Loading IPs in Redshift

First of all you should access your Amazon EC2 Service panel and go to the Security Groups section on the menu. Now, follow the next steps to configure a Redshift security group.

  • Create or edit a security group
    • Select the VPC that Redshift is actually using
    • On Inbound section, add 4 new rules (one rule for each IP address)
    • For each rule, select the type option as Redshift
    • For each rule, insert each of the following IPs (one IP for each rule):

    The following image shows how your Security Group will look like:

    Responsive image
  • Update your VPC security groups in your Redshift cluster

    If you created a new security group, go to the Cluster section, then click on Cluster. Select the Modify cluster option and then select the new security group on VPC Security Groups. Finally, click on Modify.

    Responsive image

    Now SlicingDice's Data Loading & Preparation Module will be able to access your Redshift clusters as soon as you create a data source on SlicingDice. This is what we'll do next!

Add an Amazon Redshift Data Source

Before adding your Amazon Redshift data source on SlicingDice, you need to be logged in our Control Panel. Then, you need to go to the Data Sources page so we can start our tutorial.

How to add a Redshift data source on SlicingDice

Before creating your data source you'll need to get the following information that must be found on Amazon Redshift console:
- Your Amazon Redshift cluster endpoint
- The port which Redshift is running (default is 5439)
- Your Redshift username and password
- Your database name

Now let's start creating your data source just clicking on Create new data source on SlicingDice's Data Source section

  • Data Source setup

    The first step is the configuration of your data source identification on SlicingDice. The following screen shows step 1.

    Responsive image

    Three fields will appear. Each field function is described at the table below.

    Field Description
    Data Source Name The name of your data source. Can be edited at any time. Mandatory
    Data Source Labels/Tags Labels/tags you might want to associate to a source, in order to organize your sources. Can be edited at any time. Optional
    Data Source Description The description of your data source. Can be edited at any time. Optional

    When ready, click on Save & Continue to go to Step 2.

  • Data Source Details

    Below you can see an example of the information and credentials that you should provide so SlicingDice can be able to connect to your Amazon Redshift source.

    Responsive image
    Field Description
    Data Source Type The type of data source. In this case we're using Amazon Redshift.
    Server The Amazon Redshift cluster endpoint.
    Port The port where Redshift is running. The default port of Redshift is 5439.
    Username The username used to log in to Redshift.
    Password The password used to log in to Redshift.
    Database The name of your Redshift database.

    You can test the connection by clicking on Test Connection. If everything goes ok you'll see a success message.

    Now you can go to the next step clicking on Save & Continue

  • Confirmation

    Here you'll see a summary of the configurations defined for this data source before you finally create it.

    The following image shows an example of a confirmation screen, which the name of the data source is drinks .

    Responsive image

    If everything is ok, click on Submit and then you'll receive a success creation message.

    Now you'll be able to find your new data source at the data sources list as you can see in the following image.

    Redshift source registered on source's list

    That's it! The next step is to load your Amazon Redshift data into SlicingDice creating a loading job.

Add a new loading job using Redshift source

Now all the connection configuration with your Amazon Redshift source is completed, so the next step is to create and execute a loading job using this Redshift Data Source you've configured on SlicingDice.

Here are the creation jobs guides for each type of loading job. Choose the most helpful for your use case:

  • One-time loading job: The one-time loading job loads your data once, needing manual intervention to execute it. This loading job type is useful if you don't update your data frequently.
  • Manual incremental loading job: The manual incremental loading job loads your data on-demand when new data needs to be inserted in SlicingDice. You need to manually start this loading job.
    Differently from an one-time loading job, only new rows will be inserted in the database. Your dataset needs to have a timestamp column in order to use this loading job type.
  • Automatic loading job: The automatic loading job loads your data frequently, specified by a predetermined time interval. You don't need to manually start this loading job, as it executes automatically.
    Your dataset needs to have a timestamp column in order to use this loading job type.