Search Results for

    Show / Hide Table of Contents

    Connecting to Azure Data Lake Storage Gen2

    Overview

    You can make Azure Storage the foundation for building enterprise data lakes on Azure with the help of Azure Data Lake Storage Gen2. It is built on Azure Blob Storage and its capabilities are dedicated to big data analytics.

    Azure Data Lake Storage Gen2 provides you with file system semantics, file-level security, and scale. You can use SkyPoint's built-in connector for importing data from ADLS Gen2. SkyPoint Modern Data Stack Platform collects, analyzes, and provides a solution to transform data into meaningful information for generating valuable insights.

    Prerequisite

    You will need the following details to configure and import data using Azure Data Lake Storage Gen2 connector:

    • Storage account name
    • Account key details
    • Storage path.

    Import data using Azure Data Lake Storage Gen2 connector

    Follow the below steps to create a new dataflow for the Azure Data Lake Storage Gen2 import connector:

    1. Go to Dataflow > Imports.
    2. Click New dataflow.

    The Set dataflow name page appears.


    Alt text


    1. In the Set dataflow name page, type Dataflow name in the Name text area.
    2. Click Next.

    The Choose connector page appears.


    Alt text


    To choose Azure Data Lake Storage Gen2 connector

    1. In the Choose connector page, select Azure Data Lake Storage Gen2 connector.

    ❕ Note: You can use the Search feature to find the connector. Also, the Azure Data Lake Storage Gen2 connector is available under both Analytics and Cloud categories.


    The Set dataflow name page appears.


    Alt text


    1. Type a Display Name for your dataflow in the text area.
    2. Type a Description for your dataflow in the text area.
    3. Click Next.

    The Configuration page appears.


    Alt text


    To configure Azure Data Lake Storage Gen2

    Follow the below steps to configure the connection to Azure Data Lake Storage Gen2:

    1. Type the Storage account name in the text area.
    2. Type the Account key in the text area.
    3. Click the Folder icon in the Storage path text area.

    Once you select your Storage path, the Table Details columns appear.


    Alt text


    1. Enter the Table Details to process the data.
    Item Description
    Purpose Option to assign a purpose (Data or Metadata) for each table.
    Data
    Loads customer data.
    Metadata
    Loads Metadata.
    File Name Displays the name of the file that you imported.
    Table Name Displays the imported table name.
    Datetime format Displays a number of Datetime Formats and SkyPoint’s Modern Data Stack Platform is set to automatically detect them.
    Delimiter Displays available separators for the variables in the imported data.
    First Row as Header Check the box for the system to automatically collect the data according to the Header Contents.
    Advanced Settings Select the options to fine tune the Import process with minute details.
    1. If necessary, apply the Advance settings to modify the default settings.

    The Advanced settings pop-up appears.


    Alt text


    ❕ Note: Advanced settings allow you to modify the default settings. It gives more flexibility to apply advanced use cases. However, the default settings are adequate to perform the task.


    If you want to Then
    Modify data types such as fixed or variable data types. Select from the Compression type. It allows you to reduce the size of data by removing the number of bits.
    Change the delimiter Click Row delimiter. By default, a column delimiter is selected, and each row is separated with a comma.
    Change information or instruction Choose from the Encoding list. By default, UTF-8 encoding is selected.
    Modify the escape character such as backslash (\) or slash (/) Select from the Escape character.
    Apply different quote characters such as Single quote (') or Double quote ("). Select from the Quote character.
    1. Click Save.

    Run, edit, and delete the imported data

    Once you save the connector, the Azure Data Lake Storage Gen2 connector gets displayed in the list of tables created in the Dataflow page.


    Alt text


    Item Description
    Name Displays the name of the imported Dataflow.
    Type Displays connector type symbol.
    Status Indicates whether the data is imported successfully.
    Tables Count Displays the number of tables.
    Created Date Displays date of creation.
    Last refresh type Displays the refresh value. You can see the value is Full or Incremental after the last refresh of data.
    Updated Date Displays last modified date.
    Last Refresh Displays the latest refresh date. This date will get updated whenever you refresh the data.
    Group by Option to view the items in a specific Group (For example, name, type, status).
    • Select the horizontal ellipsis in the Actions column and do the following:
    If you want to Then
    Modify the Dataflow Select Edit and modify the Dataflow. Click Save to apply your changes.
    Execute the Dataflow Select Run.
    Bring the data to its previous state Select Rollback.
    Delete the Dataflow Select Remove and then click the Delete button. All tables in the data source get deleted.
    See the run history of the Dataflow Select Run history.

    Next step

    After completing the data import, start the Master Data Management (MDM) - Stitch process to develop a unified view of your customers.

    • Improve this Doc
    In This Article
    Back to top Powered By SkyPoint Cloud