Once the GoodData.CN is running, the first step is to connect with a database. We will be using Postgres as the example database. A data source in GoodData is a entity registered in GoodData that allows you to connect such data source with mutliple workspaces. Once the data source is registered in GoodData.CN, you need to have GoodData scans the database and stores under the data source entity before utilizing the data source in the individual workspaces.
You may execute connect_postgres.sh to create a data source in GoodData.CN. In the script, it utilizes the 4 APIs to complete the task:
- Establish connection between GoodData and Postgres
- Scan and save the physical model in Postgres and save in JSON
- Upload the layout of the physical model found in Postgres and to connect GoodData
- Refresh the data source (Remove cache, but this step is optional)
You may find connect_drill.sh as the example to connect GoodData.CN with Apache Drill.
Documentation link
Once you have registered a data source, it is required to generate a physical data model (PDM) for GoodData.CN to map the metadata to the database before working on a logical data model (LDM). You may use scan_ds.sh as the example script to scan the data source. If you cannot or do not want to use this API, it can be done on the UI in the LDM Modeler.
Documentation link
Once you have registered a data source, you may test the connectivity of the data source. You may use test_ds.sh as the example script to test the data source ps-gooddata-8731 -- Be sure to update the data source ID.
Documentation link
GoodData.CN supports the mainstream databases/data warehouses such as:
- Postgres
- AWS Redshift
- Google BigQuery
- Vertica
- etc...
Above are the example databases/data warehouses supported by GoodData.CN, be sure to check with the updated list in the documentation page.
You may execute get_datasources.sh to obtain the list of the data sources registered in GoodData.CN. The list will be saved as datasources.json in the current directory.
You may execute delete_datasource.sh to delete a data source. In this example script, it is going to delete the data source ps-gooddata-8731 created in the connect_postgres.sh script -- Be sure to update the data source ID.
Whenever a visualization is rendered in GoodData.CN, the result is cached in the data source object; even the dataset is updated, it will not reflected in GoodData.CN unless you refresh the data source/remove cache in GoodData.CN. You may execute refresh_datasource.sh to do so. In this example script, it is going to refresh the data source ps-gooddata-8731 -- Be sure to update the data source ID.
Documentation link
Once you have register a data source in GoodData.CN, you may utilize the data source to exposure to the workspaces. In the next step, we will go over how to create a worksapce(s) and establish a logical data model (LDM) on top of the data source in the Workspace folder.