Configuring Your BigQuery Connector to DataHub
Now that you have created a Service Account and Service Account Key in BigQuery in the prior step, it's now time to set up a connection via the DataHub UI.
Configure Secrets
- Within DataHub, navigate to the Ingestion tab in the top, right corner of your screen
If you do not see the Ingestion tab, please contact your DataHub admin to grant you the correct permissions
- Navigate to the Secrets tab and click Create new secret
- Create a Private Key secret
This will securely store your BigQuery Service Account Private Key within DataHub
- Enter a name like
BIGQUERY_PRIVATE_KEY
- we will use this later to refer to the secret - Copy and paste the
private_key
value from your Service Account Key - Optionally add a description
- Click Create
- Create a Private Key ID secret
This will securely store your BigQuery Service Account Private Key ID within DataHub
- Click Create new secret again
- Enter a name like
BIGQUERY_PRIVATE_KEY_ID
- we will use this later to refer to the secret - Copy and paste the
private_key_id
value from your Service Account Key - Optionally add a description
- Click Create
Configure Recipe
- Navigate to the Sources tab and click Create new source
- Select BigQuery
- Fill out the BigQuery Recipe
You can find the following details in your Service Account Key file:
- Project ID
- Client Email
- Client ID
Populate the Secret Fields by selecting the Private Key and Private Key ID secrets you created in steps 3 and 4.
- Click Test Connection
This step will ensure you have configured your credentials accurately and confirm you have the required permissions to extract all relevant metadata.
After you have successfully tested your connection, click Next.
Schedule Execution
Now it's time to schedule a recurring ingestion pipeline to regularly extract metadata from your BigQuery instance.
Decide how regularly you want this ingestion to run-- day, month, year, hour, minute, etc. Select from the dropdown
Ensure you've configured your correct timezone
Click Next when you are done
Finish Up
- Name your ingestion source, then click Save and Run
You will now find your new ingestion source running
Validate Ingestion Runs
- View the latest status of ingestion runs on the Ingestion page
- Click the plus sign to expand the full list of historical runs and outcomes; click Details to see the outcomes of a specific run
- From the Ingestion Run Details page, pick View All to see which entities were ingested
- Pick an entity from the list to manually validate if it contains the detail you expected
Congratulations! You've successfully set up BigQuery as an ingestion source for DataHub!
Need more help? Join the conversation in Slack!