Sample Configuration Files
Oracle
GCP Airflow Foundations supports data warehouse migration from an Oracle database to BigQuery using Dataflow.
For a detailed description and data type of each configuration field,
please refer to gcp_airflow_foundations.base_class.dataflow_job_config.DataflowJobConfig
.
source:
name: CSG
source_type: ORACLE
ingest_schedule: "@daily"
start_date: "2021-01-01"
extra_options:
dataflow_job_config:
system_name: CSG
region: us-central1
bq_load_temp_directory: <GCS directory for loading temporary Dataflow files>
template_path: <GCS path to Dataflow template>
jdbc_driver_class: oracle.jdbc.driver.OracleDriver
jdbc_jar_path: <the GCS path to the driver .jar file>
jdbc_url: <a valid JDBC url for connecting to the database>
jdbc_user: <the database username>
jdbc_pass_secret_name: <the database password>
kms_key_path: <the KMS key path for encrypting/decrypting JDBC credentials>
sql_casts: {"DATE": "to_char(COLUMN, 'yyyy-mm-dd') as COLUMN"}
table_type_casts: {}
bq_schema_table: ALL_TAB_COLUMNS
database_owner: <owner of the tables to query (query scope)>
location: US
dataset_data_name: oracle
connection: google_cloud_default
landing_zone_options:
landing_zone_dataset: staging_zone
tables:
- table_name: oracle_table
ingestion_type: FULL
surrogate_keys: []
hds_config:
hds_table_type: SNAPSHOT
hds_table_time_partitioning: DAY
Salesforce
For a detailed description and data type of each configuration field,
please refer to gcp_airflow_foundations.base_class.salesforce_ingestion_config.SalesforceIngestionConfig
.
source:
name: salesforce
source_type: SALESFORCE
ingest_schedule: "@daily"
start_date: "2021-01-01"
extra_options:
gcs_bucket: data-lake-bucket
location: US
dataset_data_name: salesforce
landing_zone_options:
landing_zone_dataset: landing_zone
tables:
- table_name: Opportunity
ingestion_type: FULL
surrogate_keys: []
hds_config:
hds_table_type: SNAPSHOT
hds_table_time_partitioning: DAY
extra_options:
sf_config:
ingest_all_columns: False
fields_to_omit: []
field_names: ["Id", "OwnerId", "Name", "Amount", "StageName"]
api_table_name: Opportunity
- table_name: Account
ingestion_type: FULL
surrogate_keys: []
hds_config:
hds_table_type: SNAPSHOT
hds_table_time_partitioning: DAY
extra_options:
sf_config:
ingest_all_columns: False
fields_to_omit: []
field_names: ["Id","Name"]
api_table_name: Account