Skip to main content
Procore

Connect to BigQuery

Objective

The Procore Analytics Cloud Connect Access tool is a notebook that helps you configure and manage data transfers from Procore to BigQuery with Procore Analytics 2.0. 

Prerequisites

  • Procore Analytics 2.0 SKU
  • Python 3.8 or higher
  • Access to Google Cloud Platform (GCP)
  • Required permissions on both Delta Share and BigQuery
  • Download the zipped package from the company level Procore Analytics tool (via Procore Analytics > Getting Started > Connection Options > BigQuery).

Steps

  1. Set Up Configuration
  2. Run the BigQuery Application

Set Up Configuration 

Delta Share Configuration 

  1. Create a file named config.share with your Delta Share credentials in JSON format.
  2. Get required fields.
    Note: These details can be obtained from the Procore Analytics web application.
    • bearerToken: Your Delta Share access token.
    • endpoint: Your Delta Share endpoint URL.
    • shareCredentialsVersion: Version number (currently 1).
Example config.share File

{
"shareCredentialsVersion": 1,
"bearerToken": "",
"endpoint": ""
}

BIGQUERY CONFIGURATION 

  1. Download the bigquery.zip file from the Procore Analytics web application. 
    Note: You can download the zipped package from the company level Procore Analytics tool (via Procore Analytics > Getting Started > Connection Options > BigQuery).
  2. Extract the package to a directory of your choice.
  3. Open the config.yaml file and modify the following parameters:
    • source_config.config_path: Path to Delta Share configuration file.
    • source_config.tables: Optional list of specific tables to process. Leave it empty to process all tables.
    • target_config.project_id: GCP project ID for BigQuery.
    • target_config.dataset: BigQuery dataset name.
    • target_config.threads: Number of concurrent table processes.
Example config.yaml File

source_config:
config_path: "<path_to_delta_share_config>"
tables: # Optional - list of specific tables to process
- "table1"
- "table2"

target_config:
project_id: "<your-gcp-project-id>"
dataset: "<bigquery-dataset-name>"
target_type: bigquery

Upload Configuration File
  1. Upload both config.yaml and config.share file to the gs bucket.
    1. Google Cloud Storage (GCS)

Run the BigQuery Application

  1. Create a Python notebook and install the following packages:
    • %pip install delta-sharing
    • pip install pandas-gbq -U
  2. Copy the code from delta_share_to_bq.py, paste it into your notebook, update the configuration path (config.yaml), and run it.

Monitoring and Logging

The application provides detailed logging with:

  • Processing status for each table.
  • Error messages and exceptions.
  • Concurrent processing information.

Best Practices

  • Performance Optimization
    • Adjust thread count based on system resources.
    • Monitor memory usage with large tables.
    • Consider table sizes when setting concurrent processes.
  • Error Management
    • Monitor application logs.
    • Set up appropriate alerting.
    • Maintain backup configurations.

Troubleshooting

Common issues and solutions:

  • Connection Failures
    • Verify network connectivity.
    • Check credential validity.
    • Confirm service account permissions.
  • Processing Errors
    • Verify table existence.
    • Check table access permissions.
    • Validate configuration settings.
  • Performance Issues
    • Reduce concurrent threads.
    • Monitor system resources.

Support

For additional help:

  • Review application logs for error details.
  • Verify configuration settings.
  • Ensure all prerequisites are met.
  • Contact your system administrator for permission-related issues.