This guide will walk you through installing the Lume Python SDK, configuring your credentials, and running a complete data transformation pipeline.

1. Prerequisites

Before you begin, ensure you have:

  1. Python 3.8+ installed on your system.
  2. A Lume Account with access to the Python SDK.
  3. A Flow Version created in the Lume UI. This includes setting up your Source and Target Connectors. If you haven’t done this, please refer to the Flow creation guide in the main documentation.
  4. Your Lume API Key. You can find this in your Lume account settings under “API Keys”.

2. Installation

⚠️ Private SDK Access - Contact your Lume representative for access and installation instructions for the SDK.

3. Configuration

There are two primary ways to configure your credentials:

Set your Lume API key as an environment variable. The SDK will automatically detect and use it.

export LUME_API_KEY="your_api_key_here"

Option B: Programmatic Initialization

Alternatively, you can initialize the client directly in your code. This is useful for environments where you can’t set environment variables, like some serverless functions.

import lume

lume.init(api_key="your_api_key_here")

This explicit initialization overrides any environment variables.

4. Triggering Your First Run

With the SDK installed and configured, you can trigger a pipeline with just a few lines of code.

For this example, we’ll assume:

  • You have a Flow Version named customer_ingest:v1.
  • Your source Connector is configured to an S3 bucket.
  • The source_path points to a specific file you want to process: s3://my-customer-data/new_records.csv.
import lume
import time

# --- Trigger the Pipeline ---
# This call initiates the run on Lume's servers and returns immediately.
# The `source_path` tells Lume which data to process.
run = lume.run(
    flow_version="customer_ingest:v1",
    source_path="s3://my-customer-data/new_records.csv"
)

print(f"Successfully initiated run: {run.id}")
print(f"Current status: {run.status}")

# --- Monitor the Run ---
# The run executes asynchronously on Lume. Use run.wait() to block
# until the pipeline reaches a final state (e.g., SUCCEEDED or FAILED).
print("Waiting for run to complete...")
run.wait()

# --- Check the Final Status ---
# After wait() completes, the run object is updated with the final status.
print(f"Run {run.id} finished with status: {run.status}")

# You can now access detailed results from the metadata object.
if run.status == 'SUCCEEDED':
    print("Metrics:", run.metadata['metrics'])

Pro Tip: Use Webhooks for Production

While run.wait() is great for simple scripts and getting started, we strongly recommend using Webhooks for production applications. They are more scalable and efficient than continuous polling.

5. Understanding the Process

When you call lume.run(), you are not running the transformation locally. Instead, you are sending a request to the Lume platform to execute the following steps:

  1. SYNCING_SOURCE: Lume’s secure agent uses the pre-configured Connector to ingest the data from your source_path into a temporary, isolated staging area.
  2. TRANSFORMING: The Lume engine applies the logic from your customer_ingest:v1 Flow Version to the staged data.
  3. SYNCING_TARGET: Lume’s agent writes the transformed data to your destination system, as defined in the Flow Version’s Target Connector.

This “Sync-Transform-Sync” model ensures that your environment only needs to hold credentials to Lume, not to your source and target data systems. You can read more about this secure architecture in our Production Guide.

Next Steps

You’ve successfully run your first pipeline! Now you can explore more advanced topics:

  • API Reference: Dive deeper into the available functions and objects.
  • Advanced Topics: Learn about webhooks, handling partial failures, and the metadata schema.
  • Production Guide: Best practices for running the SDK in a production environment.
  • Examples: See complete examples for different use cases.