> ## Documentation Index
> Fetch the complete documentation index at: https://docs.lume.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Building a Target Schema

> Learn how to create effective target schemas for your data transformations

If you are building your schema or need to customize an existing one, this guide will walk you through the process.

## Schema Basics

Target schemas in Lume use YAML format to define your desired output structure. A target schema requires a models section and each entry requires a name field and columns section. Each column entry contains the following:

1. A field name
2. A clear description of the field's business meaning and context
3. A set of tests to validate your transformed data

<CodeGroup>
  ```yaml Basic Field theme={null}
  models:
    - columns:
        - name: customer_name
          description: "The full legal name of the customer as it appears on official documents"
  ```

  ```yaml Business Context theme={null}
  models:
    - columns:
        - name: revenue
          description: "Monthly recurring revenue in USD, calculated at the end of each calendar month. Excludes one-time purchases and refunds"
          tests:
          - not_null
  ```
</CodeGroup>

<Info>
  Write clear, specific descriptions that explain your business's unique requirements and context. For example, specify if "revenue" means monthly recurring revenue, annual revenue, or revenue before returns. Learn more about writing effective descriptions in our [Creating Field Descriptions](/pages/documentation/project_guides/creating_descriptions) guide.
</Info>

## Defining Enums for Classifications

Within your Target Schema, you can define an enum set that will trigger Lume's classification module. The classification module will classify the transformed source data needed to fit your target field to one of your options if it fits. Here is an example of defining an enum set of Apparel, Electronics, and Perishable for the field category:

<CodeGroup>
  ```yaml Classifications theme={null}
  models:
    - columns:
      - name: category 
        description: Category of the product
        tests:
        - accepted_values:
            values:
            - Apparel
            - Electronics
            - Perishable
  ```
</CodeGroup>

<Info>
  Classifications for SQL projects coming soon!
</Info>

## Defining Code Generation Language Preference

A user can define per model what language they would like Lume's AI engine to generate code. Here is a quick example:

```yaml Language Specification theme={null}
    models:
        - name: orders
          language: python
          columns:
            - name: order_id
```

<Info>
  Lume currently supports code generation in both SQL and Python.
</Info>

## Types of Default Tests

The YAML schema provides built in test options: unique, not\_null, accepted\_values, and relationships. Here is an example using those tests for an orders model:

<AccordionGroup>
  <Accordion title="Uniqueness Test">
    ```yaml theme={null}
    models:
        - name: orders
            columns:
                - name: order_id
                    tests:
                        - unique
    ```
  </Accordion>

  <Accordion title="Nullability Test">
    ```yaml theme={null}
    models:
        - name: orders
            columns:
                - name: order_id
                    tests:
                        - not_null
    ```
  </Accordion>

  <Accordion title="Accepted Values Test">
    ```yaml theme={null}
    models:
        - name: orders
            columns:
                - name: status
                    tests:
                        - accepted_values:
                            values: ['placed', 'shipped', 'completed', 'returned']
    ```
  </Accordion>

  <Accordion title="Nullability Test">
    ```yaml theme={null}
    models:
        - name: orders
            columns:
                - name: customer_id
                    tests:
                        - relationships:
                            to: ref('customers')
                            field: id
    ```
  </Accordion>
</AccordionGroup>

Lume also provides built in support for [DBT Utils Tests](https://github.com/dbt-labs/dbt-utils?tab=readme-ov-file#generic-tests).

## Complete Example

Here's a complete target schema example:

```yaml theme={null}
models:
  - name: customers
    description: "Customer records and metadata"
    columns:
      - name: customer_id
        description: "Unique identifier for each customer"
        tests:
          - not_null
          - unique

      - name: customer_name
        description: "Full legal name of the customer"
        tests:
          - not_null

      - name: customer_type
        description: "Type of customer (e.g., individual, business)"
        tests:
          - accepted_values:
              values: ["individual", "business"]

  - name: orders
    description: "All customer orders"
    columns:
      - name: order_id
        description: "Primary key for the order"
        tests:
          - not_null
          - unique

      - name: customer_id
        description: "Foreign key to customers"
        tests:
          - not_null
          - relationships:
              to: ref('customers')
              field: customer_id

      - name: status
        description: "Current status of the order"
        tests:
          - accepted_values:
              values: ["pending", "shipped", "delivered", "cancelled"]

  - name: payments
    description: "Payments made toward orders"
    columns:
      - name: payment_id
        description: "Unique ID for the payment record"
        tests:
          - not_null
          - unique

      - name: order_id
        description: "Associated order ID"
        tests:
          - relationships:
              to: ref('orders')
              field: order_id

      - name: payment_method
        description: "Method of payment (e.g., credit card, PayPal)"
        tests:
          - accepted_values:
              values: ["credit_card", "paypal", "bank_transfer"]
```

<Accordion title="Ecommerce Product Catalog Mapping">
  When working with ecommerce data, product catalogs often require specific schema structures to handle product attributes, variants, and categorization. Here's an example schema that demonstrates common ecommerce patterns:

  ```yaml theme={null}
  models:
    - name: products
      description: "Core product information and metadata"
      columns:
        - name: product_id
          description: "Unique identifier for each product (SKU)"
          tests:
            - not_null
            - unique

        - name: product_name
          description: "Display name of the product"
          tests:
            - not_null

        - name: product_type
          description: "Main product category (e.g., physical, digital, subscription)"
          tests:
            - accepted_values:
                values: ["physical", "digital", "subscription", "service"]

        - name: brand
          description: "Manufacturer or brand name"
          tests:
            - not_null

        - name: status
          description: "Current product status in the catalog"
          tests:
            - accepted_values:
                values: ["active", "draft", "archived", "discontinued"]

        - name: category
          description: "Primary product category"
          tests:
            - accepted_values:
                values: ["clothing", "electronics", "home", "beauty", "sports"]

    - name: product_variants
      description: "Product variations (size, color, etc.)"
      columns:
        - name: variant_id
          description: "Unique identifier for the variant"
          tests:
            - not_null
            - unique

        - name: product_id
          description: "Reference to parent product"
          tests:
            - not_null
            - relationships:
                to: ref('products')
                field: product_id

        - name: color
          description: "Product color variant"
          tests:
            - accepted_values:
                values: ["red", "blue", "green", "black", "white", "yellow"]

        - name: size
          description: "Product size variant"
          tests:
            - accepted_values:
                values: ["XS", "S", "M", "L", "XL", "XXL"]

        - name: material
          description: "Product material variant"
          tests:
            - accepted_values:
                values: ["cotton", "polyester", "wool", "silk", "leather"]
  ```

  <Warning>
    Lume currently supports only single-level category hierarchies in the schema definition. If your product catalog requires multiple category levels (e.g., Clothing > Men > Shirts > T-Shirts), please contact Lume support for assistance with implementing a custom solution.
  </Warning>

  <Info>
    For ecommerce catalogs, pay special attention to:

    * Product variants and their relationships
    * Category enumerations
    * Product status workflows
    * Brand and manufacturer relationships
  </Info>
</Accordion>

## Best Practices

1. **Clear Descriptions**: Write clear, specific descriptions that explain the business meaning of each field
2. **Test Rules**: Add test rules where appropriate to ensure data quality
3. **Consistent Naming**: Use consistent field naming conventions throughout your schema

<Note>
  Remember: Focus on describing what each field means, not how to transform it. Lume handles the transformation logic automatically!
</Note>

<Note>
  Property names in Lume's API cannot contain periods (.).
</Note>

<Warning>
  Lume currently does not support custom test and macros.

  * ❌ `our_custom_macros_test`
  * ✅ `not_null`
  * ✅ `unique`
</Warning>
