> ## Documentation Index
> Fetch the complete documentation index at: https://docs.lume.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Building a Target Schema

> Learn how to create effective target schemas for your data transformations

<Note>
  The quickest way to get started with flat, tabular data is to upload a sample CSV file with your desired output format. Lume will automatically generate a target schema for you! For nested or complex data structures, we recommend building your schema manually using this guide.
</Note>

If you prefer to build your schema manually or need to customize an existing one, this guide will walk you through the process.

## Schema Basics

Target schemas in Lume use [JSON Schema](https://json-schema.org/understanding-json-schema/) format to define your desired output structure. Each field requires:

1. A field name
2. One or more data types
3. A clear description of the field's business meaning and context

<CodeGroup>
  ```json Basic Field theme={null}
  {
    "customer_name": {
      "type": ["string"],
      "description": "The full legal name of the customer as it appears on official documents"
    }
  }
  ```

  ```json Business Context theme={null}
  {
    "revenue": {
      "type": ["number"],
      "description": "Monthly recurring revenue in USD, calculated at the end of each calendar month. Excludes one-time purchases and refunds"
    }
  }
  ```
</CodeGroup>

<Info>
  Write clear, specific descriptions that explain your business's unique requirements and context. For example, specify if "revenue" means monthly recurring revenue, annual revenue, or revenue before returns. Learn more about writing effective descriptions in our [Creating Field Descriptions](/pages/documentation/guides/creating_descriptions) guide.
</Info>

## Field Types

Common JSON Schema types include:

* `string`: Text data
* `number`: Numeric values
* `integer`: Whole numbers
* `boolean`: True/false values
* `array`: Lists of values
* `object`: Nested structures
* `null`: Missing or undefined values

## Data Classification with Enums

Use enums to classify data into specific categories:

```json theme={null}
{
  "subscription_tier": {
    "type": ["string"],
    "description": "The customer's subscription level",
    "enum": ["free", "basic", "premium", "enterprise"]
  }
}
```

## Validation Rules

JSON Schema provides several validation options:

<AccordionGroup>
  <Accordion title="String Validation">
    ```json theme={null}
    {
      "phone_number": {
        "type": ["string"],
        "description": "Customer's contact phone number",
        "pattern": "^\\+?[1-9]\\d{1,14}$",
        "minLength": 10,
        "maxLength": 15
      }
    }
    ```
  </Accordion>

  <Accordion title="Numeric Validation">
    ```json theme={null}
    {
      "age": {
        "type": ["integer"],
        "description": "Customer's age in years",
        "minimum": 0,
        "maximum": 120
      },
      "success_rate": {
        "type": ["number"],
        "description": "Success rate as a decimal",
        "minimum": 0,
        "maximum": 1,
        "exclusiveMaximum": true
      }
    }
    ```
  </Accordion>

  <Accordion title="Format Specifications">
    ```json theme={null}
    {
      "email": {
        "type": ["string"],
        "description": "Primary contact email address",
        "format": "email"
      },
      "website": {
        "type": ["string"],
        "description": "Company website URL",
        "format": "uri"
      },
      "created_at": {
        "type": ["string"],
        "description": "Account creation timestamp",
        "format": "date-time"
      }
    }
    ```
  </Accordion>
</AccordionGroup>

## Complete Example

Here's a complete target schema example:

```json theme={null}
{
  "type": "object",
  "properties": {
    "customer_id": {
      "type": ["string"],
      "description": "Unique identifier for the customer",
      "pattern": "^CUST\\d{6}$"
    },
    "full_name": {
      "type": ["string"],
      "description": "Customer's full legal name"
    },
    "email": {
      "type": ["string", "null"],
      "description": "Primary contact email address",
      "format": "email"
    },
    "account_type": {
      "type": ["string"],
      "description": "Type of account held by the customer",
      "enum": ["personal", "business", "enterprise"]
    },
    "monthly_spend": {
      "type": ["number"],
      "description": "Average monthly spend in USD",
      "minimum": 0
    },
    "is_active": {
      "type": ["boolean"],
      "description": "Whether the customer account is currently active"
    }
  },
  "required": ["customer_id", "full_name", "account_type"]
}
```

## Best Practices

1. **Clear Descriptions**: Write clear, specific descriptions that explain the business meaning of each field
2. **Appropriate Types**: Use the most specific type(s) possible for each field
3. **Validation Rules**: Add validation rules where appropriate to ensure data quality
4. **Required Fields**: Mark essential fields as required in the schema
5. **Consistent Naming**: Use consistent field naming conventions throughout your schema

<Note>
  Remember: Focus on describing what each field means, not how to transform it. Lume handles the transformation logic automatically!
</Note>

## Advanced Schema Structures

### Nested Objects

Your schema can include nested objects to represent complex data structures:

```json theme={null}
{
  "billing_address": {
    "type": ["object"],
    "description": "Customer's billing address details",
    "properties": {
      "street": {
        "type": ["string"],
        "description": "Street address including unit number"
      },
      "city": {
        "type": ["string"],
        "description": "City name"
      },
      "state": {
        "type": ["string"],
        "description": "State or province code",
        "minLength": 2,
        "maxLength": 2
      },
      "postal_code": {
        "type": ["string"],
        "description": "Postal or ZIP code"
      }
    }
  }
}
```

### Arrays

Use arrays to represent lists of values or objects:

<CodeGroup>
  ```json Simple Array theme={null}
  {
    "tags": {
      "type": ["array"],
      "description": "List of tags associated with the customer",
      "items": {
        "type": ["string"]
      }
    }
  }
  ```

  ```json Array of Objects theme={null}
  {
    "order_history": {
      "type": ["array"],
      "description": "Customer's previous orders",
      "items": {
        "type": ["object"],
        "properties": {
          "order_id": {
            "type": ["string"],
            "description": "Unique identifier for the order"
          },
          "amount": {
            "type": ["number"],
            "description": "Order total in USD"
          },
          "items": {
            "type": ["integer"],
            "description": "Number of items in the order"
          }
        }
      }
    }
  }
  ```
</CodeGroup>

### Database-Based Schemas

Your schema can mirror database tables and relationships:

```json theme={null}
{
  "user": {
    "type": ["object"],
    "description": "User record from the database",
    "properties": {
      "id": {
        "type": ["integer"],
        "description": "Primary key from users table"
      },
      "departments": {
        "type": ["array"],
        "description": "Departments this user belongs to",
        "items": {
          "type": ["object"],
          "properties": {
            "dept_id": {
              "type": ["integer"],
              "description": "Foreign key to departments table"
            },
            "role": {
              "type": ["string"],
              "description": "User's role in this department",
              "enum": ["member", "lead", "manager"]
            }
          }
        }
      }
    }
  }
}
```

<Warning>
  Field names cannot contain periods (.) as this is a protected character in Lume. Use underscores or camelCase instead:

  * ❌ `user.first.name`
  * ✅ `user_first_name`
  * ✅ `userFirstName`
</Warning>
