> ## Documentation Index
> Fetch the complete documentation index at: https://docs.lume.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Core Concepts

<Note>
  Understanding these foundational concepts will help you make the most of Lume's capabilities. This guide introduces the key components and how they work together.
</Note>

## Project Basics

<CardGroup>
  <Card title="Source Data" icon="database" href="#source-data">
    The input data you want to transform
  </Card>

  <Card title="Target Schema" icon="bullseye" href="#target-schema">
    The desired structure for your output data
  </Card>
</CardGroup>

### Source Data

Source data is any user-provided data that you want to interpret or transform. Lume currently supports CSV files, and you can upload multiple CSV files to work with. All data must be structured, meaning:

* The first row must contain column headers/names
* Each subsequent row must follow the same structure
* Data should be organized in a tabular format
* Each column should contain consistent data types

There are two types of source files you can work with:

1. **Source Data**: Your primary customer or business data that needs to be transformed
2. **Seed Data**: External or internal enhancement data that can be used to enrich your source data. This includes:
   * Reference data (e.g., country codes, state abbreviations)
   * Lookup tables
   * Master data
   * Any non-customer data that helps enhance your primary dataset

<Info>
  While Lume only requires a single record to generate mapping logic, providing larger data samples improves mapping accuracy through better pattern recognition.
</Info>

<Note>
  Support for JSON and XML formats is coming soon! In the meantime, we recommend converting these files to CSV or reaching out to our support team for assistance.

  Need support for additional data formats? Contact the Lume team for assistance.
</Note>

<Accordion title="Example Source Data">
  ```csv theme={null}
  # customers.csv - Primary source data
  customer_id,customer_name,industry_code,region_code
  0018y000008hFqqAAE,Blue Sky Ventures LLC,IND001,REG01
  0018y000008nrMrAAI,Green Acres Holdings LLC,IND002,REG02
  0018y000008pQrStAAE,Sunset Technologies Inc,IND001,REG03
  ```

  ```csv theme={null}
  # orders.csv - Additional source data
  order_id,customer_id,order_date,order_amount,status
  ORD001,0018y000008hFqqAAE,2024-01-15,1500.00,completed
  ORD002,0018y000008hFqqAAE,2024-02-01,2300.00,pending
  ORD003,0018y000008nrMrAAI,2024-01-20,950.00,completed
  ORD004,0018y000008pQrStAAE,2024-02-05,3200.00,processing
  ```

  ```csv theme={null}
  # industry_codes.csv - Seed data for enrichment
  industry_code,industry_name,industry_category
  IND001,Technology,Professional Services
  IND002,Healthcare,Healthcare
  IND003,Manufacturing,Industrial
  ```
</Accordion>

### Target Schema

A target schema defines the desired output format for your transformed data. It uses YAML format to specify:

* Target Model Name
* Column Names
* Test Rules
* Business logic and descriptons

For more details on building target schemas, see our [Building Target Schemas](/pages/documentation/project_guides/building_target_schema) guide.

<Accordion title="Example Target Schema">
  ```yaml theme={null}
  models:
  - name: customers
    description: "Customer records and metadata"
    columns:
      - name: customer_id
        description: "Unique identifier for each customer"
        tests:
          - not_null
          - unique

      - name: customer_name
        description: "Full legal name of the customer"
        tests:
          - not_null

      - name: customer_type
        description: "Type of customer (e.g., individual, business)"
        tests:
          - accepted_values:
              values: ["individual", "business"]

  - name: orders
    description: "All customer orders"
    columns:
      - name: order_id
        description: "Primary key for the order"
        tests:
          - not_null
          - unique

      - name: customer_id
        description: "Foreign key to customers"
        tests:
          - not_null
          - relationships:
              to: ref('customers')
              field: customer_id

      - name: status
        description: "Current status of the order"
        tests:
          - accepted_values:
              values: ["pending", "shipped", "delivered", "cancelled"]

  - name: payments
    description: "Payments made toward orders"
    columns:
      - name: payment_id
        description: "Unique ID for the payment record"
        tests:
          - not_null
          - unique

      - name: order_id
        description: "Associated order ID"
        tests:
          - relationships:
              to: ref('orders')
              field: order_id

      - name: payment_method
        description: "Method of payment (e.g., credit card, PayPal)"
        tests:
          - accepted_values:
              values: ["credit_card", "paypal", "bank_transfer"]
  ```
</Accordion>

## Key Components

<CardGroup>
  <Card title="Projects" icon="diagram-project" href="#projects">
    Orchestrate your data transformation journey
  </Card>

  <Card title="Project Versions" icon="database" href="#project-versions">
    Manage versions of a Project
  </Card>

  <Card title="File Manager" icon="check-double" href="#file-manager">
    Manage files for a given project
  </Card>

  <Card title="Workbook" icon="wand-magic-sparkles" href="#workbook">
    AI-powered data transformation
  </Card>

  <Card title="Code" icon="play" href="#code">
    Execute and monitor your transformations
  </Card>

  <Card title="Lineage" icon="chart-line" href="#lineage">
    Track table and column lineage
  </Card>

  <Card title="Data" icon="rotate" href="#data">
    Easily view the transformed data
  </Card>
</CardGroup>

### Projects

A Project is your complete data transformation pipeline. It can:

* Accept multiple file inputs (sources and seeds)
* Include multiple transformation steps
* Join and combine data
* Produce final mapped output

Projects help you organize related transformations into logical sequences. Complex transformations can be broken down into manageable steps, making them easier to maintain and modify.

<img
  src="https://mintcdn.com/lume/uQrHt4yL2wp-pTuy/images/projects.png?fit=max&auto=format&n=uQrHt4yL2wp-pTuy&q=85&s=ac2308c6ec5d9edab3e9146cab753f69"
  alt="Projects View"
  style={{ 
border: '16px solid #e2e8f0',
borderRadius: '8px',
width: '100%',
maxWidth: '100%'
}}
  width="1721"
  height="928"
  data-path="images/projects.png"
/>

### Project Versions

The project versions is a place to quickly manage different versions of your project and the runs associated with each version. Lume will automatically snapshot versions of the project as edits are made that result in changes to the code. These changes include:

* Code
* Source Schema
* Target Schema

<img
  src="https://mintcdn.com/lume/uQrHt4yL2wp-pTuy/images/project_versions.png?fit=max&auto=format&n=uQrHt4yL2wp-pTuy&q=85&s=6ee5b761ae9d9891c995d57bc4e37610"
  alt="Projects View"
  style={{ 
border: '16px solid #e2e8f0',
borderRadius: '8px',
width: '100%',
maxWidth: '100%'
}}
  width="5112"
  height="2634"
  data-path="images/project_versions.png"
/>

### File Manager

The file manager is a place to manage and access your uploaded data. It can:

* View metadata per model on row count, column count, and file size.
* Insert, upsert, and remove source tables and source seed files.
* Add additional context to the source table description to guide the AI generation.
* Provide column level metadata around data type, nullability, and additional notes.

<img
  src="https://mintcdn.com/lume/uQrHt4yL2wp-pTuy/images/file_manager.png?fit=max&auto=format&n=uQrHt4yL2wp-pTuy&q=85&s=a2c35a49983482e348d09f4f99db69e0"
  alt="Projects View"
  style={{ 
border: '16px solid #e2e8f0',
borderRadius: '8px',
width: '100%',
maxWidth: '100%'
}}
  width="3452"
  height="1912"
  data-path="images/file_manager.png"
/>

### Workbook

Lume generates a spreadsheet style artifact called a Workbook, but you don't need to be a programmer to use it effectively. The platform provides:

**Core Concepts:**

* Data lineage showing how fields map between source and target
* Sample data previews for curosry visual inspections
* Natural language explanations of the transformation logic
* Interactive edit interface for adjusting or providing additional mapping context
* AI Chat to explore daata nd gain a deeper understanding

<img
  src="https://mintcdn.com/lume/uQrHt4yL2wp-pTuy/images/workbook.png?fit=max&auto=format&n=uQrHt4yL2wp-pTuy&q=85&s=912f2246ea1e01b2495646a710a29bd2"
  alt="Projects View"
  style={{ 
border: '16px solid #e2e8f0',
borderRadius: '8px',
width: '100%',
maxWidth: '100%'
}}
  width="1723"
  height="935"
  data-path="images/workbook.png"
/>

### Code

Code represents the section to gain insights about the testing validation and sql models produced:

* Compiled Code
* Data Preview
* Lineage
* Validation

<img
  src="https://mintcdn.com/lume/uQrHt4yL2wp-pTuy/images/code.png?fit=max&auto=format&n=uQrHt4yL2wp-pTuy&q=85&s=31d415df854ba4eba1016a744633b271"
  alt="Projects View"
  style={{ 
border: '16px solid #e2e8f0',
borderRadius: '8px',
width: '100%',
maxWidth: '100%'
}}
  width="3456"
  height="1916"
  data-path="images/code.png"
/>

### Lineage

A visual representation of the table and column level lineage to better understand the relationships between the transformations that Lume's AI engine created.

<img
  src="https://mintcdn.com/lume/uQrHt4yL2wp-pTuy/images/lineage.png?fit=max&auto=format&n=uQrHt4yL2wp-pTuy&q=85&s=69ac14f09627f15ebb0a7e18ad6d4242"
  alt="Projects View"
  style={{ 
border: '16px solid #e2e8f0',
borderRadius: '8px',
width: '100%',
maxWidth: '100%'
}}
  width="3456"
  height="1914"
  data-path="images/lineage.png"
/>

### Data

Lume provides comprehensive target data review. You can quickly scan the set of produced data to ensure it passes a quick visual inspection.

<img
  src="https://mintcdn.com/lume/uQrHt4yL2wp-pTuy/images/data.png?fit=max&auto=format&n=uQrHt4yL2wp-pTuy&q=85&s=f33b4d98da02dff8ad0e4b31fc782979"
  alt="Projects View"
  style={{ 
border: '16px solid #e2e8f0',
borderRadius: '8px',
width: '100%',
maxWidth: '100%'
}}
  width="3456"
  height="1920"
  data-path="images/data.png"
/>
