Sling
Slingdata.ioBlogGithubHelp!
  • Introduction
  • Sling CLI
    • Installation
    • Environment
    • Running Sling
    • Global Variables
    • CLI Pro
  • Sling Platform
    • Sling Platform
      • Architecture
      • Agents
      • Connections
      • Editor
      • API
      • Deploy from CLI
  • Concepts
    • Replications
      • Structure
      • Modes
      • Source Options
      • Target Options
      • Columns
      • Transforms
      • Runtime Variables
      • Tags & Wildcards
    • Hooks / Steps
      • Check
      • Command
      • Copy
      • Delete
      • Group
      • Http
      • Inspect
      • List
      • Log
      • Query
      • Replication
      • Store
      • Read
      • Write
    • Pipelines
    • Data Quality
      • Constraints
  • Examples
    • File to Database
      • Custom SQL
      • Incremental
    • Database to Database
      • Custom SQL
      • Incremental
      • Backfill
    • Database to File
      • Incremental
    • Sling + Python 🚀
  • Connections
    • Database Connections
      • Athena
      • BigTable
      • BigQuery
      • Cloudflare D1
      • Clickhouse
      • DuckDB
      • DuckLake
      • MotherDuck
      • MariaDB
      • MongoDB
      • Elasticsearch
      • MySQL
      • Oracle
      • Postgres
      • Prometheus
      • Proton
      • Redshift
      • StarRocks
      • SQLite
      • SQL Server
      • Snowflake
      • Trino
    • Storage Connections
      • AWS S3
      • Azure Storage
      • Backblaze B2
      • Cloudflare R2
      • DigitalOcean Spaces
      • FTP
      • Google Drive
      • Google Storage
      • Local Storage
      • Min.IO
      • SFTP
      • Wasabi
Powered by GitBook
On this page
  • Root Level
  • Stream Level
  • Hooks
  • Source Options
  • Target Options
  • Replication Specification
  1. Concepts
  2. Replications

Structure

Below is the structure of the replication configuration file.

Root Level

At the root level, we have the following keys:

# 'source', 'target' and 'streams' keys are required
source: <connection name>
target: <connection name>

defaults: <replication stream map>

hooks: <replication level hooks map>

streams:
  <stream name>: <replication stream map>

env:
  <variable name>: <variable value>

Stream Level

The <replication stream map> is a map object which accepts the following keys:

object: <target table or file name>
mode: full-refresh | incremental | truncate | snapshot | backfill
description: <stream description>
disabled: true | false

primary_key: [<array of column names to use as primary key>]
update_key: <column name to use as incremental key>

columns: {<map of column name to data type>}
select: [<array of column names to include or exclude>]
files: [<array of file paths to include or exclude>]
where: <SQL where clause>
single: true | false
sql: <source custom SQL query>
transforms: [<array of transforms or map of column name to array of transforms>]
hooks: <stream level hooks map>

source_options: <source options map>
target_options: <target options map>

Hooks

# replication level, at start and end of replication
start: [<array of hooks>]
end: [<array of hooks>]

# stream level, before and after a stream run
pre: [<array of hooks>]
post: [<array of hooks>]

Source Options

compression: auto | none | zip | gzip | snappy | zstd
chunk_size: <backfill chunk size>
datetime_format: auto | <ISO 8601 date format>
delimiter: <character to use as flat file delimiter>
empty_as_null: true | false
escape: <character to use as flat file quote escape>
flatten: true | false
format: csv | xml | xlsx | json | parquet | avro | sas7bdat | jsonlines | arrow | delta | raw
header: true | false
jmespath: <JMESPath expression>
limit: <integer>
null_if: <null_if expression>
range: <backfill range expression>
sheet: <excel sheet/range expression>
skip_blank_lines: true | false

Target Options

add_new_columns: true | false
adjust_column_type: true | false
batch_limit: <integer>
column_casing: source | target | snake | upper | lower
column_typing: {map of column type generation configuration}
compression: auto | none | gzip | snappy | zstd
datetime_format: auto | <ISO 8601 date format>
delimiter: <character to use as flat file delimiter>
delete_missing: hard | soft
file_max_bytes: <integer>
file_max_rows: <integer>
format: csv | xlsx | json | parquet | raw
header: true | false
ignore_existing: true | false
post_sql: <sql query>
pre_sql: <sql query>
table_ddl: <ddl sql query>
table_keys: {map of table key type to array of column names}
table_tmp: <name of table>
use_bulk: true | false

Replication Specification

Here we have the definitions for the accepted keys.

Replication Config Key
Description

source

The source database connection (name, conn string or URL).

target

The target database connection (name, conn string or URL).

hooks

streams.<key>

The source table (schema.table), local / cloud file path. Use file:// for local paths.

streams.<key>.object

or defaults.object

The target table (schema.table) or local / cloud file path. Use file:// for local paths.

streams.<key>.columns

or defaults.columns

streams.<key>.transforms

or defaults.transforms

streams.<key>.hooks

or defaults.hooks

streams.<key>.mode

or defaults.mode

streams.<key>.select or defaults.select

Select or exclude specific columns from the source stream. Use - prefix to exclude.

streams.<key>.single or defaults.single

When using a wildcard (*) in the stream name, consider as a single stream (don't expand into many streams).

streams.<key>.sql or defaults.sql

The custom SQL query to use. Accepts file://path/to.query.sql as well.

streams.<key>.primary_key

or defaults.primary_key

The column(s) to use as primary key. If composite key, use array.

streams.<key>.update_key

or defaults.update_key

The column to use as update key (for incremental mode).

streams.<key>.source_options

or defaults.source_options

streams.<key>.target_options

or defaults.target_options

env

PreviousReplicationsNextModes

Last updated 23 hours ago

The <stream name> identifies the stream to replicate. This can be either a source table name, a file path, or a wildcard pattern using *. Wildcards allow matching multiple tables within a schema or multiple files within a directory. For example, my_schema.* matches all tables in my_schema, while data/*.csv matches all CSV files in the data directory. See for more details.

The <replication level hooks map> and <stream level hooks map> accepts the keys below. See for more details.

The <source options map> accepts the keys below. See for more details.

The <target options map> accepts the keys below. See for more details.

The replication level hooks to apply (at start & end of replication). See for details.

The columns types map. See for details.

The transforms to apply. See for details.

The stream level hooks to apply (pre- & post-stream run). See for details.

The target load to use: incremental, truncate, full-refresh, backfill or snapshot. Default is full-refresh.

Options to further configure source. See for details.

Options to further configure target. See for details.

Environment variables to use for replication. See for details.

Tags & Wildcards
Hooks
Source Options
Target Options
here
here
here
here
mode
here
here
here