Skip to main content

CreatePipeline Embeddable

Currently, it's not possible to create pipelines via the Ingestro API. To allow users, either internal teams or your customers, to create pipelines, you need to integrate the CreatePipeline component. This component is easy to implement into your existing application and provides an intuitive workflow:

  • Select the pipeline's input and output connectors, as well as its target data model (TDM)
  • Configure how the input data should be transformed based on the selected TDM
  • Set an execution schedule and an error threshold (optional)

Depending on your use case, for example, whether you want your customers to go through the flow or, say, your internal customer success team, you can configure the component in different ways. You can use the embeddable as-is, with a linked pipeline template, and/or by injecting parts of the configuration at the component level. If specific components are predefined in the template or within the component itself, they will not be shown in the flow."

You can optionally execute the pipeline immediately after it is created by setting settings.runPipelineOnCreation to true. When this setting is enabled, users can review and manually adjust individual entries during the pipeline creation process. These manual changes will then be applied to the pipeline's initial execution.

The CreatePipeline component supports a streamlined upload flow when all required pipeline components are preconfigured — including the name, manual input connector, output connector, target data model, and error threshold. If the input connector has node.type == MANUAL, the component immediately displays a file upload step as the entry point to pipeline creation. Users can upload their file first, then proceed to header selection and data transformation. This mirrors the importer-like experience, providing a simpler, more intuitive setup flow tailored for manual data ingestion.

info

A pipeline template is the blueprint of a pipeline. You can predefine certain components so that the user going through the "Create Pipeline" flow doesn't have to configure them manually.

info

Input connectors always trigger executions for all pipelines they are connected to.

Preview Mode for Large Datasets

The CreatePipeline component can automatically activate preview mode to optimize performance and provide a faster configuration experience when working with large datasets. Preview mode allows users to configure transformations on a sample of the data before applying them to the entire dataset.

How Preview Mode Works

Preview mode is automatically activated when the number of input rows exceeds the sampleSize setting (default: 10.000 rows). Set sampleSize to null to disable preview mode entirely.

When active and runPipelineOnCreation is set to false:

  1. Sample Data Throughout: The entire pipeline creation flow uses only a sample of the data (limited to the number of rows specified by sampleSize).
  2. Info Banner: An info banner is displayed at the top of the header selection and data transformation steps, indicating that you're working with a sample of the data.
  3. Configuring Transformations: Users can set up mappings and transformations using the sample data, ensuring a responsive and efficient workflow.

When active and runPipelineOnCreation is set to true:

  1. Initial Data Transformation: Only the initial data transformation step uses a sample of the data (limited to the number of rows specified by sampleSize).
  2. Info Banner: An info banner is displayed at the top of the data transformation step, indicating that you're working with a sample of the data.
  3. Configuring Transformations: Users can set up mappings and transformations using the sample data, ensuring a responsive and efficient workflow.
  4. Applying to All Rows: After configuring transformations, users can apply the changes to the entire dataset. The transformations are then processed across all input rows.
  5. Continuing the Flow: Once transformations are applied to all rows, the flow continues normally with the complete dataset, and preview mode is no longer active for subsequent steps.
info

Preview mode is designed to improve performance when working with large files while maintaining the same level of control over data transformation and mapping.

Configure the component based on your specific use case

Add the code snippet below and insert the component on the page where you want it to appear:

Fields

accessToken

Add here the access token you've got in Step 3

templateId

Add here the ID of the template you want to use when a pipeline is created

configuration

The configuration determines if certain settings or components, such as connectors, target data model, schedule config etc., are already set for the pipeline, meaning that users going through the flow won't have to set them themselves

developerMode

Set developer mode to true to test pipelines in your testing environment. Pipeline executions in developer mode are free of charge. Set it to false for production use. Please note that pipelines executed in developer mode will only output 100 rows

name

The name of the pipeline

tdmId

The ID of the target data model that should be used for the created pipeline. If this is set, the user won't be able to select another target data model

inputConnectorId

The ID of the input connector that should be used for the created pipeline. If this is set, the user won't be able to select another input connector

outputConnectorId

The ID of the output connector that should be used for the created pipeline. If this is set, the user won't be able to select another output connector

errorConfig

Defines how the pipeline should handle errors that might occur during pipeline execution

errorThreshold

A float between 0 and 100, representing the allowed percentage of erroneous cells during a pipeline execution. For example, if it is set to 10, it means that pipeline executions with less than 10% erroneous cells will be considered successful and will not fail

scheduleConfig

Defines when the pipeline is executed for the first and last time, as well as the interval at which it is executed

frequency

Sets how often the pipeline is executed. It is intertwined with interval. For example, if frequency is set to HOURLY and interval is set to 2, the pipeline is executed every 2 hours:

  • HOURLY
  • DAILY
  • WEEKLY
  • MONTHLY

interval

Sets the interval based on the frequency at which the pipeline is executed. For example, if interval is set to 2 and frequency is set to HOURLY, the pipeline is executed every 2 hours. The next execution cannot be scheduled further into the future than 1 year from the set start date and time

startsOn

The date and time when the pipeline is first executed, provided as a timestamp in UTC (e.g. 2024-09-02T13:26:13.642Z). The date and time cannot be in the past

endsOn

The date and time when the pipeline is last executed, provided as a timestamp in UTC (e.g. 2024-09-02T13:26:13.642Z). This date and time cannot be earlier than the start date and time

settings

i18nOverrides

Allows you to override each text element in the interface

language

Defines the language of the embeddable (so far we only support English ("en"))

modal

Defines whether the component is shown inline (false) or within a modal view (true)

allowTdmCreation

Defines whether the "Create target data model" button is shown in the TDM selection dropdown

allowInputConnectorCreation

Defines whether the "Create connector" button is shown in the input connector selection dropdown

allowOutputConnectorCreation

Defines whether the "Create connector" button is shown in the output connector selection dropdown

runPipelineOnCreation

Defines whether the pipeline is executed after it was created.

sampleSize

Defines the maximum number of rows to use during the data transformation preview phase when working with large datasets. Default: 10000 (integer or null). When the number of input rows exceeds this value, preview mode is automatically activated, allowing users to configure transformations on a sample before applying them to the entire dataset. Set to null to disable preview mode entirely.

onPipelineCreate

Runs after the user has confirmed the final step of the flow to create a pipeline

onClose

Runs when the user attempts to exit the "Create Pipeline" flow by clicking "Cancel" or closing the modal using the "X" button

onConnectorCreate

Runs when the user clicks on "Create connector" when selecting an input or output connector

onTdmCreate

Runs when the user clicks on "Create target data model" when selecting a target data model

onExecutionView

Runs when the user clicks on the "View" or "Fix" button of the execution that was created after the new pipeline was created. When defined, each execution element shows a "View" or "Fix" button that triggers this hook when clicked.

onSuccessContinue

Runs when the user clicks the "Continue" button on the success screen after successfully creating a pipeline.

React
Angular
Vue
VanillaJS
<CreatePipeline
accessToken="ACCESS_TOKEN"
templateId="TEMPLATE_ID"
configuration={{
developerMode: boolean (default: false),
name: string,
tdmId: string,
outputConnectorId: string,
inputConnectorId: string,
errorConfig: {
error_threshold: number,
},
scheduleConfig: {
frequency: "HOURLY" | "DAILY" | "WEEKLY" | "MONTHLY",
interval: number,
startsOn: Date,
endsOn: Date,
}
}}
settings={{
i18nOverrides: {},
language: "en",
modal: boolean (default: true),
allowTdmCreation: boolean (default: false),
allowInputConnectorCreation: boolean (default: false),
allowOutputConnectorCreation: boolean (default: false),
runPipelineOnCreation: boolean (default: false),
sampleSize: number (default: 10000)
}}
onPipelineCreate={({data}) => {
// runs after the user has confirmed the final step of the flow to create a pipeline
// data: pipeline object after creation
}}
onClose={() => {
// runs when the creation workflow is closed via the "Cancel" button or the "X" button
}}
onConnectorCreate={({reload, connectorType}) => {
// runs when the user clicks on "Create connector" when selecting an input or output connector
// reload: on function call, refetch the connectors
// connectorType: "input" or "output"
}}
onTdmCreate={({reload}) => {
// runs when the user clicks on "Create target data model" when selecting a target data model
// reload: on function call, refetch the TDMs
}}
onExecutionView={({data}) => {
// runs when the user selects an execution from the list of triggered pipelines
// data: object of the selected execution
}}
onSuccessContinue={({data}) => {
// runs when the user clicks the "Continue" button on the success screen after pipeline creation
// data: object of the created pipeline
}}
/>