CCC ETL Capabilities
Capabilities for ETL technologies, as defined by the FINOS Common Cloud Controls project.
- ID
- CCC.ETL.CP
- Version
- v2026.06-rc4
- Gemara version
- v1.2.0
- Author
- FINOS Common Cloud Controls
Data Processing
The Data Processing group covers entries related to transforming, enriching, and moving data through pipelines. This includes ETL/ELT, stream and batch processing, data lineage, schema evolution, and workflow orchestration for data workloads.
CCC.ETL.CP01 Batch Processing
Supports the processing of bounded (batch) data sources using a consistent programming model or engine.
CCC.ETL.CP02 Stream Processing
Supports the processing of unbounded (streaming) data sources using a consistent programming model or engine.
CCC.ETL.CP03 Schema Evolution
Automatically detects source data structures and manages changes in schema (e.g., column additions) over time without pipeline failure.
CCC.ETL.CP04 Distributed Data Shuffling
Provides an internal service to re-partition and group data across distributed workers for complex operations like joins and aggregations.
CCC.ETL.CP05 Windowing and Event-Time Processing
Enables grouping of data based on time attributes, supporting tumbling, hopping, and session windows with late-data handling (watermarking).
CCC.ETL.CP08 Job Bookmarks
Persists the state of a processing job (e.g., checkpointing or bookmarks) to ensure exactly-once processing and fault tolerance.
CCC.ETL.CP09 Pushdown Optimization
The ability to translate transformation logic into the native language of the source or sink (e.g., SQL) to minimize data movement.
CCC.ETL.CP11 Data Lineage & Metadata Tracking
Captures and exports metadata regarding the data sources, the transformation steps, and the final destination (sink), showing the flow from source to destination for compliance and debugging.
CCC.ETL.CP12 User-Defined Function (UDF) Support
Allows developers to inject custom logic (Python, Java, SQL) into the managed processing pipeline for complex transformations.
Ingestion
The Ingestion group covers entries related to how a service receives or retrieves data, inputs, or commands for processing. This includes both active (pull-based) and passive (push-based) ingestion patterns.
CCC.ETL.CP06 Change Data Capture (CDC) Integration
Supports incremental data ingestion by tracking changes in source transaction logs rather than full table scans.
CCC.ETL.CP07 Connectivity and Connector Library
Provides pre-built, managed connectors for a variety of sources and sinks (e.g., Object Storage, RDBMS, NoSQL, Pub/Sub).
Orchestration
The Orchestration group covers entries related to coordinating and managing workloads across distributed systems. This includes container orchestration, job scheduling, CI/CD pipelines, build automation, and service mesh management.
CCC.ETL.CP10 Visual Orchestration
Provides a graphical interface to define dependencies between extraction, transformation, and loading tasks.
CCC.ETL.CP13 Time-Based Job Triggering
Supports time-based (cron) mechanisms to initiate data processing workflows.
CCC.ETL.CP14 Event Based Job Triggering
Supports event-based (file arrival) mechanisms to initiate data processing workflows.