Apache SeaTunnel 2.3.12 Released! Core Engine Upgraded, Connector Ecosystem Expanded Further
Recently, Apache SeaTunnel 2.3.12 was officially released. This is another iteration following version 2.3.11. During this cycle, 82 PRs were merged, delivering 9 new features, over 30 function enhancements, more than 20 documentation corrections, and 43 bug fixes. Key improvements focus on the integration of SensorsData and Databend ecosystems, expanded read/write capabilities for connectors like Paimon, ClickHouse, and MaxCompute, enhancements to SQL Transform syntax and vector functions, as well as refined Checkpoint granular monitoring for the Zeta engine and improved usability of REST interfaces.
Highlights Overview
Apache SeaTunnel 2.3.12 boasts numerous highlights, categorized as follows:
New Connectors: SensorsData and Databend connectors.
Connector Capability Expansion:
Paimon: Multi-source concurrency, permission verification, and LIKE/IN predicate pushdown.
ClickHouse: Parallel reading of multiple tables and parallel fetching of table structures.
MaxCompute Sink: Append upsert&delete session mode and timestamp field writing.
Transform/SQL Syntax & Vector Function Enhancements: SQL Transform adds COALESCE type conversion, multi_if, vector functions, and Murmur64 hashing.
Zeta Engine Observability Enhancement: REST API can return results in SQL format, job information includes
startTimeby default, and task queue size is observable.File Connector Enhancement: Support for binary chunking, custom CSV delimiters, and file filtering by last modification time.
Documentation Corrections & Supplements: Added explanations for Iceberg S3 Tables, JDBC GenericDialect, and StarRocks mandatory schema.
Bug Fixes: 43 fixes covering scenarios such as Iceberg time zones, Kafka offsets, Oracle CDC, and Transform vector dimensions.
Feature Update List
【New Connectors】
SensorsData Source/Sink (#9432)
Databend Source/Sink (#9331)
【Connector Capability Expansion】
ClickHouse: Concurrent reading of multiple tables + parallel reading of table structures (#9704 #9446)
Paimon: Multi-source concurrency, permission control, LIKE/IN pushdown, version upgraded to 1.1.1 (#9759 #9722 #9484 #9379 #8074)
MaxCompute: Upsert/delete session mode, timestamp field writing, tunnel endpoint option (#9462 #9234 #9548)
Hudi: Pre-merge field option (#9496)
HdfsFile: Concurrent writing of multiple tables (#9651)
Hive Sink: Support for overwrite mode (#7891)
Kudu: Filter pushdown (#9405)
TDengine: Sub-table and fieldNames mapping (#9593)
MySQL CDC: Startup by time offset, Tinyint(1) reading as byte, compatibility with MySQL 8.4+ (#9735 #9373 #9720)
Redis Hash: Support for
key_field_nameoption, key included in result records (#9642 #9574)File/CSV/Excel: Binary chunking, custom CSV delimiters, filtering by modification time, configurable maximum rows per Excel table (#9668 #9608 #9526 #9391)
【Transform & SQL】
Added COALESCE type conversion, multi_if, TRIM_SCALE, Murmur64, vector dimensionality reduction, vector functions, JSONPath multi-field extraction, and Data Validator transformation (#9299 #9154 #9700 #9748 #9783 #9765 #9712 #9445)
SQL Transform EXTRACT function extended for field support, pre-check for cast failures (#9342 #9600)
【Zeta Engine & Core】
REST API support for SQL format result return (#9802)
Added
startTimeto job information (#9400)Exposure of task intermediate queue size metrics (#9550)
Expansion of JobStateEvent event listening (#9689)
Plugin directory isolation: Each connector can have an independent lib directory (#9650)
Cluster scripts support displaying member information (#9502)
Default
slot-numchanged to CPU cores × 2 (#9601)Local mode documentation and default configuration optimization (#9770)
Stability optimization for CheckpointErrorRestoreEndTest cases (#9619)
Removal of distributed locks when storing metrics (#9776)
【Format & Serialization】
Maxwell/Canal/Debezium JSON formats supplemented with
ts_msandtablefields; File Sink supports output in corresponding formats (#9701 #9278 #9336)
【Dependencies & Build】
AWS SDK v2 uniformly upgraded to 2.31.30 (#9698)
Apache Commons migrated to Commons-Lang3 (#9694)
Spotless automatically replaces shaded package imports (#9655)
CDC JAR size optimization (#9546)
Documentation Optimization
Added documentation for Iceberg S3 Tables REST Catalog (#9686)
New description of JDBC GenericDialect support (#9763)
Corrected StarRocks documentation to set the schema necessity to
true(#9656)Fixed missing
SAVEPOINT_DONEfield in REST APIfinished-jobs(#9676)Corrected title hierarchy of transform-v2 TableFilter (#9528)
Fixed plugin_input configuration example error in Sink plugins (#9492)
Updated Paimon projection pushdown documentation (#9425)
Updated JDBC end-to-end documentation (#9679)
Updated SQL function return type descriptions (#9703 #9711)
Added description of multi-modal support (#9652)
Added SeaTunnel toolchain to README (#9707)
Corrected parameter type formatting (#9753)
Fixed deepwiki link errors (#9356)
Fixed DynamoDB parameter errors (#9447)
Fixed 404 links in documentation (#9561)
New Zeta tuning guide (#9539)
Bug Fixes
Iceberg: Illegal provider-class in ORC writing, time zone offset, version upgraded to 1.6.1 (#9588 #9460 #9451)
Kafka: Exception when offset=-1, partition filter blocking, incorrect starting offset for job recovery (#9376 #9598 #9736)
Paimon: DECIMAL precision loss, dynamic bucketing exception, duplicate submission exception (#9452 #9480 #9595 #9665)
ClickHouse: Incorrect SeaTunnelRow tableId setting (#9585)
File/Parquet: Invalid custom schema, null pointer in binary reading strategy (#9596 #9391)
HTTP: Null pointer in pageField, pagination infinite loop, missing content-type, invalid mime type, inconsistent field count (#9498 #9504 #9497 #9363 #9103)
JDBC: Read/write of Postgres network address type, invalid Vertica upsert, Float→BigDecimal precision loss (#9618 #9607 #9670)
Oracle CDC: Unupdated transaction commit when LOB is enabled (#9412)
Mongo-CDC: Exception caused by default
exactly-once=true(#9454)Redis: Missing key field in Hash reading (#9642)
Elasticsearch: Incorrect vector column definition generation (#9470 #9471)
Prometheus: Failed double parsing for time (#9311)
OceanBase Oracle: Creation of unsupported data types (#9383)
RabbitMQ: Missing default values for durable/exclusive/auto-delete (#9631)
Transform-V2: Vector dimension precision, custom UDF exception, date format 'T' handling, integer input for from_unixtime (#9646 #9195 #9406 #9738)
Spark: Unapplied source parallelism (#9319)
Zeta: Excessive reading after Checkpoint disablement, unending local mode, master switch thread leak, Imap resource leak, missing pending status in job state retrieval, invalid HTTPS custom port (#9552 #9549 #9464 #9696 #9489 #9705)
Other fixes for CI, E2E, packaging, and dependency conflicts (details omitted).
Acknowledgments
Lao Wang, Adam Wang, alberne wang, chestnufang, corgy - w, CosmosNi, David Zollo, dy102, dyp12, e - mhui, Emmanuel, hailin0, huangkuilin, Jarvis, Jast, Jeremy, JeremyXin, Jia Fan, jiachuan.zhu, Junxin Xiao, Leon Yoah, litiliu, liucongjy, liuwei178, loupipalien, Luigi Durso, misi, Nana Jerde, ocean - zhc, Osiris, Parkjihun, SEZ, sohurdc, suntectec, wanmingshi, WenDing - Y, wgzhao, wildpea, xiaochen, yzeng1618, ZHANG YINGHONG, zhangdonghao, zhangqingsong, zhenyue - xu, Zhilin Li, Zmm
Special thanks to the release manager Jia Fan(GitHub ID: Jia Fan)! We also appreciate all code and documentation contributors listed above. Your every Commit, Review, and test ensured the high-quality and on-schedule launch of version 2.3.12. Apache SeaTunnel becomes more powerful because of you—let’s continue to work together for the next version!
For the complete list of changes, please visit the official Release page:
https://github.com/apache/seatunnel/releases/tag/2.3.12
About Apache SeaTunnel
Apache SeaTunnel is an easy-to-use, ultra-high-performance distributed data integration platform that supports real-time synchronization of massive amounts of data and can synchronize hundreds of billions of data per day stably and efficiently.
Welcome to fill out this form to be a speaker of Apache SeaTunnel: https://forms.gle/vtpQS6ZuxqXMt6DT6 :)
Why do we need Apache SeaTunnel?
Apache SeaTunnel does everything it can to solve the problems you may encounter in synchronizing massive amounts of data.
Data loss and duplication
Task buildup and latency
Low throughput
Long application-to-production cycle time
Lack of application status monitoring
Apache SeaTunnel Usage Scenarios
Massive data synchronization
Massive data integration
ETL of large volumes of data
Massive data aggregation
Multi-source data processing
Features of Apache SeaTunnel
Rich components
High scalability
Easy to use
Mature and stable
How to get started with Apache SeaTunnel quickly?
Want to experience Apache SeaTunnel quickly? SeaTunnel 2.1.0 takes 10 seconds to get you up and running.
https://seatunnel.apache.org/docs/2.1.0/developement/setup
How can I contribute?
We invite all partners who are interested in making local open-source global to join the Apache SeaTunnel contributors family and foster open-source together!
Submit an issue:
https://github.com/apache/seatunnel/issues
Contribute code to:
https://github.com/apache/seatunnel/pulls
Subscribe to the community development mailing list :
dev-subscribe@seatunnel.apache.org
Development Mailing List :
dev@seatunnel.apache.org
Join Slack:
https://join.slack.com/t/apacheseatunnel/shared_invite/zt-1kcxzyrxz-lKcF3BAyzHEmpcc4OSaCjQ
Follow Twitter:
https://twitter.com/ASFSeaTunnel
Join us now!❤️❤️


