Sharp ETL is a ETL framework that simplifies writing and executing ETLs by simply writing SQL workflow files. The SQL workflow file format is combined your favourite SQL dialects with just a little bit of configurations.
docker run --name sharp_etl_db -d -p 3306:3306 -e MYSQL_ROOT_PASSWORD=root -e MYSQL_DATABASE=sharp_etl mysql:5.7build from source or download jar from releases
./gradlew buildJars -PscalaVersion=2.12 -PsparkVersion=3.3.0 -PscalaCompt=2.12.15cat spark/src/main/resources/tasks/hello_world.sqlyou will see the following contents:
-- workflow=hello_world
-- loadType=incremental
-- logDrivenType=timewindow
-- step=define variable
-- source=temp
-- target=variables
SELECT 'RESULT' AS `OUTPUT_COL`;
-- step=print SUCCESS to console
-- source=temp
-- target=console
SELECT 'SUCCESS' AS `${OUTPUT_COL}`;spark-submit --master local --class com.github.sharpdata.sharpetl.spark.Entrypoint spark/build/libs/sharp-etl-spark-standalone-3.3.0_2.12-0.1.0.jar single-job --name=hello_world --period=1440 --default-start-time="2022-07-01 00:00:00" --once --localAnd you will see the output like:
== Physical Plan ==
*(1) Project [SUCCESS AS RESULT#17167]
+- Scan OneRowRelation[]
root
|-- RESULT: string (nullable = false)
+-------+
|RESULT |
+-------+
|SUCCESS|
+-------+
The compatible versions of Spark are as follows:
