Skip to content

Conversation

@slfan1989
Copy link
Contributor

@slfan1989 slfan1989 commented Oct 5, 2025

Which issue does this PR close?

Closes #1404.

Rationale for this change

[AURON#1404] Support for Spark 4.0.1 Compatibility in Auron.

What changes are included in this PR?

To support Spark 4, Auron needs to be adapted accordingly. Currently, Celeborn already supports Spark 4.0, and Iceberg has also supported Spark 4.0 for some time. The Iceberg community has already voted to deprecate support for Spark 3.4, and it will be removed soon.

For this PR, I have made the following changes:

  • Created a new module: I created a new module spark-extension-shims-spark4. While I considered making changes to the existing spark-extension-shims-spark3 module, I decided that creating a new module would be a better approach since the current module is targeted specifically for Spark 3.

  • Removed support for lower versions of Spark: In the spark-extension-shims-spark4 module, I removed all references and compatibility for Spark versions 3.0 to 3.5, ensuring that the module starts supporting Spark 4.0.

  • Three changes encountered during compilation:

    • NativeShuffleExchangeExec#ShuffleWriteProcessor: Due to SPARK-44605 restructuring the write method in the API, I refactored the partition and rdd handling here to retrieve them from dependencies for compatibility with other interfaces. In the future, we should switch to the new interface and make further changes to nativeRssShuffleWrite / nativeShuffleWrite.

    • NativeBroadcastExchangeBase#getBroadcastTimeout: In Spark 4.0, getBroadcastTimeout needs to be fetched from getActiveSession.

    • NativeBroadcastExchangeBase#getRelationFuture: In Spark 4.0, the type of SparkSession has changed to org.apache.spark.sql.classic.SparkSession, so I made the necessary adjustments to the way it is accessed.

Are there any user-facing changes?

No.

How was this patch tested?

CI.

@slfan1989
Copy link
Contributor Author

slfan1989 commented Oct 5, 2025

This PR is still under debugging, but it is largely compatible with Spark 4. Some fine-tuning is still required.

./auron-build.sh --pre --sparkver 4.0 --scalaver 2.13 -DskipBuildNative

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Auron Parent Pom 7.0.0-SNAPSHOT:
[INFO] 
[INFO] Auron Parent Pom ................................... SUCCESS [  1.855 s]
[INFO] proto .............................................. SUCCESS [ 12.516 s]
[INFO] hadoop-shim_2.13 ................................... SUCCESS [  4.145 s]
[INFO] auron-core ......................................... SUCCESS [  4.969 s]
[INFO] auron-common_2.13 .................................. SUCCESS [  5.419 s]
[INFO] spark-version-annotation-macros_2.13 ............... SUCCESS [  3.169 s]
[INFO] spark-extension_2.13 ............................... SUCCESS [ 16.443 s]
[INFO] spark-extension-shims-spark4_2.13 .................. SUCCESS [ 13.889 s]
[INFO] assembly ........................................... SUCCESS [  9.909 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  01:12 min
[INFO] Finished at: 2025-10-05T17:11:14+08:00
[INFO] ------------------------------------------------------------------------

./auron-build.sh --pre --sparkver 4.0 --scalaver 2.13

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Auron Parent Pom 7.0.0-SNAPSHOT:
[INFO] 
[INFO] Auron Parent Pom ................................... SUCCESS [  2.034 s]
[INFO] proto .............................................. SUCCESS [ 14.988 s]
[INFO] hadoop-shim_2.13 ................................... SUCCESS [  5.040 s]
[INFO] auron-core ......................................... SUCCESS [08:13 min]
[INFO] auron-common_2.13 .................................. SUCCESS [  9.084 s]
[INFO] spark-version-annotation-macros_2.13 ............... SUCCESS [  3.782 s]
[INFO] spark-extension_2.13 ............................... SUCCESS [ 21.663 s]
[INFO] spark-extension-shims-spark4_2.13 .................. SUCCESS [ 16.130 s]
[INFO] assembly ........................................... SUCCESS [ 13.870 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  09:40 min
[INFO] Finished at: 2025-10-05T17:28:57+08:00
[INFO] ------------------------------------------------------------------------

@slfan1989 slfan1989 marked this pull request as ready for review October 6, 2025 00:08
slfan1989 and others added 2 commits October 8, 2025 16:48
Co-authored-by: cxzl25 <3898450+cxzl25@users.noreply.github.com>
@slfan1989
Copy link
Contributor Author

@richox @cxzl25 @SteNicholas I believe we should continue moving forward with support for Spark 4.0. Although this version is an initial support and may have some issues, we should keep pushing ahead. I would appreciate hearing your thoughts and suggestions.

This PR requires #1399 to be merged first.

cc: @guixiaowen

@merrily01 merrily01 changed the title [AURON#1404] Support for Spark 4.0.1 Compatibility in Auron. [AURON #1404] Support for Spark 4.0.1 Compatibility in Auron. Oct 13, 2025
@guixiaowen
Copy link
Contributor

@richox @cxzl25 @SteNicholas I believe we should continue moving forward with support for Spark 4.0. Although this version is an initial support and may have some issues, we should keep pushing ahead. I would appreciate hearing your thoughts and suggestions.

This PR requires #1399 to be merged first.

cc: @guixiaowen

@slfan1989 LGTM,LGTM,LGTM

@richox
Copy link
Contributor

richox commented Oct 15, 2025

Is there a big difference between the APIs of spark4.0 and spark3.5? can we use a unique shims-spark package for these two shims?

@slfan1989
Copy link
Contributor Author

Is there a big difference between the APIs of spark4.0 and spark3.5? can we use a unique shims-spark package for these two shims?

I’ve discussed privately with @richox. In the Auron project, we’ve decided to create a unified shim instead of introducing a separate new module. I’ll continue to follow up on the progress of this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for Spark 4.0.1 Compatibility in Auron

4 participants