1 Star 0 Fork 8


加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
pom.xml 119.17 KB
一键复制 编辑 原始数据 按行查看 历史
<?xml version="1.0" encoding="UTF-8"?>
~ Licensed to the Apache Software Foundation (ASF) under one or more
~ contributor license agreements. See the NOTICE file distributed with
~ this work for additional information regarding copyright ownership.
~ The ASF licenses this file to You under the Apache License, Version 2.0
~ (the "License"); you may not use this file except in compliance with
~ the License. You may obtain a copy of the License at
~ http://www.apache.org/licenses/LICENSE-2.0
~ Unless required by applicable law or agreed to in writing, software
~ distributed under the License is distributed on an "AS IS" BASIS,
~ See the License for the specific language governing permissions and
~ limitations under the License.
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<name>Spark Project Parent POM</name>
<name>Apache 2.0 License</name>
<name>Matei Zaharia</name>
<organization>Apache Software Foundation</organization>
<name>Dev Mailing List</name>
<name>User Mailing List</name>
<name>Commits Mailing List</name>
<!-- See additional modules enabled by profiles below -->
<!-- Version used in Maven Hive dependency -->
<!-- Version used for internal directory structure -->
<!-- note that this should be compatible with Kafka brokers version 0.10 and up -->
If you changes codahale.metrics.version, you also need to change
the link to metrics.dropwizard.io in docs/monitoring.md.
<!-- Should be consistent with Kinesis client dependency -->
<!-- the producer is used in tests -->
<!-- org.apache.httpcomponents/httpclient-->
<!-- commons-httpclient/commons-httpclient-->
<!-- managed up from 3.2.1 for SPARK-11652 -->
<!-- for now, not running scalafmt as part of default verify pipeline -->
<!-- org.apache.commons/commons-lang/-->
<!-- org.apache.commons/commons-lang3/-->
<!-- org.apache.commons/commons-pool2/-->
Managed up from older version from Avro; sync with jackson-module-paranamer dependency version
If you are changing Arrow version specification, please check
./python/pyspark/sql/pandas/utils.py, and ./python/setup.py too.
<!-- org.fusesource.leveldbjni will be used except on arm64 platform. -->
<!-- Some UI tests require Chrome and Chrome driver installed so those tests are disabled by default. -->
<!-- Package to use when relocating shaded classes. -->
<!-- Modules that copy jars to the build directory should do so under this location. -->
<!-- Allow modules to enable / disable certain build plugins easily. -->
Dependency scopes that can be overridden by enabling certain profiles. These profiles are
declared in the projects that build assemblies.
For other projects the scope should remain as "compile", otherwise they are not available
during compilation if the dependency is transivite (e.g. "graphx/" depending on "core/" and
needing Hadoop classes in the classpath to compile).
These default to Hadoop 3.x shaded client/minicluster jars, but are switched to hadoop-client
when the Hadoop profile is hadoop-2.7, because these are only available in 3.x. Note that,
as result we have to include the same hadoop-client dependency multiple times in hadoop-2.7.
Overridable test home. So that you can call individual pom files directly without
things breaking.
<!-- Needed for consistent times -->
<maven.build.timestamp.format>yyyy-MM-dd HH:mm:ss z</maven.build.timestamp.format>
Google Mirror of Maven Central, placed first so that it's used instead of flaky Maven Central.
See https://storage-download.googleapis.com/maven-central/index.html
<name>GCS Maven Central mirror</name>
This is used as a fallback when the first try fails.
<name>Maven Repository</name>
Google Mirror of Maven Central, placed first so that it's used instead of flaky Maven Central.
See https://storage-download.googleapis.com/maven-central/index.html
<name>GCS Maven Central mirror</name>
This is a dummy dependency that is used to trigger the maven-shade plugin so that Spark's
published POMs are flattened and do not contain variables. Without this dependency, some
subprojects' published POMs would contain variables like ${scala.binary.version} that will
be substituted according to the default properties instead of the ones determined by the
profiles that were active during publishing, causing the Scala 2.10 build's POMs to have 2.11
dependencies due to the incorrect substitutions. By ensuring that maven-shade runs for all
subprojects, we eliminate this problem because the substitutions are baked into the final POM.
For more details, see SPARK-3812 and MNG-2971.
This is needed by the scalatest plugin, and so is declared here to be available in
all child modules, just as scalatest is run in all children
<!-- This artifact is a shaded version of ASM 7.x. The POM that was used to produce this
is at https://github.com/apache/geronimo-xbean/tree/trunk/xbean-asm7-shaded
For context on why we shade ASM, see SPARK-782 and SPARK-6152. -->
<!-- Shaded deps marked as provided. These are promoted to compile scope
in the modules where we want the shaded classes to appear in the
associated jar. -->
<!-- End of shaded deps -->
<!-- Provide a JAXB impl; no longer auto available in Java 9+ in the JDK -->
<!-- for now, we only write XML in PMML export, and these can be excluded -->
SPARK-27611: Exclude redundant javax.activation implementation, which
conflicts with the existing javax.activation:activation:1.1.1 dependency.
<!-- Hive uses commons-logging 1.1.3 from 0.13 to 1.2 -->
<!-- Update htmlunit dependency that selenium uses for better JS support -->
<!-- Added for selenium only, and should match its dependent version: -->
<!-- runtime scope is appropriate, but causes SBT build problems -->
<!-- Only HyperLogLogPlus is used, which doesn't depend on fastutil -->
<!-- In theory we need not directly depend on protobuf since Spark does not directly
use it. However, when building with Hadoop/YARN 2.2 Maven doesn't correctly bump
the protobuf version up from the one Mesos gives. For now we include this variable
to explicitly bump the version when building with YARN. It would be nice to figure
out why Maven can't resolve this correctly (like SBT does). -->
<!-- Guava is excluded because of SPARK-6149. The Guava version referenced in this module is
15.0, which causes runtime incompatibility issues. -->
<!-- SPARK-28765 Unused JDK11-specific dependency -->
<!-- SPARK-28765 Unused JDK11-specific dependency -->
<!-- Hadoop 3.x dependencies -->
<!-- End of Hadoop 3.x dependencies -->
<!-- BeanUtils >= 1.9.0 no longer splits out -core; exclude it -->
<!-- Hadoop-3.2 -->
<!-- Managed up to match Hadoop in HADOOP-16530 -->
<!-- avro-mapred for some reason depends on avro-ipc's test jar, so undo that. -->
<!-- See SPARK-23654 for info on this dependency;
It is used to keep javax.activation at v1.1.1 after dropping
jets3t as a dependency.
Hack to exclude org.apache.hadoop:hadoop-yarn-server-resourcemanager:jar:tests.
For some reasons, SBT starts to pull the dependencies of 'hadoop-yarn-server-tests' above
with 'tests' classifier after upgrading SBT 1.3 (SPARK-21708). Otherwise, some tests might
fail, see also SPARK-33104.
<!-- Hadoop-3.2 -->
<!-- Begin of Hive 2.3 exclusion -->
ORC is needed, but the version should be consistent with the `sql/core` ORC data source.
Looks like this is safe, please see the major changes from ORC 1.3.3 to 1.5.4:
HIVE-17631 and HIVE-19465
<!-- jetty-all conflict with jetty 9.4.12.v20180830 -->
<!-- org.apache.logging.log4j:* conflict with log4j 1.2.17 -->
<!-- Hive includes javax.servlet to fix the Hive on Spark test failure; see HIVE-12783 -->
<!-- hive-storage-api is needed and must be explicitly included later -->
<!-- End of Hive 2.3 exclusion -->
<!-- pull this in when needed; the explicit definition culls the surplis-->
<!-- break the loop -->
<!-- excluded dependencies & transitive.
Some may be needed to be explicitly included-->
<!-- Do not need Calcite because we disabled hive.cbo.enable -->
<!-- Cat X license now; see SPARK-18262 -->
<!-- Begin of Hive 2.3 exclusion -->
<!-- Do not need Tez -->
<!-- Do not need Calcite, see SPARK-27054 -->
<!-- org.apache.logging.log4j:* conflict with log4j 1.2.17 -->
<!-- End of Hive 2.3 exclusion -->
<!-- Begin of Hive 2.3 exclusion -->
<!-- Hive removes the HBase Metastore; see HIVE-17234 -->
<!-- End of Hive 2.3 exclusion -->
<!-- Begin of Hive 2.3 exclusion -->
<!-- parquet-hadoop-bundle:1.8.1 conflict with 1.10.1 -->
<!-- Do not need Jasper, see HIVE-19799 -->
<!-- End of Hive 2.3 exclusion -->
<!-- hive shims pulls in hive 0.23 and a transitive dependency of the Hadoop version
Hive was built against. This dependency cuts out the YARN/hadoop dependency, which
is needed by Hive to submit work to a YARN cluster.-->
<!-- Begin of Hive 2.3 exclusion -->
<!-- Exclude log4j-slf4j-impl, otherwise throw NCDFE when starting spark-shell -->
<!-- End of Hive 2.3 exclusion -->
<!-- hive-llap-common is needed when registering UDFs in Hive 2.3.
We add it here, otherwise -Phive-provided won't work. -->
<!-- hive-llap-client is needed when run MapReduce test in Hive 2.3. -->
<groupId> com.esotericsoftware</groupId>
<skipMain>true</skipMain> <!-- skip compile -->
<skip>true</skip> <!-- skip testCompile -->
<!-- Surefire runs all Java tests -->
<!-- Note config is repeated in scalatest config -->
<argLine>-ea -Xmx4g -Xss4m -XX:ReservedCodeCacheSize=${CodeCacheSize} -Dio.netty.tryReflectionSetAccessible=true</argLine>
Setting SPARK_DIST_CLASSPATH is a simple way to make sure any child processes
launched by the tests have access to the correct test-time classpath.
<!-- Needed by sql/hive tests. -->
<!-- Scalatest runs all Scala tests -->
<!-- Note config is repeated in surefire config -->
<argLine>-ea -Xmx4g -Xss4m -XX:ReservedCodeCacheSize=${CodeCacheSize} -Dio.netty.tryReflectionSetAccessible=true</argLine>
Setting SPARK_DIST_CLASSPATH is a simple way to make sure any child processes
launched by the tests have access to the correct test-time classpath.
<!-- Needed by sql/hive tests. -->
<!-- This includes dependencies with 'runtime' and 'compile' scopes;
see the docs for includeScope for more details -->
<!-- This plugin's configuration is used to store Eclipse m2e settings only. -->
<!-- It has no influence on the Maven build itself. -->
<!-- This plugin dumps the test classpath into a file -->
The shade plug-in is used here to create effective pom's (see SPARK-3812), and also
remove references from the shaded libraries from artifacts published by Spark.
<mkdir dir="${project.build.directory}/tmp" />
<!-- Enable surefire and scalatest in all children, in one place: -->
<!-- Build test-jar's for all projects, since some projects depend on tests from others -->
<parameters>${scalafmt.parameters}</parameters> <!-- (Optional) Additional command line arguments -->
<skip>${scalafmt.skip}</skip> <!-- (Optional) skip formatting -->
<configLocation>dev/.scalafmt.conf</configLocation> <!-- (Optional) config location -->
Couple of dependencies are coming in bundle format (bundle is just a normal jar which
contains OSGi metadata in the manifest). If one don't use OSGi, then a bundle will work as
any other jar. Since maven doesn't have native bundle support it needs an external plugin
handle it. If the plugin is not added then the build can't resolve bundle dependencies.
This profile is enabled automatically by the sbt build. It changes the scope for shaded
dependencies, since we don't shade it in the artifacts generated by the sbt build.
<!-- Ganglia integration is not included by default due to LGPL-licensed code -->
<!-- Kinesis integration is not included by default due to ASL-licensed code -->
<!-- A series of build profiles where customizations for particular Hadoop releases can be made -->
<!-- Hadoop-a.b.c dependencies can be found at
<!-- Default hadoop profile. Uses global properties. -->
<!-- Default hive profile. Uses global properties. -->
<!-- generally not enabled for automated builds, but will run k8s tests -->
This is a profile to enable the use of the ASF snapshot and staging repositories
during a build. It is useful when testing against nightly or RC releases of dependencies.
It MUST NOT be used when building copies of Spark to use in production of for distribution,
<!-- override point for ASF staging/snapshot repos -->
<id>ASF Staging</id>
<id>ASF Snapshots</id>
<id>ASF Staging</id>
<id>ASF Snapshots</id>
These empty profiles are available in some sub-modules. Declare them here so that
maven does not complain when they're provided on the command line for a sub-module
that does not have them.
<!-- use org.openlabtesting.leveldbjni on aarch64 platform -->
马建仓 AI 助手
