1 Star 0 Fork 0

yunnai/kokkos

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
BSD-3-Clause
Kokkos Core implements a programming model in C++ for writing performance portable
applications targeting all major HPC platforms. For that purpose it provides
abstractions for both parallel execution of code and data management.
Kokkos is designed to target complex node architectures with N-level memory
hierarchies and multiple types of execution resources. It currently can use
OpenMP, Pthreads and CUDA as backend programming models.

Kokkos Core is part of the Kokkos C++ Performance Portability Programming EcoSystem,
which also provides math kernels (https://github.com/kokkos/kokkos-kernels), as well as 
profiling and debugging tools (https://github.com/kokkos/kokkos-tools).  

# Learning about Kokkos

A programming guide can be found on the Wiki, the API reference is under development.

For questions find us on Slack: https://kokkosteam.slack.com or open a github issue.

For non-public questions send an email to
crtrott(at)sandia.gov

A separate repository with extensive tutorial material can be found under 
https://github.com/kokkos/kokkos-tutorials.

Furthermore, the 'example/tutorial' directory provides step by step tutorial
examples which explain many of the features of Kokkos. They work with
simple Makefiles. To build with g++ and OpenMP simply type 'make'
in the 'example/tutorial' directory. This will build all examples in the
subfolders. To change the build options refer to the Programming Guide
in the compilation section.

To learn more about Kokkos consider watching one of our presentations:
* GTC 2015:
  - http://on-demand.gputechconf.com/gtc/2015/video/S5166.html
  - http://on-demand.gputechconf.com/gtc/2015/presentation/S5166-H-Carter-Edwards.pdf


# Contributing to Kokkos

We are open and try to encourage contributions from external developers. 
To do so please first open an issue describing the contribution and then issue
a pull request against the develop branch. For larger features it may be good
to get guidance from the core development team first through the github issue. 

Note that Kokkos Core is licensed under standard 3-clause BSD terms of use. 
Which means contributing to Kokkos allows anyone else to use your contributions
not just for public purposes but also for closed source commercial projects.
For specifics see the LICENSE file contained in the repository or distribution.

# Requirements

### Primary tested compilers on X86 are:
  * GCC 4.8.4
  * GCC 4.9.3
  * GCC 5.1.0
  * GCC 5.5.0
  * GCC 6.1.0
  * GCC 7.2.0
  * GCC 7.3.0
  * GCC 8.1.0
  * Intel 15.0.2
  * Intel 16.0.1
  * Intel 17.0.1
  * Intel 17.4.196
  * Intel 18.2.128
  * Clang 3.6.1
  * Clang 3.7.1
  * Clang 3.8.1
  * Clang 3.9.0
  * Clang 4.0.0
  * Clang 6.0.0 for CUDA (CUDA Toolkit 9.0)
  * Clang 7.0.0 for CUDA (CUDA Toolkit 9.1)
  * PGI 18.7
  * NVCC 7.5 for CUDA (with gcc 4.8.4)
  * NVCC 8.0.44 for CUDA (with gcc 5.3.0)
  * NVCC 9.1 for CUDA (with gcc 6.1.0)
  * NVCC 9.2 for CUDA (with gcc 7.2.0)
  * NVCC 10.0 for CUDA (with gcc 7.4.0)

### Primary tested compilers on Power 8 are:
  * GCC 6.4.0 (OpenMP,Serial)
  * GCC 7.2.0 (OpenMP,Serial)
  * IBM XL 16.1.0 (OpenMP, Serial)
  * NVCC 9.2.88 for CUDA (with gcc 7.2.0 and XL 16.1.0)

### Primary tested compilers on Intel KNL are:
  * Intel 16.4.258 (with gcc 4.7.2)
  * Intel 17.2.174 (with gcc 4.9.3)
  * Intel 18.2.199 (with gcc 4.9.3)

### Primary tested compilers on ARM (Cavium ThunderX2)
  * GCC 7.2.0 
  * ARM/Clang 18.4.0
  
### Other compilers working:
  * X86:
   - Cygwin 2.1.0 64bit with gcc 4.9.3
   - GCC 8.1.0 (not warning free)

### Known non-working combinations:
  * Power8:
   - Pthreads backend
  * ARM
   - Pthreads backend


Primary tested compiler are passing in release mode
with warnings as errors. They also are tested with a comprehensive set of 
backend combinations (i.e. OpenMP, Pthreads, Serial, OpenMP+Serial, ...).
We are using the following set of flags:
GCC:   -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits
       -Wignored-qualifiers -Wempty-body -Wclobbered -Wuninitialized
Intel: -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits -Wuninitialized
Clang: -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits -Wuninitialized
NVCC:  -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits -Wuninitialized

Other compilers are tested occasionally, in particular when pushing from develop to 
master branch, without -Werror and only for a select set of backends.

# Running Unit Tests

To run the unit tests create a build directory and run the following commands

KOKKOS_PATH/generate_makefile.bash
make build-test
make test

Run KOKKOS_PATH/generate_makefile.bash --help for more detailed options such as
changing the device type for which to build.

# Installing the library

To install Kokkos as a library create a build directory and run the following

KOKKOS_PATH/generate_makefile.bash --prefix=INSTALL_PATH
make kokkoslib
make install

KOKKOS_PATH/generate_makefile.bash --help for more detailed options such as
changing the device type for which to build.

Note that in many cases it is preferable to build Kokkos inline with an 
application. The main reason is that you may otherwise need many different
configurations of Kokkos installed depending on the required compile time
features an application needs. For example there is only one default 
execution space, which means you need different installations to have OpenMP
or Pthreads as the default space. Also for the CUDA backend there are certain
choices, such as allowing relocatable device code, which must be made at 
installation time. Building Kokkos inline uses largely the same process
as compiling an application against an installed Kokkos library. See for 
example benchmarks/bytes_and_flops/Makefile which can be used with an installed
library and for an inline build.  

### CMake

Kokkos supports being build as part of a CMake applications. An example can 
be found in example/cmake_build. 

# Kokkos and CUDA UVM

Kokkos does support UVM as a specific memory space called CudaUVMSpace. 
Allocations made with that space are accessible from host and device. 
You can tell Kokkos to use that as the default space for Cuda allocations.
In either case UVM comes with a number of restrictions:
(i) You can't access allocations on the host while a kernel is potentially 
running. This will lead to segfaults. To avoid that you either need to 
call Kokkos::Cuda::fence() (or just Kokkos::fence()), after kernels, or
you can set the environment variable CUDA_LAUNCH_BLOCKING=1.
Furthermore in multi socket multi GPU machines without NVLINK, UVM defaults 
to using zero copy allocations for technical reasons related to using multiple
GPUs from the same process. If an executable doesn't do that (e.g. each
MPI rank of an application uses a single GPU [can be the same GPU for 
multiple MPI ranks]) you can set CUDA_MANAGED_FORCE_DEVICE_ALLOC=1.
This will enforce proper UVM allocations, but can lead to errors if 
more than a single GPU is used by a single process.


# Citing Kokkos

If you publish work which mentions Kokkos, please cite the following paper:

@article{CarterEdwards20143202,
title = "Kokkos: Enabling manycore performance portability through polymorphic memory access patterns ",
journal = "Journal of Parallel and Distributed Computing ",
volume = "74",
number = "12",
pages = "3202 - 3216",
year = "2014",
note = "Domain-Specific Languages and High-Level Frameworks for High-Performance Computing ",
issn = "0743-7315",
doi = "https://doi.org/10.1016/j.jpdc.2014.07.003",
url = "http://www.sciencedirect.com/science/article/pii/S0743731514001257",
author = "H. Carter Edwards and Christian R. Trott and Daniel Sunderland"
}
//@HEADER // ************************************************************************ // // Kokkos v. 2.0 // Copyright (2014) Sandia Corporation // // Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation, // the U.S. Government retains certain rights in this software. // // Redistribution and use in source and binary forms, with or without // modification, are permitted provided that the following conditions are // met: // // 1. Redistributions of source code must retain the above copyright // notice, this list of conditions and the following disclaimer. // // 2. Redistributions in binary form must reproduce the above copyright // notice, this list of conditions and the following disclaimer in the // documentation and/or other materials provided with the distribution. // // 3. Neither the name of the Corporation nor the names of the // contributors may be used to endorse or promote products derived from // this software without specific prior written permission. // // THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY // EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR // PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE // CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, // EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, // PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR // PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF // LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // // Questions? Contact Christian R. Trott (crtrott@sandia.gov) // // ************************************************************************ //@HEADER

简介

Kokkos C++ Performance Portability Programming EcoSystem: The Programming Model - Parallel Execution and Memory Abstraction 展开 收起
C++
BSD-3-Clause
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
C++
1
https://gitee.com/yunnai/kokkos.git
git@gitee.com:yunnai/kokkos.git
yunnai
kokkos
kokkos
master

搜索帮助

0d507c66 1850385 C8b1a773 1850385