Skip to content
This repository was archived by the owner on Jan 26, 2026. It is now read-only.

Commit f34f5ea

Browse files
authored
Updating README (#71)
* Updating README, fixing a few oversights when renaming sharpy
1 parent e03d50b commit f34f5ea

File tree

3 files changed

+52
-31
lines changed

3 files changed

+52
-31
lines changed

README.md

Lines changed: 49 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
1-
[![.github/workflows/ci.yml](https://github.com/intel-sandbox/personal.fschlimb.sharpy/actions/workflows/ci.yml/badge.svg)](https://github.com/intel-sandbox/personal.fschlimb.sharpy/actions/workflows/ci.yml)
2-
# Distributed Data-Parallel Python Array
1+
[![.github/workflows/ci.yml](https://github.com/intel-sandbox/sharpy/actions/workflows/ci.yml/badge.svg)](https://github.com/intel-sandbox/sharpy/actions/workflows/ci.yml)
2+
# Distributed Python Array
33
A array implementation following the [array API as defined by the data-API consortium](https://data-apis.org/array-api/latest/index.html).
4-
It supports a controller-worker execution model as well as a CSP-like execution.
4+
Parallel and distributed execution currently is MPI/CSP-like. In a later version support for a controller-worker execution model will be added.
55

66
## Setting up build environment
77
Install MLIR/LLVM and IMEX from branch dist-ndarray (see https://github.com/intel-innersource/frameworks.ai.mlir.mlir-extensions/tree/dist-ndarray).
88
```bash
9-
git --recurse-submodules clone https://github.com/intel-sandbox/personal.fschlimb.sharpy
10-
cd personal.fschlimb.sharpy
9+
git --recurse-submodules clone https://github.com/intel-sandbox/sharpy
10+
cd sharpy
1111
git checkout jit
1212
conda env create -f conda-env.yml -n sharpy
1313
conda activate sharpy
@@ -19,28 +19,16 @@ export IMEXROOT=<your-IMEX-install-dir>
1919
```bash
2020
python setup.py develop
2121
```
22-
If your compiler does not default to a recent version, try something like `CC=gcc-9 CXX=g++-9 python setup.py develop`
23-
24-
## Running Tests [non functional]
25-
__Test are currently not operational on this branch.__
22+
If your compiler does not default to a recent (e.g. g++ >= 9) version, try something like `CC=gcc-9 CXX=g++-9 python setup.py develop`
2623

24+
## Running Tests
2725
```bash
2826
# single rank
2927
pytest test
30-
# multiple ranks, controller-worker, controller spawns ranks
31-
SHARPY_MPI_SPAWN=$NoW PYTHON_EXE=`which python` pytest test
32-
# multiple ranks, controller-worker, mpirun
33-
mpirun -n $N python -m pytest test
34-
# multiple ranks, CSP
35-
SHARPY_CW=0 mpirun -n $N python -m pytest test
28+
# distributed on multiple ($N) ranks/processes
29+
SHARPY_IDTR_SO=`pwd`/sharpy/libidtr.so mpirun -n $N python -m pytest test
3630
```
3731

38-
If SHARPY_MPI_SPAWN is set it spawns the provided number of MPI processes.
39-
By default new processes launch python executing a worker loop.
40-
This requires setting PYTHON_EXE.
41-
Alternatively SHARPY_MPI_EXECUTABLE and SHARPY_MPI_EXE_ARGS are used.
42-
Additionally SHARPY_MPI_HOSTS can be used to control the host to use for spawning processes.
43-
4432
## Running
4533
```python
4634
import sharpy as sp
@@ -65,6 +53,21 @@ and multi-process run is executed like
6553
SHARPY_IDTR_SO=`pwd`/sharpy/libidtr.so mpirun -n 5 python simple.py
6654
```
6755

56+
### Distributed Execution without mpirun
57+
Instead of using mpirun to launch a set of ranks/processes, you can tell the runtime to
58+
spawns ranks/processes for you by setting SHARPY_MPI_SPAWN to the number of desired MPI processes.
59+
Additionally set SHARPY_MPI_EXECUTABLE and SHARPY_MPI_EXE_ARGS.
60+
Additionally SHARPY_MPI_HOSTS can be used to control the host to use for spawning processes.
61+
62+
The following command will run the stencil example on 3 MPI ranks:
63+
```bash
64+
SHARPY_IDTR_SO=`pwd`/sharpy/libidtr.so \
65+
SHARPY_MPI_SPAWN=2 \
66+
SHARPY_MPI_EXECUTABLE=`which python` \
67+
SHARPY_MPI_EXE_ARGS="examples/stencil-2d.py 10 2000 star 2" \
68+
python examples/stencil-2d.py 10 2000 star 2
69+
```
70+
6871
## Contributing
6972
Please setup precommit hooks like this
7073
```
@@ -78,11 +81,10 @@ Typically, sharpy operations do not get executed immediately. Instead, the funct
7881
the actual computation gets deferred by creating a promise/deferred object and queuing it for later. This is not visible to users, they can use it as any other numpy-like library.
7982

8083
Only when actual data is needed, computation will happen; that is when
81-
- the values of array elements are casted to bool int or float
82-
- the array is printed
84+
- the values of array elements are casted to bool, int, float or string
85+
- this includes when the array is printed
8386

84-
In the background a worker thread handles deferred objects. Until computation is needed it dequeues deferred objects from the FIFO queue and asks them to generate MLIR.
85-
Objects can either generate MLIR or instead provide a run() function to immediately execute. For the latter case the current MLIR function gets executed before calling run() to make sure potential dependences are met.
87+
In the background a worker thread handles deferred objects. Until computation is needed it dequeues deferred objects from the FIFO queue and asks them to generate MLIR. Objects can either generate MLIR or instead provide a run() function to immediately execute. For the latter case the current MLIR function gets executed before calling run() to make sure potential dependencies are met.
8688

8789
### Distribution
8890
Arrays and operations on them get transparently distributed across multiple processes. Respective functionality is partly handled by this library and partly IMEX dist dialect.
@@ -91,6 +93,25 @@ sharpy provides this library functionality in a separate dynamic library "idtr".
9193

9294
Right now, data is split in the first dimension (only). Each process knows the partition it owns. For optimization partitions can actually overlap.
9395

94-
sharpy supports to execution modes:
95-
1. CSP/SPMD/explicitly-distributed execution, meaning all processes execute the same program, execution is replicated on all processes. Data is typically not replicated but distributed among processes.
96-
2. Controller-Worker/implicitly-distributed execution, meaning only a single process executes the program and it distributes data and work to worker processes.
96+
sharpy currently supports one execution mode: CSP/SPMD/explicitly-distributed execution, meaning all processes execute the same program, execution is replicated on all processes. Data is typically not replicated but distributed among processes. The distribution is handled automatically by sharpy, all operations on sharpy arrays can be viewed as collective operations.
97+
98+
Later, we'll add a Controller-Worker/implicitly-distributed execution mode, meaning only a single process executes the program and it distributes data and work to worker processes.
99+
100+
### Array API Coverage
101+
Currently only a subset of the Array API is covered by sharpy
102+
- elementwise binary operations
103+
- elementwise unary operations
104+
- subviews (getitem with slices)
105+
- assignment (setitem with slices)
106+
- `empty`, `zeros`, `ones`, `linspace`, `arange`
107+
- reduction operations over all dimensions (max, min, sum, ...)
108+
- type promotion
109+
- many cases of shape broadcasting
110+
111+
### Other Functionality
112+
- `sharpy.to_numpy` converts a sharpy array into a numpy array.
113+
- `sharpy.numpy.from_function` allows creating a sharpy array from a function (similar to numpy)
114+
- In addition to the Array API sharpy also provides functionality facilitating interacting with sharpy arrays in a distributed environment.
115+
- `sharpy.spmd.gather` gathers the distributed array and forms a single, local and contiguous copy of the data as a numpy array
116+
- `sharpy.spmd.get_locals` return the local part of the distributed array as a numpy array
117+
- sharpy allows providing a fallback array implementation. By setting SHARPY_FALLBACK to a python package it will call that package if a given function is not provided by sharpy. It will pass sharpy arrays as (gathered) numpy-arrays.

src/MPITransceiver.cpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -57,9 +57,9 @@ MPITransceiver::MPITransceiver(bool is_cw)
5757
"'SHARPY_MPI_EXECUTABLE' or 'PYTHON_EXE'");
5858
clientExe = _tmp;
5959
// 2. arguments
60-
_tmp = "-c import FutureArray as dt; dt.init(True)";
60+
_tmp = "-c import sharpy as sp; sp.init(True)";
6161
args.push_back("-c");
62-
args.push_back("import FutureArray as dt; dt.init(True)");
62+
args.push_back("import sharpy as sp; sp.init(True)");
6363
} else {
6464
clientExe = _tmp;
6565
// 2. arguments

test/test_red.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
import sharpy as dt
1+
import sharpy as sp
22
from utils import runAndCompare
33
import pytest
44

0 commit comments

Comments
 (0)