Skip to content
This repository has been archived by the owner on Nov 25, 2018. It is now read-only.

Commit

Permalink
merge
Browse files Browse the repository at this point in the history
  • Loading branch information
usewits committed Oct 21, 2015
2 parents 2fd3a6c + 56ca284 commit 8cb4c0f
Show file tree
Hide file tree
Showing 12 changed files with 200 additions and 177 deletions.
2 changes: 1 addition & 1 deletion docs/basic.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
Getting started: The Basics
===========================

EBSP programs are written in single-program multiple-data *(SPMD)* style. This means that each core runs the same code, but obtains different data. Later we will see how we can transfer data to and from the Epiphany cores, but for now our first step will be to get the cores to output their designated core number (called ``pid`` for *processor identifier*). Like all programs written for the Parallella, an EBSP program consists of two parts. One part contains the code that runs on the *host processor*, the ARM chip that hosts the Linux OS. The other part contains the code that runs on each Epiphany core. In heterogeneous computing it is common to call this second part the *kernel*.
EBSP programs are written in *SPMD* (single-program multiple-data) style. This means that each core runs the same code, but obtains different data. Later we will see how we can transfer data to and from the Epiphany cores, but for now our first step will be to get the cores to output their designated core number (called ``pid`` for *processor identifier*). Like all programs written for the Parallella, an EBSP program consists of two parts. One part contains the code that runs on the *host processor*, the ARM chip that hosts the Linux OS. The other part contains the code that runs on each Epiphany core. In heterogeneous computing it is common to call this second part the *kernel*.

Hello World!
------------
Expand Down
Binary file modified docs/img/coduin_logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/img/coduin_logo_small.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ The EBSP library is released under the LGPLv3. The `source code <https://github.
About Coduin
------------

Coduin (formerly Buurlage Wits) is a small company based in Utrecht, the Netherlands. Next to our work on software libraries and models for many-core processors in embedded systems, we are also active in the area of data analysis and predictive modelling.
`Coduin <http://codu.in>`_ (formerly Buurlage Wits) is a small company based in Utrecht, the Netherlands. Next to our work on software libraries and models for many-core processors in embedded systems, we are also active in the area of data analysis and predictive modelling.

.. image:: img/coduin_logo.png
:width: 250 px
Expand Down
8 changes: 4 additions & 4 deletions docs/introduction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,10 @@ occurance of all communications within a step. The BSP model can be used to writ
:width: 450 px
:align: center

This library (EBSP) provides an implementation of the model on top of the Epiphany SDK (ESDK).
Our library (EBSP) provides an implementation of the model on top of the Epiphany SDK (ESDK).
This allows the BSP computing model to be used with the Epiphany
architecture developed by Adapteva_.
In particular this library has been implemented and tested on the
In particular EBSP has been implemented and tested on the
Parallella_ board. Our goal is to
allow current BSP programs to be run on the Epiphany architecture
with minimal modifications.
Expand Down Expand Up @@ -95,5 +95,5 @@ For more advanced use you can download the latest EBSP release from the `release
rm -r bin

.. _BSP: http://en.wikipedia.org/wiki/Bulk_synchronous_parallel
.. _Adapteva:
.. _Parallella:
.. _Adapteva: http://www.adapteva.com
.. _Parallella: http://www.parallella.org
149 changes: 101 additions & 48 deletions examples/lu_decomposition/e_lu_decomposition.c
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@
/*
This file is part of the Epiphany BSP library.
Copyright (C) 2014-2015 Buurlage Wits
Support e-mail: <[email protected]>
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License (LGPL)
as published by the Free Software Foundation, either version 3 of the
Expand All @@ -30,14 +27,14 @@ see the files COPYING and COPYING.LESSER. If not, see
#include "e-lib.h"

#include <math.h>
#include "common.h"

int M = 0;
int N = 0;
int dim = 0;
int s = 0;
int t = 0;
int entries_per_col = 0;
float* matrix;

int proc_id(int s, int t) { return s * M + t; }

Expand All @@ -53,62 +50,96 @@ int gtl(int i, int j) {
return (i / M) * (dim / M) + (j / M);
}

float* a(int i, int j) { return (float*)_LOC_MATRIX + gtl(i, j); }
float* a(int i, int j) { return (float*)matrix + gtl(i, j); }

void get_initial_data(int* M, int* N, int* dim, float* matrix) {
int packets = 0;
int accum_bytes = 0;
int status = 0;
int tag = 0;

bsp_qsize(&packets, &accum_bytes);
for (int i = 0; i < packets; i++) {
bsp_get_tag(&status, &tag);
if (tag == 0)
bsp_move(M, sizeof(int));
if (tag == 1)
bsp_move(N, sizeof(int));
if (tag == 2)
bsp_move(dim, sizeof(int));
if (tag == 3) {
bsp_move(matrix, status);
}
}
}

int main() {
bsp_begin();

int p = bsp_pid();

M = (*(int*)_LOC_M);
N = (*(int*)_LOC_N);
dim = (*(int*)_LOC_DIM);
int* pi_out;
ebsp_open_up_stream((void**)&matrix, 0);
get_initial_data(&N, &M, &dim, matrix);

// cache data locations
int* LOC_RS = (int*)_LOC_RS;
float* LOC_ARK = (float*)_LOC_ARK;
int* LOC_R = (int*)_LOC_R;
int* LOC_PI = (int*)_LOC_PI;
int* LOC_PI_IN = (int*)_LOC_PI_IN;
float* LOC_ROW_IN = (float*)_LOC_ROW_IN;
float* LOC_COL_IN = (float*)_LOC_COL_IN;
entries_per_col = dim / N;

s = p / M;
t = p % M;

entries_per_col = dim / N;
if (t == 0)
ebsp_open_up_stream((void**)&pi_out, 1);

// cache data locations
int* loc_rs = ebsp_malloc(M * sizeof(int));
float* loc_ark = ebsp_malloc(M * sizeof(float));
int r = 0;
int* loc_pi = ebsp_malloc(entries_per_col * sizeof(int));
int* loc_pi_in = ebsp_malloc(2 * sizeof(int));
float* loc_row_in = ebsp_malloc(sizeof(float) * dim);
float* loc_col_in = ebsp_malloc(sizeof(float) * dim);

// register variable to store r and a_rk
// need arrays equal to number of procs in our proc column
bsp_push_reg((void*)LOC_RS, sizeof(int) * N);
bsp_push_reg((void*)loc_rs, sizeof(int) * N);
bsp_sync();

bsp_push_reg((void*)LOC_ARK, sizeof(float) * N);
bsp_push_reg((void*)loc_ark, sizeof(float) * N);
bsp_sync();

bsp_push_reg((void*)LOC_R, sizeof(int));
bsp_push_reg((void*)&r, sizeof(int));
bsp_sync();

bsp_push_reg((void*)LOC_PI_IN, sizeof(int));
bsp_push_reg((void*)loc_pi_in, sizeof(int));
bsp_sync();

bsp_push_reg((void*)LOC_ROW_IN, sizeof(int));
bsp_push_reg((void*)loc_row_in, sizeof(int));
bsp_sync();

bsp_push_reg((void*)LOC_COL_IN, sizeof(int));
bsp_push_reg((void*)loc_col_in, sizeof(int));
bsp_sync();


// also initialize pi as identity
if (t == 0)
for (int i = 0; i < entries_per_col; ++i)
*((int*)LOC_PI + i) = s + i * N;
for (int i = 0; i < entries_per_col; ++i) {
loc_pi[i] = s + i * N;

}

for (int k = 0; k < dim; ++k) {
if (t == 0)
for (int i = 0; i < entries_per_col; ++i) {
}


ebsp_barrier();

//----------------------
// STAGE 1: Pivot search
//----------------------
if (k % M == t) {

// COMPUTE PIVOT IN COLUMN K
int rs = -1;
float a_rk = -1.0f;
Expand All @@ -128,32 +159,34 @@ int main() {
// HORIZONTAL COMMUNICATION
for (int j = 0; j < N; ++j) {
// put r_s in P(*,t)
bsp_hpput(proc_id(j, t), &rs, (void*)LOC_RS, s * sizeof(int),
bsp_hpput(proc_id(j, t), &rs, (void*)loc_rs, s * sizeof(int),
sizeof(int));

// put a_(r_s, k) in P(*,t)
bsp_hpput(proc_id(j, t), &a_rk, (void*)LOC_ARK,
bsp_hpput(proc_id(j, t), &a_rk, (void*)loc_ark,
s * sizeof(float), sizeof(float));
}


bsp_sync(); // (0) + (1)

a_rk = -1.0f;
for (int j = 0; j < N; ++j) {
if (*((int*)LOC_RS + j) < 0)
if (*((int*)loc_rs + j) < 0)
continue;

float val = fabsf(*(((float*)LOC_ARK + j)));
float val = fabsf(*(((float*)loc_ark + j)));

if (val > a_rk) {
a_rk = val;
rs = *((int*)LOC_RS + j);
rs = *((int*)loc_rs + j);
}
}


// put r in P(s, *)
for (int j = 0; j < M; ++j) {
bsp_hpput(proc_id(s, j), &rs, (void*)LOC_R, 0, sizeof(int));
bsp_hpput(proc_id(s, j), &rs, (void*)&r, 0, sizeof(int));
}

bsp_sync(); // (2) + (3)
Expand All @@ -165,38 +198,37 @@ int main() {
// ----------------------------
// STAGE 2: Index and row swaps
// ----------------------------
int r = *((int*)LOC_R);

if (k % N == s && t == 0) {
bsp_hpput(proc_id(r % N, 0), ((int*)LOC_PI + (k / N)),
(void*)LOC_PI_IN, 0, sizeof(int));
bsp_hpput(proc_id(r % N, 0), ((int*)loc_pi + (k / N)),
(void*)loc_pi_in, 0, sizeof(int));
}

if (r % N == s && t == 0) {
// here offset is set to one in case k % N == r % N
bsp_hpput(proc_id(k % N, 0), ((int*)LOC_PI + (r / N)),
(void*)LOC_PI_IN, sizeof(int), sizeof(int));
bsp_hpput(proc_id(k % N, 0), ((int*)loc_pi + (r / N)),
(void*)loc_pi_in, sizeof(int), sizeof(int));
}

bsp_sync(); // (4)

if (k % N == s && t == 0) {
*((int*)LOC_PI + (k / N)) = *((int*)LOC_PI_IN + 1);
*((int*)loc_pi + (k / N)) = *((int*)loc_pi_in + 1);
}

if (r % N == s && t == 0)
*((int*)LOC_PI + (r / N)) = *((int*)LOC_PI_IN);
*((int*)loc_pi + (r / N)) = *((int*)loc_pi_in);

if (k % N == s) { // need to swap rows with row r
for (int j = t; j < dim; j += M) {
bsp_hpput(proc_id(r % N, t), a(k, j), (void*)LOC_ROW_IN,
bsp_hpput(proc_id(r % N, t), a(k, j), (void*)loc_row_in,
sizeof(float) * (j - t) / M, sizeof(float));
}
}

if (r % N == s) { // need to swap rows with row k
for (int j = t; j < dim; j += M) {
bsp_hpput(proc_id(k % N, t), a(r, j), (void*)LOC_COL_IN,
bsp_hpput(proc_id(k % N, t), a(r, j), (void*)loc_col_in,
sizeof(float) * (j - t) / M, sizeof(float));
}
}
Expand All @@ -205,12 +237,12 @@ int main() {

if (k % N == s) {
for (int j = t; j < dim; j += M) {
*a(k, j) = *((float*)LOC_COL_IN + (j - t) / M);
*a(k, j) = *((float*)loc_col_in + (j - t) / M);
}
}
if (r % N == s) {
for (int j = t; j < dim; j += M) {
*a(r, j) = *((float*)LOC_ROW_IN + (j - t) / M);
*a(r, j) = *((float*)loc_row_in + (j - t) / M);
}
}

Expand All @@ -222,7 +254,7 @@ int main() {
if (k % N == s && k % M == t) {
// put a_kk in P(*, t)
for (int j = 0; j < N; ++j) {
bsp_hpput(proc_id(j, t), a(k, k), (void*)LOC_ROW_IN, 0,
bsp_hpput(proc_id(j, t), a(k, k), (void*)loc_row_in, 0,
sizeof(float));
}
}
Expand All @@ -239,7 +271,7 @@ int main() {

if (k % M == t) {
for (int i = start_idx; i < dim; i += N) {
*a(i, k) = *a(i, k) / (*((float*)LOC_ROW_IN));
*a(i, k) = *a(i, k) / (*((float*)loc_row_in));
}
}

Expand All @@ -248,7 +280,7 @@ int main() {
// put a_ik in P(s, *)
for (int i = start_idx; i < dim; i += N) {
for (int sj = 0; sj < M; ++sj) {
bsp_hpput(proc_id(s, sj), a(i, k), (void*)LOC_COL_IN,
bsp_hpput(proc_id(s, sj), a(i, k), (void*)loc_col_in,
sizeof(float) * i, sizeof(float));
}
}
Expand All @@ -259,7 +291,7 @@ int main() {
// put a_ki in P(*, t)
for (int j = start_jdx; j < dim; j += M) {
for (int si = 0; si < N; ++si) {
bsp_hpput(proc_id(si, t), a(k, j), (void*)LOC_ROW_IN,
bsp_hpput(proc_id(si, t), a(k, j), (void*)loc_row_in,
sizeof(float) * j, sizeof(float));
}
}
Expand All @@ -269,13 +301,34 @@ int main() {

for (int i = start_idx; i < dim; i += N) {
for (int j = start_jdx; j < dim; j += M) {
float a_ik = *((float*)LOC_COL_IN + i);
float a_kj = *((float*)LOC_ROW_IN + j);
float a_ik = *((float*)loc_col_in + i);
float a_kj = *((float*)loc_row_in + j);
*a(i, j) = *a(i, j) - a_ik * a_kj;
}
}
}

ebsp_move_chunk_up((void**)&matrix, 0, 0);

if (t == 0) {
for (int i = 0; i < dim / 4; ++i) {
pi_out[i] = loc_pi[i];
}
ebsp_move_chunk_up((void**)&pi_out, 1, 0);
}

ebsp_free(matrix);
ebsp_free(loc_rs);
ebsp_free(loc_ark);
ebsp_free(loc_pi);
ebsp_free(loc_pi_in);
ebsp_free(loc_row_in);
ebsp_free(loc_col_in);

ebsp_close_up_stream(0);
if (t == 0)
ebsp_close_up_stream(1);

bsp_end(); // (11)

return 0;
Expand Down
Loading

0 comments on commit 8cb4c0f

Please sign in to comment.