About Me

Michael Zucchi

 B.E. (Comp. Sys. Eng.)

  also known as Zed
  to his mates & enemies!

notzed at gmail >
fosstodon.org/@notzed >


android (44)
beagle (63)
biographical (104)
blogz (9)
business (1)
code (77)
compilerz (1)
cooking (31)
dez (7)
dusk (31)
esp32 (4)
extensionz (1)
ffts (3)
forth (3)
free software (4)
games (32)
gloat (2)
globalisation (1)
gnu (4)
graphics (16)
gsoc (4)
hacking (459)
haiku (2)
horticulture (10)
house (23)
hsa (6)
humour (7)
imagez (28)
java (231)
java ee (3)
javafx (49)
jjmpeg (81)
junk (3)
kobo (15)
libeze (7)
linux (5)
mediaz (27)
ml (15)
nativez (10)
opencl (120)
os (17)
panamaz (5)
parallella (97)
pdfz (8)
philosophy (26)
picfx (2)
players (1)
playerz (2)
politics (7)
ps3 (12)
puppybits (17)
rants (137)
readerz (8)
rez (1)
socles (36)
termz (3)
videoz (6)
vulkan (3)
wanki (3)
workshop (3)
zcl (4)
zedzone (26)
Monday, 30 June 2014, 12:39

static resource init

I noticed some activity on a thread in the parallella forum and whilst following that up I had another look at how to get the file loader to automgically map symbols across multiple cores.

Right now the loader will simply resolve the address to a core-local address and the on-core code needs to manually do any resolution to remote cores using ez_global_address() and related functions.

First I tried creating runtime meta-data for symbols which let the loader know how the address should be mapped on a given core. Once I settled on some basic requirements I started coding something up but quickly realised that it was going to take quite a bit of work. The first approach was to use address matching so that for example any address which matched a remapped address could be rewritten as necessary. But this wont work because in-struct addresses could fall through. So then I thought about using the reloc record symbol index. This could work fine if the address is in the same program on a different core, but it falls over if the label is in a different program; which is the primary case i'm interested in. Then I looked into just including a symbolic name but this is a bit of a pain because the loader then needs to link the string pointers before it can be used. So I just copied the symbol name into a fixed sized buffer. But after all that I threw it away as there were still a couple of fiddly cases to handle and it just seemed like too much effort.

So then I investigated an earlier idea I had which was to include some meta-data in a mangled symbol name. One reason I never followed it up was it seemed too inconvenient but then I realised I could use some macros to do some of the work:

  #define REF(var, row, col) _$_ ## row ## $_ ## col ## $_ ## var

  // define variable a stored on core 0,0
  extern int REF(a, 0, 0);

  // access variable a on core 0,0
  int b = REF(a, 0, 0);

This just creates a variable named "_$_0$_0$_a" and when the loader goes to resolve this symbol it can be mapped to the variable 'a' on the specific core as requested (i'm using _N to map group-relative, or N to map this-core-relative). The same mechanism just works directly for symbols defined in the same program as well since the linker doesn't know of any relation between the symbols.

So although this looked relatively straightforward and would probably work ... it would mean that the loader would need to explicitly relink (at least some of) the code for each core at load-time rather than just linking each program once and copying it over as a block; adding a lot of on-host overheads which can't be discounted. It didn't really seem to make much difference to code size in some simple tests either.

So that idea went out the window as well. I think i'll just stick to hand coding this stuff for now.


However I thought i would revisit the meta-data section idea but use it in a higher level way: to define resources which need to be initialised before the code executes - e.g. to avoid race conditions or other messy setup issues.

I coded something up and it seems to be fairly clean and lean; I might keep this in. It still needs a bit of manipulation of the core image before it gets copied to each core but the processing is easier to separate and shouldn't have much overhead.

An example is initialising a port pair as used in the workgroup-test sample. There is actually no race condition here because each end only needs to initialise itself but needing two separate steps is error prone.

First the local endpoint is defined with a static initialiser and then it's reference to the remote port is setup at the beginning of main.

extern ez_port_t portb EZ_REMOTE;

ez_port_t porta = {
  .mask = 4-1

main() {
    int lid = ez_get_core_row();

    ez_port_setup(&aport, ez_global_core(&bport, lid, 0));

   ... can now use the port

This sets up one end of a port-pair between two columns of cores. Each column is loading a different program which communicates with the other via a port and some variables.

Using a resource record, this can be rewritten as:

extern ez_port_t bport EZ_REMOTE;

// includes the remote port which the loader/linker will resolve to the
// lower 15 bits of the address
ez_port_t aport = {
    .other = &bport,
    .mask = 4-1

// resource record goes into .eze.restab section which is used but not loaded
// the row coordinate is relative, i.e. 'this.row + 0',
// and the column of the target is absolute '1'.
static const struct ez_restab __r_aport
__attribute__((used,section(".eze.restab,\"\",@progbits ;"))) = {
    .flags = EZ_RES_REL_ROW | EZ_RES_PORT,
    .row = 0,
    .col = 1,
    .target= &aport

main() {
    ... can now use the port

It looks a bit bulky but it can be hidden in a single macro. I guess it could also update an arbitrary pointer too and this isn't far off what I started with - but it can only update data pointers not in-code immediate pointer loads that elf reloc records allow for.

Not sure what other resources I might want to add though and it's probably not worth it for this alone.

Tagged hacking, parallella.
FFT(N=2^20) parallella/epiphany, preliminary results | cexpi, power of 2 pi
Copyright (C) 2019 Michael Zucchi, All Rights Reserved. Powered by gcc & me!