About Me

Michael Zucchi

 B.E. (Comp. Sys. Eng.)

  also known as Zed
  to his mates & enemies!

notzed at gmail >
fosstodon.org/@notzed >


android (44)
beagle (63)
biographical (104)
blogz (9)
business (1)
code (77)
compilerz (1)
cooking (31)
dez (7)
dusk (31)
esp32 (4)
extensionz (1)
ffts (3)
forth (3)
free software (4)
games (32)
gloat (2)
globalisation (1)
gnu (4)
graphics (16)
gsoc (4)
hacking (459)
haiku (2)
horticulture (10)
house (23)
hsa (6)
humour (7)
imagez (28)
java (231)
java ee (3)
javafx (49)
jjmpeg (81)
junk (3)
kobo (15)
libeze (7)
linux (5)
mediaz (27)
ml (15)
nativez (10)
opencl (120)
os (17)
panamaz (5)
parallella (97)
pdfz (8)
philosophy (26)
picfx (2)
players (1)
playerz (2)
politics (7)
ps3 (12)
puppybits (17)
rants (137)
readerz (8)
rez (1)
socles (36)
termz (3)
videoz (6)
vulkan (3)
wanki (3)
workshop (3)
zcl (4)
zedzone (26)
Monday, 09 May 2016, 12:57

java, fft, and jvm

After quite a break i finally got stuck into a bit of hacking over the weekend. Actually i literally spent every waking hour of the weekend tinkering on the keyboard, about 25 hours or so. Yes my arse is a bit sore.

I will probably end up writing more about it but for now I'm just going to summarise some findings and results of some tinkering with FFT routines in Java. It may sound oxymoronic but i've been a bit under the weather so this was something "easy" to play with which doesn't require too much hard thought.

Initially i was working on an out-of-place in-order implementation but I hit some problems with 2D transforms so decided to settle on in-place out-of-order instead. Using a decimation in frequency (DIF) for the forward transform and a decimation in time (DIT) for the inverse allows the same routine to be used for all passes with no bit-reversal needed. I only implemented the basic Cooley-Tukey algorithm and only with radix-2 or radix-4 kernels and with special case code for the first (DIT) or last (DIF) pass. I'm only really interested in "video-image-sized" routines at this point.

I normally use jtransforms (2.4) for (java) fft's so i used that as a performance basis. Whilst the out-of-order results aren't identical they are just as useful for my needs so I think it's fair to compare them. In all cases i'm talking about single-threaded performance and complex input/output (jtransforms.complexForward/complexInverse), and i've only looked at single precision so far.

The performance inflexion point (1D) is about 2^14 elements, beyond that jtransforms pulls (well) ahead. I suspect this is related to the cache size and such minutiae is something i just haven't explored.

Some other java/jvm related observations:

Of course 'performance java' isn't exactly a hot topic anymore and i should be focusing on massively-parallel devices but I was only doing it to pass the time and I can tackle OpenCL another wet weekend. If I can identify isolated cases for the the optimiser-degredations I came across I will see what the java forums have to say about them.

I've nearly exhausted my interest for now but i still have a couple more things to try and also to identify what if any of it I want to keep. If I get really keen I will see about some sort of reusable library for special purpose fft use - one problem with jtransforms (and indeed other fft libraries) is that you need to marshal the data into/from the correct layout which can be quite costly and negate much of the performance they purport to provide. Big if on that though.

Tagged hacking, java.
radix-4 stuff | my lonesome easter
Copyright (C) 2019 Michael Zucchi, All Rights Reserved. Powered by gcc & me!