Adobe, MAKE SOME NOISE

Since I had my benchmark code already warmed up from Friday and a photoshop doc open with my histograms, I figured I should make good on the updated speed tests that I promised a year ago.

Here is basic math, receiving a huge boost between debug and release modes. The shaded bar is the duration of the call, so shorter means faster execution.

speed benchmark: actionscript math operators

From here down, everything is running 10m iterations. The scale of these graphs is 560ms. Here's the difference in calling a static public variable, versus a local copy of same.

speed benchmark: actionscript local var vs static public var

Finally, replacing the Math helper methods with some logic and basic math:

speed benchmark: actionscript math.pow squared
speed benchmark: actionscript math.pow cubed
speed benchmark: actionscript math.min
speed benchmark: actionscript math.max
speed benchmark: actionscript math.abs
speed benchmark: actionscript math.floor
speed benchmark: actionscript math.ceil
speed benchmark: actionscript math.round

These stack up pretty well with what I saw in the debug player a year ago (although there is an overall speed increase in all the tests).

If anyone is curious, there is a slowdown in calculating powers of a number (i * i * i * i) as you have to reference that variable over and over - for me the break-even between stringing on more 'i's and just calling Math.pow() was around 60.

I was messing around with some bitmap code this evening and I came across a performance bottleneck with some repeated calls to Math.random(). Since I haven't posted anything in a while, I decided to do some benchmarking and try out an alternate (psuedo) random number generator.

I chose the XOR algorithm, primarily because AS3 is lightning-quick with bit operations, and the algorithm is only 4 lines of code. Here are two versions I tried:

  1. // UINT
  2.  
  3. const MAX_RATIO:Number = 1 / uint.MAX_VALUE;
  4. var r:uint = Math.random() * uint.MAX_VALUE;
  5.  
  6. // returns a number from 0 - 1
  7. function XORandom():Number{
  8. r ^= (r << 21);
  9. r ^= (r >>> 35);
  10. r ^= (r << 4);
  11. return (r * MAX_RATIO);
  12. }

  1. // INT
  2.  
  3. const MAX_RATIO:Number = 1 / int.MAX_VALUE;
  4. const NEGA_MAX_RATIO:Number = -MAX_RATIO;
  5. var r:int = Math.random() * int.MAX_VALUE;
  6.  
  7. // returns a number from 0 - 1
  8. // comment out line 13 for -1 - 1
  9. function XORandom():Number{
  10. r ^= (r << 21);
  11. r ^= (r >>> 35);
  12. r ^= (r << 4);
  13. if(r < 0) return r * MAX_RATIO;
  14. return r * NEGA_MAX_RATIO;
  15. }

It's worth noting that this algorithm works from a seed number. For my purposes, picking one random seed and iterating is fine, but this function is predictable enough that you wouldn't want to use this method for crypto or anything where you'd want "very random" data sets. This also could be useful in some situations where you'd want to replay a set of random numbers - such as generating enemies or levels in a game and then creating a replay by saving the seed number.

For my benchmark, I ran 25 sets of 10 million iterations on each function, using the 10.0 release player in Safari.

acctionscript math.random optimization

Interestingly, the logic to gate negative numbers in the signed-int version was enough to push it out quite a bit past the uint version. All told, the uint algorithm runs in just under one fourth the time it takes to call Math.random().

As always, Wikipedia will tell you as much as you want to know about the ins and outs of this algorithm and how it stacks up against other generators. And, just for fun, here's a frequency graph of the output.

Get Adobe Flash player

Flattening arrays

September 21st, 2009

A big danger when working with large sets of data is the possibility of introducing a small slowdown that's compounded with every piece of data in the set. A common example is the multidimensional array; it's a powerful tool, but every time you look into the array you have to go at least two lookups deep to find your data. Normally that's not a big deal but if you're touching the set thousands of times per frame, it adds up quickly.

A fairly drastic solution is to flatten your arrays - it will give you a nice performance boost when you access the data, but it comes at the cost of obfuscation. Here's how it works:

flat_array

(Sorry the numbers are goofy between the diagram and the code - CS4 crashed and I didn't feel like re-making that graphic)

In the traditional 2D view each "row" is a new Array object - this is where the added computing cost comes in. By flattening the data into one super-long array, you can access any piece of data with just one lookup. Finding the correct cell is a little more tricky though.

  1. var myMDArray:Array = [
  2. ['0a', '0b', '0c', '0d', '0e'],
  3. ['1a', '1b', '1c', '1d', '1e'],
  4. ['2a', '2b', '2c', '2d', '2e']];
  5.  
  6. var myFlatArray:Array = ['0a', '0b', '0c', '0d', '0e', '1a', '1b', '1c', '1d', '1e', '2a', '2b', '2c', '2d', '2e'];
  7.  
  8. var columns:int = 5;
  9.  
  10. trace(myMDArray[2][4]);
  11. trace(myFlatArray[14]);
  12. trace(myFlatArray[(2 * columns) + 4]);
  13. // 2e, 2e, 2e

Now - supposing you have an index for something and want to reverse engineer the row and column - just break out the modulo grid code:

  1. var row:int = int(14 / columns);
  2. var col:int = 14 % columns;
  3.  
  4. trace(row, col);
  5. // 2, 4

Modulo Grids

August 1st, 2009

Here's a quick optimization for people making grids of things. The traditional method is a pair of loops, nested to make rows and columns:

  1. const ROWS:int = 30;
  2. const COLUMNS:int = 88;
  3.  
  4. for(var row:int = 0; row < ROWS; row++){
  5. for(var column:int = 0; column < COLUMNS; column++){
  6. var s:Shape = new Shape();
  7. s.x = column * 5;
  8. s.y = row * 5;
  9.  
  10. s.graphics.beginFill(0x666666);
  11. s.graphics.drawRect(0, 0, 4, 4);
  12. addChild(s);
  13. }
  14. }

This can be simplified and sped up by taking advantage of the relationship between division and modulus to generate the row and column dynamically:

  1. const ROWS:int = 30;
  2. const COLUMNS:int = 88;
  3. const ITEMS:int = ROWS * COLUMNS;
  4.  
  5. for(var i:int = 0; i < ITEMS; i++){
  6. var s:Shape = new Shape();
  7. s.x = int(i / ROWS) * 5; // this is the same as Math.floor(i / ROWS), just faster
  8. s.y = (i % ROWS) * 5;
  9.  
  10. s.graphics.beginFill(0x666666);
  11. s.graphics.drawRect(0, 0, 4, 4);
  12. addChild(s);
  13. }

By upgrading to loop-invariants, we can cut the nested for-loop for a nice boost. And, as a bonus, this new code can work with sets that don't end squarely at the end of a column.

FIVe3D Extreme Optimization

July 11th, 2009

It's not often that my experimental projects line up with my day job, but a couple months ago I talked the higher-ups into letting me do some crossover work with 3D terrain rendering.

The obvious issue with Flash and 3D is performance. The maps we wanted to show we're rectilinear height-maps, 256 vertices to a side. I already knew there was no way to do this in a photo-realistic way (256² is just over 65,000 squares, which would be 130,050 triangles to render...last I checked, Papervision gets unusable around 2000) so I decided to go with a wireframe style instead.

Since I didn't need camera's or any of the weight of the bigger engines, I opted to use Mathieu Badimon's very lightweight and (imo) thoughtfully laid out FIVe3D library as my starting point. From there I got a quick proof of concept up and realized there was still no way I could get away with the full 256² grid so I started cutting down the resolution. Using every third node (64²) struck a good compromise with the fidelity of the original mesh, and the performance limitations I needed to stay in to work for the majority of our users.

Off the shelf, FIVe3D was giving me roughly 12fps and consuming a pretty weighty chunk of system memory, so I started digging in and looking for optimizations I could make. To start with, I created a specific class for my model to eliminate all the cross-class calls between Shape3D and Graphics3D. I also decided the Point3D class had to go since every render pass would create tens of thousands of throwaway objects in order to calculate the z-translations for my nodes. After internalizing those, I brought in the Matrix3D class as well and started combining some of the heavy-use functions so the AVM could stay within one scope for the majority of the calculation and drawing.

That gave me a boost of about 50% but I still needed a little more oomph. Finally, I took the dive and flattened my arrays and ended up restructuring the core FIVe3D engine so it was operating a little more specifically to my needs (mostly hard typing some things and taking out optional calls to the flat-fill shader and the like). At this point, the code is fairly incomprehensible and some of the extensibility of the engine was lost, but it did put my frame rate just under 30fps so I can't complain.

Here is a standalone copy showing the final mesh, and if you want to melt your processor, click for a version with the original full-detail grid.

Get Adobe Flash player