logo
Jan-Mar 1997
Oct-Dec 1997
Jan-Mar 1998
Apr-Jun 1998
Jul-Sep 1998
Oct-Dec 1998
Jan-Mar 1999
Apr-Jun 1999
Jul-Sep 1999
Oct-Dec 1999
Jan-Mar 2000
Apr-Jun 2000
Jul-Sep 2000
Oct-Dec 2000
Jan-Mar 2001
Apr-Jun 2001
Jul-Sep 2001
Oct-Dec 2001
Jan-Mar 2002
Apr-Jun 2002
Jul-Sep 2002
Oct-Dec 2002
Jan-Mar 2003
Apr-Jun 2003
Jul-Sep 2003
Oct-Dec 2003
Jan-Mar 2004
Apr-Jun 2004
Jul-Sep 2004
Oct-Dec 2004
Jan-Mar 2005
Apr-Jun 2005
Jul-Sep 2005
Oct-Dec 2005
Jan-Mar 2006
Apr-Jun 2006
Jul-Sep 2006
Oct-Dec 2006
Jan-Mar 2007
Apr-Jun 2007
Jul-Sep 2007
Oct-Dec 2007
Jan-Mar 2008
Apr-Jun 2008
Jul-Sep 2008
Oct-Dec 2008
Jan-Mar 2009
Oct-Dec 2009
Jan-Mar 2010
Apr-Jun 2010
Jan-Mar 2011
Oct-Dec 2011

July-September 2005 (oldest-to-newest)

27 Jul 2005 - Rolling in his grave

So I'm sitting here at work listening to the Defender 2000 soundtrack. One of the tracks is a remix of Beethoven's 5th. You know, the one in that level of Dragon's Lair 2 where you're a mini-Dirk being chased by a cat while carried through the air on a flying piano. Erm. Anyway, I'm sitting there trying to imagine what Beethoven would think if he heard his masterpiece played to a decidedly hip-hop beat. Makes me smile :)

1 Aug 2005 - Figured it out yet? Here's a couple more hints.
Game Progress


1 Aug 2005 - Next games

Game Progress Okay so the secret's out. I am in the process of porting two Yaroze games to NUON using the library I created for Decaying Orbit. Why go to the trouble of making the library and only port one game right?

The first game is Ben James' "Katapila". The second is Philippe-Andre Lorin's "Invs". Both gents have graciously offered help as I take their respective source code and get the games running on NUON.

Katapila is nearly done. Just a few details left to include. Invs is ported, but not optimized. I need to spend some time improving the frame rate. No guess on a release date for either one yet.

16 Aug 2005 - Let the optimization commence

Game Progress I fixed the last known bug with my library's handling of Invs. Now I need to optimize the hell out of it so that it runs at 30fps throughout. I'm actually looking forward to this as I find it fun.

One of the major time sinks is that, rather than clearing the screen every frame, the background is copied over from another buffer. *PLUS* the background buffer is faded slightly every frame. Certain sprites are drawn to the background buffer in addition to the main buffer and they leave trails as it fades out. Very cool effect, but the current brute force implementation is too slow.

As it stands now the copy is done on MPE3 in C code rather than on the renderers. This is bad because the DMAs are serialized and blocking. The fade is applied some time later by drawing a translucent box over the background buffer. This part is done on the renderers, but requires another set of read/modify/write. In fact it does two reads because it needs to read both the background buffer and the sprite image, which is just a box of solid color.

I plan to speed this up tremendously by performing all the operations in one pass on the renderers. I will read in the background buffer, write it to the destination, then fade the pixels and write it back to the background buffer. Only one read and two writes are necessary, which is the bare minimum.

If you remember ( "An intriguing idea"), I modified the sprite library to use overlays for the sprite renderers. Each sprite type can have its own renderer that is loaded when needed. That way I can have a virtually unlimited number of renderers without taking up precious local RAM.

Anyway, I plan to do the same thing with the code that does background clearing. At the moment the clear routine is fixed to clear to the given color. I will change that to perform the steps described two paragraphs up. It will use overlays as well so that any number of screen clearing methods can be created as the need arises.

I'm excited to see what kind of speed boost this will achieve.

18 Aug 2005 - Timing

Game Progress I timed the various portions of the Invs main loop. The initial frames of the first level take about 80ms. As suspected the background handling occupies a tremendous amount of that time. Both the copy and fade take over 20ms each. To hit 30fps each frame needs to complete in 33ms so you can see how optimizing this will help.

Another optimization is to allow the previous frame to render while the current frame is computing. At the moment they are done serially. That should buy another 10-20ms.

22 Aug 2005 - Firefly

I borrowed a friend's DVD set of Firefly. You can guess where this is going. I've been hooked and feel the need to complete my addiction post haste. At least there's only 14 episodes so it won't take long.

Game Progress I'm just starting to test out the new clear routine as described earlier. At the moment it just copies the background buffer to the current frame buffer. Once I get that working I'll make the pixels fade out.

24 Aug 2005 - Bonus MPE

An unexecpted sick day yesterday let me finish watching the rest of Firefly. What a great series. I hope the movie does well enough to bring it back. Knowing how these things typically go, however, I'm not holding my breath.

Game Progress Oh yes, the background renderer is quite the speedup. It went from 80ms per frame on the opening level to about 65ms. Then I realized I was only using two of the three MPEs for rendering. Adding the third dropped that to just under 50ms.

That 50ms includes about 25ms of rendering time and 20ms of main loop execution. Once I overlap those two the smaller one should hide inside the bigger. So a sub-33ms frame should be very doable. And that's without optimizing the background renderer.

I'm doing things the hard (but accurate) way right now. The original Invs code calls for the background to be faded by 8 on R, G and B every frame. Because NUON does everything in YCrCb (curses) I must first convert each pixel to RGB, subtract 8 from each, saturate to zero in case they go negative, and convert back to YCrCb. It takes about 29 cycles per pixel, which is a just atrocious amount given I have to do this for the entire screen. At 320x240 and using 3 rendering MPEs that's 742k cycles per MPE (29 * 320 * 240 / 3). Given that there's only 1800k cycles to play with in a given frame (54,000,000 / 30) that means the clear renderer takes 41% of the allowed time. Not ideal.

I can optimize what I have now and get it down to 18 cycles pretty easily (25%). However, I'm thinking of changing the fade to a multiplicative effect rather than subtractive (fading by X% rather than X brightness levels). That would let me use matrices to combine all the above steps into one, which would be significantly faster. I'm guessing on the order of 7 cycles per pixel (10%).

Let's see if I can get it at full speed using the accurate method first.

25 Aug 2005 - Full speed

Game Progress Using the multiplicative method sped up rendering by 40% (from 25ms to 15ms). Overlapping rendering with the main code then hid that 15ms behind the 20ms of processing. w00t as they say. That put the game at 30fps during the opening seconds of the first wave. I played for a while and it did dip down below 30fps when things got more hectic.

I have some ideas on speeding up drawing of points and lines, which Invs uses quite a bit. Hopefully that will help.

I hit a crash bug, but I think I'm just running out of sprites. Time to bump up that limit me thinks.

28 Aug 2005 - Speeding up the points

Game Progress The crash I saw earlier was a bug in the Yaroze library. Not too difficult to find and fix in the end. Turns out I was running out of sprites, but not because of the hardcoded limit. In some cases the sprite library was losing track of a sprite. Eventually it lost track of so many that it couldn't render a screen.

I created a special sprite renderer to handle points. Invs uses them a lot to represent energy that floats down for you to pick up. Before, each one was an individual sprite and went through the whole sprite pipeline, including reading a source image of constant white.

I changed it so that most of the sprite pipeline is skipped. What's the point of rotating a single pixel? I also allowed twelve points to be packed together in a single sprite structure. That leads to fewer sprites, which is less memory overhead and DMAs. Now the screen can be littered with them and the frame rate doesn't take a hit.

I need to implement something similar for lines as the game uses them a lot too. They work as is, but the current method never has gotten them exactly right.

30 Aug 2005 - Bah

Game Progress Due to a bug in my timing/profiling functions the time to render a frame was more optimistic than reality. I am getting some slowdown pretty early on. I went back and played the original Yaroze version and noticed how much faster it seemed. Turns out that was true.

On the other hand, I fixed a couple remaining bugs. One dealt with the pause screen and how it copies the current frame over to another buffer. The other wasn't so much a bug as an unimplemented feature. The Yaroze has the ability to draw lines that transition between two different colors from end-to-end. Until I can write a renderer to handle that properly I just make it draw a normal line using the average of the two colors.

1 Sep 2005 - Lines

Game Progress I got the line renderer going yesterday. I need to do some profiling to find out how much it helps speed-wise. At least now lines are drawn more accurately than the old sprite-as-line approximation. I need to add another renderer to handle gradiated lines, but that shouldn't take long since I can just modify the one I have.

I fixed the code that times code segments and can see where the game is slowing down. Some of the one-off cases can be fixed by changing from double to quad buffering. For the rest I need to find out exactly what is the major time sink. It seems to be in the processing rather than the rendering. That gives me hope that it can be fixed.

6 Sep 2005 - Whoooooops

Game Progress Gradated lines are in. Next I'm thinking about creating a special renderer for non-rotating, non-scaling sprites. They basically just copy pixels from the source image to the frame buffer. Transparency and translucency make things complicated, but I can start simple. Invs doesn't use translucency on the invaders anyway.

The big oops I discovered is that the original game runs at 60fps, not 30 as I suspected. Well since the game is PAL technically it's 50fps, but regardless it spells trouble. I need to do some heavy-duty optimizing to have a chance at hitting full speed.

The above sprite renderer will help a bit I think. Especially if I can group sprites that use the same source image together in the same sprite structure. Even better would be if the sprite is small enough that I can load the whole thing into local ram and blit it out multiple times. The invaders would definitely benefit from that.

7 Sep 2005 - Some hope

Game Progress It seems that function calls are really expensive. Adding ~40 sprites to the display list was taking longer than expected. I put my timing routines around a function call and got about 7 to 8ms per frame (cumulative over all sprites).

func A()
{
  startTiming();
  B();
  stopTiming();
}

func B()
{
  ...
}

I put the same routines just inside that function and it drops to 3 to 4ms.

func A()
{
  B();
}

func B()
{
  startTiming();
  ...
  stopTiming();
}

Next I tried unrolling everything I could, inlining all the functions. I got the time down to around 2ms.

I tell you this shocked the hell out of me. I knew there was some overhead involved. Heck I'm willing to live with it if it means my code is cleaner and easier to maintain. But 4x the time? Sorry, that's too much.

So now my goal is to inline as much of the sprite library within my Yaroze library as possible. That means I won't be able to use libsprite2.a any more. I need to actually inline the sprite lib into my code so they get compiled together in one function. Hopefully this will remove the C code as the bottleneck and put the burden back on the renderers.

12 Sep 2005 - Progress

Game Progress I'm working hard to optimize things as much as possible. One thing I added, that I should have done before, is have the Yaroze sprite structure keep a link to the NUON image data. Before, each sprite had to go searching for the appropriate image every frame. I did that because in Yaroze-land there's no guarantee that the sprite structure will persist for any length of time; it could be destroyed immediately after the call to insert it into the display list.

In Invs, most sprites are semi-permanent. I can take advantage of that by keeping a link to the image data as described above. If a sprite hasn't changed from frame-to-frame then it doesn't need to go searching for the image data.

The wrinkle is when the library's automatic memory recovery kicks in. If a game uses a lot of images then the lib deallocates old, unused images to make room for the new ones. In that case the image data might not be there. I added a check to make sure that the image data is still valid.

This helped, but I took things a step further. I can flag certain image data as "permanent" - ones that I know are needed quite often. This tells the lib to never deallocate them. That way I can guarantee that the image data is always present and valid and just return it immediately. This offers a tremendous speed boost when inserting sprites to the display list.

12 Sep 2005 - Pondering

Game Progress I'm starting to wonder if the optimizations I'm making for Invs would help get Decaying Orbit to run at 60fps. Something to investigate later...

15 Sep 2005 - Render bound

Game Progress Good progress as the current round of optimizations have pushed the bottleneck on rendering rather than processing. I'm going to create a special renderer that handles non-rotated, non-scaled sprites more quickly. That should make me processing-bound again, but hopefully further optimization is not necessary. It's getting tougher to find places to speed up the main loop.

28 Sep 2005 - Everything

It's been a while since my last update. There are a couple reasons for that. And another as to why it might be a while to the next.

First, work has been very busy. There's been a lot of pressure to get a task done. That has sucked up much of my free time lately. Thankfully it's pretty much wrapped up now.

Second, my hard drive went kaput. Fortunately I heard it making noise before this happened and had a replacement on hand. So I really only lost maybe a day to that annoyance. Still, I'm happy to now have 100GB instead of 60 and that the new drive is quiet as a mouse. It's just a pain reinstalling everything again.

The reason for future productivity loss is that my wife is in the early stages of labor at the moment. Our son will be born either today or tomorrow most likely. I'll be understandably involved in diapers, spit-ups, and sleep loss again for a while.

Game Progress I did get the renderer finished that I talked about earlier. It's hard to tell how much it helps, but I believe it does. Next I want to find a way to pin the frame rate at 50fps since that's the speed of the original Invs. Getting 50fps on NTSC involves skipping every sixth frame. I need to figure out a way to do that.

Also, I'm at the point where I'm sick of optimizing to eke out another one or two fps. I'll probably get it running at a decent speed for most of it, allow some slowdown to happen on the busier screens, and just release the thing.



This web page and all other pages on this site are © 1999-2007 Scott Cartier