Optimizing the RDP

Discuss topics related to development here.
Post Reply
User avatar
MarathonMan
Site Admin
Posts: 692
Joined: Fri Oct 04, 2013 4:49 pm

Optimizing the RDP

Post by MarathonMan » Sat Jun 18, 2016 12:56 pm

AIO has made some excellent points about optimizing the RDP. I have taken some of his algorithms and my own approaches and gotten some performance boosts that are starting to become quite convincing.

With a single-threaded build, the speed up over master is about 2VI/s on my desktop at the moment. You can download a build with the RDP optimizations here:
http://downloads.cen64.com/cen64-linux6 ... mental_rdp
http://downloads.cen64.com/cen64-linux6 ... mental_rdp
http://downloads.cen64.com/cen64-win64- ... al_rdp.exe
http://downloads.cen64.com/cen64-win64- ... al_rdp.exe

There is no SSE2 or SSSE3 support with optimizations right now. I may look into this for the future.
Attachments
angrylion-rdp_2.png
angrylion-rdp_2.png (43.56 KiB) Viewed 8246 times

User avatar
juef
Posts: 31
Joined: Sun Oct 27, 2013 10:19 pm

Re: Optimizing the RDP

Post by juef » Sat Jun 18, 2016 8:29 pm

That's great progress, thanks for sharing! :D

User avatar
Snowstorm64
Posts: 303
Joined: Sun Oct 20, 2013 8:22 pm

Re: Optimizing the RDP

Post by Snowstorm64 » Sun Jun 19, 2016 10:39 am

This is exciting! Do you think these optimizations in the end will yield enough boost so that games can be run at 60 VI/s with the single-threaded build?
OS: Debian GNU/Linux Jessie (8.0)
CPU: Intel i7 4770K @ 3.5 GHz
Build: AVX (compiled from git)

AIO
Posts: 51
Joined: Wed Nov 05, 2014 4:56 pm

Re: Optimizing the RDP

Post by AIO » Sun Jun 19, 2016 1:38 pm

Snowstorm64 wrote:Do you think these optimizations in the end will yield enough boost so that games can be run at 60 VI/s with the single-threaded build?
A lot of 2D games should be able to run full speed after optimizing. It seems that games which use a lot of rectangles, use the RSP less as well (Yoshi's Story, Bangaioh, Tower and Shaft, etc.). So multi-threading won't be necessary for many 2D games.

There are certain games that appear to have a variable frame rate, that seem to have 0 chance of running full speed. I'm starting to wonder if these games actually frame-skip (and how often), on the console. I'm thinking maybe the reason they have no chance of full speed is because HLE emulators may be running some of these games at a higher speed. Even after factoring in frame-skip though, some games will still have practically no chance of ever running full speed. I'd appreciate it if someone could confirm how much the N64 frame skips in some of these games with seemingly variable frame rate. Some games I have in mind are Vigilante 8, Goldeneye, and Star Wars Ep 1 Racer.

User avatar
MarathonMan
Site Admin
Posts: 692
Joined: Fri Oct 04, 2013 4:49 pm

Re: Optimizing the RDP

Post by MarathonMan » Sun Jun 19, 2016 3:21 pm

Snowstorm64 wrote:This is exciting! Do you think these optimizations in the end will yield enough boost so that games can be run at 60 VI/s with the single-threaded build?
I'll let you know when I'm able to use two threads to run things at 60VI/s, let alone one. ;)

In all seriousness, AIO is correct. Some 2D titles (Rampage: World Tour) already run at 60VI/s for me. OTOH, SPLiT's Nacho demo is a different story...

User avatar
Snowstorm64
Posts: 303
Joined: Sun Oct 20, 2013 8:22 pm

Re: Optimizing the RDP

Post by Snowstorm64 » Sun Jun 19, 2016 5:34 pm

AIO wrote:I'd appreciate it if someone could confirm how much the N64 frame skips in some of these games with seemingly variable frame rate. Some games I have in mind are Vigilante 8, Goldeneye, and Star Wars Ep 1 Racer.
I think Banjo-Kazooie has that too, and maybe also other Rare games.

On the other hand, Super Mario 64 seems to be more performant with these RDP optimizations and with -multithread option enabled (~5 VI/s boost, average is 50-60 VI/s in most levels, with peaks = 80 VI/s and drops = 40 VI/s). I have to say SM64 is quite playable even at 50 VI/s, at least for me. :D
OS: Debian GNU/Linux Jessie (8.0)
CPU: Intel i7 4770K @ 3.5 GHz
Build: AVX (compiled from git)

User avatar
MarathonMan
Site Admin
Posts: 692
Joined: Fri Oct 04, 2013 4:49 pm

Re: Optimizing the RDP

Post by MarathonMan » Sun Jun 19, 2016 9:15 pm

Super Smash Bros. also seems to have gotten a nice boost, at least for me.

User avatar
MarathonMan
Site Admin
Posts: 692
Joined: Fri Oct 04, 2013 4:49 pm

Re: Optimizing the RDP

Post by MarathonMan » Thu Jun 23, 2016 3:33 pm

Update 06/23/2016

Looks like a movdqa alignment issue happened sometime on Linux builds? There is now a 5+% VI/s gain over master now.

Overall not very much higher perf. than last posting, but better in low VI/s areas.
Attachments
angrylion-rdp_3.png
angrylion-rdp_3.png (43.04 KiB) Viewed 8014 times

User avatar
Snowstorm64
Posts: 303
Joined: Sun Oct 20, 2013 8:22 pm

Re: Optimizing the RDP

Post by Snowstorm64 » Thu Jun 23, 2016 4:06 pm

With latest commits, SM64 has just gained another +1 VI/s boost. :) But SSB64 is now broken...

EDIT: Paper Mario and Mario Party are causing a segfault too.
EDIT2: This is the faulty commit.
EDIT3: Wrong commit, I have updated the link.
OS: Debian GNU/Linux Jessie (8.0)
CPU: Intel i7 4770K @ 3.5 GHz
Build: AVX (compiled from git)

User avatar
MarathonMan
Site Admin
Posts: 692
Joined: Fri Oct 04, 2013 4:49 pm

Re: Optimizing the RDP

Post by MarathonMan » Thu Jun 23, 2016 4:38 pm

Snowstorm64 wrote:But SSB64 is now broken...
Looks a-ok to me... can you please provide more info?*

EDIT: Unless it's a segfault... I know the cause of that and just caught it a bit ago myself.

User avatar
Snowstorm64
Posts: 303
Joined: Sun Oct 20, 2013 8:22 pm

Re: Optimizing the RDP

Post by Snowstorm64 » Thu Jun 23, 2016 4:52 pm

Looks like I haven't made in time to edit the message... :P However, is this commit you're referring to it as the cause?
OS: Debian GNU/Linux Jessie (8.0)
CPU: Intel i7 4770K @ 3.5 GHz
Build: AVX (compiled from git)

User avatar
MarathonMan
Site Admin
Posts: 692
Joined: Fri Oct 04, 2013 4:49 pm

Re: Optimizing the RDP

Post by MarathonMan » Thu Jun 23, 2016 5:51 pm

Snowstorm64 wrote:Looks like I haven't made in time to edit the message... :P However, is this commit you're referring to it as the cause?
That commit causes a segfault sometimes, yep.

AIO
Posts: 51
Joined: Wed Nov 05, 2014 4:56 pm

Re: Optimizing the RDP

Post by AIO » Fri Jun 24, 2016 6:49 pm

Snowstorm64 wrote:Do you think these optimizations in the end will yield enough boost so that games can be run at 60 VI/s with the single-threaded build?
What games do you have in mind? I'm willing to profile and examine a few popular games that are very slow, to see what can be done. I'm not really concerned about games like Star Fox, SM64, OOT, etc because those are relatively lightweight games tbh. Those can already be full speed once optimizations are done. Although I may at one point, profile the explosions in starfox again, so that the VI/s don't drop.

User avatar
Nintendo Maniac 64
Posts: 185
Joined: Fri Oct 04, 2013 11:37 pm

Re: Optimizing the RDP

Post by Nintendo Maniac 64 » Sat Jun 25, 2016 2:20 am

AIO wrote:I'd appreciate it if someone could confirm how much the N64 frame skips in some of these games with seemingly variable frame rate. Some games I have in mind are Vigilante 8, Goldeneye, and Star Wars Ep 1 Racer.
I don't own Vigilante 8, and I never noticed frame skipping whenever I last played it 10-15 years ago (but I was much less sensitive to such thing), but anyone that's done 3-4 player splitscreen in GoldenEye with any sort of explosive weapon in a level with exploding scenery will very know that GoldenEye has an extremely variable framerate - it can and will drop down to what has to be like 5fps when things are really blowing up.

And to clarify, the game does not slow down (like some NES games) but rather will become choppy (like most PC games), thereby implying an intentionally variable framerate - it's very noticeable when you're trying to aim your rocket launcher at the attacker(s) causing the mini WW3 around you and one moment your rocket launcher is pointing slightly to the right and then, half a second later when the next frame finally arrives, your giant bazooka barrel is pointing clear across your entire screen to the far left.
CEN64 Forum's resident straight-male kuutsundere
(just "tsundere" makes people think of "Shana clones" *shivers*)

CPU+iGPU: Pentium G3258 @ 4.6GHz/1.281v
dGPU: Radeon HD5870 1GB
RAM: Vengeance 1600 4x4GB
OS: Windows 7

User avatar
Snowstorm64
Posts: 303
Joined: Sun Oct 20, 2013 8:22 pm

Re: Optimizing the RDP

Post by Snowstorm64 » Sun Jul 03, 2016 5:46 am

AIO wrote:What games do you have in mind? I'm willing to profile and examine a few popular games that are very slow, to see what can be done. I'm not really concerned about games like Star Fox, SM64, OOT, etc because those are relatively lightweight games tbh. Those can already be full speed once optimizations are done. Although I may at one point, profile the explosions in starfox again, so that the VI/s don't drop.
Other than the games that are mentioned in this thread (like Vigilante 8), I could think of F-Zero X (the cartridge port of the Expansion Kit, because the N64 version isn't working right now), the Clock Town part in Majora's Mask, the world hub in Mario Party. There's also Banjo-Kazooie, Doom 64, Star Wars Episode 1 - Racer (this is especially slow!), Iggy's Reckin' Balls (although I don't think this is a popular game, nor it's slow, but there's a scene that happens after the end of the level, where a bunch of colorful explosions makes the VI/s drop, like in Star Fox 64). All those games, except Iggy's Reckin' Balls, rarely pass the 50 VI/s point, even with -multithread on on my PC.
OS: Debian GNU/Linux Jessie (8.0)
CPU: Intel i7 4770K @ 3.5 GHz
Build: AVX (compiled from git)

AIO
Posts: 51
Joined: Wed Nov 05, 2014 4:56 pm

Re: Optimizing the RDP

Post by AIO » Sun Jul 03, 2016 2:05 pm

Nintendo Maniac 64 wrote: I don't own Vigilante 8, and I never noticed frame skipping whenever I last played it 10-15 years ago (but I was much less sensitive to such thing), but anyone that's done 3-4 player splitscreen in GoldenEye with any sort of explosive weapon in a level with exploding scenery will very know that GoldenEye has an extremely variable framerate - it can and will drop down to what has to be like 5fps when things are really blowing up.

And to clarify, the game does not slow down (like some NES games) but rather will become choppy (like most PC games), thereby implying an intentionally variable framerate - it's very noticeable when you're trying to aim your rocket launcher at the attacker(s) causing the mini WW3 around you and one moment your rocket launcher is pointing slightly to the right and then, half a second later when the next frame finally arrives, your giant bazooka barrel is pointing clear across your entire screen to the far left.
I'm glad you brought up Goldeneye. I'm guessing maybe some of these games that run really poorly with Angrylion's is partially because these games also ran poorly on the console.
Snowstorm64 wrote:Other than the games that are mentioned in this thread (like Vigilante 8), I could think of F-Zero X (the cartridge port of the Expansion Kit, because the N64 version isn't working right now), the Clock Town part in Majora's Mask, the world hub in Mario Party. There's also Banjo-Kazooie, Doom 64, Star Wars Episode 1 - Racer (this is especially slow!), Iggy's Reckin' Balls (although I don't think this is a popular game, nor it's slow, but there's a scene that happens after the end of the level, where a bunch of colorful explosions makes the VI/s drop, like in Star Fox 64). All those games, except Iggy's Reckin' Balls, rarely pass the 50 VI/s point, even with -multithread on on my PC.
Mario Party should be full speed once the optimizations are done. That game isn't too intensive. F-Zero is going to be tougher. I hardly ever tested Iggy's, Banjo, or Doom 64. I haven't tested Clock Town, but I'm sure Zelda MM can run full speed after applying more optimizations.

Star Wars Episode 1 - Racer is another one of those games that may have frameskip on console. This needs to be investigated. Interestingly, some parts of DK64 may also have frameskip (like the part where he misses the vine on most emulators).

User avatar
Snowstorm64
Posts: 303
Joined: Sun Oct 20, 2013 8:22 pm

Re: Optimizing the RDP

Post by Snowstorm64 » Sun Jul 03, 2016 2:29 pm

AIO wrote: Mario Party should be full speed once the optimizations are done. That game isn't too intensive. F-Zero is going to be tougher. I hardly ever tested Iggy's, Banjo, or Doom 64. I haven't tested Clock Town, but I'm sure Zelda MM can run full speed after applying more optimizations.

Star Wars Episode 1 - Racer is another one of those games that may have frameskip on console. This needs to be investigated. Interestingly, some parts of DK64 may also have frameskip (like the part where he misses the vine on most emulators).
True, Mario Party isn't that intensive, but it becomes slow in that particular place I have mentioned before, the world hub where there are the tube, the bank, the raft and some other buildings. I cannot think of any other similar places where the VI/s drops, though.

Zelda MM is a bit more intensive but it's playable enough, however it slows more when we are in Clock Town, especially in the south sector where VI/s can reach near 40 VI/s.
OS: Debian GNU/Linux Jessie (8.0)
CPU: Intel i7 4770K @ 3.5 GHz
Build: AVX (compiled from git)

AIO
Posts: 51
Joined: Wed Nov 05, 2014 4:56 pm

Re: Optimizing the RDP

Post by AIO » Sun Jul 03, 2016 2:58 pm

Snowstorm64 wrote: True, Mario Party isn't that intensive, but it becomes slow in that particular place I have mentioned before, the world hub where there are the tube, the bank, the raft and some other buildings. I cannot think of any other similar places where the VI/s drops, though.

Zelda MM is a bit more intensive but it's playable enough, however it slows more when we are in Clock Town, especially in the south sector where VI/s can reach near 40 VI/s.
In some cases it's hard to speculate because on one hand, there's a lot of optimizations that even I haven't done yet. At the same time, idk exactly how much slower it will be after achieving cycle accurate accuracy. I tested that world hub scene and it has a lot of room for improvement.

An optimized dynarec will even allow you to run games much faster (especially those 2D games). I'll try profiling Clock Town sometime this week.

User avatar
Nintendo Maniac 64
Posts: 185
Joined: Fri Oct 04, 2013 11:37 pm

Re: Optimizing the RDP

Post by Nintendo Maniac 64 » Mon Jul 04, 2016 5:30 pm

AIO wrote:Star Wars Episode 1 - Racer is another one of those games that may have frameskip on console. This needs to be investigated. Interestingly, some parts of DK64 may also have frameskip (like the part where he misses the vine on most emulators).
I could probably check both of these since I have both games and my N64 is even hooked up and the like, though you might have to wait at least 2 days before I get results.
CEN64 Forum's resident straight-male kuutsundere
(just "tsundere" makes people think of "Shana clones" *shivers*)

CPU+iGPU: Pentium G3258 @ 4.6GHz/1.281v
dGPU: Radeon HD5870 1GB
RAM: Vengeance 1600 4x4GB
OS: Windows 7

User avatar
MarathonMan
Site Admin
Posts: 692
Joined: Fri Oct 04, 2013 4:49 pm

Re: Optimizing the RDP

Post by MarathonMan » Mon Jul 04, 2016 7:03 pm

AIO wrote:Interestingly, some parts of DK64 may also have frameskip (like the part where he misses the vine on most emulators).
This one instance is not frameskip - it s related to memory and DMA timings. If I fiddle with the memory latency in CEN64, I can get DK to grab the vine.

AIO
Posts: 51
Joined: Wed Nov 05, 2014 4:56 pm

Re: Optimizing the RDP

Post by AIO » Tue Jul 05, 2016 2:17 am

Snowstorm64 wrote: Zelda MM is a bit more intensive but it's playable enough, however it slows more when we are in Clock Town, especially in the south sector where VI/s can reach near 40 VI/s.
I tried running around Clock Town today and the game doesn't seem intensive tbh. I'm honestly surprised you don't get full speed with multi-threading. I profiled and saw that it largely used functions I haven't bothered optimizing yet, which is good news I guess since that means there a lot of room for improvement.
Nintendo Maniac 64 wrote: I could probably check both of these since I have both games and my N64 is even hooked up and the like, though you might have to wait at least 2 days before I get results.
Nice! That would be cool if you tested :D . I'm patient, so you can take your time.
MarathonMan wrote:This one instance is not frameskip - it s related to memory and DMA timings. If I fiddle with the memory latency in CEN64, I can get DK to grab the vine.
I can't say I am sure, but it seems like that part of the game is running extra slow. When using counter factor 1 or 2 in 1964, he misses the vine and the frame rate during that scene seems good. But if I use CF 3, the game runs at a slower framerate during that scene, but DK doesn't miss the vine. When I watched a youtube video, it seems that the console also has a bad frame rate in that scene.

User avatar
Nintendo Maniac 64
Posts: 185
Joined: Fri Oct 04, 2013 11:37 pm

Re: Optimizing the RDP

Post by Nintendo Maniac 64 » Sat Jul 09, 2016 7:40 pm

AIO wrote:Nice! That would be cool if you tested :D . I'm patient, so you can take your time.
From what I can tell, in both SWEp1R and DK64, the gameplay itself slows down when the framerate drops, thereby implying a non-variable framerate.

However, I think both games might have a variable framerate when running above what seems to be 20fps. I'm less confident that DK64 does this, but I'm pretty sure SWEp1R does.
CEN64 Forum's resident straight-male kuutsundere
(just "tsundere" makes people think of "Shana clones" *shivers*)

CPU+iGPU: Pentium G3258 @ 4.6GHz/1.281v
dGPU: Radeon HD5870 1GB
RAM: Vengeance 1600 4x4GB
OS: Windows 7

User avatar
MarathonMan
Site Admin
Posts: 692
Joined: Fri Oct 04, 2013 4:49 pm

Re: Optimizing the RDP

Post by MarathonMan » Sun Jul 10, 2016 9:15 pm

Update 07/10/2016

Thought of some more ideas today. Now 7-8% faster over master.
Attachments
angrylion-rdp_4.png
angrylion-rdp_4.png (40.69 KiB) Viewed 7396 times

User avatar
Nintendo Maniac 64
Posts: 185
Joined: Fri Oct 04, 2013 11:37 pm

Re: Optimizing the RDP

Post by Nintendo Maniac 64 » Sun Jul 10, 2016 10:53 pm

MarathonMan wrote:07/10/2016
This is going to seem incredibly off-topic...MarathonMan, maybe I'm thinking of a completely different guy, but I thought you were a native of South America? I say this because only someone native to the US and/or it territories (and maybe Canada or Mexico) would use the M/D/Y format (AFAIK it's Y/M/D or D/M/Y everywhere else).
CEN64 Forum's resident straight-male kuutsundere
(just "tsundere" makes people think of "Shana clones" *shivers*)

CPU+iGPU: Pentium G3258 @ 4.6GHz/1.281v
dGPU: Radeon HD5870 1GB
RAM: Vengeance 1600 4x4GB
OS: Windows 7

User avatar
MarathonMan
Site Admin
Posts: 692
Joined: Fri Oct 04, 2013 4:49 pm

Re: Optimizing the RDP

Post by MarathonMan » Mon Jul 11, 2016 12:31 am

Nintendo Maniac 64 wrote:
MarathonMan wrote:07/10/2016
This is going to seem incredibly off-topic...MarathonMan, maybe I'm thinking of a completely different guy, but I thought you were a native of South America? I say this because only someone native to the US and/or it territories (and maybe Canada or Mexico) would use the M/D/Y format (AFAIK it's Y/M/D or D/M/Y everywhere else).
http://orig04.deviantart.net/95cb/f/201 ... 8flm4d.jpg

User avatar
Nintendo Maniac 64
Posts: 185
Joined: Fri Oct 04, 2013 11:37 pm

Re: Optimizing the RDP

Post by Nintendo Maniac 64 » Mon Jul 11, 2016 12:32 am

Hey, I myself am a native of northeast Ohio. ;)
CEN64 Forum's resident straight-male kuutsundere
(just "tsundere" makes people think of "Shana clones" *shivers*)

CPU+iGPU: Pentium G3258 @ 4.6GHz/1.281v
dGPU: Radeon HD5870 1GB
RAM: Vengeance 1600 4x4GB
OS: Windows 7

User avatar
Snowstorm64
Posts: 303
Joined: Sun Oct 20, 2013 8:22 pm

Re: Optimizing the RDP

Post by Snowstorm64 » Sat Jul 16, 2016 1:30 pm

With today's commits:
MarioKart120.png
MarioKart120.png (47.65 KiB) Viewed 7067 times
(Okay, I admit I have cheated a bit with the time trial mode, but hey, it's still good :P)
OS: Debian GNU/Linux Jessie (8.0)
CPU: Intel i7 4770K @ 3.5 GHz
Build: AVX (compiled from git)

User avatar
MarathonMan
Site Admin
Posts: 692
Joined: Fri Oct 04, 2013 4:49 pm

Re: Optimizing the RDP

Post by MarathonMan » Sat Jul 16, 2016 2:32 pm

Have you noticed any problem with frameskipping? It looks like sometimes, games are cutting frames (i.e., SPLiT's Nacho demo). I have to compare to the console and verify.

User avatar
Snowstorm64
Posts: 303
Joined: Sun Oct 20, 2013 8:22 pm

Re: Optimizing the RDP

Post by Snowstorm64 » Sat Jul 16, 2016 3:32 pm

MarathonMan wrote:Have you noticed any problem with frameskipping? It looks like sometimes, games are cutting frames (i.e., SPLiT's Nacho demo). I have to compare to the console and verify.
I haven't tried too much games, but I believe it could be true what you are saying about frameskipping. But it's hard for me to compare between overclocked 60 Hz version on CEN64 and standard 50 Hz version of same games on my N64...
OS: Debian GNU/Linux Jessie (8.0)
CPU: Intel i7 4770K @ 3.5 GHz
Build: AVX (compiled from git)

Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest