Search found 25 matches

by izy
Sun May 07, 2017 9:38 pm
Forum: Bugs/Issues
Topic: Are the recent 2017 Windows builds broken? Or is AVX now required or something?
Replies: 7
Views: 6627

Re: Are the recent 2017 Windows builds broken? Or is AVX now required or something?

In Windows you could avoid to create a new dependency on iconv for the old cen64 (also the call to iconv_close is nowhere to be seen). It is easier to add an exception... include windows.h, add a few #ifdef _WIN32 and just call MultiByteToWideChar to make the conversion. It is straightforward after ...
by izy
Thu May 04, 2017 7:55 pm
Forum: Bugs/Issues
Topic: Are the recent 2017 Windows builds broken? Or is AVX now required or something?
Replies: 7
Views: 6627

Re: Are the recent 2017 Windows builds broken? Or is AVX now required or something?

It seems cen64 is gonna be a compiler an operating system and at a some point in an undisclosed future also an emulator At the moment it is in its first stage -- a compiler. If you want to see something from the current cen64 then you should cmake it with the debug flag in order to get some verbose ...
by izy
Sun Jul 17, 2016 10:14 am
Forum: RCP
Topic: How To Triangle
Replies: 6
Views: 7623

How not To Mistake

Let me ruin the party :D There is a missing link. You turned this: int32_t crmajm = cr1+(cr3-cr1)*(y2-y1)/(y3-y1); int32_t cgmajm = cg1+(cg3-cg1)*(y2-y1)/(y3-y1); int32_t cbmajm = cb1+(cb3-cb1)*(y2-y1)/(y3-y1); int32_t camajm = ca1+(ca3-ca1)*(y2-y1)/(y3-y1); into this: int32_t crde = (cr3-cr1)/(y3-y...
by izy
Wed Jul 13, 2016 9:45 am
Forum: Development
Topic: eventfd-based barrier patch (untested)
Replies: 16
Views: 10814

Re: eventfd-based barrier patch (untested)

I update the patch with a new version not to use a counter. It uses the xor to toggle between two eventfds. Although it requires two fds and is full of global variables it will make a single system call per passage. Indeed in the end it gives little advantage over the pthread since it looks like it ...
by izy
Sun Apr 17, 2016 10:50 am
Forum: Development
Topic: eventfd-based barrier patch (untested)
Replies: 16
Views: 10814

Re: eventfd-based barrier patch (untested)

Snowstorm64, would you mind to test this little benchmark program https://issues.apache.org/jira/browse/TS-2137 ? Because your results are really hard to understand (and likely wrong). The pthreads APIs cannot be faster than an eventfd. At least an atomic operation and a futex word are required for ...
by izy
Tue Feb 23, 2016 4:24 pm
Forum: Compatibility
Topic: Compatibility on new 2016 builds
Replies: 58
Views: 21312

Re: Compatibility on new 2016 builds

MarathonMan wrote: I also fixed some audio-related bugs
are you editing ai/controller.c? we could conflict i am also writing a patch for that file.
by izy
Mon Feb 22, 2016 8:05 pm
Forum: Development
Topic: eventfd-based barrier patch (untested)
Replies: 16
Views: 10814

Re: eventfd-based barrier patch (untested)

301 VI/s are only possible if the rcp thread spins in free running and the barrier is skipped altogether. I suggest you to pack your PC and throw it in the trash. Get a TV instead.
by izy
Mon Feb 22, 2016 8:04 pm
Forum: Compatibility
Topic: Compatibility on new 2016 builds
Replies: 58
Views: 21312

Re: Compatibility on new 2016 builds

Yes, I'm aware of this. Not sure what the cause of it is... guessing something with interrupts. I don't know if you are thinking about this, but the notification functions in vr4300/interface.c aren't atomic for real. I guess this is why there is a FIXME after each check_for_interrupts() call. In t...
by izy
Sun Feb 21, 2016 1:08 pm
Forum: Development
Topic: eventfd-based barrier patch (untested)
Replies: 16
Views: 10814

Re: eventfd-based barrier patch (untested)

You didn't noticed that the two eventfd tests gave different results as if they ran on different computers. It is sure something is gone wrong. It is ok if you disable the audio. See what are the results if you run: taskset 1 ./cen64 -multithread.. and with taskset 3.
by izy
Fri Feb 19, 2016 1:57 pm
Forum: Development
Topic: eventfd-based barrier patch (untested)
Replies: 16
Views: 10814

Re: eventfd-based barrier patch (untested)

The eventfd version performs better than vanilla from your results, or programs running in background biased the results. You log sixty entries (sixty VI/s). Paste in a Libreoffice spreadsheet. Sum all sixty VI/second entries to obtain a VI/minute entry. Second(time) pthread (VI/s) eventfd (VI/s) 1 ...
by izy
Thu Feb 18, 2016 8:24 am
Forum: Development
Topic: eventfd-based barrier patch (untested)
Replies: 16
Views: 10814

Re: eventfd-based barrier patch (untested)

I don't know if this is the correct way to test cen64, but using Super Smash Bros i get these values: ./cen64.original -multithread -headless ./pifdata.bin ./ssb.z64 Using NTSC-U PIFROM VI/s: 0.01 VI/s: 12.95 VI/s: 10.53 VI/s: 10.73 VI/s: 10.61 VI/s: 10.64 VI/s: 10.66 VI/s: 9.76 VI/s: 9.54 VI/s: 9.5...
by izy
Wed Feb 17, 2016 12:08 pm
Forum: Development
Topic: eventfd-based barrier patch (untested)
Replies: 16
Views: 10814

Re: eventfd-based barrier patch (untested)

That sounds a little weird to me, a thread barrier is always a bottleneck moreover that is a bottleneck by definition (a "barrier") i did a sort of a speed test in cen64 this way and i get a noticeable IV/s difference ./cen64 -multithread -headless ./pifdata.bin ./mgc_2011.z64 but i don't know if th...
by izy
Tue Feb 16, 2016 12:59 pm
Forum: Development
Topic: eventfd-based barrier patch (untested)
Replies: 16
Views: 10814

eventfd-based barrier patch (untested)

This is an eventfd-based barrier only tested for compiling. I don't know if this performs better than pthread_barrier_wait and not even if this actually works since i cannot really test the code. diff -rpBcN cen64-304d711-old/device/device.c cen64-304d711-new/device/device.c *** cen64-304d711-old/de...
by izy
Tue Feb 16, 2016 12:58 pm
Forum: Open Discussion
Topic: Thinking about multi-threading...
Replies: 41
Views: 48350

Re: Thinking about multi-threading...

I see that the multithreading code has been updated now. I don't know if that code was written that way for any particular reason, such as to give chances to add new features in the future. At the moment it seems to me that the code is only used to create a thread barrier ? Using a mutex (a locking ...
by izy
Mon Feb 15, 2016 1:16 pm
Forum: Open Discussion
Topic: Thinking about multi-threading...
Replies: 41
Views: 48350

Re: Thinking about multi-threading...

MarathonMan wrote:the sync overhead
How is that done?

edit: i can't find that code from the patch http://forums.cen64.com/viewtopic.php?f=5&p=2406#p1867
by izy
Mon Feb 08, 2016 12:33 pm
Forum: Suggestions
Topic: cold and hot attributes in programs?
Replies: 2
Views: 1980

Re: cold and hot attributes in programs?

I think that what you're thinking is possible. Btw note that if you get to call a function from a section to another, such as to signal an interrupt to another component, it'll result in a very far function call. This could be avoided forcing inlining or with some code redundancy though. Another way...
by izy
Mon Feb 01, 2016 2:52 pm
Forum: Suggestions
Topic: cold and hot attributes in programs?
Replies: 2
Views: 1980

cold and hot attributes in programs?

I saw that cen64_cold is applied to all run-once functions with different and unrelated functionalities (such as initialization/finalization functions) and similar functions very rarely used by the program (load/save functions). The reason for the hot/cold attributes to exist is to specify a priorit...
by izy
Sun Sep 27, 2015 9:05 am
Forum: Compatibility
Topic: Support .n64
Replies: 26
Views: 34097

Re: Support .n64

Once you've put a N64's ROM image into the ram memory, you can swap the words there. It is something easy, fast. Two lines of code. Thats what is required to turn an array from little to big endian, or vice versa. I saw that CEN64 memory-maps the ROM file being used. Prove me wrong as i say that tha...
by izy
Sun Jul 12, 2015 10:31 am
Forum: Open Discussion
Topic: Contributing to CEN64 + Emulator architecture expanation
Replies: 7
Views: 6134

Re: Contributing to CEN64 + Emulator architecture expanation

If I use a LUT, I basically lock myself into a specific approach Adding code doesn't remove yours. If you're interested, the Tree could be kept and used for tests. All you'd need is to add a few "ifdef"s in order to decide which code (tree/lut) to use. Also, I've actually tried implementing a LUT, ...
by izy
Fri Jul 10, 2015 10:34 am
Forum: Open Discussion
Topic: Contributing to CEN64 + Emulator architecture expanation
Replies: 7
Views: 6134

Re: Contributing to CEN64 + Emulator architecture expanation

I began to give a look at bus_read_word/bus_write_word Looking into the first call made to the file ri/controller.c i noticed that there are calls to byteswap_32 being made there. byteswap_32 uses __builtin_bswap32 which uses bswap r32 . Bswap is a slow instruction. It isn't a bad thing to break the...
by izy
Thu Jul 02, 2015 3:44 pm
Forum: Open Discussion
Topic: Future state of the project
Replies: 53
Views: 96591

Re: Future state of the project

DIT: I was trying to remove the LEAs since IIRC only more-recent Intel micro-architectures are good at munching through LEAs (see: http://www.realworldtech.com/includes/images/articles/sandy-bridge-5.png?71da3d). No, always use LEA if possible. Also, that picture is wrong. What's wrong with it? Lin...
by izy
Mon Jun 29, 2015 11:45 am
Forum: Open Discussion
Topic: Future state of the project
Replies: 53
Views: 96591

Re: Future state of the project

I found out the code you were talking about in a fork of mupen64plus. It's hard to say whether the LUT is advantageous, because results differ a lot among scenarios and CPUs. It's really possible that the code with the LUT performs better in a real word application. Measuring the cpu time by hand is...
by izy
Fri Jun 26, 2015 10:31 am
Forum: Open Discussion
Topic: Future state of the project
Replies: 53
Views: 96591

Re: Future state of the project

MarathonMan, your code works exactly like mine, just in reverse order. However working in my direction, shifting left (multiply), is always a better idea than shifting right (divide) because it offers more chance for optimization. That is why your code generates 7 assembly instructions, and my code ...
by izy
Mon Jun 22, 2015 10:54 am
Forum: Open Discussion
Topic: Future state of the project
Replies: 53
Views: 96591

Re: Future state of the project

Hello AIO. c = ((s & 1)) ? (byteval & 0xf) : (byteval >> 4); c |= (c << 4); I don't know if replacing 8 instructions with a LUT of 4 instructions, one of which involves memory access, is advantageous. However, it's possible use "ADD" in place of "OR" c = LUT[((s & 1) << 8) | byteval]; the code could...
by izy
Tue Jun 02, 2015 12:08 pm
Forum: Open Discussion
Topic: Future state of the project
Replies: 53
Views: 96591

Re: Future state of the project

3) There is some dead prelude and epilogue code for the main CPU. So there is some I-Cache pollution. This could be avoided if GCC supported naked functions on x86, but it doesn't. Hello p4plus2, i looked at your code but it's maybe possible to write it better. Let me help you. 1) The GCC compiler ...