Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   Fedora User (http://www.linux-archive.org/fedora-user/)
-   -   Fedora 15 x64 stunning performance (vs Win7). (http://www.linux-archive.org/fedora-user/618197-fedora-15-x64-stunning-performance-vs-win7.html)

Quicksort 01-06-2012 11:35 PM

Fedora 15 x64 stunning performance (vs Win7).
 
Hardware platform: Dell Studio XPS, Core i7 950, 8Mb L3 Cache.
Operating systems: Fedora 15 x64 / Windows 7 x64 Ultimate

Hi everybody,

I have developed a sorting/searching library written in assembly
language. As long as one stays in the L1 Cache (in place physical sorts)
speeds are identical, but when the proportion of L1 cache misses
is hign (sorts by reference which return an ordering vector as APL
sorts do) Fedora dramatically outperforms Win7.
This performance gap is stunning but consistent and I am not overdoing
it. Something weird occurs when one leaves the L1 cache. As one remains
in the L3 cache (my Core i7 950 has a 8Mb cache) the performance penalty
is about 33%, with mixed L3 cache/main memory accesses it grows to
50%/60% ! I wish some x64 Linux kernel developer could enlighten me. The
assembly code is exactly the same in both cases(except or course for
calls to APIs being replaced with Linux system calls), JWASM assembler
being used. No disk swapping, large/huge pages, or virtual machine
involved and my test program is a plain application run from the command
line.

Any thoughts?

Quicksort



--
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org

Konstantin Svist 01-07-2012 12:25 AM

Fedora 15 x64 stunning performance (vs Win7).
 
On 01/06/2012 04:35 PM, Quicksort wrote:

Hardware platform: Dell Studio XPS, Core i7 950, 8Mb L3 Cache.
Operating systems: Fedora 15 x64 / Windows 7 x64 Ultimate

Hi everybody,

I have developed a sorting/searching library written in assembly
language. As long as one stays in the L1 Cache (in place physical sorts)
speeds are identical, but when the proportion of L1 cache misses
is hign (sorts by reference which return an ordering vector as APL
sorts do) Fedora dramatically outperforms Win7.
This performance gap is stunning but consistent and I am not overdoing
it. Something weird occurs when one leaves the L1 cache. As one remains
in the L3 cache (my Core i7 950 has a 8Mb cache) the performance penalty
is about 33%, with mixed L3 cache/main memory accesses it grows to
50%/60% ! I wish some x64 Linux kernel developer could enlighten me. The
assembly code is exactly the same in both cases(except or course for
calls to APIs being replaced with Linux system calls), JWASM assembler
being used. No disk swapping, large/huge pages, or virtual machine
involved and my test program is a plain application run from the command
line.

Any thoughts?

Quicksort





You would probably have much better luck posting this question to LKML -
that's where most kernel developers lurk.

http://vger.kernel.org/vger-lists.html#linux-kernel

HTH

--
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org


All times are GMT. The time now is 09:36 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.