NLMISC::CFastMem Class Reference

Functions for Fast Memory manipulation with Pentium-class processors. More...

#include <fast_mem.h>

List of all members.

Static Public Member Functions

static void precache (const void *src, uint nbytes)
 Fast precaching of memory in L1 cache using SSE or MMX where available (NB: others methods don't do the test) nbytes should not override 4K.
static void * memcpySSE (void *dst, const void *src, size_t nbytes)
 Fast memcpy using SSE instructions: prefetchnta and movntq.
static void precacheSSE (const void *src, uint nbytes)
 Fast precaching of memory in L1 cache using MMX/SSE instructions: movq and prefetchnta Result is typically 880 Mo/s (surely slower because of overhead).
static void precacheMMX (const void *src, uint nbytes)
 Fast precaching of memory in L1 cache using MMX instructions only: movq Result is typically 720 Mo/s (surely slower because of overhead).

Static Public Attributes

static void *(* memcpy )(void *dts, const void *src, size_t nbytes) = findBestmemcpy ()
 This is a function pointer that points on the best memcpy function available depending of the OS and proc.

Detailed Description

Functions for Fast Memory manipulation with Pentium-class processors.

From http://www.sgi.com/developers/technology/irix/resources/asc_cpu.html

Author:
Lionel Berenguier
Nevrax France
Date:
2002

Definition at line 42 of file fast_mem.h.


Member Function Documentation

void * NLMISC::CFastMem::memcpySSE ( void *  dst,
const void *  src,
size_t  nbytes 
) [static]

Fast memcpy using SSE instructions: prefetchnta and movntq.

Can be called only if SSE and MMX is supported NB: Copy per block of 4K through L1 cache Result is typically 420 Mo/s instead of 150 Mo/s.

Definition at line 203 of file fast_mem.cpp.

References memcpy.

Referenced by NLMISC::findBestmemcpy().

void NLMISC::CFastMem::precache ( const void *  src,
uint  nbytes 
) [static]

Fast precaching of memory in L1 cache using SSE or MMX where available (NB: others methods don't do the test) nbytes should not override 4K.

Definition at line 216 of file fast_mem.cpp.

Referenced by NL3D::CShadowSkin::applySkin(), NL3D::CRayMesh::fastIntersect(), NL3D::CTextureDLM::modulateAndfillRect565(), NL3D::CTextureDLM::modulateAndfillRect8888(), and NL3D::CTextureDLM::modulateConstantAndfillRect().

void NLMISC::CFastMem::precacheMMX ( const void *  src,
uint  nbytes 
) [static]

Fast precaching of memory in L1 cache using MMX instructions only: movq Result is typically 720 Mo/s (surely slower because of overhead).

Hence prefer precacheSSE() when available. nbytes should not override 4K

Definition at line 212 of file fast_mem.cpp.

void NLMISC::CFastMem::precacheSSE ( const void *  src,
uint  nbytes 
) [static]

Fast precaching of memory in L1 cache using MMX/SSE instructions: movq and prefetchnta Result is typically 880 Mo/s (surely slower because of overhead).

nbytes should not override 4K

Definition at line 208 of file fast_mem.cpp.


Member Data Documentation

void *(* NLMISC::CFastMem::memcpy)(void *dts, const void *src, size_t nbytes) ( void *  dts,
const void *  src,
size_t  nbytes 
) = findBestmemcpy () [static]

The documentation for this class was generated from the following files:

Generated on Thu Jan 7 08:30:17 2010 for NeL by  doxygen 1.6.1