Index
Prev
Index
Amdahl's law
Understanding Parallel Speedup and Amdahl's Law
execution time given n and p
Predicting Execution Time with n CPUs
parallel fraction p
Understanding Amdahl's Law
speedup(n ) given p
Understanding Amdahl's Law
superlinear speedup
Understanding Superlinear Speedup
application placement and I/O resources
Application Placement and I/O Resources
application tuning process
About Performance Analysis and Debugging
automatic parallelization
limitations
Using Compiler Options
avoiding segmentation faults
Avoiding Segmentation Faults
cache bank conflicts
Tuning the Cache Performance
cache coherency
Cache Coherency
Cache coherent non-uniform memory access (ccNUMA) systems
MPI Job Problems and Application Design
cache performance
Tuning the Cache Performance
ccNUMA
MPI Job Problems and Application Design
See Also
cache coherent non-uniform memory access
ccNUMA architecture
ccNUMA Architecture
cgroups
About cpusets and Control Groups (cgroups)
commands
dlook
dlook
Command
dplace
dplace
Command
common compiler options
Compiler Overview
compiler command line
Compiler Overview
compiler libaries
C/C++
C/C++ Libraries
dynamic libraries
Dynamic Libraries
message passing
SHMEM Message Passing Libraries
overview
Library Overview
compiler libraries
static libraries
Static Libraries
compiler options
tracing and porting
Getting the Correct Results
compiler options for tuning
Using Compiler Options to Optimize Performance
compiling environment
The SGI Compiling Environment
compiler overview
Compiler Overview
debugger overview
About Debugging
libraries
Library Overview
modules
Environment Modules
Configuring MPT
OFED
OFED Tuning Requirements for SHMEM
CPU-bound processes
Sources of Performance Problems
cpusets
About cpusets and Control Groups (cgroups)
data decomposition
Data Decomposition
data dependency
Identifying Opportunities for Loop Parallelism in Existing Code
data parallelism
Data Decomposition
data placement practices
About the Data and Process Placement Tools
data placement tools
Data Process and Placement Tools
cpusets
About the Data and Process Placement Tools
dplace
About the Data and Process Placement Tools
overview
About Nonuniform Memory Access (NUMA) Computers
taskset
About the Data and Process Placement Tools
debugger overview
About Debugging
debuggers
gdb
About Debugging
idb
About Debugging
TotalView
About Debugging
denormalized arithmetic
Compiler Overview
determining parallel code amount
Measuring Parallelization and Parallelizing Your Code
determining tuning needs
tools used
Determining Tuning Needs
distributed shared memory (DSM)
Distributed Shared Memory (DSM)
dlook command
dlook
Command
dplace command
dplace
Command
Environment variables
Environment Variables for Performance Tuning
explicit data decomposition
Data Decomposition
False sharing
Fixing False Sharing
file limit resources
resetting
Resetting the File Limit Resource Default
Flexible File I/O (FFIO)
Multithreading Considerations
environment variables to set
Environment Variables
operation
About FFIO
overview
About FFIO
simple examples
Simple Examples
floating-point programs
Floating-point Program Performance
Floating-Point Software Assist
Floating-point Program Performance
FPSWA
See
Floating-Point Software Assist
functional parallelism
Data Decomposition
Global reference unit (GRU)
MPI Application Communication on SGI Hardware
GNU debugger
About Debugging
Gustafson's law
Gustafson's Law
implicit data decomposition
Data Decomposition
I/O tuning
application placement
About I/O Tuning
layout of filesystems
Layout of Filesystems and XVM for Multiple RAIDs
I/O-bound processes
Sources of Performance Problems
iostat command
Using the
iostat(1)
command
Java environment variables
setting
Setting Java Enviroment Variables
layout of filesystems
Layout of Filesystems and XVM for Multiple RAIDs
limits
system
Resetting System Limits
Linux shared memory accounting
Linux Shared Memory Accounting
memory
cache coherency
Cache Coherency
ccNUMA architecture
ccNUMA Architecture
distributed shared memory (DSM)
Distributed Shared Memory (DSM)
non-uniform memory access (NUMA)
Non-uniform Memory Access (NUMA)
memory accounting
Linux Shared Memory Accounting
memory management
About the Compiling Environment
Memory Use Strategies
memory page
About the Compiling Environment
memory strides
Tuning the Cache Performance
memory-bound processes
Sources of Performance Problems
Message Passing Toolkit
for parallelization
Using SGI MPI
modules
Environment Modules
command examples
Environment Modules
MPI on SGI UV systems
general considerations
About MPI Application Tuning
job performance types
MPI Job Problems and Application Design
other ccNUMA performance issues
MPI Job Problems and Application Design
MPI on UV systems
MPI Application Communication on SGI Hardware
MPI profiling
MPI Performance Tools
MPInside profiling tool
MPI Performance Tools
non-uniform memory access (NUMA)
Non-uniform Memory Access (NUMA)
NUMA Tools
command
dlook
dlook
Command
dplace
dplace
Command
OFED configuration for MPT
OFED Tuning Requirements for SHMEM
OpenMP
Using OpenMP
environment variables
Environment Variables for Performance Tuning
parallel execution
Amdahl's law
Understanding Parallel Speedup and Amdahl's Law
parallel fraction p
Understanding Amdahl's Law
parallel speedup
Understanding Parallel Speedup
parallelization
automatic
Using Compiler Options
using MPI
Using SGI MPI
using OpenMP
Using OpenMP
perf tool
Profiling with
perf
performance
VTune
Other Performance Analysis Tools
performance analysis
About Performance Analysis and Debugging
performance gains
types of
About Performance Analysis and Debugging
performance problems
sources
Sources of Performance Problems
PerfSuite script
Profiling with
PerfSuite
process placement
determining
Determining Process Placement
set-up
Determining Process Placement
using OpenMP
Example Using OpenMP
using pthreads
Example Using
pthreads
profiling
MPI
MPI Performance Tools
perf
Profiling with
perf
PerfSuite
Profiling with
PerfSuite
ps command
Using the
ps(1)
Command
resetting default system stack size
Resetting the Default Stack Size
resetting file limit resources
Resetting the File Limit Resource Default
resetting system limit resources
Resetting System Limits
resetting virtual memory size
Resetting Virtual Memory Size
resident set size
About the Compiling Environment
sar command
Using the
sar(1)
command
segmentation faults
Avoiding Segmentation Faults
setting Java environment variables
Setting Java Enviroment Variables
SGI PerfBoost
MPI Performance Tools
SGI PerfCatcher
MPI Performance Tools
SHMEM
SHMEM Message Passing Libraries
shortening execution time
Adding CPUs to Shorten Execution Time
stack size
resetting
Resetting the Default Stack Size
suggested shortcuts and workarounds
Suggested Shortcuts and Workarounds
superlinear speedup
Understanding Superlinear Speedup
swap space
About the Compiling Environment
system
overview
About the Compiling Environment
system configuration
Determining System Configuration
system limit resources
resetting
Resetting System Limits
system limits
address space limit
Resetting System Limits
core file siz
Resetting System Limits
CPU time
Resetting System Limits
data size
Resetting System Limits
file locks
Resetting System Limits
file size
Resetting System Limits
locked-in-memory address space
Resetting System Limits
number of logins
Resetting System Limits
number of open files
Resetting System Limits
number of processes
Resetting System Limits
priority of user process
Resetting System Limits
resetting
Resetting System Limits
resident set size
Resetting System Limits
stack size
Resetting System Limits
system monitoring tools
About the Operating System Monitoring Commands
system usage commands
Operating System Monitoring Commands
iostat
Using the
iostat(1)
command
ps
Using the
ps(1)
Command
sar
Using the
sar(1)
command
vmstat
Using the
vmstat(8)
Command
w
Using the
w(1)
command
taskset command
taskset
Command
tools
perf
Profiling with
perf
PerfSuite
Profiling with
PerfSuite
VTune
Other Performance Analysis Tools
tuning
cache performance
Tuning the Cache Performance
environment variables
Environment Variables for Performance Tuning
false sharing
Fixing False Sharing
heap corruption
Managing Heap Corruption Problems
managing memory
Memory Use Strategies
multiprocessor code
Tuning Multiprocessor Codes
parallelization
Measuring Parallelization and Parallelizing Your Code
profiling
perf
Profiling with
perf
PerfSuite script
Profiling with
PerfSuite
VTune analyzer
Other Performance Analysis Tools
single processor code
Single Processor Code Tuning
using compiler options
Using Compiler Options to Optimize Performance
using math functions
Using Tuned Code
verifying correct results
Getting the Correct Results
uname command
Determining System Configuration
unflow arithmetic
effects of
Compiler Overview
UV Hub
MPI Application Communication on SGI Hardware
virtual addressing
About the Compiling Environment
virtual memory
About the Compiling Environment
vmstat command
Using the
vmstat(8)
Command
VTune performance analyzer
Other Performance Analysis Tools
w command
Using the
w(1)
command