| Why Intel Trace Analyzer and
Collector 7.2? |
Analyze MPI performance.
Speed up parallel application runs. Locate hotspots and bottlenecks.
Compare trace files with graphics providing extensively detailed analysis
and aligned timelines.
 |
Supported on Linux* and Microsoft* Windows* (Windows Compute
Cluster Server* 2003, Windows XP and Windows Server* 2003)
|
 |
Intuitive full color customizable GUI with many drill down view
options |
 |
Highly Scalable with low overhead and efficient memory
usage |
 |
Easy run-time loading — or instrument an MPI application
executable |
 |
MPI Correctness Checking Library detects many types of errors in
communication |
 |
Integrated online help |
 |
Easy installation and full documentation |
 |
Full tracing and/or light-weight statistics gathering |
|
What's New? |
 |
Correctness Checking reports now available in the Intel Trace
Analyzer GUI |
 |
Migration to Trolltech* Qt 4.x for refreshed
look-and-feel |
 |
Added support for:
| - |
Intel® Compilers 11.0 |
| - |
Microsoft* Windows Vista and HPC Server
2008 | | |
Many
features, many options, major performance improvements.
 |
PIN-based binary instrumentation |
 |
Runtime behavior displayed by function, process, thread,
timelines or cluster or node |
 |
Multiple types of filtering (functions, processes, messages) and
aggregation |
 |
Performance counter data recording can be displayed as
timeline |
 |
Trace data is cached to reduce runtime overhead and memory
consumption |
 |
Traces multi-threaded MPI applications for event-based tracing to
non-MPI applications |
 |
Fail safe tracing |
 |
Support for MPI-1, SHMEM, MPI-IO, ROMIO |
 |
Distributed memory checking with the MPI Correctness checking
library | |
Trace
Collector
 |
Automated MPI tracing and MPI Correctness Checking |
 |
Generic distributed (non-MPI) and single process
tracing |
 |
Thread level tracing with traces created even if application
crashes |
 |
HPM data collection (PAPI, rusage, OS-counters) |
 |
Configurable tracefile parameters |
 |
Feature disabling/enabling |
 |
Tuning parameters |
 |
Distributed Memory checking with Valgrind* |
 |
Binary runtime instrumentation |
 |
Compiler instrumentation
| - |
Icc/ifort/icpc -tcollect |
| - |
Gcc/g++
-finstrument-functions | |
 |
API: source code instrumentation (counter, function, message and
collective operation logging) | |
Trace
Analyzer
 |
Event, Quantitative, Qualitative, and Counter Timelines |
 |
Flexible message and collective operation Profiles |
 |
Function Profile (call graph, call tree, flat and load
balance) |
 |
Detailed comparison (of 2 traces) |
 |
Multi-level source code visualization with a full text
browser |
 |
Flexible and powerful event tagging and filtering |
 |
Hierarchical grouping and aggregation across function or
processes data |
 |
Large set of configuration parameters per chart |
 |
Export profiling data as text; export charts to graphics or
printer |
 |
Command line interface | |
| MPI Checking
|
|
Included in Intel Trace Analyzer and Collector is a unique MPI
correctness checker to detect deadlocks, data corruption, or errors with
MPI parameters, data types, buffers, communicators, point-to-point
messages and collective operations. By providing checks at run-time, and
reporting the errors as they are detected, the debugging process is
greatly expedited. The correctness checker also allows debugger
breakpoints to help in-place analysis but has a small enough footprint to
allow use during production runs. The true benefit of the Intel Trace
Analyzer and Collector Correctness Checker is the potential to scale to
extremely large systems and the ability to detect errors even among a
large number of processes. The checker can be set to view deadlocks
regardless of fabric type.
By tracking data types and wrapping MPI
calls, the requests and communicators can be reused from the trace
collector. (The checking library is compiled from the source code of the
performance data collection library.) The Analyzer is able to extremely
rapidly unwind the call stack and use debug information to map instruction
addresses to source code with and without frame pointers.
With both
command line and GUI interfaces the user can additionally set up batch
runs or do interactive debugging. The timeline view shows actual function
calls and process interactions which highlights excessive delays or errors
that stem from improper execution ordering.
See screen shots of various displays including metrics
tracking, timeline views and parallel displays.
Instrumentation and Tracing |
|
Intel Trace Analyzer and Collector specializes in low intrusion binary
instrumentation. It can create and add this instrumentation to existing
statically and dynamically linked binary executables to allow automatic
monitoring of MPI as well as function entry and exit points. This includes
the capability of tracing and recording performance data from parallel
threads in C, C++ and Fortran.
Intel Trace Analyzer and Collector
support both MPI applications and distributed non-MPI applications in C,
C++, and Fortran. For applications running with Intel® MPI Library this
includes tracing of internal MPI states. |
|