THE COMPUTER
TRANSITION SYSTEMS
REPORT - AUGUST 2002
COMPUTER TRANSITION SYSTEMS, BOX 4553, MELBOURNE, VICTORIA, 3001
http://www.cts.com.au
--- phone (03) 9530 6633 --- fax (03) 9530 6644 --- email: info@cts.com.au
TECPLOT NEWS
LF95 for LINUX - Cluster Pricing
INTEL Software
WINTERACTER Version 4.1
LF95 Version 5.7 for Windows
CPU PERFORMANCE
ABSOFT FORTRAN NEWS
FORCHECK Version 13
FORTRAN PLUS Version 2.5
plusFORT Version 6.5
PC PERFORMANCE
All prices are based on A$=US$0.54 and will vary with the exchange rate
TECPLOT NEWS
David Fletcher at the University of Sydney has found a simple method for directly importing CFX version 5 files into Tecplot. This is done by means of the cgns file import capability in Tecplot. You prepare a 'res' file with CFX5 and then export it in cgns format. Tecplot can then read it.
The
CFD Analyzer is a
Tecplot Add-on with extensive capabilities for post-processing CFD
results.
Version 3.0 of CFD Analyzer has just been released for Tecplot. The
new features are detailed at
http://www.amtec.com/Product_pages/cfd_analyzer/newfeatures.html.
They include a mass option for particle path calculations, flow
feature extraction, improvements in the integration routines,
calculation enhancements, particle path and streamline enhancements,
shadow graph and Schlieren plots, and GUI enhancements.
LF95 for Linux – Cluster Pricing
The base price of LF95 for Linux allows compiled programs to be run on clusters of 4 cpus or less. The price of an LF95 Pro 2 user licence for clusters of 16 cpus or less is $4279 (academic $3058). The corresponding LF95 Express licence $1529. The price of an LF95 Pro 2 user licence for clusters of 64 cpus or less is $6204 (academic $4609). The corresponding LF95 Express licence is $2288.
We are now able to offer the Intel compilers (both Fortran and C++) for Windows and Linux as well as the Intel Vtune Performance Analyzer (a profiler), the Intel Math Kernel Library, and the Intel Integrated Performance Primitives. Prices are given in the Catalogue Section – see page 6 .
Winteracter is a very flexible graphics and user interface library for Windows and Linux. Version 4.1 was released at the end of May. Version 4.1 upgrades are $341 ($253 academic) from Winteracter 4.0: $748 ($561) from Winteracter 3.x; and $935 ($715) from earlier versions of Winteracter. New copies of Winteracter are $1485 ($1122 academic). The major improvements in version 4.1 are
The Winteracter Development Environment (WiDE) now incorporates a fully integrated version of the resource editor
The text editor has been substantially upgraded.
- Winteracter library calls, Winteracter symbolic names, resource identifier, message processing, new subroutine/function, file, date/time can now be inserted in a single step. - a 'Goto Routine" option greatly simplifies navigation in multi-routine source files - the "Find All" option now allows the routine which contains each matched line to be reported. - inclusion of a number of new options – multiple file buffers, find-difference, save-selection, delete/join/transpose lines, indent/unindent, comment/uncomment, case conversion, open recent files, font selection, preference management and on line help.Graphics metafiles may now be created in memory.
A set of curve plotting and point calculation routines have been added to draw Bezier, cubic B-spline, and coincident curves.
The number of line types in GDI and CGM has been increased to the full set of 7.
a full set of text handling routines in OpenGL
Fortran oriented descriptions of all OpenGL functions have been added to the online help. Normally all OpenGL descriptions are C oriented.
Support for the new LF95 version 5.7 (as well as earlier versions of LF95 and LF90)
LF95 version 5.7 for Windows
Lahey has just released version 5.7 for Windows. The accompanying Lahey newsletter provides details of what is new in this release. Essentially there have been various improvements which should lead to some improvement in execution speed of compiled programs particularly those which make considerable use of complex logical expressions. Not mentioned in the newsletter is that the external 4 GB file size limit has been eliminated – the limit is now '18 exabytes'. Another very recent improvement is changing preconnected units 5 and 6 will no longer affect unit =*. In version 5.7 the Microsoft linker is used. This means that anyone upgrading to version 5.7 will also need to upgrade any third party library they use. The advantage of the Microsoft linker is that it should make a much larger range of third party software compatible with LF95. It will also provide greater flexibility in mixed language support. The Fortran runtime can now be implemented as a DLL. Dynamic linking with the Fortran runtime allows multiple executables to use the same copy of the runtime thus decreasing the size of each executable. The default stack size has been increased from 100K to 1MB. This can lead to much faster load times for LF95 executables. The Winteracter Starter kit in version 5.7 is now based on Winteracter 4.1. This means substantial improvements in the Resource Editor, the Application Wizard and new graphics text subroutines. There are numerous minor enhancements. The documentation has been improved with version 5.7 with more mixed language programming examples. Note that printed documentation is no longer provided with the Standard Edition of version 5.7. All documentation is provided in .pdf format so hard copy can be easily generated on any printer. The price of LF95 for Windows remains unchanged - Pro $1661 (educational $1199); Standard $1210 (academic $869); Express $506. Updates from LF95 version 5.6 are $440. Updates from earlier versions of LF95 and any version of LF90 are $638.
The SPEC 2000 floating point benchmark is composed of Fortran programs and socan be quite representative of the capability of cpus to run Fortran programs. The best results for current processors are 1356 for the Itanium2 1 GHz, 1202 for the IBM Power4 1.3 GHz, 878 for the P4 2.53 GHz, 776 for the Alpha 21264 1 GHz, 711 for the Ultrasparc III Cu 1.05 GHz, 703 for the Itanium 0.8 GHz, 624 for the Athlon XP2200+ (1.8 GHz), 499 for the SGI 14K 0.6 GHz, 464 for the PA8700 .75 GHz, 483 for the SPARC64 .8GHz, and 329 for the P3 1 GHz. The results here are for the fastest example of each cpu. However computers with a 2.0 GHz P4 are less than $1500. The Spec 2000 floating point result for the 2.0 GHz P4 is 715. One of the least expensive Itanium2 based machines is a 900 GHz model selling for US$5865 (A$11,950).
At the end of July Absoft will release version 8 of Pro Fortran for the Macintosh under OSX and OS9. Improvements to the release for OSX include compiler improvements, programmer's editor, runtime stack trace, plot2D and 3D graphics library, OpenGL support, improved IDE, detailed error message reporting, IDE plugins which make VAST easier to use, and BLAS routines have been optimised for the G4. Version 8 for OS9 has the improved compiler and the 2D/3D graphics library. Upgrades from version 7.2 and later are $715. Upgrades from version 6.2 and earlier are $913.
Forcheck is the best static analyzer for Fortran. Those with programs of more than a few thousand lines will find the task of program maintenance greatly reduced if Forcheck is used regularly. Forcheck is available for most platforms. The Windows version is $990 (academic $605). The Linux version is $1342 ($759).
With Version 13 the Windows IDE has been improved considerably - project files are displayed as a tree, more windows can be opened to view output and edit source files concurrently, report files can be stored for later use, and by double clicking on the diagnostic message one can jump to the location of a problem.
Major improvements in version 13 include
improved handling of F90/95 features and conformance verification eg comparison of nested imported derived types, pointer components, generic names and resolving generic procedures, improved support of nested interfaces, renamed intrinsic procedures, verification of constant and initialisation expressions, support for names up to 64 characters long.
improved handling of legacy Fortran syntax eg improved account for equivalencies, detection of defined and referenced depending on data types, improved verification of constraints of statement functions, and improved verification of type and type parameters of entries.
Improved verification of procedure references and argument lists
Improved tuning and software engineering facilities eg selective suppression of diagnostic messages, and improved reporting – self explanatory messages, verbose messages, reporting filename and line number with messages, generation of report file, reference structure suited for long, qualified, procedure names.
Version 2.5 of FortranPlus, the economical Fortran 95 compiler for Windows and Linux produced by NA software, was released recently. The major advance is the inclusion of an integrated development environment in the Windows release. A number of additional command line options have been added to the compiler. FortranPlus is available in three configurations the 'student pack' which restricts compilation to source files no longer than 2000 lines (but any number of object files can be linked together) ($209), the standard pack ($616 – academic $462) and the professional pack ($1034 – academic $825)
Like
earlier versions, plusFORT 6.50 contains extensive facilities for
unscrambling legacy Fortran spaghetti code, and for translating
Fortran 66 to 77 and on to Fortran 90 and 95. plusFORT also
includes a static analysis tool, facilities for dynamic analysis (to
identify tricky "used before set" errors at run-time) and
coverage analysis (to identify untested code and execution
hot-spots).
The new version greatly extends the ability of
plusFORT to handle new Fortran features. SPAG, the plusFORT
source code transformation tool, now processes almost all Fortran 90
code, including modules, derived types, dynamic allocation, array
language, and all code produced by SPAG when translating older
programs to Fortran 90. The main limitation is that SPAG does
not yet recognise declarations relating to overloaded operators and
procedures. SPAG does dynamic and coverage analysis only for
Fortran 77 code.
The plusFORT AUTOMAKE tool solves a problem
which will be familiar to most Fortran 90 users - the requirement
that modules be compiled before the code that USEs them. A
similar requirement applies to SPAG. For example, if a module
is changed, it is important that the file containing the source for
the module be re-processed before any files that USE the module. The
new "autospag" command uses the AUTOMAKE tool to ensure
that module dependencies are handled correctly when SPAG processes
Fortran 90 input (for example to ensure that modules are processed
before the subprograms that USE them).
plusFORT 6.50 is
available immediately for Windows (95 and up), Linux (on x86
platforms), and most varieties of Unix. The Windows version
includes an optional GUI interface and configuration file editor.
Many people use Fortran because they want very fast execution speed for programs where a considerable amount of time is spent doing calculations which involve floating point numbers. Fortran compilers writers have spent decades optimizing the speed of DO LOOPs which are used to handle operations on arrays. This is the primary reason that Fortran in general will provide much better execution speed for these types of programs. Arguably Fortran also has the additional advantage of being more easily maintained than most other languages.
There are essentially three ways to make a program run faster. Firstly the code can be modified so that it is more efficient. Secondly one can switch to a compiler that produces faster executing code. Thirdly one can switch to (or wait for) a faster computer. The intent of investigation here was to determine the relative advantages with respect to running Fortran programs on the various cpus/memory combinations used in today's personal computers. For Windows LF95 version 5.6 was used with default compiler settings except for -nap -nchk -nchkglobal -ncover -ng -o1 -npca -nsav -nstchk -ntrace -stack 24000000. For Linux LF95 6.1 was used. For the Macintosh G3 and G4 based machines Absoft Pro Fortran version 7.0 for OSX was used. Defaults plus the -o -s switch was used and in some cases -B19 and -xINTEGER were required to get proper execution. A single set of switches which made all benchmarks compile correctly could not be found for the Absoft compiler. Benchmarks came from two sources - John Campbell of GHD and Polyhedron. John Campbell's programs are basic tests for sorting, random number generation, vector operations, and disk speed. The other 24 programs are from a variety of original sources and are used by Polyhedron (www.polyhedron.com) to compare the capability of Fortran compilers. These programs were slightly modified by inclusion of internal timing routines. This eliminated the program loading time from the execution speed. The effect of file i/o was minimised where possible by positioning the timing start to be after any large data file reading and by positioning the timing termination before any large output file writing. Also all screen output was redirected to a scratch file to remove the effects of the graphics card speed. 33 programs were run on 27 personal computers – 12 P4 based, 7 P3 based, 2 Celeron, 3 Athlon, and three Macintoshes. In addition some of the benchmarks were run on a 500MHz Alpha based machine. It would have been ideal if the same motherboard, hard disk drive, and operating system was used in each computer. However this was not the case. A variety of Windows operating systems (W95, W98, W2000, WNT, WMe, and XP) and Linux were used. To obtain an accurate execution time Polyhedron found that at least 10 runs were needed for the benchmarks it uses. To keep the total machine time for the benchmarks to no more than an hour the time consuming hard disk test programs were run twice and all individual benchmark programs from Polyhedron were run three times. In a limited study it was found that the fastest time time for each benchmark was very close to the most consistent execution time found when individual benchmarks were run many times. A summary of the benchmark results was prepared in which the fastest execution times for each benchmark was noted and our analysis is based this summary. A copy of the summary is available on request from Computer Transition Systems. The executable code for Windows, Linux and OSX is also available on request as is the source code.
OBSERVATIONS
G3 and G4 cpus There are are no SPEC results for Macintoshes. From our study it appears that the reason is simple. Fortran programs do not run very fast on these processors. A 400 MHz G3 ran all benchmarks appreciably slower than a 500 MHz P3. A 733 MHz G4 ran the benchmarks 1.5 times as fast as the 400 MHz G3. The G4 was similar in performance to a 600 MHz P3.
P3 vs P4 A comparison of the closest two P3 and P4 machines, a 1000 MHz P3 with dynamic memory and a 1500 MHz P4 with the same type of memory, shows that if one adjusts for clock rate the P3 performed at a very similar rate to that of the P4. It seems that if P3 cpus were produced with the same clock rate as P4s the P3 based machines would execute programs nearly as fast as P4 based machines. The SPEC SP2000 results are in agreement – a 1GHz P3 was about half the speed of a 1.8 GHz P4.
Athlon vs P4 The XP1800 Athlon based PC using ddr performed similarly to the 1.8 GHz P4 machine with ddr. Interestingly this is not confirmed by the SPEC 2000FP benchmark test where the XP1800+ produced a result which was only 91% of that for the 1.5 GHz P4.
Celeron vs P3 On all except one benchmark (PROTEIN) a PC with a P3 processor was no faster than a Celeron based machine with the same clock rate and the same type and speed of memory.
The Alpha The 500 MHz Alpha computer ran the benchmarks at the same speed as a 500 MHz P3 based machine.
dram vs ddr ddr typically provides execution speed increases of 20% or more for the vast majority of benchmarks. Improvements are more than 60% for benchmarks that are limited by memory speed.
RDRAM vs ddr The only direct comparison available in the data set was 133 MHz ddr and 100 MHz RD memory. For the most part the ddr memory provided better performance.
Linux vs Windows Except for the PNPOLY benchmark little difference in execution speed between the two operating systems was evident. This is reasonable in view of the Linux compiler having more efficient optimization of logical expressions (PNPOLY uses complex logical expressions extensively). In the latest release of LF95 for Windows this drawback has been removed and we would expect the PNPOLY results to then be the same for Windows and Linux.
Windows2000 vs XP There was no difference in execution speed.
Windows98 vs 2000 More programs ran faster under W98 than 2000.
CPU speed – The benchmark results from P4 based machines using RD or ddr ram revealed that benchmark execution speed was directly proportional to cpu clock rate (ie not limited by other factors).
Return to Computer Transition System's home page.
Updated 13 August 2002