------- Article 1723 (8 more) in comp.benchmarks: From: Jeffrey Reilly Subject: FAQ: SPEC Followup-To: comp.benchmarks Summary: SPEC, benchmarks, CINT92, CFP92, SFS, SDM Nntp-Posting-Host: mipos2 Organization: Intel Corporation, Santa Clara, CA USA Date: Fri, 4 Jun 1993 16:22:49 GMT Lines: 494 Attatched is the FAQ posting for the Standard Performance Evaluation Corporation (SPEC) and the benchmarks they produce. Jeff SPEC CINT92/CFP92 Release Manager Jeff Reilly | "There is something fascinating about Intel Corporation | science. One gets such wholesale returns jwreilly@mipos2.intel.com | of conjecture out of such a trifling (408) 765 - 5909 | investment of fact" - M. Twain ========================================================================= The following text Answers to Frequently Asked Questions about SPEC Benchmarks =========================================================== is updated regulary by active SPEC members. Last update: 4/09/1993. Contents: 1. What is SPEC 2. How to Contact SPEC 3. SPEC's Products and Services 4. Current SPEC Benchmarks 4.1 CPU Benchmarks 4.2 SDM Benchmarks 4.3 SFS Benchmarks 5. Outdated SPEC Benchmarks 6. Forthcoming SPEC Benchmarks 7. Membership in SPEC 8. Acknowledgments 1. What is SPEC =============== SPEC, the Standard Performance Evaluation Corporation, is a non-profit corporation formed to "establish, maintain and endorse a standardized set of relevant benchmarks that can be applied to the newest generation of high-performance computers" (from SPEC's bylaws). The founders of this organization believe that the user community will benefit greatly from an objective series of applications-oriented tests, which can serve as common reference points and be considered during the evaluation process. While no one benchmark can fully characterize overall system performance, the results of a variety of realistic benchmarks can give valuable insight into expected real performance. Members of SPEC are currently: AT&T/NCR, Auspex, Bull, Compaq, Control Data, Data General, DEC, EDS, Fujitsu, HaL Computer, Hewlett-Packard, IBM, Intel, Intergraph, Kubota Pacific, Motorola, NeXT, Network Appliance, Novell, Olivetti, Siemens Nixdorf, Silicon Graphics, Solbourne, Sun, Unisys, Ziff-Davis. SPEC Associates are currently: Center for Scientific Computing (Espoo, Finland), Leibniz Computing Center of the Bavarian Academy of Science (Munich, Germany), National Taiwan University, SERC Daresbury Laboratory (England), National Taiwan University (Taiwan). Legally, SPEC is a non-profit corporation registered in California. SPEC basically does 2 things: - SPEC develops suites of benchmarks intended to measure computer performance. These suites are packaged with source code and tools and are extensively tested for portability before release. They are available to the public for a fee covering development and administration costs. By license agreement, SPEC members and customers agree to run and report results as specified in each benchmark suite's documentation. - SPEC publishes a quarterly report of SPEC news and benchmark results: The SPEC Newsletter. This provides a centralized source of information for SPEC benchmark results. Both SPEC members and non-SPEC members may publish in the SPEC Newsletter, though there is a fee for non-members. (Note that results may be published elsewhere as long as the format specified in the SPEC Run Rules is followed.) 2. How to Contact SPEC ====================== SPEC [Standard Performance Evaluation Corporation] c/o NCGA [National Computer Graphics Association] 2722 Merrilee Drive Suite 200 Fairfax, VA 22031 USA Phone: +1-703-698-9600 Ext. 318 FAX: +1-703-560-2752 E-Mail: spec-ncga@cup.portal.com For technical questions regarding the SPEC benchmarks (e.g., problems with execution of the benchmarks), Dianne Dean (she is the person normally handling SPEC matters at NCGA) refers the caller to an expert at a SPEC member company. 3. SPEC Products and Services ============================= The SPEC benchmark sources are generally available, but not free. SPEC is charging separately for its benchmark suites; the income from the benchmark source tapes is intended to support the administrative costs of the corporation - making tapes, answering questions about the benchmarks, and so on. Buyers of the benchmark tapes have to sign a license stating the conditions of use (site license only) and the rules for result publications. All benchmark suites come on QIC 24 tapes, written in UNIX tar format. Accredited universities receive a 50 % discount on SPEC tape products. Current prices are: CINT92 $ 425.00 (CPU intensive integer benchmarks) CFP92 $ 575.00 (CPU intensive floating point benchmarks) CINT92&CFP92 $ 900.00 SDM $ 1450.00 (UNIX Software Development Workloads) SFS $ 1200.00 (System level file server (NFS) workload) The SPEC Newsletter appears quarterly, it contains result publications for a variety of machines (typically, about 50-70 pages of result pages per issue) as well as articles dealing with SPEC and benchmarking. Newsletter $ 550.00 (1 year subscription, 4 issues) 4. Current SPEC Benchmarks ========================== 4.1 CPU Benchmarks ================== There are currently two suites of compute-intensive SPEC benchmarks, measuring the performance of CPU, memory system, and compiler code generation. They normally use UNIX as the portability vehicle, but they have been ported to other operating systems as well. The percentage of time spent in operating system and I/O functions is generally negligible. CINT92, current release: Rel. 1.1 --------------------------------- This suite contains 6 benchmarks performing integer computations, all of them are written in C. The individual programs are: Number and name Application Approx. size gross net 008.espresso Logic Design 14800 11000 022.li Interpreter 7700 5000 023.eqntott Logic Design 3600 2600 026.compress Data Compression 1500 1000 072.sc Spreadsheet 8500 7100 085.gcc Compiler 87800 58800 ------ ----- 123900 85500 The approximate static size is given in numbers of source code lines, including declarations (header files). "Gross" numbers include comments and blank lines, "net" numbers exclude them. A somewhat more detailed, though still short description of the benchmarks (from an article by Jeff Reilly, in SPEC Newsletter vol. 4, no. 4): 008.espresso Generates and optimizes Programmable Logic Arrays. 022.li Uses a LISP interpreter to solve the nine queens problem, using a recursive backtracking algorithm. 023.eqntott Translates a logical representation of a Boolean equation to a truth table. 026.compress Reduces the size of input files by using Lempel-Ziv coding. 072.sc Calculates budgets, SPEC metrics and amortization schedules in a spreadsheet based on the UNIX cursor- controlled package "curses". 085.gcc Translates preprocessed C source files into optimized Sun-3 assembly language output. CFP92, current release: Rel. 1.1 -------------------------------- This suite contains 14 benchmarks performing floating-point computations. 12 of them are written in Fortran, 2 in C. The individual programs are: Number and name Application Lang. Approx. size gross net 013.spice2g6 Circuit Design F 18900 15000 015.doduc Simulation F 5300 5300 034.mdljdp2 Quantum Chemistry F 4500 3600 039.wave5 Electromagnetism F 7600 6400 047.tomcatv Geometric Translation F 200 100 048 ora Optics F 500 300 052.alvinn Robotics C 300 200 056.ear Medical Simulation C 5200 3300 077.mdljsp2 Quantum Chemistry F 3900 3100 078.swm256 Simulation F 500 300 089.su2cor Quantum Physics F 2500 1700 090.hydro2d Astrophysics F 4500 1700 093.nasa7 NASA Kernels F 1300 800 094.fpppp Quantum Chemistry F 2700 2100 ----- ----- 57900 43900 Short description of the benchmarks: 013.spice2g6 Simulates analog circuits (double precision). 015.doduc Performs Monte-Carlo simulation of the time evolution of a thermo-hydraulic model for a nuclear reactor's component (double precision). 034.mdljdp2 Solves motion equations for a model of 500 atoms interacting through the idealized Lennard-Jones potential (double precision). 039.wave5 Solves particle and Maxwell's equations on a Cartesian mesh (single precision). 047.tomcatv Generates two-dimensional, boundary-fitted coordinate systems around general geometric domains (vectorizable, double precision). 048 ora Traces rays through an optical surface containing spherical and planar surfaces (double precision). 052.alvinn Trains a neural network using back propagation (single precision). 056.ear Simulates the human ear by converting a sound file to a cochleogram using Fast Fourier Transforms and other math library functions (single precision). 077.mdljsp2 Similar to 034.mdljdp2, solves motion equations for a model of 500 atoms (single precision). 078.swm256 Solves the system of shallow water equations using finite difference approximations (single precision). 089.su2cor Calculates masses of elementary particles in the framework of the Quark Gluon theory (vectorizable, double precision). 090.hydro2d Uses hydrodynamical Navier Stokes equations to calculate galactical jets (vectorizable, double precision). 093.nasa7 Executes seven program kernels of operations used frequently in NASA applications, such as Fourier transforms and matrix manipulations (double precision). 094.fpppp Calculates multi-electron integral derivatives (double precision). More information about the individual benchmarks is contained in description files in each benchmark's subdirectory on the SPEC benchmark tape. The CPU benchmarks can be used for measurement in two ways: - Speed measurement - Throughput measurement Speed Measurement ----------------- The results ("SPEC Ratio" for each individual benchmark) are expressed as the ratio of the wall clock time to execute one single copy of the benchmark, compared to a fixed "SPEC reference time" (which was chosen early-on as the execution time on a VAX 11/780). As is apparent from results publications, the different SPEC ratios for a given machine can vary widely. SPEC encourages the public to look at the individual results for each benchmarks; users should compare the characteristics of their workload with that of the individual SPEC benchmarks and consider those benchmarks that best approximate their jobs. However, SPEC also recognizes the demand for aggregate result numbers and has defined the integer and floating-point averages SPECint92 = geometric average of the 6 SPEC ratios from CINT92 SPECfp92 = geometric average of the 14 SPEC ratios from CFP92 Throughput Measurement ----------------------- With this measurement method, called the "homogenuous capacity method", several copies of a given benchmark are executed; this method is particularly suitable for multiprocessor systems. The results, called SPEC rate, express how many jobs of a particular type (characterized by the individual benchmark) can be executed in a given time (The SPEC reference time happens to be a week, the execution times are normalized with respect to a VAX 11/780). The SPEC rates therefore characterize the capacity of a system for compute-intensive jobs of similar characteristics. Similar as with the speed metric, SPEC has defined averages SPECrate_int92 = geometric average of the 6 SPEC rates from CINT92 SPECrate_fp92 = geometric average of the 14 SPEC rates from CFP92 Because of the different units, the values SPECint92/SPECfp92 and SPECrate_int92/SPECrate_fp92 cannot be compared directly. Comprehensive result lists -------------------------- Comprehensive lists of all CINT92/CFP92 result average values (speed measurements as well as throughput measurements) have been published in the December 1992 issue of the SPEC Newsletter (vol. 4, no. 4, pp. 6-8). They contain the system name, the respective average value, and a pointer to the Newsletter issue where the full result report can be found. Readers are warned that "SPEC does not recommend that readers use any one value for making comparisons; there is a wealth of information in the Reporting Pages which cannot be easily reduced to summary values". No more SPECmark computation ---------------------------- While the old average "SPECmark[89]" has been popular with the industry and the press (see section 5: Oudated SPEC Benchmarks), SPEC has intentionally *not* defined an average "SPECmark92" over all CPU benchmarks of the 1992 suites, for the following reasons: - With 6 integer and 14 floating-point benchmarks, the average would be biased too much towards floating-point, - Customers' workloads are different, some integer-only, some floating-point intensive, some mixed, - Current processors have developed their strengths in a more diverse way (some more emphasizing integer performance, some more floating- point performance) than in 1989. 4.2 SDM Benchmarks ================== SDM stands for "Systems Development Multiuser"; the benchmarks in this suite (current release: 1.1) characterize the capacity of a system in a multiuser UNIX environment. Contrary to the CPU benchmarks, the SDM benchmarks contain UNIX shell scripts (consisting of commands like "cd", "mkdir", "find", "cc", "nroff", etc.) that exercise the operating system as well as the CPU and I/O components of the system. The suite contains two benchmarks: 057.sdet Represents a large commercial UNIX/C based software development environment. This characterization is based on AT&T analysis and models developed by Steve Gaede, formerly with AT&T Bell Laboratories. 061.kenbus1 Represents UNIX/C usage in a Research and Development environment. This characterization is based on data collection and analysis at Monash University by Ken McDonell. For each benchmark, throughput numbers (scripts, i.e. simulated user loads per hour) are given for several values of concurrent workloads. The reader can determine the peak throughput as well as the ability of a system to sustain throughput over a range of concurrent workloads. Since the workloads for the two benchmarks are different, their throughput values are different also and cannot be compared directly. 4.3. SFS Benchmark Suite ======================== SFS stands for "system-level file server"; SFS Release 1.0 is designed to provide a fair, consistant and complete method for measuring and reporting NFS file server performance. SFS Release 1.0 contains one benchmark, 097.LADDIS. 097.LADDIS measures NFS file server performance in terms of NFS response time and throughput. It does this by generating a synthetic NFS workload based on a workload abstraction of an NFS operation mix and an NFS operation request rate. Running 097.LADDIS requires a file server (the entity being measured) and two or more "load generators" connected to the file server via a network medium. The load generators are each loaded with 097.LADDIS and perform the 097.LADDIS workload on file systems exported by the file server. SFS Release 1.0 results include full server configuration information (hardware and software) and a graph of server response time versus NFS load for the 097.LADDIS operation mix. 5. Outdated SPEC Benchmarks =========================== SPEC has published the first CPU benchmark suite in 1989, the last release of it is 1.2b. It contains 10 compute-intensive programs, 4 integer (written in C) and 6 floating-point (written in Fortran). The following average values had been defined: SPECint89 = geometric average of the SPEC ratios of the 4 integer programs in rel. 1.2b (CPU-Suite of 1989) SPECfp89 = geometric average of the SPEC ratios of the 6 floating-point programs in rel. 1.2b SPECmark89 = geometric average of all 10 SPEC ratios of the programs in rel. 1.2b In addition, there was the possibility of throughput measurements, with 2 copies of a benchmark running per CPU, called "Thruput Method A" (There was never a "Method B"). The following average values had been defined: SPECintThruput89 = geometric average of the Thruput Ratios of the 4 integer programs SPECfpThruput89 = geometric average of the Thruput Ratios of the 6 floating-point programs SPECThruput89 ("aggregate thruput") = geometric average of the Thruput Ratios of all 10 programs SPEC now discourages use of the 1989 benchmark suite and recommends use of the CINT92 and CFP92 suites, for the following reasons: - The new suites cover a wider area of programs (20 programs instead of 10), - The execution times for some of the old benchmarks became too short on today's fast machines, with the danger of timing inaccuracies, - Input files have now been provided for most benchmarks in the 1992 suites, eliminating the danger of unintended compiler optimizations (constant propagation), - The new suites do no longer contain a benchmark (030.matrix300) that was too much influenced by a particular compiler optimization. This optimization, while legal and a significant step in compiler technology (it is still often used with the benchmarks of 1992), inflated the SPEC ratio for this benchmark since it executed only code susceptible to this optimization. However, SPEC is aware of the fact that results with the old benchmark suite will still be quoted for a while and used for comparison purposes. SPEC has discontinued offering Rel. 1.2b tapes after December 1992, will then label result publications in the Newsletter with "Benchmark Obsolete", and will finally discontinue result publications for it after June 1993. 6. Forthcoming SPEC Benchmarks ============================== A number of areas have been considered or are being considered by SPEC for future benchmark efforts: - Client/Server benchmarks - Commercial computing benchmarks - RTE-based benchmarks - I/O benchmarks SPEC is always open to suggestions from the computing community for future benchmarking directions. Of course, SPEC even more welcomes proposals for actual programs that can be used as benchmarks. 7. Membership in SPEC ===================== The costs for SPEC membership are Annual Dues $ 5000.00 Initiation Fee $ 1000.00 There is also the category of a "SPEC Associate", intended for accredited educational institutions or non-profit organizations: Annual Dues $ 1000.00 Initiation Fee $ 500.00 Associates have no voting privileges, otherwise they have the same benefits as SPEC members: Newsletters and benchmark tapes as they are available, with company-wide license. Probably more important are early access to benchmarks that are being developed, and the possibility to participate in the technical work on the benchmarks. The intention for associates is that they can act in an advisory capacity to SPEC, getting first-hand experience in an area that is widely neglected in academia but nethertheless very important in the "real world", and providing technical input to SPEC's task. SPEC meetings are held about every seven weeks, for technical work and decisions about the benchmarks. Every member or associate can participate and make proposals; decisions are made by a Steering Committee (9 members) elected by the general membership at the Annual Meeting. All members vote before a benchmark is finally accepted. 8. Acknowledgments ================== This summary of SPEC's activities has been written initially by Reinhold Weicker (Siemens Nixdorf). Portions of the text have been carried over from earlier Usenet postings (Answers to Frequently Asked Questions, No. 16: SPEC) by Eugene Miya (NASA Ames Research Center). Additional input has been provided by Jeff Reilly (Intel). This summary is regularly updated by Jeff Reilly, Reinhold Weicker and possibly other SPEC people. Managerial and technical inquiries about SPEC should be directed to NCGA (see section 2). E-Mail questions that do not interfere too much with our real work :-) can also be mailed to jwreilly@mipos2.intel.com from North America Reinhold.Weicker@stm.mchp.sni.de from Europe or elsewhere -------