Newsgroups: comp.parallel,comp.sys.super From: eugene@sally.nas.nasa.gov (Eugene N. Miya) Reply-To: eugene@george.arc.nasa.gov (Eugene N. Miya) Subject: [l/m xx/xx/xx] Dead Comp. Arch. Society c.par/c.s.super (26/28) FAQ Organization: NASA Ames Research Center, Moffett Field, CA Date: 26 May 1998 12:03:05 GMT Message-ID: <6keb1p$1sj$1@sun500.nas.nasa.gov> Archive-Name: superpar-faq Last-modified: 30 Apr 1998 26 Dead computer architecture society < * This Panel * > 27 Special call 28 Dedications 2 Introduction and Table of Contents and justification 4 Comp.parallel news group history 6 parlib 8 comp.parallel group dynamics 10 Related news groups, archives and references 12 14 16 18 Supercomputing and Crayisms 20 IBM and Amdahl 22 Grand challenges and HPCC 24 Suggested (required) readings This space intentionally left blank (temporarily). UNDERDEVELOPMENT This is a roughly chronological list of past supercomputer, parallel computer, or especially "interesting" architectures, not paper designs (See panel 14, for references for those). Computer archeology is important (not merely interesting), because it is the failed projects where real learning takes place. Even Seymour Cray designed "failed" machines. DCAS came from a so-so Robin Williams movie: Dead Poets Society (DPS) which nerd CS students went to see (trust me, he's better in live performance). In turn, the dead-architecture, lessons-learned discussion started in comp.arch later that same year. The idea was to collect material from knowledgeable ex-engineers and former scientists, anonymously if need be, before it was lost (since the company had either died or evolved). The problem is that academic and commercial literature is fraught with all kinds of useless glowing marketing/sales language. We (the net, I didn't do this alone) collected comments anonymously (if need be) to prevent lessons being lost. The idea was that anyone could comment. It was that netters had hashed over this material before so many times, it seemed useful to capture it (like an FAQ ;^). We assembled a list of architectures. Maybe, a third the way through the list, I was asked by certain people with CRI to suspend discussion, because CRI was starting to acquire Supertek (which I personally always thought was a mistake). We never resumed. We lost the inertia. Ever hear of the Gibbs Project? If not: you should not be surprised. Around that same time, ASPLOS came to Santa Clara, where they held a Dead Computer Architecture Society panel session. I had a meeting of some sort (possibly SIGGRAPH) and I missed the starting hour. I gave Peter Capek of IBM TJW a video camera, but I did not keep the tape because I merely wanted to see what I had missed (if I had, I would have given it to J. Fisher who sat on the panel). I did not regard that as recording history. The panel session discussed the various failed minisupercomputer firms (perhaps I should use more flowery marketing language like "attempted?"). Either way, lessons were there in front of 200+ architects, OS and language designers. Perhaps there was another video camera in the room..... Let see what were the four architectures represented? Elxsi ... Multiflow ... One poster has mentioned "Why no mention of the Symbolics 3600, LMI, or TI LISP machines?" I am not adverse to including the lessons from those machines, however, the DCAS discussion was about minisupercomputers. The 3600 and other LISP machines fell more into the class of workstations during their time competing with the Xerox "D-machines" [Dorado, Dolphin, and Dandelion], SUN, SGI, VAXStation, etc. Most at the time were not even parallel machines. But if you can pitch me a good case, I'll consider them. Do it. Also useful: old header files for those systems which ran C compilers. Most recently, I am reminded of a warm fall Saturday morning in a house on a hill overlooking the beautiful Santa Barbara Channel. George Michael, who I drove just to see Glen Culler (who had suffered a stroke some time back), was talking about "war stories," Ms. Culler [wife and David's mother] chimed in: "I really think you need a better title for your book {one GAM was working on}. No one will buy it with a word like 'war stories' in the title...." Three of us in the room chuckled. She is great. The Dead Computer Architecture Society ====================================== Floating Point Systems (FPS) ---------------------------- (Purchased by Cray Research) FPS AP-series (Culler based design with VLIW attributes) 7600 performance attached to your PDP-11. Roots with Culler-Harris (CHI), Inc. FPS started with specialized attached processors FPS AP-120B, and scaled from there to the FPS T-series Hypercubes. The AP-120 line could be attached to machines as small as a PDP-11. They were controlled by specialized Fortran (and later C) system calls (a software emulator existed for code development: obviously slow). Known as an FFT and MXM box. It was marketed in 1977 in Scientific American as 7600 power on your minicomputer and showed quite respectable, but economical, number crunching power (I/O was still a problem). 38-bit words. Pipelined, precursor to VLIW? Perhaps. Later models: FPS-164, FPS-264, FPS-500, APP Larger 64-bit attached processors. Pre-IEEE-FP. Attached processors became useful and popular for signal processing, medical apps. FPS T-series (hypercubes) Someone else (maybe Stevenson) can write a T-series paragraph. Absorbed by Cray Research. This business unit sold by SGI to Sun at time of SGI/Cray merger, 7/96. [Current living incarnation.] The former CS-6400 line: Current living incarnation is the UltraEnterprise 10000 and UltraHPC 10000 (2 different names for 2 different markets, same box). Denelcor -------- The Denelcor Heterogeneous Element Processor (HEP) was perhaps the most unusual architecture a student will never get a chance to see. My first knowledge of this machine came from Mike Muuss (BRL, scheduled to get one [4 PEMs delivered]) at a time when the DEC VAX-11/780 was the only VAX around. Later I would invite representatives to Ames. 7600-class scalar CPUs at a time when the Cray-1 was out and the X-MP was just being delivered. 64-bit machine. 1978-1984. Full-Empty bits on the memory, goes way beyond mere Test-and-Set instructions. Separate physical Instruction (128MB) and Data (1 GB) Memory based in Aurora, CO, East of Denver. Operating systems: HEP-OS and HEP Unix. Programming and architecture manuals at the Museum. Keywords: dataflow (limited), 13 systems delivered. Photos. Sites (Messina list, 13 sites): BRL (only 4 PEMs) Argonne LANL GIT XXX (probably) Luftansa 7 to go. Problems: somewhat underpowered at the time, programming difficulties. Hardware deadlock. Early inexperience with serious parallel systems. Software. Ambitious. Pipelining. Dataflow. I would hope that a HEP simulator sees the light of day, one of these days. It is suggested that the Horizon simulator is a close approximation to the HEP. I do not know how to obtain it (but know roughly where), I just don't have time. Successor machines: HEP-2 (design 70% complete?) and HEP-3. Horizon (paper design). Tera (1 machine). Keywords: learning to live with latency. See Dennis Shasha's book Out of Their Minds for a barely adequate profile on Burton Smith (too short). Elxsi ----- Sunnyvale, CA based super-minicomputer. ECL-technology, bus-oriented, true 64-bit, first IEEE 754 FP machine, SISD (non-vector) cpus (1-10 later 1-16 CPUs). Impressive for its time (designed to compete against the VAX-11/780 AND low end CDC supercomputers). EMBOS ("Unix-like" operating system, "We renamed `grep` to `find.`" "Ah? what did you renamed `find` to?") ENIX Tata Elxsi, sites in Australia and India. Over 200 sites, and many CPUs. 1 CPU per board, Photos exist. The firm dissolved in the late 1980s, people to H-P. Personal experience: Saw and briefly used a 4 processor system which replaced a Cyber 172. Since replaced by networked workstations. That application was real-time flight data anaylsis on experimental aircraft. Lessons? -------- ECL is expensive. Don't screw around with OSes. Understand the market. Alliant ------- Once called Data Flow Systems. comp.sys.alliant FX/8, FX/80, FX/2800, etc. The FX/8 (the first architecture) had a particularly slow scalar system using MC68008 CPUs for Interactive front-end Processors. This at a time when SUN workstations had better known 68010 processors. The multiple backend Computational Engines (CEs) were a proprietary design with vector instruction. The Berkeley Unix port was a mixed beast. Basis for U. Ill. Cedar Project. Fizzled? Acquired Raster Technologies (graphics). The Friday before Memorial Day 1992. At least that's when 80% of us got laid off. 1) Undercapitalized in a market not as big as it first appeared. > I disagree about the "undercapitalized". 2) Technology changing faster than we could keep up with. (Small Unix-CPU systems can be designed and shipped far faster than a parallel system.) > True, 3) Relying on Intel for a part that didn't _end_ in "86". > True, 4) Long lead time on sales of MPP systems. #include "alliant.h" Surviving news group, comp.sys.alliant. Museum has 1 FX/8 "Do not run classified data on this machine" and 1 FX/1 (former Wallach desk stand). Multiflow --------- VLIW A couple of us pondered what ever happened to SN#1 of this machine. I saw it! Even typed 'ls' on it! Is the third flag at the assembly room at Convex Computer: Maryland? Museum History Center (Trace 14). Lessons? Among others: Waited too long to go to ECL. Myrias ------ What was the US DOD doing funding a Canadian company? That was the first question which ever came onto the net. Homebase was Edmonton, Al, Ca Formed by some academic types from U of Edmonton ca 1984? (or was it U of Alberta at Edmonton?) Original design "on a napkin at a bar"... Weakness: Hardware. SPS-1 68000, "proof of concept" a hierarchy of busses, 4 68k's per bus 16 of those busses on another bus in a box (called a "cage") hook as many cages together as you want/can afford ca 1986?, none installed SPS-2 as above but with 68020 + 68881/2 + MMU, "production system" 4MB/68020 largest system actually built was about 1088 cpus (~ 1024 + 64), a "benchmark system", proof of concept (again) ca 1988?, <~10 installed SPS-3 as above but with 68040, one or two actually built ca 1990?, ~0 installed Strength: Software. Basic idea: take VM Unix, remove pager/swapper replace pager with custom pager which swaps pages between processors according to rules which make the illusion of single address space, SPMD Hey, this is Pre SCI, KSR, et al. Neatest thing: debugger could make "ghost pages", which contained a count of number of reads/write per word- it could find uninitialized words easier than anyone. (notice I didn't say "faster") (i.e., one ghost word per data word) Funniest thing: we had more trouble getting our Canadian friends into NASA than into NSA... Q: What did "SPS" stand for? A: Oh, Scalable Processor System, or Super Parallel System, or something. If you can make up something sensible from "SPS", someone at Myrias probably used the term at least once... Q: Why did US DoD fund a Canadian startup? A: No one else had provably _scalable_ system at the time. They went bankrupt without warning, got a call at 1430 to come to the office, "you're out of work at 1700" Don't like it? Sue us in Edmonton "you're toast, eh?" AFAIK, Today: they're still alive and well and selling the SW for WS clusters You can see them at Supercomputing '9?, perhaps sharing a small booth Lessons Learnt: Myrias: Hardware matters. Best software in the world needs something to run on. 300 Kflops/processor hasn't been supercomputing for quite a while now. Flexible Computer (FLEX) ------------------------ FLEX-32 Saw it! (Langley Research Center) Even typed 'ls' on it! NS32032 processors on a bus. Competitor to Sequent. Scientific Computer Systems (SCS) --------------------------------- SCS-40, SCS-30 (in order of development time, not performance) An attempt at a low-cost binary compatible Cray X-MP clone mostly for software development, but also marketed to those who could not afford a full sized Cray. This was probably a bad business decision on their part. "Come down from your Cray...." kiss of death. The term "Crayette" was first used with this machine. The last hosts running COS and CTSS (CIVIC: the Fortran which replace LRLTRAN). Licensed COS 1.13 from CRI. Meanwhile CRI was transitioning to UNICOS (tm). They secretly hoped UNICOS was going to fail. Then hoped for a remaining (surviving) COS/CTSS market: also failed. shipped a few dozen Cray-clones. The first developed machines were delivered to San Diego. Roots also in Portland, OR and Boeing (also purchased a couple) in Seattle. Supertek (Purchased by Cray Research) -------- Another attempt at a low-cost binary compatible Cray X-MP clone. Mike Fung of H-P tried this. Santa Clara, CA. S-1 (not to be confused with the LLNL S-1 project), S-2 The last host running CTSS You should probably note that the S-1 was sold by Cray Research after the buyout as the "XMS" and the S-2 (which was still under development at the time of purchase) was sold as the "Y-MP/EL" Culler-Harris Inc. (CHI) ------------------------ Culler Scientific ----------------- Ametek ------ A conglomerate with something like 40 divisions? One producing a Hypercube clone similar to Intel and NCUBE in Acadia, CA. Used at Caltech and other sites. Never saw one. Guiltech -------- Based in Santa Clara, CA. A somewhat mysterious company Guilfolye. Originally optical interconnect. It changed to a systolic design. The only VMS based "supercomputer." Two? delivered (JPL and TRW) in Beta test. Its last PR gasp was when an employee sold a manual to the Soviets in the mid-80s. That employee was sent to prison for violating export control laws. Cydrome ------- Milpitas, CA Hosted SIGBIG meetings. Cydra 5 (black boxes) Two delivered? Pittsburgh Supercomputer Center (PSC) and One Cydrome was delivered to Yale, where a water pipe running through the machine room burst over it. Bill Gropp (At Yale 1982-1990) -- http://www.mcs.anl.gov/~gropp Hash addresses: The Cydra 5 (aka MXCL 5) did this. It was one of the things that made the memory system expensive (it didn't take 0 cycles; but it did make access to memory pretty uniform independent of stride). I should try to find Richard about this and also see if he retains any old manuals. Museum History Center (Cydra 5). Cray Computer Corporation ------------------------- Colorado Springs, CO Cray-3 Cray-4 A computer company doing research when the parent research company was doing computers. Forked from its parent CRI sometime after an unsuccessful Cray Labs in Boulder, CO. About the same time as SSI(1). The 3 was intended to be a 16 processor machines with a 2.0 ns clock cycle (1 instruction per cycle unlike the Cray-2). The 4 was to be a 1 cubic foot cube. The 4 abandoned the local memory. And brought back B and T registers. GaAs technology (Vitesse) Founder killed due to injury complications suffered following a car crash. Successor, SRC Computer. Cray-5 Cray-6 Supercomputer Systems Inc. (1) ------------------------------ Eau Claire, WI Steve Chen Heavily funded funded by IBM with not a lot to show for it. 1 2-CPU prototype. Photo in BusinessWeek inside a Farady cage. Stories that the hunk of the machine was not properly cooled on first power up and that the hulk was later found by the side of a road abandoned. Scheduled to be a ramped up 64-processor Y-MP with another memory stage. Cray Research Incorporated -------------------------- Acquired by Silicon Graphics. Still Birth =========== American Supercomputer ---------------------- A project by Mike Flynn (Stanford). CHoPP ----- forgot what the CH stands for...PP was parallel processing. 1970s era by Sullivan. Supercomputer Systems Inc. (2) ------------------------------ San Diego, CA Very little is know about this firm. Half alive companies (software, services, different products only) ==================== CDC (now CDS) --- This section will be added later comp.sys.cdc ICL DAP (not totally dead) ------- ICL, sometimes called the IBM of England. Sometimes considered competition to the Goodyear/Loral MPP English SIMD machine, a.k.a. Active Memory Technology (Irvine Offices) Versions: Transputer SPARC? Inmos Transputers ----------------- Not a supercomputer per se, but an interesting attempt at a component with real concern for I/O. Popular processor (1982) in some circles. Well thought out communication scheme, but problems with scalability. Crippled because it's lacking US popularity. |However, you must mention Transputers (something developed in EUROPE, |outside of the U.S.A., name comes from TRANSistor and comPUTER) and the |related companies: |* INMOS (from GB), now bought by SGS Thompson (French), who was the |inventor and sole manufacturer of transputers |* Parsytec (still alive, but does not use Transputers any more, Germany) |* Meiko (GB) produced the "computing surface" |* IBM had an internal project (codenamed VICTOR) |and there are many more. Transputers had a built in communication agent and |it was very easy to connect them together to form large message passing |machines in regular, fixed topologies. INMOS' idea was that they should be |the "transistors" (building blocks) of the new parallel computers. The Inmos transputer has earned a place in this file, now that SGS/Thomson has issued the last-time-buy warning (end 98, last deliveries end 99). The moral of this one is don't try to change everything at once (language, processing model, hardware) SIMD machines in general Thinking Machines Corp. ----------------------- Thinking Machines was founded by Danny Hillis to architect the concept of massively parallel (SIMD) computers. TMC sold over 100 systems called "Connection Machines" between 1989 and 1996. CM-2 up to 65,536 single-bit computers with FP accelerator CM-5 up to 1024 32-bit (SPARC) computers with vector accelerator. They went out of the computer business in 1996 and are still alive (barely) making data mining software. CM-1 CM-200 Special projects. MasPar ------ Now a data mining software company. SIMD mini-Connection Machine-1 also resold by DEC minus the lights and the black cabinet. KSR1/KSR2 --------- Homebase was Waltham, Ma, USA It was a non-descript, plain red brick building at the end of a long driveway past other office buildings. There was _no_ identifying signage, and no indication which door was the front door. Formed by some academic types from Cambridge, first office was actually on Kendall Square, hence the name. Strength: AllCache The goal is to have a logically shared memory in a scalable architecture. So you connect your processors, with their caches, to main memory. What does main memory do? 1. It gives you a bottleneck, and 2. It provides the value which any datum is assumed to have, and 3. It doubles the memory costs of your computer. Thus, if you can figure out how to do (2.), you can eliminate (1.) & (3.). The AllCache Engine solves (2.) as follows: Connect the processors together using a high-speed, unidirectional ring to give high bandwidth and allow all processor caches to stay coherent. The size of the ring was 34 = 32 + 2 nodes. Use 32 of the nodes for processors, and the other two for linking to other rings. Configure the rings in a hierarchial fashion using 32 processor rings as the base level, rings connecting to rings with 32 processors as the next higher level, rings connected to those rings as the next higher level, etc. Tell yourself that "data locality" means you'll rarely have a memory access go thru the higher levels of the rings. Viola- scalable shared memory. AllCache level 0 is a ring of (up to) 32 processors. AllCache level 1 is a ring whose nodes connect to the rings with 32 processors. Any KSR1 with more than 32 processors had level 1. Max is 32 * 34 = 1088 AllCache level >1 was never built, but was allowed by the architecture AllCache moved cache lines, which were 128 bytes. (Not to be confused with subcache lines which were 64 bytes.) Subcache size: 512 KB Cache size: 32 MB (i.e., "per processor") Level 0 AllCache: 1 GB ( 32 processors) Level 1 AllCache: 34 GB ( 34 * Level 0) KSR1 - 20 MHz processors, largest ever built was installed at BRL and was 384 processors. Sites included CTC, ORNL, GT, NCSC, UMi, UFl & a few more. KSR2 - 40 MHz processors --> but same speed as KSR1 memory system !?!? (Some of my sales friends say it worked. None of my sysadmin friends ever said it worked.) KSR3 - same as KSR1 & KSR2, but would use IBM PPC processors. Weakness: Implementation KSR made their own processors- 20 MHz w/ fused fadd/fmul instruction gave the luminal speed of 40 Mflops per. Two instruction streams- arithmetic & memory. IEEE fp. It was a 64 bit processor w/ 64 bit addresses in 1991. It had no speculative execution or branch prediction, etc. I/O ran thru the processor, and worked by "cycle stealing"- When the I/O subsystem wanted the processor to do something, it would stop instruction issue, and insert its own instructions in the memory-op instruction stream. AllCache latencies were approximate (_no_ memory time on the KSR1 was determinate, all were averages, too many microstates for the same macrostate) data item is somewhere in subcache - 2 clocks data item is somewhere in cache - 20 clocks data item is somewhere in your level 0 AllCache - 50 clocks data item is somewhere in your level 1 AllCache - 150 clocks subcache was two-way set associative with random replacement. cache was 16-way set associative with random replacement, but 4 of the 16 were tied down by the OS. The processor didn't have a scoreboard, and nobody really knew just exactly where, at any time, a data item might be located, so a subcache miss stalled the processor for _at least_ 10's of clocks. The bottom line was that the KSR1 was a difficult beast to program *for high efficiency*. The programmer had to keep in mind which subcache line a data item would use, which cache line a data item would use, all the while trying to make (typically vector) code have behavior resembling cache re-use. One thing which was supposed to help out was an instruction called "prefetch" which could move a data item to where it was needed prior to the actual data request. In Fortran, a prefetch looked like a function call (which the compiler would silently ignore). It didn't work in general, and who wants to code prefetches? Why not just go with message passing? Neatest thing: Free lunches in house. This saved the company a lot of time employees didn't spend driving to restaurants, talking shop where others could overhear, getting stuck in traffic, etc. It was a good meal (and I'm a fussy eater). Funniest thing: where other startups had newspaper clippings on the walls describing some Victory the company achieved, at KSR the most popular clipping was a gag article some local business reporter wrote on How Fast Startup CEO's Drove Their Fancy Sportscars on Rt128. Henry claimed 138 mph (speed limit 55 mph). Q: If AllCache was such a good idea, why did KSR die? A: They were caught inflating the revenue the company had actually received. They were sued by the stockholders, who were paid off largely in stock. The day after the court finalized the settlement, the company declared bankruptcy (the capital didn't give any more to management). Q: Why was KSR so secretive? A: AllCache is a simple idea, and it's not clear the patents would be upheld in court (post Myrias, SCI). Was laid off in true KSR style. Found out when my login didn't work anymore. Lessons Learnt: KSR: Don't assume that having troubles in the Numerical World means you're ready for the Transaction/Data Mining World (if you can't make an NSF computer center happy, how are you going to make a bank happy?); use, ahem, standard accounting practices, at least after you go public. Evans and Sutherland Computer Division ES-1 ------------------------------------------- E&S is well-known in the computer graphics community as making some of the finest high-performance real-time computer graphics hardware. This image generation hardware is used in $10-100M flight (and other) simulators. When it was announced that E&S was getting into the supercomputer arena, they were perceived a serious/credible new contender. Gordon Bell, however, takes a dimmer view of them. One representative machine is in storage at the Museum (over Gordon's dead body). ES-1 ---- Jean Yves LeClerk studied under Dave Evans and upon getting his degree, went back to France. When he got an idea of how to build a supercomputer, he came back to Evans for advice on how to raise capital to fund the project. Evans said "I won't tell you how to raise money- I'll fund you myself." Thus, Evans & Sutherland got a computer division, ESCD. It was located in Mountain View, Ca, USA, just off US 101. Formed ca 1986, product shipped 1989, near midnight at the end of the September (so shipped in the quarter promised). Basic idea: The building block was an 8 x 8 nonblocking crossbar, which could also connect to another similar crossbar without using one of the 8 x 8 connections. Use two crossbars to connect 16 processors to a memory system with 16 banks. Virtual memory, with translation done in the memory side of the crossbar (to allow faster context switches). The processor had a small TLB (512 ? page entries). Use another 8 x 8 crossbar to connect 8 of those together, and you have 128 processors in one system. Note that this scheme may be extended: 8 128 processor systems could be connected by another 8 x 8 crossbar, etc. Great for data-parallel, too bad there wasn't any HPF in 1988 :-( (ESCD played around with PCF, IIRC) The system was a (theoretically) scalable shared memory NUMA computer. ESCD had a unique nomenclature: the processors were called "computational units", and the set of 16 computational units and memory was called a "processor". Memory was 128 MB per "processor". Use MACH for your OS, so you'll have a "parallel" unix. You need the parallel file system to drive the very good I/O subsystem, which was rated at 200 MB/s per processor (1.6 GB/s per full system). Neatly finessing the issue of custom v. off-the-shelf processor, ESCD made their own processor, but used Weitek chips for the floating point. (This was back in the days when a processor was a "chip set", rather than the single die for sale today.) During development, the clock was 40 MHz, with plans to go to 50 Mhz by the time of production. 32-bit words, but the Weitek's would do 64-bit fp nicely. Luminal speed was about 12 Mflops per computatiuonal unit. Measured speeds (i.e., operands in memory rather than register) were more like 2-3 Mflops per. There were some unexpected problems with the pipelines in the processors: certain instructions couldn't be issued at particular clocks after issue of certian other instructions. The French called this "pipeline hazards", the Californians called it "cursed instruction sequences". It was a closely guarded secret, and caused ESCD to not release the instruction set, nor the assembler. Biggest problem with the design was memory access: processor to memory: return the physical address of this virtual address memory to processor: here's the paddress processor to memory: read/write data to/from this physical address memory to processor/processor to memory: here's the data So a memory op was actually 4 messages rather than 2 anytime the physical address wasn't in the processor's small TLB. Think Linpack. 512 pages just ain't supercomputin' Serials 1 and 2 went to Caltech and U. Colorado at Boulder (can't recall which got which). Up to about serial 7 or so were in some stage of production when the project ended. The ES-1 at CU Boulder was installed right beside, and during the same week as, the Myrias SPS-2. Head to head competition. (Myrias was in-and-out in a couple of days. ESCD needed a couple of weeks.) Neatest thing: Being within walking distance of Shoreline City Park, so a walk would clear the head of the frustrations of working on beta (alpha ?) HW & SW. Funniest thing: Culturally, ESCD couldn't take a meal together, because The French wouldn't eat without wine, Which the Mormons wouldn't touch. The Californians acted well-fed after mass quantities of pizza and beer, While the Sales team was out looking for a 3+ star restaurant to put on their expense reports. Mort d'ES-1 The project ended when Evans resigned from the Board of E&S. Then a 4/3 vote in favor of the project became a 3/4 vote against. We got 60 days notice (U.S. plant closing law), it was announced at the Supercomputing convention in Reno. (If you have to end my project, end it at the Supercomputing convention. Most of the places I might look for work are within walking distance. For the record, Henry Cornelius was the first headhunter to our booth after the announcement, within 3 or 4 minutes.) Lessons Learnt: ESCD: Don't push too many technologies simultaneously. Chips checked out on the silicon compiler, but were too dense (100K transistors in 1986) for the foundries to make at acceptable yield; MACH was not ready for commercial use in 1988. End ES-1 [ My thanks to the moderator for allowing me to contribute my reminiscences. Next time I work for another startup with another Great Idea, I'll take better notes... (I didn't know there was going to be a test ;-) ] Astronautics ------------ ZS-1: no vectors. Wisconsin Prisma ------ Colorado Springs. GaAs Convex software. Vitesse ------- Vitesse Electronics was a startup to do two things: GaAs chips (which survives as Vitesse Semiconductor), and a mini-supercomputer which never made it. The architecture of the Vitesse Numerical Processor (VNP) used a very deep pipeline, and attempted to bypass the latency problems that arose from the deep pipeline with a so-called data-driven optimizer (DADO). The machine did not have registers, but used three-address instructions directly involving memory. The DADO kept track of the interdependencies among source and destination addresses, and issued instructions when it could. The intent was to allow enough instructions in the DADO to cover two (or perhaps even more) data dependencies in tight loops without needing to stall the pipeline. The processor did not have any IO, but relied on a front-end to do all of the work, and run most of the UNIX operating system. System calls from the backend were relayed to the front-end. The initial machine was intended to be CMOS, with a view towards a later implementation in GaAs. The machine was intended as an MP machine, and had a very interesting interconnect. SW was used to establish mappings from a local processor's address space to so-called global virtual addresses. Similarly, global virtual addresses could be mapped to local addresses. The net effect was that the SW could establish a form of "carbon-copy" memory. Writes from one CPU to a local address would also show up, through the mappings, in a local address in one or more other processors. The mappings could be, but need not be, symmetric. The machine was designed far enough to have an assembler, a compiler, and an OS that booted, and even ran a [trivial] job in simulation, but the key chips were never fabbed. The Applied Dynamics AD100, and ECL-based multiprocessors with 65-bit (yes, 65, not 64) floating point, did 20 MFlops in 1981. There are a couple of hundred installations, or more, the majority of which are in California. The company was a University of Michigan Aerospace engineering department spinoff located in Ann Arbor, Michigan, and founded by three UM profs. Their focus was/is on real time applications, their system had lots of special hardware to interface to r/t equipment. The company still exists, although they are not selling many of these expensive machines any more, and they have a web site at http://www.adi.com. It had a minimal operating system, and in addition to Fortran supported their in-house parallel simulation language (ADSIM) derived from CSSL, for systems of odes. Q Is it true that supercomputer programmers spend their nights in flophouses? A Only when coming up on a deadline. Stan Lass