The Potential of Science With the Cell Processor

Please create an account to participate in the Slashdot moderation system

The Potential of Science With the Cell Processor 176

Posted by Zonk on Sunday May 28, 2006 @07:35AM from the making-a-station-play dept.

prostoalex writes "High Performance Computing Newswire is running an article on a paper by computer scientists at the U.S. Department of Energy's Lawrence Berkeley National Laboratory. They have evaluated the processor's performance in running several scientific application kernels, then compared this performance against other processor architectures. The full paper is available from Computer Science department at Berkeley."

This discussion has been archived. No new comments can be posted.

The Potential of Science With the Cell Processor

Load All Comments

Search 176 Comments Log In/Create an Account

Comments Filter:

Cell + Linux = success (Score:3, Funny)

by Anonymous Coward writes: on Sunday May 28, 2006 @07:37AM (#15419879)

OS X is closed source. This means that it is the work of the devil - its purpose is to make the end users eat babies.

Linux is the only free OS. Yes the BSD lincenses may appear more free, but as they have no restrictions, they are actually less free than the GPL. You see, restricting the end user more actually makes them more free than not putting restrictions on them. You must be a dumb luser for not understanding this.

And you obviously dont have a real job. A real job involves being a student or professional academic. You see, academics are the ones who know all about productivity - if you work for a commercial organisation you obviously do not know anything about computers. Usability is stupid. Whats wrong with the command line? If you cant use the command line then you shouldnt be using a computer. vi should be the standard word processor - you are such a luser if you want to use Word. Installing software should have to involve recompiling the kernel of the OS. If you dont know how to do this, you are a stupid luser who should RTFM. Or go to a Linux irc channel or newsgroup. After all, they are soooo friendly. If you dont know how the latest 2.6 kernel scheduling algorithm works then they will tell you to stop wasting their time, but they really are quite supportive.

Oh, and M$ is just as evil as Apple. Take LookOUT for instance. You could just as easily use Eudora. Who needs groupware anyway, a simple email client should be all we use (thats all we use as academics, why cant businesses be any different).

And trend setters - Linux is the trend setter. It may appear KDE is a ripoff from XP, but thats because M$ stole the KDE code. We all know they have GPL'ed code hidden in there somewhere (but not the things that dont work, only the things that work could possibly have GPL'ed code in it).

And Apple is the suxor because they charge people for their product. We all know that its a much better business model to give all your products away for free. If you charge for anything, then you are allied with M$ and will burn in hell.

Share
twitter facebook
What about the compiler? (Score:2, Insightful)

by Watson Ladd ( 955755 ) writes:

The paper did a lot of hand-optimization, which is irrelevent to most programmers. What gcc -O3 does is way more importent then what an assembly wizard can do for most projects.
- What about the programmer? (Score:5, Insightful)
  
  by Anonymous Coward writes: on Sunday May 28, 2006 @07:50AM (#15419919)
  
  "The paper did a lot of hand-optimization, which is irrelevent to most programmers. "
  
  But not to programmers who do science.
  
  "What gcc -O3 does is way more importent then what an assembly wizard can do for most projects."
  
  Not an unsurmountable problem.
  
  Parent Share
  twitter facebook
  - - Re:What about the programmer? (Score:3, Insightful)
      
      by zCyl ( 14362 ) writes:
      
      Hand optimization or writing portions of code in assembler is
      the last thing 85% of these people want to do. They don't want
      to be computing experts to do their science/research.
      
      When you're talking about reuseable modules like an FFT or matrix multiplication, then many scientists doing simulations would love to have a hand optimized FFT or matrix module to plug in as a simulation component. Even if they don't know a drop of assembly themselves, having the optimized module available can make a large differenc
- Re:What about the compiler? (Score:5, Insightful)
  
  by Anonymous Coward writes: on Sunday May 28, 2006 @07:55AM (#15419925)
  
  Hand optimization _is_ relevant to scientific programmers
  
  Parent Share
  twitter facebook
  - Re:What about the compiler? (Score:3, Insightful)
    
    by penguin-collective ( 932038 ) writes:
    
    Except for a tiny minority of specialists, most scientific programmers, even those working on large-scale problems, have neither the time nor the expertise to hand-optimize. Many of them don't even know how to use optimized library routines properly.
- Re:What about the compiler? (Score:5, Insightful)
  
  by TommyBear ( 317561 ) writes: <tommybear2@gmail.com> on Sunday May 28, 2006 @08:07AM (#15419945) Homepage
  
  Hand optimizing code is what I do as a game developer and I can assure you that it is very relevant to my job.
  
  Parent Share
  twitter facebook
  - Re:What about the compiler? (Score:1)
    
    by C.A. Nony Mouse ( 860026 ) writes:
    
    That games can be written to run well on Cell is not news. That the same might be true for scientific code is.
    - Re:What about the compiler? (Score:1, Insightful)
      
      by Anonymous Coward writes:
      
      Methinks that the point was that if a GAME development company is going to fork over the cash for ASM wizards, a company spending a few hundred mil. building a super-computer might just consider doing the same. Maybe.
      
      And I know from Uni that many profs WILL hand optimize code for complex, much used algorithms. Then again, some will just use matlab.
      - Re:What about the compiler? (Score:2)
        
        by SeeMyNuts! ( 955740 ) writes:
        
        "a company spending a few hundred mil. building a super-computer might just consider doing the same"
        
        Well, if they hire the typical contractor to do the work, $10 million goes towards the computer, $90 million goes towards a coffee service, and $300 million goes towards per diem.
  - Re:What about the compiler? (Score:4, Informative)
    
    by JanneM ( 7445 ) writes: on Sunday May 28, 2006 @10:09AM (#15420285) Homepage
    
    Hand optimizing code is what I do as a game developer and I can assure you that it is very relevant to my job.
    
    It makes sense for a game developer - and even more an embedded developer. You spend the time to optimize once, and then the code is run on hundreds of thousands or millions of sites, over years. The time you spend can effectively be amortized over all those customers.
    
    For scientific software the calculation generally changes. You write code, and that code is typically used in one single place (the lab where the code was written), and only run a comparatively few times, indeed sometimes only once.
    
    For a game developer to spend three months extra to shave a few seconds of one run of a piece of code makes perfect sense. For an embedded developer using a couple of months' worth of development cost to be able to use a slower, cheaper chip, shaving a dollar of the production of perhaps tens of millions of gadgets makes sense.
    
    For a graduate student (cheap as they are in the funny-mirror economics of science) to spend three months to make one single run of a piece of software run a few hours faster does not make sense at all.
    
    In fact, disregarding the inherent coolness factor of custom hardware, in most situations it just doesn't pay to make custom stuff for science when you can just run it for a little longer to get the same result. In fact, not infrequently have I heard about labs spending the time and effort to make custom stuff, but by the time they're done, the off the shelf hardware had already caught up.
    
    Parent Share
    twitter facebook
    - - Re:What about the compiler? (Score:2)
        
        by m874t232 ( 973431 ) writes:
        
        It's not wasted time if the time spent optimizing is less than the time saved.
        
        Wrong. A programmer hour is much more valuable than a machine hour.
        
        And this hasn't been lost on scientists and engineers--hence the popularity of software like MATLAB.
        
        Re:What about the compiler? (Score:3, Informative)
        
        by jericho4.0 ( 565125 ) writes:
        
        Maybe true on our computers, but not on supercomputers.
        
        Re:What about the compiler? (Score:3, Informative)
        
        by Tough Love ( 215404 ) writes:
        
        A programmer hour is much more valuable than a machine hour
        
        You forgot to take into account the team of scientists waiting for the machine to produce a result.
        
        Re:What about the compiler? (Score:2)
        
        by plalonde2 ( 527372 ) writes:
        
        Oddly, on the Cell, most of the optimization is low-level algorithmic stuff. Yes, assembly gets you that last little boost, but most of the Cell optimizations I've worked with (for the last 15 months or so) have been data movement and data decomposition exercises. Breaking your data into SPU-sized chunks, or into SPU-streamable chunks is the hard part. It's also the part compilers are *useless* for.
  - - Re:What about the compiler? (Score:2)
      
      by Shinobi ( 19308 ) writes:
      
      Well, that's where you're wrong. There are more people who hand-optimize than the academic world cares to admit, since admitting it would also mean admitting that the oh-so-sacred academic practices as well as compiler technology+libraries has some areas where they can't be applied efficiently.
      - Re:What about the compiler? (Score:2)
        
        by try_anything ( 880404 ) writes:
        
        I think you're mixing up CS theorists with the scientists and engineers who just want to crunch a bunch of numbers and get the answers. You'd like the latter group; they write horrible code and aren't ashamed of it.
        You know the saying, "You can write Fortran in any language?" Scientists judge a language by how easy it is to write Fortran in it. That's why C is their second-favorite language.
- Re:What about the compiler? (Score:3, Interesting)
  
  by suv4x4 ( 956391 ) writes:
  
  The paper did a lot of hand-optimization, which is irrelevent to most programmers. What gcc -O3 does is way more importent then what an assembly wizard can do for most projects.
  
  Actually bullshit. We're talking scientific applications here, and it's not uncommon that programs written to run on supercomputers *are* optimized by an assembly wizard to squeeze every cycle out of it.
  - Re:What about the compiler? (Score:1)
    
    by Watson Ladd ( 955755 ) writes:
    
    Most projects, not some superexpensive code. Sure, fast API's like BLAS will use hand-written asembler, but it takes a compiler to find those optimizations that are too complex to do by hand or hard to find while being easy to do. And the asembler advantage is negative on some RISC processors now due to advances in compiler design. So gcc -O3 might outpreform asm, so then gcc -O3 is relevant as nobody will want to use asm as gcc can outpreform it. But I haven't seen anything about how true this is for the C
    - Re:What about the compiler? (Score:2)
      
      by netwiz ( 33291 ) writes:
      
      it takes a compiler to find those optimizations that are too complex to do by hand or hard to find while being easy to do.
      
      Say what? Um, that type of optimization doesn't exist, unless the programmer is really untalented. Most of the big opportunities should stand out like a sore thumb on a trace. Once you know what's taking all the time in the code, you can look at the way it's put together to catch the low-hanging fruit. Generally, the first 10% of the work gets you 90% of the way there. Then there's
  - Re:What about the compiler? (Score:4, Informative)
    
    by adam31 ( 817930 ) writes: <adam31.gmail@com> on Sunday May 28, 2006 @12:42PM (#15420807)
    
    Actually bullshit.
    Actually, it's not bullshit. Simple C intrinsics code is the way to go to program the Cell... there's just no need for hand-optimized asm. Intrinsics has a poor rep on x86 because SSE sucks. 8 registers. A source operand must be modified on each instr, no MADD, MSUB, etc.
    But Cell has 128 registers and a full set of vector instructions. There's no danger of stack spills. As long as the compiler doesn't freak out about aliasing (which is easy), and it can inline everything, and you present it enough independent execution streams at once... the SPE compiler writes really, really nice code.
    The thing that does need to be hand-optimized still is the memory transfer. DMA can be overlapped with execution, but it has to be done explicitly. In fact, algorithms typically need to be designed from the start so that accesses are predictable and coherent and fit within ~180kb. (Generally, someone seeking performance would do this step long before asm code on any platform anyway...)
    
    Parent Share
    twitter facebook
- Re:What about the compiler? (Score:1)
  
  by maximthemagnificent ( 847709 ) writes:
  
  Hard science is exactly the sort of application that would employ an assembly programmer to optimize code.
- Re:What about the compiler? (Score:2, Informative)
  
  by Anonymous Coward writes:
  
  Insightful? Ah... no.
  
  Scientific users code to the bleeding edge. You give them hardware that blows their hair back and they will figure out how to use it. You give them crappy painful hardware (Maspar, CM*) that is hard to optimize for, then they probably won't use it.
  
  Assembly language optimization is not a big deal. Right now the biggest thing bugging me is that I have to rewrite a core portion of a code to use SSE, since SSE is so limited for integer support. As this is a small amount of work, and th
  - Re:What about the compiler? (Score:1)
    
    by Gromius ( 677157 ) writes:
    
    I'm a particle physicist. Our computing needs are insane but massively parrallel, basically the grid is being developed for us and us alone although we figure that some other people might find a use for it. We spend the fast majority of our day to day job programming. And we're, with only a few exceptions, piss poor at it. Forget hand optimized assembly, I'm currently fighting a losing battle to stop people using x = pow(y,2) (and I have found that in our base software package, one suposedly written by the
    - Re:What about the compiler? (Score:2)
      
      by statusbar ( 314703 ) writes:
      
      Jeez, that reminds me of the "Database Specialists" doing "SELECT * from mytable;" and then doing a java for() loop to find the rows they are interested in.. Then they complain about the database machine being too slow so they get it upgraded.
      
      How much do these new machines cost?
      
      How much does a competent programmer cost?
      
      Which one is the best option?
      
      --jeffk++
- Re:What about the compiler? (Score:5, Insightful)
  
  by samkass ( 174571 ) writes: on Sunday May 28, 2006 @10:08AM (#15420280) Homepage Journal
  
  What seems to be more important than that is:
  
  "According to the authors, the current implementation of Cell is most often noted for its extremely high performance single-precision (32-bit) floating performance, but the majority of scientific applications require double precision (64-bit). Although Cell's peak double precision performance is still impressive relative to its commodity peers (eight SPEs at 3.2GHz = 14.6 Gflop/s), the group quantified how modest hardware changes, which they named Cell+, could improve double precision performance."
  
  So the Cell is great because there's going to be millions of them sold in PS3's so they'll be cheap. But it's only really great if a new custom variant is built. Sounds kind of contradictory.
  
  Parent Share
  twitter facebook
  - Re:What about the compiler? (Score:3, Informative)
    
    by FromWithin ( 627720 ) writes:
    
    So the Cell is great because there's going to be millions of them sold in PS3's so they'll be cheap. But it's only really great if a new custom variant is built. Sounds kind of contradictory.
    
    Did you not read the last bit?
    On average, Cell is eight times faster and at least eight times more power efficient than current Opteron and Itanium processors, despite the fact that Cell's peak double precision performance is fourteen times slower than its peak single precision performance. If Cell were to include
  - Re:What about the compiler? (Score:2, Interesting)
    
    by cfan ( 599825 ) writes:
    
    >So the Cell is great because there's going to be millions of them sold in >PS3's so they'll be cheap. But it's only really great if a new
    >custom variant is built. Sounds kind of contradictory.
    
    No, the Cell is great because, as the pdf shows, it has an incredible Gflops/Power ratio, even in its current configuration.
    
    For example, here are the Gflops (double precision) obtained in 2d FFT:
    
    Cell+ Cell X1E AMD64 IA64
    
    1K^2 15.9 6.6 6.99 1.19 0.52
    2K^2 26.5 6.7 7.10
  - Re:What about the compiler? (Score:1)
    
    by Angstroman ( 747480 ) writes:
    
    So the Cell is great because there's going to be millions of them sold in PS3's so they'll be cheap. But it's only really great if a new custom variant is built. Sounds kind of contradictory.
    
    The HPC world is substantially different from either gaming or "normal" application programming. The strong draw of the cell is that it is a production core with characteristics that are important to High Performance Computing, particularly power dissipation per flop. While conventional applications target getting th
  - Re:What about the compiler? (Score:2)
    
    by fbg111 ( 529550 ) writes:
    
    It's a good start, a good platform upon which to expand. I would bet IBM would be willing to make Cell+, given their traditional involvement in scientific computing. But you left a key part out of your quote, that even in its current form, Cell appears to be 8x faster and more power efficient than current Opterons and Itaniums in double-precision calculations. Doubling that by making a few modifications to the silicon is probably not out of the question, though whether this would allow Cell+ the price re
  - Re:What about the compiler? (Score:2)
    
    by TopSpin ( 753 ) * writes:
    
    But it's only really great if a new custom variant is built.
    
    Cell had a specific problem domain to address during the design of the initial product. If Cell really is all that, there will be future revisions. These researchers are pointing out what is necessary to make Cell more viable to a broader base of users. They are putting themselves at the head of the line.
    
    They have evaluated the existing Cell, added their guesswork as to what could be done with modest changes and quantified the result relative to
- Re:What about the compiler? (Score:5, Interesting)
  
  by john.r.strohm ( 586791 ) writes: on Sunday May 28, 2006 @11:24AM (#15420522)
  
  Irrelevant to most C/C++ code wallahs doing yet another Web app, perhaps.
  
  Irrelevant to people doing serious high-performance computing, not hardly.
  
  I am currently doing embedded audio digital signal processing, On one of the algorithms I am doing, even with maximum optimization for speed, the C/C++ compiler generated about 12 instructions per data point, where I, an experienced assembly language programmer (although having no previous experience with this particular processor) did it in 4 instructions per point. That's a factor of 3 speedup for that algorithm. Considering that we are still running at high CPU utilization (pushing 90%), and taking into account the fact that we can't go to a faster processor because we can't handle the additional heat dissipation in this system, I'll take it.
  
  I have another algorithm in this system. Written in C, it is taking about 13% of my timeline. I am seriously considering an assembly language rewrite, to see if I can improve that. The C implementation as it stands is correct, straightforward, and clean, but the compiler can only do so much.
  
  In a previous incarnation, I was doing real-time video image processing on a TI 320C80. We were typically processing 256x256 frames at 60 Hz. That's a little under four million pixels per second. The C compiler for that beast was HOPELESS as far as generating optimal code for the image processing kernels. It was hand-tuned assembly language or nothing. (And yes, that experience was absolutely priceless when I landed on my current job.)
  
  Parent Share
  twitter facebook
  - Re:What about the compiler? (Score:4, Informative)
    
    by adam31 ( 817930 ) writes: <adam31.gmail@com> on Sunday May 28, 2006 @04:10PM (#15421537)
    
    I am also an experienced assembly programmer, and I too shared your mistrust of the compiler. However, I started SPE programming several months ago and I promise you that the compiler can work magic with intrinsics now. Knowledge of assembly is still helpful, because you need to have in mind what you want the compiler to generate... make sure it sees enough independent execution clumps that it can cover latencies and fill both the integer pipe and FP pipe, understand SoA vs AoS, etc. But you get to write with real variable names, not worry about scheduling/pairing of individual instructions or loop unrolling issues.
    Some of my best VU routines that I spent a couple weeks hand-optimizing, I re-wrote with SPE intrinsics in an afternoon. After some initial time figuring out exactly how the compiler likes to see things, it was a total breeze. My VU code ran in 700 usec while my SPE code ran in 30 usec (@ ~1.3 IPC! Good work, compiler).
    The real worry now is becoming DMA-bound. For example, assuming you're running all 8 SPEs full-bore, and you write as much data as you read. At 25.6 GB/s, you get 3.2 GB/s per SPE, so 1.6 GB/s in each direction (assuming perfect bus utilization), so @3.2 GHz, that's 0.5 Bytes/cycle. So, for a 16-byte register, you need to execute 32 instructions minimum or you're DMA-bound!
    Food for thought.
    
    Parent Share
    twitter facebook
- No, this is why we have subroutine libraries (Score:5, Interesting)
  
  by golodh ( 893453 ) writes: on Sunday May 28, 2006 @11:26AM (#15420527)
  
  Although I agree with your point that crafting optimised assembly language routines is way beyond most users (and indeed a waste of time for all but an expert) there are certain "standard operations" that
  (a) lend themselves extremely well to optimisation
  (b) lend themselves extremely well to incorporation in subroutine libraries
  (c) tend to isolate the most compute-intensive low-level operations used in scientific computation
  SGEMM
  If you read the article, you will find (among others) a reference to a operation called "SGEMM". This stands for Single precision General Matrix Multiplication. This is the sort of routines that make up the BLAS library (Basic Linear Algebra Subprograms) (see e.g. http://www.netlib.org/blas/ [netlib.org]). High performance computation typically starts with creating optimised implementation of the BLAS routines (if necessary handcoded at assembler level), sparse-matrix equivalents of them, Fast Fourier routines, and the LAPACK library.
  ATLAS
  There is a general movement away from optimised assembly language coding for the BLAS, as embodied in the ATLAS software package (Automatically Tuned Linear Algebra Software; see e.g. http://math-atlas.sourceforge.net/ [sourceforge.net]). The ATLAS package provides the BLAS routines but produces fairly optimal code on any machine using nothing but ordinary compilers. How? If you run a makefile for the ATLAS package, it may take about 12 hours (depending on your computer of course; this is a typical number for a PC) or so to compile. In this time the makefile will simply run through multiple switches and for the BLAS routines and run testsuites for all its routines for varying problem sizes. And then it picks the best possible combination of switches for each routine and each problem size for the machine architecture on which it's being run. In particular it takes account of the size of caches. That's why it produces much faster subroutine libraries than those produced by simply compiling e.g. the BLAS routines with an -O3 optimisation switch thrown in.
  Specially tuned versus automatic?: MATLAB
  The question is of course: who wins? Specially tuned code or automatic optimisation? This can be illustrated with the example of the well-known MATLAB package. Perhaps you have used MATLAB on PC's, and wondered why its matrix and vector operations are so fast? That's because for Intel and AMD processors it uses a specially (vendor-optimised) subroutine library (see http://www.mathworks.com/access/helpdesk/help/tech doc/rn/r14sp1_v7_0_1_math.html [mathworks.com]) For SUN machines, it uses SUN's optimised subroutine library. For other processors (for which there are no optimised libraries) Matlab uses the ATLAS routines. Despite the great progress and portability that the ATLAS library provides, carefully optimised libraries can still beat it (see the Intel Math Kernel Library at http://www.intel.com/cd/software/products/asmo-na/ eng/266858.htm [intel.com])
  Summary
  In summary:
  -large tracts of Scientific computation depend on optimised subroutine libraries
  -hand-crafted assembly-language optimisation can still outperform machine-optimised code.
  Therefore the objections that the hand-crafted routines described in the article distort the comparison or are not representative of real-world performance are invalid.
  However ... it's so expensive and difficult that you only ever want to do it if you absolutely must. For scientific computation this typically means that you only consider handcrafting "inner loop primitives" such as the BLAS routines, FFT's, SPARSEPACK routines etc. for this treatment, and that you just don't attempt to do that yourself.
  Read the rest of this comment...
  
  Parent Share
  twitter facebook
- Re:What about the compiler? (Score:2)
  
  by Frumious Wombat ( 845680 ) writes:
  
  Actually, for my field (Chemistry), what GCC -O3 does is irrelevant, except during the development phase of a program, or as a last resort for portability. We care about what the fastest native compiler we can find + optimized libraries does. The Cell will be no different; a few hand-optimized routines such as BLAS, FFTPack, etc, in libraries, then an auto-vectorizing Fortran-95 compiler on top. I will be interested in seeing how packages such as GAMESS or NWChem http://www.emsl.pnl.gov/docs/nwchem/nwche [pnl.gov]
- Re:What about the compiler? (Score:2)
  
  by Wolfier ( 94144 ) writes:
  
  What the academics does with a new technology by hand is often what makes things you do daily, like -O3, possible.
  Sometimes people DO use published research results to construct compilers.
  
  -O3 is more important when the optimization is just a mean to an end - however, when optimization is an end itself, it's easy to see the value of disciplined hand tuning.
Doesn't it easily scale up? (Score:2, Interesting)

by Poromenos1 ( 830658 ) writes:

Doesn't the Cell's design mean that it can very easily scale up, without requiring any changes in the software? Just add more computing CPUs (SPEs they are called, I think?) and the Cell runs faster without changing your software.

I'm not entirely sure of this, can someone corroborate/disprove?
- Re:Doesn't it easily scale up? (Score:2)
  
  by owlstead ( 636356 ) writes:
  
  Yes, if there isn't any communication overhead between the processors. If you have 100 seperate threads or processes, without (or almost without) any computation, then the application is perfect for multiple CPU's. If there is a lot of communication needed, then much less so. You cannot write an application for 8 cores with very fast communications and expect it to run on multiple processors without any modifications. That's why many parallel processor designs cost more for the networking part than for the
  - Hmm (Score:2)
    
    by Poromenos1 ( 830658 ) writes:
    
    Yes, but the Cell is designed to process data in independent packages which are scheduled and sent to processors by the central unit, it's not a traditional multiprocessor system. Hmm, I guess that from the specs the processors could be communicating via the network instead of just buses as well, which would make what you say correct. I guess we should wait and see.
    - 'designed', nothing (Score:2)
      
      by Szplug ( 8771 ) writes:
      
      All MP machines have: communication channels, and processors. If the designers envisioned it being used a certain way and optimized it for that, well, what of it? Maybe that's how the standard game API does things but, it's still processors and communication channels. It's more than likely you can get better performance out of it by adapting your problem for it specifically, minimizing communication and keeping processors busy as much as the problem allows, same as for all other MP systems.
    - Re:Hmm (Score:2)
      
      by owlstead ( 636356 ) writes:
      
      The cell architecture makes it easy to distribute workloads, that's true. But that's just the beginning of solving the parallel puzzle. The trick is to spread the workload in such a way that the communication overhead is minimal. Otherwise, it may be wiser to use a different architecture. My guess is that the cell processor is interesting to grid computing, but needs a serious platform, both hardware and software-wise to be viable for the more serious work. On the other hand, IBM should be big enough to han
  - Re:Doesn't it easily scale up? (Score:2)
    
    by jacksonj04 ( 800021 ) writes:
    
    It should be best suited to things needing concurrent, but not parallel processing. For example you could be running several simulations at once, none of which are interdependent. When one is done, the processor can be handed another instruction without needing to wait for the results from everything else.
    
    The code will be the tricky bit.
Not likely to be low cost CPUs (Score:2)

by maraist ( 68387 ) * writes:

An interesting point is that most consoles sell their hardware at a loss. At least the XBox does. This means that there is no guarantee that IBM is willing to sell their CPUs at the same price that one would believe they cost for the PS3.

Moreoever, the scientific community is very likely to push their cell+ architecture and I'm sure IBM would be more than happy to help... For a massive price.

So, when building an HPC system, you're likely to work around the best architecture (the more expensive cell+), and
- WTF? (Score:5, Insightful)
  
  by SmallFurryCreature ( 593017 ) writes: on Sunday May 28, 2006 @08:57AM (#15420057) Journal
  
  First off you are talking about consoles being sold at a loss. NOT their components.
  IF IBM was the maker of the chip they would most certainly not sell them at a loss. Why should they? Sony might sell the console at a loss to recoup the loss from game sales but IBM has no way to recoup any losses.
  Then again IBM is in a parnetship with Sony and Toshiba so the chip is probaly owned by this partnership and Sony will just be making the chips it needs itself.
  So any idea that IBM is selling Cells at a loss is insane.
  Then the cost of the PS3 is mostly claimed to be in the Blu-ray drive tech. Not going to be off much intrest to a science setup is it? Even if they want to use a blu-ray drive they need just 1 in a 1000 cell rig. Not going to break the bank.
  No the cell will be cheap because when you run an order of millions of identical cpu's prices drop rapidly. There might even be a very real market for cheap cells. Regular CPU's always have lesser quality versions. Not a problem for an intel or AMD who just badge them celeron or whatever but you can't do that with a console processor. All cell processors destined for the PS3 must be off similar spec.
  So what to do with a cell chip that has one of the cores defective? Throw it away OR rebadge it and sell it for blade servers? That is were celerons come from (defective cache)
  We already know that the cell processor is going to be sold for other purposes then the PS3. IBM has a line of blade servers coming up that will use the cell.
  No I am afraid that it will be perfectly possible to buy Cells and they will be sold at a profit just like any other cpu. Nothing special about it. they will however benefit greatly from the fact that they already got a large customer lined up. Regular CPU's need to recover their costs as quickly as possible because their success will be uncertain. This is why regular top end cpu's are so fucking expensive. But the Cell allready has an order for millions, meaning the costs can be spread out in advance over all those units.
  
  Parent Share
  twitter facebook
  - Re:WTF? (Score:4, Insightful)
    
    by Kjella ( 173770 ) writes: on Sunday May 28, 2006 @10:03AM (#15420261) Homepage
    
    So what to do with a cell chip that has one of the cores defective? Throw it away OR rebadge it and sell it for blade servers?
    
    Use it. Seriously, that's why there's central + 7 of them, not 8. One is actually a spare so that unless it's either flawed in the central logic or two separate cores, the chip is still good. Good way to keep the yields up...
    
    Parent Share
    twitter facebook
  - Re:WTF? (Score:2)
    
    by epiphani ( 254981 ) writes:
    
    So what to do with a cell chip that has one of the cores defective? Throw it away OR rebadge it and sell it for blade servers? That is were celerons come from (defective cache)
    
    Actually, the cell has 8 SPU's on die. It only utilizes seven, specifically to handle the possibility of defective units. They throw the extra SPU on there to increase yields.
    - Re:WTF? (Score:2)
      
      by jericho4.0 ( 565125 ) writes:
      
      The Cell in the PS3 has 7 SPEs. The Cell as used in other places will likely have the full 8 available.
- Re:Not likely to be low cost CPUs (Score:2)
  
  by Oswald ( 235719 ) writes:
  
  Doesn't sound right. IBM isn't taking a loss on PS3 hardware. If anybody is, it's Sony, and they would be subsidizing the volume that would allow IBM to sell the chip (relatively) cheaply.
- Re:Not likely to be low cost CPUs (Score:2)
  
  by WindBourne ( 631190 ) writes:
  
  I just don't believe this "low cost" "high volume" statement.If not, then you are about the only one. Simply look at the top500.org to see what low cost,high volume produces. My bet is that IBM is using sony to get to high volume rather quickly. After that point, they will start using this in a number of their own systems. And you can bet that this will form the foundation of a very very fast parallel arch for top500. I also expect to see it upgraded to cell+ quickly.
  - Re:Not likely to be low cost CPUs (Score:2)
    
    by maraist ( 68387 ) * writes:
    
    I just don't believe this "low cost" "high volume" statement.If not, then you are about the only one.
    
    Well, I'm just saying, I wouldn't bet money on IBM coming out with their cell in a high volume enough way to provide ultra-low pricing as in the PowerPC or obviously x86 markets. History has shown time and time again, that innovation is not what is important, dominance is. Alpha had a superb chip but was in no way marketable. Apple has always had a better design in computer hardware, but will likely nev
Lattice QCD people: (Score:2)

by ettlz ( 639203 ) writes:

Isn't Cell similar to things like QCDOC (from what my LQCD colleagues tell me, it's based on PowerPC, but are there similarities in the wider architecture, interconnects, etc.)? Have any plans to use it here?
- Re:Lattice QCD people: (Score:1)
  
  by Watson Ladd ( 955755 ) writes:
  
  A little bit. The big difference is the Cell has SPE's which are like DSP's on the chip which are controlled by a PPC processor. QCDOC is a lot of PPC processors connected similarly. Also, memory is symmetric on QCDOC, while it is asymmetric on the Cell. The similarity is mostly in the kind of bus used. Think about one QCDOC node connected to seven QCDSP nodes and only the QCDOC node having a lot of memory and you will have the right idea. Ars Technica had a good review of the Cell.
- Re:Lattice QCD people: (Score:2)
  
  by Quiberon ( 633716 ) writes:
  
  QCDOC people ended up making BlueGene
Not the real issue (Score:1, Offtopic)

by argoff ( 142580 ) writes:

The real issue here has nothing to do with the performance and capabilities of the cell processor. The real issue is, can I make a copy, contract out my own fab, and make it without anyone elses permission. If I can, then it will be successfull, if I can't then it is just another proprietary technology that won't give the end user any real advantage over the long term - and thus no real reason to switch from more commoditized technologies.
When can we start Folding with it? (Score:1)

by BartonOC ( 977544 ) writes:

Sounds like this cpu would end up having great folding performance. I so hope the PS3 ends up being hackable and we get to throw Linux on it ;-)
- Re:When can we start Folding with it? (Score:1)
  
  by ahodes1 ( 880242 ) writes:
  
  Linux will be pre-installed on the PS3 HDD, no hacking needed: http://www.gamasutra.com/php-bin/news_index.php?st ory=9290 [gamasutra.com]
  - - Re:When can we start Folding with it? (Score:1)
      
      by Xymor ( 943922 ) writes:
      
      E3: Kawanishi Talks Homebrew Linux PS3 Development [gamasutra.com] there's also some talks on idie game development, just google PS3 + Linux
The ball is in the hands of developpers. (Score:2, Insightful)

by stengah ( 853601 ) writes:

The fact is that most scientists use high-level software (MATLAB, Femlab, ...) to do their simulations. Altough theses scientists may be interested by any potential speed-up to their workflow, they are not willing to invest any bit of their time to translate all their codebase to asm-optimized C. Thus, the ball is in the hands of software developpers, not scientists.
- Re:The ball is in the hands of developpers. (Score:4, Informative)
  
  by infolib ( 618234 ) writes: on Sunday May 28, 2006 @11:09AM (#15420465)
  
  The fact is that most scientists use high-level software (MATLAB, Femlab, ...) to do their simulations.
  
  Indeed, most scientists. They also know very little about profiling but since the simulation is used only maybe a hundred times that hardly matters.
  
  The cases we're talking about here are where thousands of processors grind the same program (or evolved versions of it) for years as the terabytes of data roll in. Such is the situation in weather modelling, high energy physics and several other disciplines. That's not a "program" in the usual sense, but rather a "research program" occupying a whole department including everyone from "domain-knowledge" scientists down to some very long haired programmers who will not shy away from a bit of ASM. If you're a developer good at optimization and parallellism there might just be a job for you.
  
  Parent Share
  twitter facebook
- Re:The ball is in the hands of developpers. (Score:2)
  
  by Surt ( 22457 ) writes:
  
  In the article they mentioned that they had ported several scientific kernels to cell, so presumably the porting work isn't going to be the core of the challenge. It sounds like the real work to be done will be convincing sony to make modifications to the next generation of cell processors to improve the double precision performance.
- Femlab? (Score:2)
  
  by colinrichardday ( 768814 ) writes:
  
  Did you mean Fermilab, or am I not keeping up with scientific progress? :-)
- Re:The ball is in the hands of developpers. (Score:2)
  
  by ceoyoyo ( 59147 ) writes:
  
  Those scientists are NOT high performance computing scientists.
  
  I do a bit of HPC. I wouldn't touch Matlab with a ten foot pole. Of course, I wouldn't touch Matlab with a ten foot pole for non-HPC stuff either.
Ease of Programming? (Score:3, Interesting)

by MOBE2001 ( 263700 ) writes: on Sunday May 28, 2006 @09:56AM (#15420235) Homepage Journal

FTA: While their current analysis uses hand-optimized code on a set of small scientific kernels, the results are striking. On average, Cell is eight times faster and at least eight times more power efficient than current Opteron and Itanium processors,

The Cell processor may be faster but how easy is it to implement an optimizing development system that eliminates the need to hand-optimized the code? Is not programming productivity just as important as performance? I suspect that the Cell's design is not as elegant (from a programmer's POV) as it could have been, only because it was not designed with an elegant software model in mind. I don't think it is a good idea to design a software model around a CPU. It is much wiser to design the CPU around an established model. In this vein, I don't see the cell as a truly revolutionary processor because, like every other processor in existence, it is optimized for the algorithmic software model. A truly innovative design would have embraced a non-algorithmic, reactive, synchronous model, thereby killing two birds with one stone: solving the current software reliability crisis while leaving other processors in dust in terms of performance. One man's opinion.

Share
twitter facebook
- Re:Ease of Programming? (Score:2)
  
  by adam31 ( 817930 ) writes:
  
  I suspect that the Cell's design is not as elegant (from a programmer's POV) as it could have been, only because it was not designed with an elegant software model in mind.
  It's possible that this is the case, however IBM is actively working on compiler technology [ibm.com] to abstract the complexity of an unshared memory architecture from developers whose goal isn't to squeeze the processor:
  When compiling SPE code, the compiler identifies data references in system memory that have not been optimized by using ex
- Re:Ease of Programming? (Score:2)
  
  by Chris Snook ( 872473 ) writes:
  
  I suspect that the Cell's design is not as elegant (from a programmer's POV) as it could have been, only because it was not designed with an elegant software model in mind. I don't think it is a good idea to design a software model around a CPU. It is much wiser to design the CPU around an established model. In this vein, I don't see the cell as a truly revolutionary processor because, like every other processor in existence, it is optimized for the algorithmic software model. A truly innovative design woul
  - That's why F0rtran really doesn't matter here (Score:2)
    
    by billstewart ( 78916 ) writes:
    
    There's a lot of scientific programming that's complex, but a lot of it really involves doing lots of setup and transformation twiddling that hands big chunks of data to a standard package like a matrix multiplier or a Fourier Transformer or Linear Programmer etc. that really burns most of the CPU cycles. Or maybe you're doing graphics and it's a ray tracer / shader / lighter / etc., but you've still got one side of your program that's harder-to-parallelize complexity and another that's just raw standard
- Re:Ease of Programming? (Score:2)
  
  by zCyl ( 14362 ) writes:
  
  Is not programming productivity just as important as performance?
  
  When you're talking about scientific computations which can sometimes take a month or more to do one run, then suddenly it can become worth it to sacrifice a bit of programmer time if it can make a substantial increase in performance. If you can do a run in a week instead of a month, then that makes a huge difference in what you can investigate. Often it's not a question of just buying more machines because sometimes you need to know the ans
- Re:Ease of Programming? (Score:2)
  
  by jthill ( 303417 ) writes:
  
  how easy is it to implement an optimizing development system that eliminates the need to hand-optimize the code?
  
  Not much payoff optimizing development systems for slow hardware. Cray tout the X1E as offering "Unrivalled Vector Processing and Scalability for Extreme Performance" [cray.com]. These guys smoked one for dinner, woke up the next day, rebuilt their code from the ground up a completely different way and smoked it again for lunch.
  It took them a month to figure out how to do that, on maybe $3K worth
- Re:Ease of Programming? (Score:2)
  
  by Lars T. ( 470328 ) writes:
  
  The Cell processor may be faster but how easy is it to implement an optimizing development system that eliminates the need to hand-optimized the code? [...] I suspect that the Cell's design is not as elegant (from a programmer's POV) as it could have been, only because it was not designed with an elegant software model in mind.
  Hunh? From a (assembler) programmer's POV we have something close to AltiVec/VMX vs. x86 and EPIC - and you ask which is easier?
And why Apple going Intel was so sad (Score:1, Insightful)

by Anonymous Coward writes:

x86, the commodity, has registers from the days when RAM was faster than the CPU (ie 8-bit days)

The tacked on FPU, MMX, SSE SIMD stuff whilst welcome still leaves few registers for program use

The PowerPC on the otherhand has a nice collection of regs, and as good if not better SIMD--The CELL goes a big step further

More regs = more varibles in the CPU = higher bandwidth of calculation
be they regular regs or SIMD regs.
That plus the way it handles cache
Could be a pig to program without the right kind o
bang, buck, effort (Score:4, Informative)

by penguin-collective ( 932038 ) writes: on Sunday May 28, 2006 @10:35AM (#15420361)

Over the last several decades, there have been lots of parallel architectures, many significantly more innovative and powerful than Cell. If Cell succeeds, it's not because of any innovation, but because it contains fairly little innovation and therefore doesn't require people to change their code too much.

One thing that Cell has that previous processors didn't is that the PS3 tie-in and IBM's backing may convince people that it's going to be around for a while; most previous efforts suffered from the problem that nobody wanted to invest time in adapting their code to an architecture that was not going to be around in a few years anyway.

Share
twitter facebook
single threaded vs multithreaded (Score:1)

by abigsmurf ( 919188 ) writes:

I thought the Cells performance was mediocre if you only had a single task going on at a time. Given that scientific simulations aren't real time, it doesn't need to be hugely multithreaded as it's better for each tick/frame/etc of the simulation to be done one after the other.
- Re:single threaded vs multithreaded (Score:2)
  
  by be-fan ( 61476 ) writes:
  
  1) Cell's performance is mediocre on typical single-threaded applications (eg: AI). Not because it has inherently bad single-threaded performance, but because most single-threaded code happens to be integer code, and the SPE's integer and branching performance sucks.
  
  2) Most simulations are highly parallel. There are lots of cases where you can simulate many parts of the system simultaniously, and only synchronize state at certain points.
Ran simulations, not code (Score:5, Insightful)

by jmichaelg ( 148257 ) writes: on Sunday May 28, 2006 @11:41AM (#15420570) Journal

Lest anyone think they actually ran "several scientific application kernels" on the Cell/AMD/Intel chips, what they actually did was run simulations of several different tasks such as FFT and matrix multiplication. Since they didn't actually run the code, they had to guess as to some parameters like DMA overhead. They also came up with a couple of hypothetical Cell processors that dispatched double precision instructions differently than how the Cell actually does it and present those results as well. They also said that IBM ran some prototype hardware that came within 2% of their simulation results, though they didn't say which hypothetical Cell the prototype hardware was implementing.
By the end of the article, I was looking for their idea of a hypothetical best-case pony.

Share
twitter facebook
- Re:Ran simulations, not code (Score:2)
  
  by the_ed_dawg ( 596318 ) writes:
  
  Lest anyone think they actually ran "several scientific application kernels" on the Cell/AMD/Intel chips, what they actually did was run simulations of several different tasks such as FFT and matrix multiplication.
  
  Simulation makes computer architecture research possible because researchers don't have access to prototype hardware. If we insisted that all experiments run on real hardware, the only people who could possibly do research are Intel, AMD, and IBM because they have access to the fab and mask
  - Re:Ran simulations, not code (Score:3, Insightful)
    
    by Sycraft-fu ( 314770 ) writes:
    
    Hey it makes a real difference. There's a great quote that shows up on /. from time to time that goes along the lines of "The difference between tehory and reality is that in theory there's no difference but in reality there is."
    
    Researchers are very good at simulating things that have little or nothing to do with reality. It all looks good in theory according to their formulas, but they fail to take something in to account. As an example take the defunct Elbrus E2K computer chip. It was supposed to be an aw
    - Re:Ran simulations, not code (Score:2)
      
      by adam31 ( 817930 ) writes:
      
      Sycraft-fu, I understand your skepticism, and I think it's a unfortunate that they didn't publish physical timings. Your post has 3 main points: 1) Their simulations don't factor in something that will account for additional slow-down, 2) Their compilers aren't adapted, and that will contribute to slowdown. 3) Realistic improvements are incremental.
      1) The Cell is actually a pretty simple architecture. Once memory is transfered to SPE local store, performance is deterministic within a fraction of a %.
      - Re:Ran simulations, not code (Score:2)
        
        by Sycraft-fu ( 314770 ) writes:
        
        No I can't point out for sure the major bottleneck, I don't claim to be a chip engineer. However I can point out one that might not have been considered by the simulation: The registers. While tons o' registers sounds like nothing but a boon, you have to remeber that on any system you are likely to see today, you are going to be running a multi-tasking OS. Well, that of course means every time the OS switches tasks, all the registers need to be saved, so the task can resume properly when it switches back. N
        
        Re:Ran simulations, not code (Score:2)
        
        by adam31 ( 817930 ) writes:
        
        The point about context switching is a good one. Not only do all the registers need to be saved, but the entire 256 kb of local store! That's a hugely non-trivial feat, but I think performance applications will be written to avoid context switches entirely.
        The RAM is XDR. The IOIF (to talk to other Cells) connection is 2 FlexIO ports. The bus itself (called the EIB [ibm.com]) is something like 300 GB/s. I agree that peak is never achievable, but it should be possible to get around 18 GB/s or so.
        
        Re:Ran simulations, not code (Score:2)
        
        by Sycraft-fu ( 314770 ) writes:
        
        The problem I see with the "let's just not context switch" idea is how do you do that, barring using the chip is a dedicated DSP? If you want to use it as a CPU, it's going to context switch. A lot. That's just how it works on a modern OS. If nothing else, the kernel wants to check on things perodicly. I don't know how often most OSes reenter their kernel, but I'd bet it's multiple times per second. Then of course there's the hardware. Every time the hardware needs attention, which is again multiple times p
        
        Re:Ran simulations, not code (Score:2)
        
        by jthill ( 303417 ) writes:
        
        Full-system emulators are just that. They model bus contention and DRAM refresh and everything else. If anything at all shows up in the actual hardware that those emulators didn't predict, the engineers figure it out and fix it; they don't like not understanding the hardware they're building, and IBM aren't the only ones who've been doing things like this for a while now.
        The LBNL guys started with a simple model. Their model generally predicted performance within 2% of what the full emulator said. It
        
        Re:Ran simulations, not code (Score:2)
        
        by ivan256 ( 17499 ) * writes:
        
        However I can point out one that might not have been considered by the simulation: The registers. While tons o' registers sounds like nothing but a boon, you have to remeber that on any system you are likely to see today, you are going to be running a multi-tasking OS. Well, that of course means every time the OS switches tasks, all the registers need to be saved, so the task can resume properly when it switches back. Not a big deal if you are saving the 30 or so registers more processors have. Gets to be a
        
        Re:Ran simulations, not code (Score:2)
        
        by Bert64 ( 520050 ) writes:
        
        Well, actually on highend servers the memory will still be faster overall due to a number of things:
        
        Interleaving
        NUMA (one memory controller per cpu)
        Wider memory bus width
    - Re:Ran simulations, not code (Score:2)
      
      by egghat ( 73643 ) writes:
      
      Elbrus may have "failed" because market leader Intel chose to buy [xbitlabs.com] them.
      
      Bye egghat.
14 times slower vs 8 times faster (Score:2)

by Kell_pt ( 789485 ) writes:

On average, Cell is eight times faster and at least eight times more power efficient than current Opteron and Itanium processors, despite the fact that Cell's peak double precision performance is fourteen times slower than its peak single precision performance.
So, that means that the cell in it's current design is 14/8= 1.75x times slower for double precision than an Opteron/Itanium is for single precision. I searched around byt couldn't find a good answer on what is the ratio between an Opteron/Itanium s
- Re:14 times slower vs 8 times faster (Score:2)
  
  by be-fan ( 61476 ) writes:
  
  The Opteron/Itanium's SP/DV performance is about the same.
  
  And you misread the statement. It said that Cell was 8 times faster than Opteron in DP.
  - Re:14 times slower vs 8 times faster (Score:2)
    
    by Kell_pt ( 789485 ) writes:
    
    Aye, seems I misunderstood, thanks. That "despite" word in there makes a difference. :)
    Still, it would seem that Cell is 1.75x (14/8) times slower for double precision (although on average it's 8x times faster (which makes sense, because its single precision speed is enough to raise the average).
Benchmark (Score:1)

by roadrouter ( 953107 ) writes:

I don't understand how they can compare the new Cell with a amd64 or an Itanuim and be so happy.
Cell have 8 vector processor and something like a ppc to "control" all of them, it's done specially for FP operations. It's like a comparation of a GPU with a CPU, it haven't got so much sense.
Ignore everything important? (Score:3, Interesting)

by Duncan3 ( 10537 ) writes: on Sunday May 28, 2006 @01:26PM (#15420953) Homepage

I love how they manage to completely ignore all the other vector-type architectures already in the market, and just compare it to Intel/AMD which are not even designed for floating point performance.

Scream "my computer beats your abacus" all you want.

But then it is from Berkeley, so that's normal. ;)

Share
twitter facebook
- Re:Ignore everything important? (Score:2)
  
  by jthill ( 303417 ) writes:
  
  I have to wonder whether the poster, the modder or both are actively committing slashdot self-parody, because this is just screamingly funny.
not a fair comparison (Score:2, Insightful)

by MonaLisa ( 190059 ) writes:

The authors discuss hand tuning and assembler coding for Cell, but not necessarily for the other processors. Their 2D FFT results, for example, are a factor a 10 slower than others I have seen. Also, for the IA64 and Opteron, the performance many of these numerical kernels are highly dependent on the compiler used. The IA64 especially is very sensitive to compiler optimization to keep the 6 pipeline slots busy and also generate memory prefetch instructions at the right time to prevent stalling. As often
- Re:Xbox 2 is a "commodity" (Score:1, Offtopic)
  
  by Adult film producer ( 866485 ) writes:
  
  word,
  
  John Carmack on PS3 vs 360 [youtube.com]
  
  Metal Gear 4 demo vid.. 8 or 9 mins long, very cool. [youtube.com]
  - Re:Xbox 2 is a "commodity" (Score:2)
    
    by PhotoBoy ( 684898 ) writes:
    
    Except neither of those links point to anything that proves the Cell is good for High Performance Computing which is the point of the article. This isn't anything to do with 360 vs PS3. If MS wanted to design a CPU that could be scaled up for HPC they would have done, instead they just got IBM to customise a PPC chip for their games console because their goal is dominance in the living room, not to become the next Intel.
    
    To be honest I question the validity of this study anyway, I seem to recall lots of pape
  - Re:Xbox 2 is a "commodity" (Score:1, Offtopic)
    
    by vertinox ( 846076 ) writes:
    
    Wow, if nothing else the MGS4 demo has left me jaw dropped. That is some friggin high poly count. I was kind of doubtful of the PS3 thinking it would be just a Xbox 360, but that video looked awesome.
    
    (Although, I dunno if it is still worth the price tag though)
    - Re:Xbox 2 is a "commodity" (Score:3, Informative)
      
      by Darkfred ( 245270 ) writes:
      
      Did Sony pay you or did Mr. Kutaragi come over to your house and type it for you.
      
      Have you seriously never seen anything like this before? As a professional ps2/360/ps3 developer I have to say that I was seriously underwhelmed by this demo. Every one of the effects has been used before. THe original xbox has every effect he mentioned. And HL2 has a significantly more complex lighting system and postprocessing effects.
      The demo appears to be a single high-poly character in a texture mapped box. The demoer admi
      - Re:Xbox 2 is a "commodity" (Score:2)
        
        by Darkfred ( 245270 ) writes:
        
        I will try to clear up a little of your confusion.
        
        > You assumed that the MGS4 trailer was pre-rendered cutscene,
        > that obviously shows that you have little knowledge of the PlayStation
        > and MGS. MGS has NEVER used pre-rendered cutscenes.. blah blah blah
        
        I never said it was prerendered. You simply misunderstood they way these things work. In-game cut scenes use different models than the regular game. That is because the artists need more detailed control of the animations. They can be much more compl
- Re:Xbox 2 is a "commodity" (Score:3, Insightful)
  
  by MooUK ( 905450 ) writes:
  
  I think you misunderstand what HPC actually is.
  
  High performance computing is that which you'd want to throw a huge Beowulf cluster at, or possibly a supercomputer or twenty. Not three small pathetic cores.
  - Re:Xbox 2 is a "commodity" (Score:1)
    
    by Anonymous Coward writes:
    
    "We also conclude that Cell's heterogeneous multi-core implementation is inherently better suited to the HPC environment than homogeneous commodity multi-core processors."
    
    Whether or not HPC is something you'd want to throw 20 or more supercomputers at in a Beowulf cluster, at least you know that the PS3 is really the only next-generation video game system because nobody concerned with raw performance and power efficiency would want to use the Xbox 2 in a HPC environment.
    - Re:Xbox 2 is a "commodity" (Score:2)
      
      by KitesWorld ( 901626 ) writes:
      
      at least you know that the PS3 is really the only next-generation video game system because nobody concerned with raw performance and power efficiency would want to use the Xbox 2 in a HPC environment.
      
      Not quite. What they're saying is that the Cell is better suited to parralel applications, like physics simulations, and that it is more scaleable - ie, easier to build supercomputers or distributed computing nodes from.
      
      However, that has no bearing upon what 'generation' the host console is - largely be
- - - Re:PS3 will rule in 2008 (Score:1, Offtopic)
      
      by Lonewolf666 ( 259450 ) writes:
      
      The last part of the puzzle is how cheap 1080P TV's will get in the next 5 years. It isn't out of the question to hook up a keyboard, mouse and "cheap" 1080P LCD or Plasma TV to a PS3 and have a computer. This is a giant leap forward for consoles, and Sonys first attempt to bridge the gap between console, computer and DVR type of device.
      If this is worthwile for users will depend a lot on how open the console is for third-party software. Usually consoles are designed to run only software licensed by the cons

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Cell + Linux = success (Score:3, Funny)

What about the compiler? (Score:2, Insightful)

What about the programmer? (Score:5, Insightful)

Re:What about the programmer? (Score:3, Insightful)

Re:What about the compiler? (Score:5, Insightful)

Re:What about the compiler? (Score:3, Insightful)

Re:What about the compiler? (Score:5, Insightful)

Re:What about the compiler? (Score:1)

Re:What about the compiler? (Score:1, Insightful)

Re:What about the compiler? (Score:2)

Re:What about the compiler? (Score:4, Informative)

Re:What about the compiler? (Score:2)

Re:What about the compiler? (Score:3, Informative)

Re:What about the compiler? (Score:3, Informative)

Re:What about the compiler? (Score:2)

Re:What about the compiler? (Score:2)

Re:What about the compiler? (Score:2)

Re:What about the compiler? (Score:3, Interesting)

Re:What about the compiler? (Score:1)

Re:What about the compiler? (Score:2)

Re:What about the compiler? (Score:4, Informative)

Re:What about the compiler? (Score:1)

Re:What about the compiler? (Score:2, Informative)

Re:What about the compiler? (Score:1)

Re:What about the compiler? (Score:2)

Re:What about the compiler? (Score:5, Insightful)

Re:What about the compiler? (Score:3, Informative)

Re:What about the compiler? (Score:2, Interesting)

Re:What about the compiler? (Score:1)

Re:What about the compiler? (Score:2)

Re:What about the compiler? (Score:2)

Re:What about the compiler? (Score:5, Interesting)

Re:What about the compiler? (Score:4, Informative)

No, this is why we have subroutine libraries (Score:5, Interesting)

Re:What about the compiler? (Score:2)

Re:What about the compiler? (Score:2)

Doesn't it easily scale up? (Score:2, Interesting)

Re:Doesn't it easily scale up? (Score:2)

Hmm (Score:2)

'designed', nothing (Score:2)

Re:Hmm (Score:2)

Re:Doesn't it easily scale up? (Score:2)

Not likely to be low cost CPUs (Score:2)

WTF? (Score:5, Insightful)

Re:WTF? (Score:4, Insightful)

Re:WTF? (Score:2)

Re:WTF? (Score:2)

Re:Not likely to be low cost CPUs (Score:2)

Re:Not likely to be low cost CPUs (Score:2)

Re:Not likely to be low cost CPUs (Score:2)

Lattice QCD people: (Score:2)

Re:Lattice QCD people: (Score:1)

Re:Lattice QCD people: (Score:2)

Not the real issue (Score:1, Offtopic)

When can we start Folding with it? (Score:1)

Re:When can we start Folding with it? (Score:1)

Re:When can we start Folding with it? (Score:1)

The ball is in the hands of developpers. (Score:2, Insightful)

Re:The ball is in the hands of developpers. (Score:4, Informative)

Re:The ball is in the hands of developpers. (Score:2)

Femlab? (Score:2)

Re:The ball is in the hands of developpers. (Score:2)

Ease of Programming? (Score:3, Interesting)

Re:Ease of Programming? (Score:2)

Re:Ease of Programming? (Score:2)

That's why F0rtran really doesn't matter here (Score:2)

Re:Ease of Programming? (Score:2)

Re:Ease of Programming? (Score:2)

Re:Ease of Programming? (Score:2)

And why Apple going Intel was so sad (Score:1, Insightful)

bang, buck, effort (Score:4, Informative)

single threaded vs multithreaded (Score:1)

Re:single threaded vs multithreaded (Score:2)

Ran simulations, not code (Score:5, Insightful)

Re:Ran simulations, not code (Score:2)

Re:Ran simulations, not code (Score:3, Insightful)

Re:Ran simulations, not code (Score:2)

Re:Ran simulations, not code (Score:2)

Re:Ran simulations, not code (Score:2)

Re:Ran simulations, not code (Score:2)