• 3 Posts
  • 31 Comments
Joined 1 year ago
cake
Cake day: October 25th, 2023

help-circle
  • Talk about burying the lede in the last segment. Asus isn’t using the official connector and every other vendor thinks their connector is risky and probably defective. That’s not on nvidia, other than allowing it (and this is the reason why they ride partners’ asses sometimes on approval/etc).

    The rest of the stuff is Igor still grinding the same old axe (pretty sure astron knows how to make a connector, if the connector is so delicate it would be broken by GN’s physical testing, etc) but if asus isn’t using the official connector and they’re disproportionately making up a huge number of the failures, that’s really an asus problem.



  • this has been my take, it’s an obvious case of the 80-20 rule. During the times of breakthrough/flux, NVIDIA benefits from having both the research community onboard as well as a full set of functionality and great tooling etc. when things slow back down you’ll see google come out with a new TPU and amazon will have a new graviton etc.

    it’s not that hard in principle to staple an accelerator to an ARM core, actually that’s kind of a major marketing point for ARM. And nowadays you’d want an interconnect too. There are a decently large number of companies who can sustain such a thing at reasonably market-competitive prices. So once the market settles, the margins will decline.

    On the other hand, if you are building large, training-focused accelerators etc… it is also going to be a case of convergent evolution. In the abstract, we are talking about massively parallel accelerator units with some large memory subsystem to keep them fed, and some type of local command processor to handle the low-level scheduling and latency-hiding. Which, gosh, sounds like a GPGPU.

    If you are giving it any degree of general programmability then it just starts to look very much like a GPU. If you aren’t, then you risk falling off the innovation curve the next time someone has a clever idea, just like previous generations of “ASICs”. And you are doing your tooling and infrastructure and debugging all from scratch too, with much less support and resources. GPGPU is turnkey at this stage, do you want your engineers building CUDA or do you want them building your product?





  • ironically COD did the “cold war gone hot” thing not too long ago, lol

    I actually think smaller-scale conflicts would be a good fit for battlefield gameplay. the series has eternally struggled to balance aircraft, having jet fighters boom-and-zoom and go repair in the endzone where they’re untouchable is no good, they just don’t couple to the battlefield very well. even tanks/helicopters have risk, but, planes just fly away and go repair. and if you make them weaker then they’re not any good.

    (it’s very similar to sniper rifles in the sense that sniper rifles either 1-hit you and then they’re not fun for anyone else, or they require multiple hits and then that’s just not good compared to DMR/etc which allow you to spam shots and achieve generally lower TTKs on average if you assume one or two misses.)

    but if you do smaller-scale conflicts, then airplanes can be older slower stuff like harriers or a-6 intruders, or propeller aircraft, and helicopters, etc. if planes can’t just disappear over the battlefield in 5 seconds flat, then that’s more of a chance for people on the ground to actually coordinate against them and gets you away from the “sniper-rifle problem”.



  • In some senses you end up with convergent design, it’s not a GPU, it’s just a control system that commands a bunch of accelerator units with a high-bandwidth memory subsystem. But that could be ARM and an accelerator unit etc. Probably need fast networking.

    But it’s overall a crazy proposition to me. Like first off goog and amazon are gonna beat you to market on anything that looks good, and you have no real moat other than “I’m sam altman”, and really there’s no market penetration of the thing (or support in execution let alone actual research) etc. Training is a really hard problem to solve because right now it’s absolutely firmly rooted in the CUDA ecosystem. Supposedly there may be a GPU Ocelot thing once again at some point but like, everyone just works with nvidia because they’re the gpgpu ecosystem that matters.

    Like, if you wanted to do this you did like Tesla and have Jim Keller design you a big fancy architecture for training fast at scale (Dojo). I guess they walked away from it or something and just didn’t care anymore? Oops.

    But, that’s the problem, it’s expensive to stay at the cutting edge. It’s expensive to get the first chip, and you’ll be going against competitors who have the scale to make their own in-house anyway. it’s a crazy business decision to be throwing yourself on the silicon treadmill against intense competition just to give nvidia the finger. wack, hemad.






  • all of these processors were utterly wiped out by the “spend $100 more on a 8700K and overclock” option.

    There is such a thing as false economy, sometimes spending more money results in a thing that lasts long and gives better results throughout that whole timespan… classic “boots theory” stuff.

    Having your $2000 build last 2-3 years less than it otherwise could have because you didn’t spend $100 more on the 8700K when you had the option to, is stupid, and not good value. Reviewers over-fixated on the minmaxing, the order of the day was “cut out everything else and shunt it all into your GPU”, some reviewers took it as far as saying you should cut down to a 500W PSU or even less. And today that $100 extra you spent on your GPU is utterly wasted, while going from a 8600K to 8700K or buying a PSU that doesn’t cut out on transient loads got you a much better system in the long term, even if it didn’t win the benchmarks on day 1.

    (and yes, transients were already problematic back then, and in fact have remained pretty constant at around 2x average power draw…)







  • GA102 to AD102 increased by about 80%, but the jump from Ad102 to GB202 is only slightly above 30%,

    Maybe GB202 is not the top chip, and the top chip is named GB200.

    I mean, you’d expect this die to be called GB102 based on the recent numbering scheme, right? Why jump to 202 right out of the gate? They haven’t done that in the past, AD100 is the compute die and AD102, 103, 104… are the gaming dies. In fact this has been extremely consistent all the way back to Pascal, even when there is a compute uarch variant that is different (and, GP100 is quite different from GP102 etc) it’s still called the 100.

    But if there is another die above it, you’d call it GB100 (like Maxwell GM200, or Fermi GF100). Which is obviously already taken, GB100 is the compute die. So you bump the whole numbering series to 200, meaning the top gaming die is GB200.

    There is also precedent for calling the biggest gaming die the x110, like GK110 or the Fermi GF110 (in the 500 series). But they haven’t done that in a long time, since Kepler. Probably because it ruins the “bigger number = smaller die” rule of thumb.

    Of course it’s possible the 512b rumor was bullshit, or this one is bullshit. But it’s certainly an odd flavor of bullshit - if you were making something up, wouldn’t you make up something that made sense? Odd details like that potentially lend it credibility, because you’d call it GB102 if you were making it up. It will also be easy to corroborate across future rumors, if nobody ever mentions GB200-series chips again, then this was probably just bullshit, and vice versa. Just like Angstronomics and the RDNA3 leak, once he’d nailed the first product the N32/N33 information was highly credible.