Wednesday, March 20, 2024

Exploding Chips, Meta's AR Hardware, and More




Stephen Cass: Hello and welcome to Fixing the Future, an IEEE Spectrum podcast where we look at concrete solutions to some big problems. I’m your host Stephen Cass, a senior editor at IEEE Spectrum. And before we start, I just want to tell you that you can get the latest coverage from some of Spectrum’s most important beats, including AI, climate change and robotics, by signing up for one of our free newsletters. Just go to spectrum.ieee.org/newsletters to subscribe. Today we’re going to be talking with Samuel K. Moore, who follows a semiconductor beat for us like a charge carrier in an electric field. Sam, welcome to the show.

Samuel K. Moore: Thanks, Stephen. Good to be here.

Cass: Sam, you recently attended the Big Kahuna Conference of the semiconductor research world, ISSCC. What exactly is that, and why is it so important?

Moore: Well, besides being a difficult-to-say acronym, it actually stands for the IEEE International Solid State Circuits Conference. And this is really one of the big three of the semiconductor research world. It’s been going on for more than 70 years, which means it’s technically older than the IEEE in some ways. We’re not going to get into that. And it really is sort of the crème de la crème if you are doing circuits research. So there is another conference for inventing new kinds of transistors and other sorts of devices. This is the conference that’s about the circuits you can make from them. And as such, it’s got all kinds of cool stuff. I mean, we’re talking about like 200 or so talks about processors, memories, radio circuits, power circuits, brain-computer interfaces. There’s kind of really something for everybody.

Cass: So while we’re there, we send you this monster thing and ask you to fish out— They’re not all going to be— Let’s be honest. They’re not all going to be gangbusters. What were the ones that really caught your eye?

Moore: All right. So I’m going to tell you actually about a few things. First off, there’s a potential revolution in analog circuits that’s brewing. Just saw the beginnings of it. There’s a cool upcoming chip that does AI super efficiently by mixing its memory and computing resources. We had a peek at Meta’s future AR glasses or the chip for them anyways. And finally, there was a bunch of very cool security stuff, including a circuit that self-destructs.

Cass: Oh, that sounds cool. Well, let’s start off with the analog stuff because you were saying this is like really a way of kind of almost saying bye-bye to some electronic analog stuff. So this is fascinating.

Moore: Yeah. So this really kind of kicked the conference off with a bang because it was one of the plenary sessions. It was literally one of the first things that was said. And it had to come from the right person, and it kind of did. It was IEEE fellow and sort of analog establishment figure from the Netherlands Bram Nauta. And it was a kind of a real, like, “We’re doing it all wrong kind of moment,” but it was important because the stakes are pretty high. Basically, Moore’s Law has been really good for digital circuits, the stuff that you use to make the processing parts of CPUs and in its own way for memory but not so much for analog. Basically, you kind of look down the road and you are really not getting any better transistors and processes for analog going forward. And you’re starting to see this in places, even in high-end processors, the parts that kind of do the I/O. They’re just not advancing. They’re using super cutting-edge processes for the compute part and using the same I/O chiplet for like four or five generations.

Cass: So this is like when you’re trying to see things from the outside world. So like your smartphone, it needs these converters to digitize your voice but also to handle the radio signal and so on.

Moore: Exactly. Exactly. As they say, the world is analog. You have to make it digital to do the computing on it. So what you’re saying about a radio circuit is actually a great example because you’ve got the antenna and then you have to amplify, you have to mix in the carrier signal and stuff, but you have to amplify it. You have to amplify it really nicely quite linearly and everything like that. And then you feed it to your analog to digital converter. What Nauta is pointing out is that we’re not really going to get any better with this amplifier. It’s going to continue to burn tens or hundreds of times more power than any of the digital circuits. And so his idea is let’s get rid of it. No more linear amplifiers. Forget it. Instead, what he’s proposing is that we invent an analog-to-digital converter that doesn’t need one. So literally--

Cass: Well, why haven’t we done this before? It sounds very obvious. You don’t like a component. You throw it out. But obviously, it was doing something. And how do you make up that difference with the pure analog-to-digital converter?

Moore: Well, I can’t tell you completely how it’s done, especially because he’s still working on it. But his math basically checks out. And this is really a question— this is really a question of Moore’s Law. It’s not so much, “Well, what are we doing now?” It’s, “What can we do in the future?” If we can’t get any better with our analog parts in the future, let’s make everything out of digital, digitize immediately. And let’s not worry about any of the amplification part.

Cass: But is there some kind of trade-off being made here?

Moore: There is. So right now, you’ve got your linear amplifier consuming milliwatts and your analog to digital converter, which is a thing that can take advantage of Moore’s Law going forward because it’s mostly just comparators and capacitors and stuff that you can deal with. And that consumes only microwatts. So what he’s saying is, “We’ll make the analog-to-digital converter a little bit worse. It’s going to consume a little more power. But the overall system is going to consume less if you take the whole system as a piece.” And that has been part of the problem is that the figures of merit, the things that you measure how good is your linear amplifier, is really just about the linear amplifier rather than worrying about like, “Well, what’s the whole system consuming?” And this looks like, if you care about the whole system, which is kind of what you have to, then this no longer really makes sense.

Cass: This also sounds like it gets closer to the dream of the pure software-defined radio, which is you take basically an idea where you take your CPU, you connect one pin to an antenna, and then almost from DC to daylight, you’re able to handle everything in software-defined functions.

Moore: That’s right. That’s right. Digital can take advantage of Moore’s Law. Moore’s Law is continuing. It’s slowing, but it’s continuing. And so that’s just sort of how things have been creeping along. And now it’s finally getting kind of to the edge, to that first amplifier. So anyways, he was kind of apprehensive about giving this talk because it is poo-pooing on quite a lot of things actually at this conference. So he told me he was actually pretty nervous about it. But it had some interest. I mean, there were some engineers from Apple and others that approached him that said, “Yeah, this kind of makes sense. And maybe we’ll take a look at this.”

Cass: So fascinating. So it appears to be solving these bottlenecks and linear amplifier efficiencies of bottleneck. But there was another bottleneck that you mentioned, which is the memory wall.

Moore: Yes.

Cass: It’s a memory wall.

Moore: Right. So the memory wall is this sort of longstanding issue in computing. Particularly, it started off in high-performance computing, but it’s kind of in all computing now, where the amount of time and energy needed to move a bit from memory to the CPU or the GPU is so much bigger than the amount of time and energy needed to move a bit from one part of the GPU or CPU to another part of the GPU or CPU, staying on the silicon, essentially.

Cass: Going off silicon has a penalty.

Moore: That’s a huge penalty.

Cass: And this is why, in traditional CPUs, you have these like caches, L1. You hear these words, L1 cache, L2 cache, L3 cache. But this goes much further. What you’re talking about is much further than just having a little blob of memory near the CPU.

Moore: Yes, yes. So the general memory wall is this problem. And people have been trying to solve this in all kinds of ways. And you just sort of see it in the latest NVIDIA GPUs basically has all of its DRAM is right on the same— is on like a silicon interposer with the GPU. They couldn’t be connected any more closely. You see it in that giant chip. If you remember, Cerebras has a wafer size chip. It’s as big as your face. And that is—

Cass: Oh, that sounds an incredible chip. And we’ll definitely put the link to that in the show notes for this because there’s a great picture. It has to be kind of seen to be believed, I think. There’s a great picture of this monster, monster thing. But sorry.

Moore: Yeah, and that is an extreme solution to the memory wall problem. But there’s all sorts of other cool research in this. And one of the best is sort of to bring the compute to the memory so that your bits just don’t have to move very far. There’s a bunch of different— well, a whole mess of different ways to do this. There were like nine talks or something on this when I was there, and there are even very cool ways that we’ve written about in Spectrum, where you can actually do you can do sort of AI calculations in memory using analog, where the--

Cass: Oh, so now we’re back to analog! Let’s creep it back in.

Moore: Yeah, no, it’s cool. I mean, it’s cool that sort of coincidentally, the multiply and accumulate task, which is sort of the fundamental crux of all the matrix stuff that runs AI you can do in basically Ohm’s Law and Kirchhoff’s Law. They just kind of dovetail into this wonderful thing. But it’s very fiddly. Trying to do anything in analog is always [crosstalk].

Cass: So before digital computers, like right up into the ‘70s, analog computers were actually quite competitive, whereby you set up your problem using operational amplifiers, which is why they’re called operational amplifiers. Op amps are called op amps. And you set it all your equation all up, and then you produce results. And this is basically like taking one of those analog operations where the behavior of the components models a particular mathematical equation. And you’re taking a little bit of analog computing, and you’re putting it in because it matches with one particular calculation that’s used in AI.

Moore: Exactly, yeah, yeah. So it’s a very fruitful field, and people are still chugging along at it. I met a guy at ISSCC. His name is Evangelos Eleftheriou. He is the CTO of a company called Axelera, and he is a veteran of one of these projects that was doing analog AI at IBM. And he came to the conclusion that it was just not ready for prime time. So instead, he found himself a digital way of doing the AI compute in memory. And it hinges on basically interleaving the compute so tightly with the cache memory that they’re kind of a part of each other. That required, of course, coming up with a sort of new kind of SRAM, which he was very hush-hush about, and also kind of doing things in integer math instead of floating point math. Most of what you see in the AI world, like NVIDIA and stuff like that, their primary calculations are in floating point numbers. Now, those floating point numbers are getting smaller and smaller. They’re doing more and more in just 8-bit floating point, but it’s still floating point. This depends on integers instead just because of the architecture depends on it.

Cass: Yeah, no, I like integer math, actually, because I do a lot of this retrocomputing. A lot of that is in this where you actually end up doing a lot of integer math. And the truth is that you realize, oh, the Forth programming language also is famously very [integer]-based. And for a lot of real-world problems, you can find a perfectly acceptable scale factor that lets you use integers with no appreciable difference in precision. Floating points are kind of more general purpose. But this really had some impressive trade-offs in the benchmarks.

Moore: Yeah, whatever they managed, despite any trade-offs they might have had to make for the math, they actually did very well. Now this is for— their aim is what’s called an edge computer. So it’s the kind of thing that would be running a bunch of cameras in sort of a traffic management situation or things like that. It was very machine-vision-oriented, but it’s like a computer or a card that you’d stick into a server that’s going to sit on-premises and do its thing. And when they ran a typical machine vision benchmark, they were able to do 2,500 frames per second. So that’s a lot of cameras potentially, especially when you consider most of these cameras are like— they’re not going 240.

Cass: Even if you take it at a standard frame rate of, say, 20 frames per frame per second, that’s 100 cameras that you’re processing simultaneously.

Moore: Yeah, yeah. And they were able to actually do this at like 353 frames per watt, which is a very good figure. And it’s performance per watt that really is kind of driving everything at the edge. If you ever want this sort of thing to go in a car or any kind of moving vehicle, everybody’s counting the watts. So that’s the thing. Anyways, I would really look, keep my eyes out for them. They are taping out this year. Should have some silicon later. Could be very cool.

Cass: So speaking of that, getting into the chips and making differences, you can make changes sort of on the plane of the chips. But you and I have found some interesting stuff on 3D chip technology, which I know has been a thread of your coverage in recent years.

Moore: Yeah, I’m all about the 3D chip technology. You’re finding 3D chip technology all the time pretty much in advanced processors. If you look at what Intel’s doing with its AI accelerators for supercomputers, if you look at what AMD is doing for basically all of its stuff now, they’re really taking advantage of being able to stack one chip on top of another. And this is, again, Moore’s Law slowing down, not getting as much in the two-dimensional shrinking as we used to. And we really can’t expect to get that much. And so if you want more transistors per square millimeter, which really is how you get more compute, you’ve got to start putting one slice of silicon on top of the other slice of silicon.

Cass: So as we’re getting towards—instead of transistors per square millimeter, it’s going to be per cubic millimeter in the future.

Moore: You could measure it that way. Thankfully, these things are so slim and sort of—

Cass: Right. So it looks like a—

Moore: Yeah, it looks basically the same form factor as a regular chip. So this 3D tech is powered by the most advanced part anyways is powered by something called hybrid bonding, which I’m afraid I have failed to understand where the word hybrid comes in at all. But really it is kind of making a cold weld between the copper pads on top of one chip and the copper pads on another one.

Cass: Just explain what a cold well is because I have heard about a cold well is, but actually, when it comes to— it’s a problem when you’re building things in outer space.

Moore: Oh, oh, that. Exactly that. So how it works here is— so picture you build your transistors on the plane of the silicon and then you’ve got layer upon layer of interconnects. And those terminate in a set of sort of pads at the top, okay? You’ve got the same thing on your other chip. And what you do is you put them face-to-face, and there’s going to be like a little bit of gap between the copper on one and the copper on the other, but the insulation around them will just stick together. Then you heat them up just a little bit and the copper expands and just kind of jams itself together and sticks.

Cass: Oh, it’s almost like brazing, actually.

Moore: I’ll take your word for it. I genuinely don’t know what that is.

Cass: I could be wrong. I’m sure a nice metallurgist out there will correct me. But yes, but I see what you’re being with the magnet. You just need a little bit of whoosh. And then everything kind of sticks together. You don’t have to go into your soldering iron and do the heavy—

Moore: There’s no solder involved. And that is actually really, really key because it means almost like an order of magnitude increase in the density you can have these connections. We’re talking about like having one connection every few microns. So that adds up to like 200,000 connections per square millimeter if my math is right. It’s actually quite a lot. And it’s really enough to make the distances between from one part of one piece of silicon to one part of another. The same kind of as if they were all just built on one piece of silicon. It’s like Cerebras did it all big in two dimensions. This is folding it up and getting essentially the same kind of connectivity, the same low energy per bit, the same low latency per bit.

Cass: And this is where Meta came in.

Moore: Yeah. So Meta has been showing up at this conference and other conferences sort of. I’ve noticed them on panels sort of talking about what they would want from chip technology for the ideal pair of augmented reality glasses. The talk they gave today was like— the point was you really just don’t want a shoebox walking around on your face. That’s just not how—

Cass: That sounds like a very pointed jab at the moment, perhaps.

Moore: Right, it does. Anyways, it turns out what they want is 3D technology because it allows them to pack in more performance, more silicon performance in an area that might actually fit into something that looks like a pair of glasses that you might actually want to wear. And again, flinging the bits around, it would probably reduce the power consumption of said chip, which is very important because you don’t want it to be really hot. You don’t want a really hot shoebox on your face. And you want it to last a long time. You don’t have to keep charging it. So what they showed for the first time, as far as I can tell, is sort of the silicon that they’ve been working on for this. This is a custom machine learning chip. It’s meant to do the kind of neural network stuff that you just absolutely need for augmented reality. And what they had was a four millimeter by four millimeter roughly chip that’s actually made up of two chips that are hybrid bonded together.

Cass: And you need this stuff because you need the chip to be able to do all this computer vision processing to process what’s going on in the environment and reduce some sort of semantic stuff that you can overlay things on. This is why learning is so, so important. Machine learning is so important to these applications or AI in general. Yeah.

Moore: Exactly, yeah. And you need that AI to be right there in your glasses as opposed to out in the cloud or even in a nearby server. Anything other than actually in the device is not going to give you enough latency and such, or it’s going to give you too much latency, excuse me. Anyway, so this chip was actually two 3D stacked chips. And what was very cool about this is they really made the 3D point because they had a version that was just the 2D, just like they had half of it. They tested the combined one, and they tested the half one. So the 3D stacked one was amazingly better. It wasn’t just twice as good. Basically, in their test, they tracked two hands, which is very important, obviously, for augmented reality. It has to know where your hands are. So that was the thing they tested. So the 3D chip was able to track two hands, and it used less energy than the ordinary 2D chip did when it was only tracking one hand. So 3D is a win for Meta clearly. We’ll see what the final project is like and whether anybody actually wants to use it. But it’s clear that this is the technology that’s going to get them there if they’re ever going to get there.

Cass: So jumping to another track, you talked about you mentioned security at the top. And I love the security because there seems to be no limit to how paranoid you can be and yet still not always be able to keep up with the real world. Spectrum has had a long coverage of the history of electronic intelligence spying. We had this great piece on the Russian typewriter and how the Russians spied on American typewriters by putting this embedding circuitry directly into the covers of the typewriters. It’s a crazy story, but you entered the chip security track. And as I’m really eager to hear about the crazy ideas you heard there— or as it turns out, not so crazy ideas.

Moore: Right. You’re not paranoid if they’re really trying to— they’re really out to get to you. So yeah, no, this was some real Mission Impossible stuff. I mean, you could kind of envision Ving Rhames and Simon Pegg hunched over a circuit board while Tom Cruise was running in the background. It was very cool. So I want to start with that vision of like somebody hunched over a circuit board that they’ve stolen and they’re trying to crack an encryption code or whatever and they’ve got a little probe on one exposed piece of copper. A group at Columbia and Intel came up with countermeasures for that. They invented a circuit that would reside basically on each pin going out of a processor, or you could have it on the memory side if you wanted. That can actually detect even the most advanced probe. So when you touch these probes to the line, there’s like a very, very slight change in capacitance. I mean, if you’re using a really high-end probe, it’s very, very slight. Larger probes, it’s huge. [laughter] You never think that the CPU is actually paying attention when you’re doing this. With this circuit, it could. It will know that you are actuall— that there’s a probe on a line, and it can take countermeasures like, “Oh, I’m just going to scramble everything. You’re never going to find any secrets from this.” So again, the countermeasures, what it triggers, they left up to you. But the circuit was very cool because now your CPU can know when someone’s trying to hack it.

Cass: My CPU always knows I’m trying to hack it. It’s evil. But yes, I’m just trying to debug it, not everything else. But that’s actually pretty cool. And then there was another one where, yeah, again, you were going after another— University of Austin, Texas, were also doing this thing where even non-physical probes, I think, it could go after.

Moore: So you don’t have to— you don’t always have to touch things. You can use the electromagnetic emissions from a chip as sort of what’s called a side channel attack. So it just sort of changes in the emissions from the chip when it’s doing particular things can leak information. So what the UT Austin team did was basically they made the circuitry that kind of does the encryption, the sort of key encryption circuitry. They modified it in a way so that the signature was just sort of a blur. And it still worked well. It did its job in a timely manner and stuff like that. But if you hold your EM sniffer up to it, you’re never going to figure out what the encryption key is.

Cass: But I think you said you had one that was your absolute favorite.

Moore: Yes. It’s totally my favorite. I mean, come on. How could I not like this? They invented a circuit that self-destructs. I got to tell you what the circuit is first because this is also a cool and—

Cass: This is a different group.

Moore: This is a group at University of Vermont and Marvell Technology. And what they came up with was a physical unclonable function circuit that—

Cass: You’re going to have to go and unpack.

Moore: Yeah, let me start with that. Physical and clonable function is basically there are always going to be very, very slight differences in each device on a chip, such that if you were to sort of take it, if you were to sort of measure those differences, every chip would be different. Every chip would have sort of its unique fingerprint. So these people have invented these physical and clonable function circuits. And they work great in some ways, but they’re actually very hard to make consistent. You don’t want to use this chip fingerprint as your security key if that fingerprint changes with temperature or as the chip ages. [laughter] So those are problems that different groups have come up with different solutions to solve. The Vermont group had their own solution. It was cool. But what I loved the most was that if the key is compromised or in danger of being compromised. For instance, somebody’s got a probe on it. [laughter] The circuit will actually destroy itself, literally destroy itself. Not in a sparks and smoke kind of way.

Cass: Boo.

Moore: I know. But at the micro level, it’s kind of like that. Basically, they just jammed the voltage up so high that there’s enough current in the long lines that copper atoms will actually be blown out of position. It will literally create voids and open circuits. At the same time, the voltage is again so high that the insulation in the transistors will start to get compromised, which is an ordinary aging effect, but they’re accelerating it greatly. And so you wind up basically with gobbledygook. Your fingerprint is gone. You could never countermeasure— sorry, you could never counterfeit this chip. You couldn’t say, well, “I got this,” because it’ll have a different fingerprint. It’s definitely not like— it won’t register as the same chip.

Cass: So not only will it not work, but if you were to like-- because it’s not like blowing fuses because there are memory protection systems where you send a little-- because you don’t want someone downloading your firmware. You send a little pulse through blows a fuse. But if you really want to, you could crack open. You could decap that chip and see what’s going on. This is scorched Earth internally.

Moore: Right, right. At least for the part that makes the physical unclonable function, that is essentially destroyed. And so if you encounter that chip and it doesn’t have the right fingerprint, which it won’t, you know it’s been compromised.

Cass: Wow. Well, that is fascinating and very cool. But I’m afraid that’s all we have time today. So thanks so much for coming on and talking about IISSCC.

Moore: ISSCC. Oh, yeah. Thanks, Stephen. It was a great time.

Cass: So today on Fixing the Future, we were talking with Samuel K. Moore about the latest developments in semiconductor technology. For IEEE Spectrum‘s Fixing the Future, I’m Stephen Cass, and I hope you’ll join us next time.

Reference: https://ift.tt/OcDnXai

No comments:

Post a Comment

New IEEE Scholarship Honors Space-Mapping Pioneer

This year the IEEE Canadian Foundation established the Dr. John William Bandler Graduate Scholarship in Engineering Design in honor of ...