Caching & Memory Hierarchy: How CPUs, RAM, & Disks Boost Performance

Name: Caching & Memory Hierarchy: How CPUs, RAM, & Disks Boost Performance
Uploaded: 2025-06-27T04:27:27+00:00
Duration: 20 min 10 s
Channel: Neural Lantern

Neural Lantern

6/27/2025

Join us for a fun dive into caching & the memory hierarchy! Learn how CPU registers, RAM, & hard drives balance speed & storage to make your computer lightning fast. From cache hits to internet browsing, we break it down in a clear, beginner-friendly way. Subscribe for more tech explainers & hit that like button to support the channel! Scan the QR code for more tutorials & resources. Let?s geek out together!

Introduction to Caching 00:00:00
What is Caching? 00:00:10
Memory Units: Size vs. Speed 00:00:14
CPU Registers vs. System Memory 00:00:33
Benefits of Caching 00:01:05
Access Patterns and Algorithms 00:01:24
Cache Hit and Miss 00:02:48
Data Movement in Cache 00:04:51
Cache Purging 00:05:57
Multiple Cache Levels 00:07:15
Internet Caching Example 00:09:28
Memory Hierarchy Pyramid 00:12:34
Volatile Memory and Registers 00:13:16
Processor Cache and RAM 00:13:48
Slower Storage Types 00:14:38
Conclusion and Subscribe Request 00:15:12

Thanks for watching!

Find us on other social media here:
- https://www.NeuralLantern.com/social

Please help support us!

- Subscribing + Sharing on Social Media
- Leaving a comment or suggestion
- Subscribing to our Blog
- Watching the main "pinned" video of this channel for offers and extras

Category

🤖

Tech

Transcript

Display full video transcript

00:00Hello there. I'd like to talk to you about caching and the hierarchy of memory.

00:11So for starters, what is caching?

00:14Simply put, sometimes we have memory units or storage units

00:19that are very, very big because they're cheap,

00:22but that also makes them very, very slow.

00:24And sometimes we have storage units that are very, very fast

00:27and thus expensive and thus very, very, very small.

00:30For example, if we had like a, let's say a CPU

00:33and we were just talking about the memory that the CPU itself could hold

00:36in its CPU registers.

00:39We have a very small number of registers inside of the CPU

00:42compared to system RAM or disks or whatever.

00:45So we can't actually hold that much memory.

00:47But at the same time, the CPU registers are lightning fast,

00:50lightning fast, even compared to system memory.

00:53So that kind of sucks, right?

00:56Because the system memory is way slower than the CPU registers.

01:00Why can't we just have more registers?

01:02Well, it's too expensive.

01:03So caching is something that you can kind of use to get the best of both worlds.

01:08You can have something that's very, very fast,

01:11be your primary place that you try to store things

01:13and look for things to get them and set them.

01:15And then when it runs out of space, you can just kind of push things

01:19to something that's a little bit slower, but larger and cheaper.

01:23And then hopefully there's a pattern of access that actually makes sense.

01:27Maybe hopefully you're not accessing your data and instructions in a totally random way.

01:32So that when you access in a pattern, then algorithms inside the machine

01:37and the operating system can kind of figure out which items are best

01:40to take out of the small storage device.

01:43And that way we can sort of keep the most frequently or the most often

01:48or the most recently used things inside of the fastest device.

01:52This is kind of a way to get the best of both worlds,

01:54kind of a way to leverage the best of both worlds.

01:57Okay. So what am I kind of talking about?

01:59Let's just suppose for the sake of argument, let me draw this a little bit.

02:03Suppose for the sake of argument, you have something that's really, really slow,

02:07but it's huge, a huge storage thing.

02:11Maybe this is your disc.

02:13Maybe this is your system Ram.

02:15I don't know.

02:16There's many levels to this.

02:18Then maybe you have something else that's a little bit faster

02:21and thus a little bit more expensive, but it's slower.

02:25And then maybe you have something else that is like really, really fast,

02:28but it's super expensive.

02:30So we can only make a really, really small version of it, right?

02:33So imagine that, you know, you are the, I don't know, we'll say the user.

02:38The user of these things is actually usually just like the CPU or something.

02:42But let's say that you're the user and you want to look and see

02:45if some piece of data you want is stored in there.

02:49Maybe you have memory location 45 or something or 44 that you want to check for.

02:55You're like, what, what is the data at memory location 44?

02:58I'm totally making this up that we wouldn't look at memory location 44, but suppose you are.

03:03So if memory location 44 or this item, whatever that you're looking for

03:08is in the super small and fast thing, then awesome.

03:11You don't need to look any further, but if it's not, we'll call that a cache miss.

03:16We'll say, well, it's not in there.

03:18Oops.

03:19It's not in there.

03:20So we call that a miss for a cache.

03:23And that means we just have to check the next thing.

03:26We'll say, all right, well, if it's not in there, is it in the next fastest thing?

03:30The thing that's like a little bit slower and a little bit bigger.

03:33If it's in there, we call that a hit.

03:35We would call it a hit if we found it on the first try also.

03:38But if it's in there, we call it a hit.

03:40And if it's not in there, we call it another miss.

03:42So we'll say miss.

03:44And then finally, we'll check maybe the biggest thing that we have available

03:48and we'll go, all right, is it in the giant thing?

03:50And usually it is going to be in the giant thing because if we're talking about

03:54program instructions or program data or something like that, you know,

03:57the data is at least going to be on your disk drive in almost every case, right?

04:01So, or your hard drive.

04:03So you could imagine maybe this green thing is the CPU registers or the CPU cache

04:08or something very, very, very fast.

04:10And you can imagine this blue thing here is perhaps system RAM.

04:13It's like a little bit bigger, a little bit faster.

04:15And then the red thing is your hard disk, which does definitely have all the data,

04:19but it's really, really, really slow.

04:23And we're hoping not to try to touch the red thing because it's going to cost us a lot more time.

04:27And then same thing for the blue.

04:29So the first time we look for this piece of data, you know,

04:32we missed twice because it was only in the red thing, you know, the disk that we'll call it.

04:38But then suppose we come along later and we look for that piece of data.

04:41Again, maybe we're inside of a loop.

04:43Maybe our function gets called multiple times and for whatever reason we check.

04:49But whenever, when we, when we originally found the data in the red box, by the way,

04:52it was sent up through the cache and sort of copied into all the other devices.

04:57So, you know, when we couldn't find our data anywhere in the green or blue,

05:02it was copied from the red to the blue and then to the green also.

05:06So it's actually everywhere.

05:08And then the next time we try to look for that same piece of data,

05:12we'll have a hit because it's just sitting in the CPU or sorry, not the CPU,

05:17but just the fastest thing,

05:18which you could imagine as being the CPU or the CPU cache or something, right?

05:22So notice how once we have a hit,

05:24there's not really any need to go look onto the slower devices.

05:27So again, if your access patterns are not totally random,

05:30then you can have most of your important data sitting in the fastest device

05:35based on how often you need that data or how recently you've used that data or whatever.

05:41So great. We have a lot of hits.

05:43We're now getting lightning speed because we're just like reading only from that green area.

05:47Suppose some time passes and you start filling up the green thing,

05:52the fastest thing with lots of other data,

05:54and it overflows because it is a small place for data.

05:57So eventually we'll have to start purging some of the data that we're not using anymore.

06:01And maybe it's this OX 44, man, that X is awful.

06:05I'm still getting used to this pen.

06:06Let me just try one more time.

06:09Oh, when I lift up the pen, it kind of drags a bit.

06:13So we've kind of filled up the green area and there's not enough room for all the data that we want.

06:18The system realizes that OX 44 hasn't been used in a while.

06:21It doesn't seem to be a part of the current working window data set.

06:25And so we'll just purge it.

06:27So the system just says, all right, let's just erase that from the green.

06:30But even when it erases it from the green, the fastest storage device,

06:34it might not erase it from the blue because the blue is bigger.

06:38It can hold more data before we start having to erase things, right?

06:41So that means there's a pretty good chance it's sitting in the blue.

06:44So if we try to access it again, I'll just put a red X here to indicate that we have a miss.

06:48So we miss the fastest device that sucks.

06:52Then we check the next fastest device and it's in there.

06:56So we're going to call that a hit.

06:58So, you know, we didn't get to use the fastest device, but at the same time,

07:02we also didn't have to use the slowest device.

07:05So this is still kind of a win, right?

07:07Suppose, oh, and then by the way, once we find that hit,

07:11once we actually hit it from the blue device,

07:13then the data gets carried back to the green device, the fastest device,

07:17because the system now considers OX44, that address or that value, whatever,

07:22to be part of the working set of data that we're currently using.

07:25So the moment we hit something, it gets brought up to the fastest device

07:29in the hopes that maybe we'll try to request it again in the near future

07:33and save a bunch of processing time.

07:36So suppose for the sake of argument, then a bunch more time passes

07:40and it has to be purged again from the fastest device.

07:46So, you know, we're accessing lots of data.

07:49We're filling up that green thing again.

07:51Eventually there's not enough room.

07:53We have to purge some old stuff.

07:54So we purge the 44 there.

07:56And then even more time passes.

07:59And we have to also purge it from the medium speed device, the blue square,

08:03because that also can fill up at some point.

08:06So that sucks.

08:07We'll never, we'll never purge from the giant storage device,

08:11which is like your disk drive usually, or your hard drive.

08:14So it'll still be there.

08:15Which means eventually when we try to request 44 again,

08:18we'll go through the original process.

08:19We'll just say, you know, going through.

08:22Well, we'll say, well, I'll just write the word miss.

08:25We'll just go miss here.

08:26And then we look here and it's another miss.

08:28And then we look here and it's a hit.

08:31I'll put hit.

08:32So in that case, when a lot of time passed and all the other caches kind of filled up,

08:38then we didn't really get any benefit, any performance boost from using caches.

08:43In fact, we took a little bit of a penalty on those two misses

08:46because we got to check one thing, whoops.

08:48And then check another thing, whoops.

08:50And then finally we hit the red thing, which is giant and has the data we wanted.

08:54We could have just gone there in the first place and saved a couple of,

08:57you know, some CPU cycles because we wouldn't have had to do the two misses.

09:01But checking and missing is part of the process of using a cache.

09:05And so it's worth it because, you know, most of the time we'll hopefully be hitting

09:10a lot of cache in the green area and even more cache in the blue area.

09:15So that's the basic idea of caching kind of within a system.

09:20Let's see.

09:21What else did I want to show you here?

09:22We got slow, medium and large, and maybe just another example that you might relate to more.

09:28Suppose you go to an internet website and on the website there's lots of images and things like that.

09:33Well, those images are usually cached on your local machine inside of your browser's cache area.

09:40What am I even talking about?

09:42First time you go to a website ever, the images get loaded from the internet.

09:46Imagine the internet as being something that is like very, very, very big

09:49because it's like at this point in, well, I think since 2020, it was over 64 zettabytes.

09:55And a zettabyte is a trillion gigabytes.

09:57So 64 trillion gigabytes of data on the internet.

10:01So huge, right?

10:02But the internet is kind of slow.

10:04Even if you have a lightning fast connection, it's probably still slower than accessing your local hard drive.

10:08Definitely way slower than accessing your local memory.

10:10Although most people, when I recorded this video, have internet connections that are significantly slower than their disk drive or their SSD.

10:20Anyway, so the internet is huge, but it's slow.

10:23And I guess it's cheap compared to the amount of storage and the speed you get.

10:28So you download the image from the website, from the internet.

10:32And that's like, takes a lot of time.

10:35If you do that every single time and there's lots of images, you might feel like your internet experience is slow.

10:39You might see images kind of like loading, you know, even in the modern era, if they take just like a moment to load,

10:45then you kind of think like, whoa, there's like a lot of delay here, right?

10:49So, but then the first time we would call that, I guess, a miss on your local cache.

10:55And then we would hit on the internet.

10:59So the first time you load it, it actually, your browser will actually save a copy of that data to your local disk area.

11:06And well, the next time you go to that website, instead of looking directly to the internet to get the images,

11:13your browser will look to see if you have a copy saved already locally on your, on your disk drive or your hard drive.

11:19And if that's true, then your computer will just load the images from the local machine and not from the internet.

11:25And then your pages will seem like they load a lot, lot faster.

11:29So that's kind of the concept of caching in a slightly different way.

11:33You know, the fast, expensive thing is your, is your hard drive.

11:36The slow, cheap thing is the internet.

11:38You can imagine there are many, many levels.

11:40I mean, this diagram that I drew, there's three levels.

11:43In your CPU, you have CPU registers, the fastest thing.

11:48And then you have L1 cache, which are almost as fast as the registers.

11:52And, and you have like some amount of kilobytes that you can store.

11:55And then your CPU, your CPU also has L2 cache usually.

11:59And then depending on what kind of CPU you have, you might have L3 cache.

12:02So that's four levels right there.

12:04And then beyond that, you know, there's system Ram that can cache some data for your CPU.

12:09It's a lot slower than the CPU caches, but you know, it's larger.

12:13And, you know, eventually we can just get basically down to your hard drive in the slowest case.

12:19So let's see, what else can I tell you real fast?

12:22Oh, I wanted to show you a diagram.

12:24Let's see.

12:26So let me show you this diagram right here.

12:28I found a diagram of this on Wikipedia.

12:31It's called computer memory hierarchy.

12:34And so this is what I talked about in the beginning of the video.

12:36We want to talk about like the hierarchy of memory.

12:38The hierarchy is kind of like a pyramid.

12:40The reason it's a pyramid is so that we can kind of remember that stuff on the top is very small.

12:45Notice how the thing on the top, it has less area to work with.

12:49And the things at the bottom have a lot more area to work with.

12:52So the thing that's written in the middle here, actually, let me go to, let me go to the actual image of it.

13:00Because I think I like it the way it looks a little bit better.

13:03Oh, no.

13:04I wanted to do this because I already had it.

13:07Okay.

13:09So the things at the top are, have a very small size and a very small capacity.

13:16And you can imagine some examples are processor registers like our CPU registers.

13:21But they're very, very fast and they're very expensive.

13:24And there's this little term written down here or this little description saying power on immediate term.

13:29That just basically means this is volatile memory.

13:31And so, you know, the, the moment you disable, the moment you disconnect power should end your computer.

13:38You just lose that.

13:39It's just gone forever.

13:40Then kind of, you know, getting a little bit bigger and also a little bit cheaper, but also a little bit slower.

13:45We have the small size, small capacity category.

13:48You could imagine this is your processor cache, which is just, you know, the cache on your processor is just a little data storage area.

13:56You know, like I said before, that sort of backs up the CPU registers.

13:59It's where the registers first look for their cache.

14:03Still very fast and very expensive, only a little bit, you know, cheaper and the slower than the CPU registers.

14:10Then we have like medium size and small size and large size and, you know, whatever different sizes and categories.

14:18So kind of going from the fastest to the slowest here, you know, once we hit cache, then we can consider this to be, you know, still pretty fast, way faster than the disk.

14:28But random access memory or system RAM is just a lot slower than the CPU, but it's, you know, faster than the disk.

14:36And then you can imagine other things that are a little bit slower, but a little bit larger could be flash memory, USB memory and so forth.

14:44And then and then pretty slow here.

14:46The slowest that I ever deal with personally or hard drives, even your SSD being way faster than your old disk drives is still considered very slow compared to system RAM, but they're huge, right?

14:57And then kind of like at the very bottom of this pyramid is tape back up incredibly slow, but incredibly large for the money.

15:05And that's why this is sometimes used still.

15:07And that my friends is the hierarchy of memory.

15:11The hierarchy of memory.

15:12I hope you learned a little bit of stuff and you had a little bit of fun.

15:16I'll see you in the next video.

15:20Hey, everybody.

15:21Thanks for watching this video again from the bottom of my heart.

15:24I really appreciate it.

15:25I do hope you did learn something and have some fun.

15:28If you could do me a please, a small little favor, could you please subscribe and follow this channel or these videos or whatever it is you do on the current social media website that you're looking at right now?

15:39It would really mean the world to me and it'll help make more videos and grow this community.

15:44So we'll be able to do more videos, longer videos, better videos, or just I'll be able to keep making videos in general.

15:50So please do me a kindness and subscribe.

15:54You know, sometimes I'm sleeping in the middle of the night and I just wake up because I know somebody subscribed or followed.

16:00It just wakes me up and I get filled with joy.

16:02That's exactly what happens every single time.

16:04So you could do it as a nice favor to me or you could you control me if you want to just wake me up in the middle of the night, just subscribe and then I'll just wake up.

16:11I promise that's what will happen.

16:13Also, if you look at the middle of the screen right now, you should see a QR code, which you can scan in order to go to the website, which I think is also named somewhere at the bottom of this video.

16:23And it'll take you to my main website where you can just kind of like see all the videos I published and the services and tutorials and things that I offer and all that good stuff.

16:32And if you have a suggestion for clarifications or errata or just future videos that you want to see, please leave a comment or if you just want to say, Hey, what's up?

16:44What's going on?

16:45You know, just send me a comment, whatever.

16:47I also wake up for those in the middle of the night.

16:49I get, I wake up in a cold sweat and I'm like, it would really, it really mean the world to me.

16:54I would really appreciate it.

16:55So again, thank you so much for watching this video and enjoy the cool music.

17:01As, as I fade into the darkness, which is coming for us all.

17:06So.

17:07So.

17:08So.

17:09So.

17:10So.