Lecture Recording

Text Transcription

2021年10月13日 下午 8:05|57分钟9秒

关键词: performance、computer performance、latency、computer using latency、throughput versus latency、car latency、performance metrics、throughput、instruction count、Psycho Solver Instruction、instruction、Anna Clock frequency、latency concept、run memory instruction、instruction cycle、instructor insertion、cycles integer、computer architecture design


Jishen Zhao 00:00

Jishen Zhao 00:14
Ok, I think of a recording is already on so. Last lecture. I just give you a very brief introduction of computer architecture what it is so basically you can think of the compute architecture. As this picture. We make up the plan as the architects, we make a plan. To connect all of those we have a goal of designing a type of computer and to use that computer to run a type or several types of applications are. Materials are technologies, so over this quarter will learn about some of the technologies that we use to design a computer, especially those listed here. Will be learning about as you will see later line. The keywords or SRAM DRAM. Those are the technology will be touching a little bit, but will not get into too much details that much of hardware like logic gate circuit technologies. Those are both hardware classes will cover I will not deal with that. So just basic concept of technology will just touch a little bit and this quarter.

Jishen Zhao 01:29
Hey so today is the first Fletcher so over today will be talking about the goals. One of the goals of designer computers for the computer round fast that is what we call performance the better. The performance the better. The computer will design So what is component of performance and? How to measure or evaluate performance if we want to design a computer this is what we’ll be discussing today so basically will discuss 32 types of performance metrics. Latency and throughput and based on those 2 basic metrics will develop more advanced metrics X speedup. Ann will do some calculation. An AD that later on in today’s class will focus on how to evaluate and measure the performance of CPU. I think everyone knows what CPU is, is basically to measure performance of computer so it’s over.

Jishen Zhao 02:33
Today’s class will have a lot of stuff in class exercise an examples. They have asked you to do some simple calculation. For computer architecture, we don’t have very much so for complex calculation. But just simple lines so hoping again your engage yourself in the class leverage class time really well so with those exercise, hopefully you can just. Get all the material covered but you don’t need to spend too much time reviewing after class.

Jishen Zhao 03:03
Alright so first of all what are the 2 basic performance metrics if you have a computer you want to measure the computer performance? What we use we use children latency and throughput. Some of you may have heard that or still remember it and your undergraduate level class but did you know what? The difference and at?

Jishen Zhao 03:24
What kind of situation we use? Which performance metrics to measure to measure performance of computer. We think of the difference of 2 latency versus throughput latency is the concept of time well throughput is rather. Anyways so we can think of for latency. We focus on a fixed number of fake one fixed ask how long it takes to complete a task well for throughput we focused on tasks. So this is the difference. The key difference that the latency would just focus on one task while throughput.

Jishen Zhao 04:00
We focus on a lot of tasks so think of the difference by comparing ways when you drive your car on a highway. You can say that you draw on the highway. There’s a lot of cars on the highway as well. So if wanna measure. The latency of you driving from city. Ato city B that is the latency? How long it takes. But if you want to measure the throughput of this highway. The performance of the highway. You actually measure. How many cars running at the same time over the section of highway. So those system input.

Jishen Zhao 04:39
And think of more of difference in terms of measuring the performance of computer which one to use so there is a different types of computers and different. They’re running different types applications so typically when you. Think of you use the latency to measure measure the way of application that care more about how long it takes to complete a complex task for example, a scientific program. Go to solve those really complex equations. It takes quite a long time, but this is only a simple a single task a single equation set or single scientific. Program so in this kind of application where we measure, the computer using latency. But if you think of the computer that running application like web server like a Google servers. You care more about throughput? How many requests for web service request your accommodating at the same time. This is what we want to know or care about when we measure performance, though, of Web server, so you think of you know the different sound right.

Jishen Zhao 05:52
I’ll give you one more example or over in class exercise wear to work on the help you to understand more of the difference between latency and throughput so look at this question or this example, an I ask you to do some simple calculation. How will do an in class exercise now so the question now is if I have a drop that I want to move people to travel from. One place to the other. Anne want to move people, the distance between the 2 places are 2 time, 10 miles and I want to move to people for round trip. That means I will move the people for 20 miles in total and I have 2 Masters car and bus. And here the number for capacity and speed so the question is. Which one has better performance car or bus? So I’m going to open up a pool. Pulling questioning that for everyone to. Work on during a class so I think I already started so click on whichever option you think of which one has better performance car or bus. For both or none, either way, they think it’s a multi choice answer. Based on your understanding based on your.

Jishen Zhao 07:19
Calculation. Ann is the anonymous poor so I will not require name. So it’s that will not check your participation so just feel free to select whatever. Whichever choice you want.

Jishen Zhao 07:45
So I will close the pulling set 65 second 321. An Apple OK. Share the result. So now I can see the result Carabus, they’re both choices so. This is the question that is tricky because I think there’s a question in the chat, which is better answers.

Jishen Zhao 08:20
Yes. That’s fine alright so the question is tricky because I didn’t say remember. We are discussing latency and throughput. Ann I didn’t say I want you to measure performance in terms of latency or throughput. So the answer is no because. I didn’t tell you, which one I want to use to measure the performance.

Jishen Zhao 08:44
So when you calculate latency and calculus to put your account will be different or your answer will be different, which ones better. So if we calculate Lindsey of moving one person or this minute person people from one place to the other using car definitely it’s faster. But if we calculate throughput. Performance. Devil’s bust is better because in the bus, we can move more people. Or in the bus, we can move the similar people much faster. 2 from one place to the other if we have limited amount of people so the way of calculating latency here is you simply calculate the distance device so sorry. At the. This day is divided by the speed on the cars halfway better. But if we want to calculate throughput you will do the reverse it’s more over capacity can consider the capacity and speed. And the distance, so this is the result you will get. 15 people per hour versus 60 people per hour.

Jishen Zhao 10:01
Alright so hopefully that gives you some idea of what difference between latency versus throughput focus on just one person versus a lot of people this is different. And the result can be there. The answer can be different, which ones better when you measure the same case and was different metric.

Jishen Zhao 10:24
Another example here so we’re going to do another polling question think of. Another case where you want to send the data from one place to the other saysana from US to UK or from. San Diego to SAN Jose so force, which one you want to use.

Jishen Zhao 10:49
An things could be different depending on where you want to send the data. How long distances and how fast your speed of. But. Internet is and which method you want to use so we’re going to do another pulling question. Which one you want to use? And I will actually.

Jishen Zhao 11:19
Pulling out the polling question. I think I relaunch the polling questions cell therapy to pick up your answer? Which really want an someone in chat says the quick instead of pigeon with a USB stick. That’s a cute idea. I should add that as option 2.

Jishen Zhao 11:57
Ok, I think we have, we don’t have clear winner we have a lot of FTP. We have Google Drive and other where others are I’m curious about learning curious learning about what others measures. I am going to close the pool in 5 seconds 432 and one. People share the result so here’s the result the winner is. Add TP.

Jishen Zhao 12:38
Let’s say there are more comments in the chat. Glue this stick USB stick to drifting coconut high so that’s. Those are good answers in a Poulan also attract so I’ll show you. My answer I’m going to sell even easier. I just sent a FedEx overnight express shipping. And why is that and what’s the relationship between this answer or this question was throughput versus latency.

Jishen Zhao 13:13
The metrics so I can show you just do the calculation. So this is the speed. The business amount of data you want to send and this is the speed. So that can remind you is very similar to the example bus versus car. We have seen earlier and in terms of sending the data over this third and. Up there in distance, which one you want to measure the performance latency or throughput. Whether that means the difference is whether we care about just a single bit of data or a lot of Bits of data. So definitely we have a lot of Bits of data we have 10 terabyte of data. So that means we actually care about a lot of. Thanks.

Jishen Zhao 14:09
A lot of people a lot of Bits of data so that means we should calculate the performance using throughput rather than wait and see when we calculate using Linux definitely sending over FTP or Google drive so and so forth over the Internet. If we have a good Internet connection definitely faster, but when we consider throughput because if we send it over the data over the Internet like a group by group dash by batch even.

Jishen Zhao 14:39
Yes. The speed is still take a long time, so instead we can send. The data over shipping or by a pigeon if the pigeon will not lose its way over on the other way. It can carry out enter battle bitter bite of data altogether.

Jishen Zhao 15:01
So it has better to put actually this is not just not my solution. This is this solution of Amazon if you go in check this Amazon. Tool it can actually help you to calculate which method is best to send data. So next time you want to send the data. Just go and check this tool that can help you to make a decision?

Jishen Zhao 15:28
Which type of message issues. Alright so, so far, we have learnt latency and throughput and the most important thing for you is to understand the difference between latency and throughput so next time if you if I ask you too. Measure the performance of certain computer design or whatever even in your daily life. You know which one to use the best to measure the performance. And if you are still wondering think of our first example when driving highway. What do you care about or you care about only your car latency but you care about the performance of the highway?

Jishen Zhao 16:13
Template. Alright so now that word you learn about latency and throughput. The next thing is that we can think of what we can build on top of that? Where are the metrics may be helpful? Measuring or comparing performance so we’re going to introduce another metrics, which is speed up speed up is rather something built on top of latency throughput so that’s why we have latency speed up versus throughput speed up. So if I ask what does speed up of computer a versus computer be you should ask which speed up you want me to measure latency also put. So that means when you calculate latency throughput. You just divide the 2 latency well. We’ll calculate the stupid speed that you divide the tools to put. And people usually measure the speed up in different ways that you can think of speed up as a faster than or slower, then so those are all the equations, but I don’t want you to. Remember, the equations just understanding the meaning of it. And it should be easy to figure out yourself rather than remembering Arthur. Right so in order to? Get better understanding of speed up and just forget about those equations that just think of what it means to be a speed up, I’ll have another example.

Jishen Zhao 17:44
So, like today, we have a lot of examples that if you just follow example or do a calculation during Class I don’t think you even need to review for class. So this is another example or in class exercise before we move on to the next topic of today’s class but when I do a pulling question. I just give you a couple of 2 minutes. To do your calculation and see if you can get the same answer. So the question the same example of car versus bus. As our original first example. But this time, I ask about speedup latency speed up versus Super Speeder. So I gave you. A couple of minutes, I was less than 2 minutes to do a calculation. And now we’re moving.

Jishen Zhao 19:27
Ok, I think the the question is relatively easy so we move on. But if you need more time to calculate feel free to come back to here in the real calculation after class. We review the Electro videos and slice. This example is a little bit harder than the previous one, but not rather really harder. But it’s more related to computer performance so these questions ask you to calculate speed up. Of comparing the 2 programs program a man program be. And I want use to use the this example. Also, 2 free to review some of the concepts.

Jishen Zhao 20:11
We’re going to use over this class cycles, so hopefully everyone still remember what Psychos are. But if you forget cycles execution time multiple by Clock frequency it’s 2nd multiplied by hurts.

Jishen Zhao 20:28
So what is? Cycle is it latency or throughput think of it. It’s latency. So the speedup over a over B. You simply divide the 2 cycles and you get the latency, the speed up. And you can also do calculate the percentage of speed up, I will not get into much detail. I think it’s pretty straightforward.

Jishen Zhao 20:57
What is not as straightforward as this question so think of this question if another program program CIS 50% faster than a how many cycles does see around 4, so I gave you one minute. Also during the class. Try to calculate myself think of. How many cycles does program see around 4? And I will show answer shortly and see if you can get us right answer myself.

Jishen Zhao 22:12
I think the chat there’s. Already people send so many in their answers here it’s good. We have different answers. 133 versus 300.

Jishen Zhao 22:33
Yeah, this is not a originally it’s not a polling question. But these we have 2. 2 different answers 133 versus 300 cycles. Let’s do a dual pulling. Let me see if I can. I can add I can add and you pulling question so I cannot already start it? Party started there. There’s a meeting.

Jishen Zhao 23:04
Yeah, in a question in the chat if it is faster than 8 must be less cycles than a right. So the answer is 133, so if a you didn’t get the 133 ‘s answer can calculate again after class and you can think of. How we did it with the speed of A&B an? Just do the reverse calculation and you should be able to get the one through 3 cycles, I see.

Jishen Zhao 23:40
An if you still have an issue of getting answer. Feel free to post the question on PSN that Ian and myself can help you, you have homework questions harder than this not as straightforward as.

Jishen Zhao 23:53
Examples here, but the key idea is the same as we actually cover over the in class exercise here. But so this is speedup just one last note about speed up is what, if you calculus speed up. An you get the answer is your speed up is less than one that means it’s easy right? That means you’re. Speed up Mays here have a which is slower than be. Alright actually this is the really the last notes about speed up so sometimes people will ask the question a tricky way instead of for speed up or faster or slower than. People asked about increase or decrease those are just like wording game, but just pay attention to if. He will ask you about increase or decrease in terms of speed up the answer will be very different so increase or decrease. That means you have a baseline because you increase and decrease from? Her baseline so depending on which which baseline you choose your answer will be different. So if we calculate increase of Psychos your baseline should be the psychos numbers, which is smaller so you divide. By 200, which is the smaller number in terms psychos, but you think is decrease decrease means your baseline is a cycle number which is larger, so you divide it by 350. Then, 200, so your percentage per speed up in terms of increase or decrease those are different. So just pay attention to the words.

Jishen Zhao 25:40
Right here right so this is what we have learnt speed up and we build it on top of the 2 very basic metrics latency and throughput. So so far, you should already learn the 3 key metrics. We measure used to measure performance and this is all of the performance metrics. You need to know for this class over this quarter, no more. Ann. With that we can do more for offensive calculation that is average or performance so average performance if you still remember your mathematics class. They’re different way over calculating average or mean you have Earth Magma Harmonic Mayanja metric mean so which equation is used to calculate which metrics. Is it different?

Jishen Zhao 26:35
So just remember when you calculate latency user? Is Mega Man? Where you calculate the average of throughput use harmonic mean and what is harmonic may it just take a one over throughput Adam together? Ann. An over the sun so thanks a lot sounds like a little bit complex, but will have more exercise. So hopefully that will help you too. Get up out of work for here and the reason over we using harmonic mental calculus throughput. Is that? When you calculate average you need to add them, together, if it’s a risk magazine.

Jishen Zhao 27:14
At the through put together. There’s no physical, meaning of adding throughput because throughput the unit of throughput is. One over time, one over second one over latency right so you can only add the latency. You can only add a second, but not one over second so that’s why you need to take a reverse.

Jishen Zhao 27:37
For throughput and Adam together and then reverse again so that the physical meaning of adding together and that’s why use harmonic mean to calculate the average of throughput, but if you. So that is a little bit stupid will be tricky just remember for this class. At least just remember calculus to put you have rich use. Harmonic may that’s sufficient and the rest one, the speedup we calculate used. How much do a metric mean because speed up is a ratio so whenever you calculate the average or ratio used to image mean so that’s pretty straightforward.

Jishen Zhao 28:10
I think there’s1 more example about why. We want to use magic man to calculate the average throughput. Again, you, said daily life example rather than computer and hopefully that will be easier for everyone. To understand so I think you do that. At least I do that to calculate my average speed sometimes. But as well when I drive my car for over the distance was different speeds when calculate average speed. It’s not just. Adam together a divided by 2. Actually, if you add the 2 speeds, which run for one mile each Atom together and divide it. By 2, you get a60 miles per hour, but this is not your actual average speed.

Jishen Zhao 29:06
So think of why and how you actually calculate average speed. So actually when you calculate average speed speed is distance overtime so when you. Likely your average speed you added distance together. An AD attached together and you divide so if we do that. Add 2 miles together an AD the Times that you used each mile. You get a45 miles per hour and think of how you get is 45 miles per hour. This is exactly when you’re using the equation of harmonically. Hi so hopefully you doing this daily life example.

Jishen Zhao 29:49
You give you better idea of why we use semantic Maine to calculate throughput. Speed speed is. A concept of stupid right, it’s the reverse of time. Rather than a respected me.

Jishen Zhao 30:07
Right so this is what we learn, we calculate average of for performance metrics latency throughput and speed up with different types of equations arithmetic. For latency harmonic mean for stupid geometric mean for speed up OK, so next time if I ask you a question or homework or exams don’t. Mr. Twister part that they’re an equation to calculate the average. They are different. Hi so I think that’s pretty much of the basic performance metrics before we continue to talk about CPU performance. Let’s take a short break of. 10 minutes, so now it’s48 so we’re going to count back after break at 4:18 to continue to discuss the CPU performance. See you later. Hi so over the next session of today’s class will be discussing how to map the latency throughput and speed up performance metrics, which are slowing. To measure or evaluate the CPU or computer performance.

Jishen Zhao 31:35
Alright so when we measure the performance of CPU or computer think of. We also use latency or still part, but we just give them different names or we? Just have very different variation over latency throughput. So what are they? Let’s consider latency first and throughput sleazy easy because throughput is. Rather, a reverse of latency in terms of concept so for latency think of computer program. What is the latency of running a program it’s the? Seconds. Over program right you wanna probably on how long it takes seconds.

Jishen Zhao 32:19
So it’s seconds over a program that’s a latency of computer running that particular program and we read it with factor it out. How was different concepts or different variables we use in computer? Programmed instruction cycles. Instruction members. So when we factor it out using by introducing those new arrivals, but if we consider. The equation itself, it still seconds over program because instruction number and cycles. They simply cancel by themselves and you still get seconds over program.

Jishen Zhao 33:13
But why would you wanna come up with this more complex equation by introducing instruction numbers and cycles because if we look at those 3 brackets. They are different concepts are we using computer.

Jishen Zhao 33:26
Design. So first take a look at the first bracket that is the instruction number of over program so as to how many instructions in a program or we calculate instruction count. If you think of instruction count.

Jishen Zhao 33:45
This can impact the latency because latency is the equation has this. This part instruction count and think of what can impact instruction count. Then it’s impacted by how you write a program. I write a more complex program you have more instructions. And how the compiler translates, the program into machine code will learn about. The machine code and I say later on, and also impacted by Isa. That means how you design. The instructions can be round by a particular computer. Or particular types of computer architecture that will learn shortly over the next week. So all of those can impact the instruction count over program that factor while the second.

Jishen Zhao 34:40
Market cycles over instructions. This is impacted by again because there’s instruction number so still by how you write a program and compiler. Dan I say, but the cycles is we just had that example.

Jishen Zhao 34:57
We define a cycles as impacted by the frequency and frequency of computer and the latency of a program so the frequency of computer is impacted by the? Architecture design, the hardware design how fast the Harvard Timberland.

Jishen Zhao 35:16
So Michael architecture or architecture also impact the cycles over instruction to factor an we gave the. Psycho Solver Instruction of particular name called CPI cycles per instruction. An this is something you’ll remember this the important concept in terms of latency. Evaluation in computer design. Will cover it will come back later on, just shortly by finishing up the third bracket here over here seconds over cycle. It’s called Clock period or hurt. Oh, actually. Or reverse of Clock frequency so this is impacted by the microarchitecture or the hardware device, the technology this is rather not.

Jishen Zhao 36:10
Will this will not discuss the over this class so for us? We’re more focus on the? Cpi an instruction count that is the microarchitecture design that impact the cycles an ISA Ann. How to attach a little bit how it got translated by compiler so all those 3? Will impact the latency, but given a computer design given a program that is already written the instruction count. It’s fixed.

Jishen Zhao 36:50
Because it’s more of a software will decide Anna Clock frequency will also fix because it’s more of a hardware technology itself underneath the device or decided. And for computer architecture design what we can kind of play around with a lot as the cycle over instruction.

Jishen Zhao 37:09
The CPI so for computer when we evaluate or? Consider the performance of computer design, we use CPI to measure the latency alarm, so CDI for CPU performance is. A very important latency concept. And in terms of well, how to how to how to think of as the better performance.

Jishen Zhao 37:41
Def Let’s lower latency and was later lower latency means we have lower CPI. But in order to achieve lower latency. It’s really difficult. I listed a couple of difficulty here. But we’re not getting to detail currently at today that just put it here. Will actually talking about details of those issues, or challenges later on all over this quarter’s over here.

Jishen Zhao 38:13
The concept of latency when we used to measure computer performance. We use CPI. So again CPI is Psychos over instruction. So when you calculate the latency of computer will UCI and like I mentioned earlier when you calculate throughput let’s mover. In terms of the unit or concepts. Moreover, reverse of CPI so we use IPC to calculate throughput. So it’s one over a CPI or its instruction per cycles.

Jishen Zhao 38:55
And at least maybe is not truth, but you can find a lot of people when you when they measure the performance or calculate performance they like more like PCI more than CPI. As like IPC more than CPI because PC you get a bigger number because you have a lot more instructions divided by cycles. So now that we know the the 2 performance metrics. We use, particularly to calculate performance of computer design.

Jishen Zhao 39:28
Cpi and IPC I give you one more example. It’s again a calculation that I will give you a few seconds to do yourself. It’s not pulling question. Just for you to get a little bit idea offer a better understanding of what CPI is and how to calculate CPI when you got a computer and you got a program to run on it, so then this.

Jishen Zhao 39:54
Question I need to calculate ourself is there’s a program or to fix so we fixed the instruction number. Instruction count and already fixed the Clock period Clock frequency, but. I asked to calculate the CPI so in this program has different types of instruction integer floating point number operations. And each type of instruction because of the hardware design and the and the compiler in authority and I say design the number of cycles CPI. For each type of instructions different so to run integer instructions. It takes one cycle for run memory instruction. Take 2 cycles and the floating point instruction actually take longer time need 3 cycles. So the question will be what is the CPI overall for running this one program that has all the instructions on this particular computer? So I’ll give you 2 minutes to calculate myself to see if you can get the same answer as I or you can read my equation here and think about why I calculate CPI like this.

Jishen Zhao 41:12
And the key idea behind calculating CPI is think of the name of CPI Psycho over instruction cycle per instruction. So when calculating CPI you always get A? What are the cycles? Can water instruction and you divide 2? So I gave you a couple minutes to think about think about why there.

Jishen Zhao 42:49
Ok, so the 2 minutes, so think of why I calculate CPI like this. So this is the message. I typically want to use. It’s easier to understand is. I think of how many instructions in total during the program and how many cycles in total. To run those instructions and then I divide it, too, so think of, I have AI make assumption. I have a program that only have 3 instructions. One integer one floating . 1 member operations that actually matched the question right. I can make any assumption of any number of instructions. But just make it simple just one instructor insertion. Wonderful 10. 1 memory operations.

Jishen Zhao 43:32
So there’s3 instructions in total and think of to run those 3 instructions? How many cycles. So I will need one cycle, 2 cycle 3 cycles, Adam together and if I got to CPI.

Jishen Zhao 43:45
So if you want to calculate CPI you can use my master to make assumption of number of instructions in a program and then calculate the total number of cycles. It takes to run that many instructions and if I did, too.

Jishen Zhao 44:00
Thanks. My TAS they’re smarter than me, so they most of them like to use other methods to calculate CPI so they want to use this percentage so. For any number of instructions in the program. This is 33%. One third of instructions is one cycles integer one sort of instructions is. A memory pressures and once order machine are floating point operations.

Jishen Zhao 44:32
So you can also calculate this one third one third and monster and together it’s also 2 is the CPI. Either Way is fine. Hi. So the next week. I’m going to give you a more complex example actually in class exercise for you. At this example for more like you’re the typical homework question or exam question you will have.

Jishen Zhao 45:00
So given a processor or given the computer with instructions percentage like this and the number of cycles for each type of restructuring like these. Which change would improve performance better? Weather plan A or plan B so the solution or the direction of her thinking of this question is to calculate the CPI or IPC doesn’t matter because IPC for computer designers the? Ipc is the reverse of CPI for over work.

Jishen Zhao 45:36
So I can get close. Cpi here and the smaller the CPI the better the performance because CPI is the concept. So I give you a again. I think the longer I gave you 4 minutes to calculate ourself? What is the CPI of plan A? What is the CPI or plan? Be. What is the CPI of original design without any performance optimization and then compare those and then come up with the answer? Which ones better A ORB.

Jishen Zhao 46:26
Anything already showing the base by accident. And this is my TAS measure. By percentage of 50%, 20%, 10%, 20%, so you can use my password. Also, so if I were to do the to calculate this question.

Jishen Zhao 46:52
I will assume that the program has 100 instruction. In total, 50 or integer 20, Arlo Tenor Store 20 instructions are. Lunch and then I calculate how many cycles in total to around this whole program and divide. You will get to as well.

Jishen Zhao 48:09
Hello, we started having answers this hold on hold on your answer the Ourself and I’ll show you my answer I think we have pretty good answers already. Just wait for a few more seconds after one minute.

Jishen Zhao 48:32
That’s good, we start to get good answers. This is the very typical exam were humble question how about question, maybe a little bit harder or more complex fancier. But this is very typical exam questions so if you’re good you. The weather is in class exercise.

Jishen Zhao 48:51
At and you still remember at the time of final exam very still confined as example and the lecture materials. You should be good at least with this type of question for funding schemes.

Jishen Zhao 49:08
And this is a very typical case wherein used to calculate if you want to optimize your program is not sometimes not only the hardware optimization you can do like the? Suffering part imitation say it when you write your program. You can do less branching less if then else or you can do less memory access to optimize your performance that’s it. That is useful way to decide what type of Optimization, you want to use your program as well.

Jishen Zhao 49:39
Alright I think time is up. It’s already 4 minutes, so I will show you my answer. So the answer posted in chat, but you are correct, it’s one . 84 point, a an 1. 6 for plan B so. Which ones winners obviously it’s the B it’s a winner because be has lower CPI? So if you still have an issue for calculating the CPI for to come back to this example question effort class and to review calculate again by herself and. Will have more exercise for you in homework. Questions later and during this quarter.

Jishen Zhao 50:27
And few few weeks later, on, and give you a chance to exercise more. Hi so I just little bit information about how people calculate CPI or outside how measuring CPI for computer. You can use different Masters, but one thing that I want to mention or I want to emphasize if in case literally. You need to do your own experiment to measure the? Cpi by yourself and your own case, the I hear the instruction count here. Don’t get it round. It’s not the instruction that you can manually count after you get the assembly code or? Call or even your program call instruction count here is we cause.

Jishen Zhao 51:26
The dynamic instruction. So this concept of dynamic will come again again over this quarter and there’s a lot of dynamic versus static. Odd concepts so dynamic here in computer architectures as always means that. It had get a number have you start to run your computer? Have you start to run your program so it’s a dynamic instruction count means? It’s a truly executed instruction, so if you have a program you have if then. House you have bounced when you run the program, you always just take one branch one pass right. You will not take both paths so in that means half of or approximately half of the instruction in. Another bunch that you not take in. Not being executed, so they should now be counted as the I hear instruction come here. So just pay attention to to this that don’t counted all of instructions. In a program statically but rather you want to count to actually execute any structure and that means truly execute instruction and also the cycles around those instructions not straightforward to measure or to count.

Jishen Zhao 52:42
Because you need to run a computer and how you kind of computer at the instruction numbers. After you started running your program. There are different ways in computer architecture design community so we can use something as hard counters. It’s something if I think most of you will not really notice, but actually when you buy computer at the car work on his car was here, so next time if you’re interested if you’re curious you can download the tool called? Video rental an automatically captures the numbers get from the hardware counters of computer so it can run your program and then run that video on the software. Add a background and their software will tell you what are the hardware numbers hardware performance numbers? Are there in your computer automatically captured by the hardware counters that’s a piece of hardware? Integrate with your computer already it’s just already there, but most of us will not use it or don’t know about it, but also we can use the simulation too. Measure their performance so in computer architecture simulators are also. Heavily used, especially in a case where you don’t have a real computer or you have a new idea over say optimizing like the previous exercise optimize have we have plan A or plan B. Optimize the computer hardware, but we don’t really have the money or its cost too much to view the real hardware just for testing or just for experiment. So instead we run a piece of simulations. Similar to we can think of her like a virtual machine that has the virtual features hardware features. But it’s not really an your hardware computer but you can simulate it and measure the?

Jishen Zhao 54:40
Performance in terms of simulation and cackling decide which optimization you want to choose so those are the typical methods of measuring or evaluating CPI or computer CB on IPC. All the more were in terms of CPI is. You can think of CPI’s overall CPI like we did in our exercise. Examples earlier, but there are also different types of CPS you can measure you can measure CPI of particular component. Like I mentioned in the Tuesday’s Class A different room in your computer. It could be a CPU could be memory. So we can close CPI per component. Other than the CPI overall as well. If you are particularly interesting this performance of particular component inside your computer. Alright so put them all together, we have learned today. A few things performance metrics. You want to be very clear about the difference between latency and throughput and when you measure the performance. Of something. Wanna know which one you want to use? If you know what you care about again for latency. We care about just one thing. One task when person one bit of data for throughput. We care about rather the bandwidth. For a lot of tasks running a lot of tests for a lot of people or a lot of Bits or bytes.

Jishen Zhao 56:25
An we learn about speed up how to calculate speed up latency versus throughput and we learn about how to average. The performance, which equation we should use for latency for to put for speed up. They’re different and then Lastly will learn about if we want to measure the performance Sophie.

Jishen Zhao 56:44
You we filled out those things for computer architecture design, we don’t really have any. Control 4, we don’t care, too much about we just focus on what we. Typically care about being computer architecture design. And those metrics, we use our CPI an IPC cycles and instruct.