CSE258 Lecture 1 Recording

Lecture Recording

Twitch.tv (original lecture): https://www.twitch.tv/videos/1161172908
Lark Minutes w/ Transcription: https://rong.feishu.cn/minutes/obcno44468d6122536hv1854

The Lark Suite link might not work for all, see resources for more details.

Text Transcription

The text transcription was performed by ByteDance Lark Suite’s Meeting Minutes program, on the original lecture video. Below are the exported text version with timestamp.

2021年10月6日下午 6:19|1小时14分钟26秒

关键词: fashion recommendation、text manipulation string、adjacency matrices edge、machine learning models、machine learning class、Losties assignment specs、Harry Potter movies、movie preferences、online advertising setting、coding questions、recommendation algorithms、Skype link Sir、data set、recommender systems type、bit operating system、learning algorithm、hand engineered solution、regression algorithm

文字记录:

Julian McAuley 01:49
Hi there, everyone on Twitch as well. Thanks so many people for coming and thanks for all of those who are watching remotely. Let’s see how this all goes. Please let me know if there’s any. Audio problems either. Here or. Or virtually I think it’s OK, someone says muted but I think it’s their fault alright great. Um so here we are so I’m going to kind of tell you a little bit about what this course is I’ll get into all of the more mundane boring details about the grading policy and so forth are in about.

Julian McAuley 02:30
20 minutes or so, but I just wanted to give you an intro to what we’re actually all doing here, so this is really a class about machine learning and predictive analytics and in particular about recommender systems. So we’re trying to build these kinds of models that help us to explore and analyze data sets in order to gain insights and make some predictions. What does that all mean we can’t see half of the slide? Not I’m sorry I’ll fix that in a second. Nothing’s going to be perfectly. You know this is very difficult. Alright screen sharing not aligned. There you go easy right. Ok. So yeah, what does it mean to understand daughter and gaining size and make predictions and all that stuff, so the kind of predictive model.

Julian McAuley 03:27
We’re going to build in this class would be something like this. Given A. So data set of product reviews from Amazon could we build a system that would predict something like a star rating. An estimate how I will rate a particular unseen product, So what rating would I give to a movie like pitch black OK. So why would we want to do that first we might want to build things like recommender systems would like to build predictive models that say what will I like? What will I dislike? What will I click on? What will I purchase? How long will I spend visiting a website things like that. If I can do that I can recommend content in a personalized way help people find products. They like so on, and so forth make a bunch of money whatever you want. But I can also do sort of more sciency things I can try and use the data to gain insights so I can kind of figure out. How do people’s opinions change overtime? How do people’s opinions change as a function of different demographic factors like their gender age and location. What are the underlying dimensions that explain why different people like different things in other words? Ok, it was also tasks like social network analysis. You know things like people, you may know on on Facebook. If anyone still uses that are kind of examples of predictive systems or in another sense. There are examples of recommender systems. We’re trying to estimate given a particular individual what person who they are not currently friends with would they be most likely to become friends with next? Can you move your camera to the top right? I mean, I guess so?

Julian McAuley 05:15
I’ll go up here. It will be a little bit smaller, so I don’t overlap with the slide do. Tell me if I block this slide at any point I’m sorry. Great so yeah, I mean for one week and make predictions. We can estimate who’s going to become friends with whom you can also do this to do recommendations of say dating partners in online dating but you also. Gain insights about the dynamics of social networks. Why do certain people become friends with each other? Is it due to? Shared interests is it due to common properties is it due to physical proximity so on, and so forth. Um OK and yeah, more mundane things like AD recommendation or AD targeting that’s a classic example of a recommender system. Can I predict? What AD you’re going to click on in order to solve some predictive task like maximizing revenue? If you can do that, you can estimate things like Co purchases are seasonal trends. So what kinds of things do people tend to purchase a different times of year so on, and so forth. Thanks so much for all the subscriptions it’s very kind. Good I’m you know, I think in a class like this, the data sets will look at they tend to be very much focused around things like E commerce, predicting things on Amazon or Yelp or Google Maps or whatever. That’s largely just because that’s where the best publicly available data sets tend to be available, but of course, you can use recommender systems and predictive models to do useful stuff, too like. Medical informatics can you recommend things like courses of preventative treatment? Can we build systems that our estimate for a particular user? What kind of systems are they likely to exhibit next time they visit the doctor. Ok and if you can do that, you can understand how you know how different groups of people with different comorbidities. Maybe progress through the stages of the disease or something like that I think this. Application is still maybe you know, some years out, whereas things like AD recommendation have been well and truly deployed for the last couple of decades.

Julian McAuley 07:31
Ok, good so you know what are the components of predictive model So what do we actually need in order to build these recommender systems or these machine learning models that make predictions for us?

Julian McAuley 07:45
Sorry it’s a bit noisy here, so fast, we need. We need basically the things we’re trying to predict we need the labels so in all these previous examples, I might have data about ratings. People have given I might have daughter about what they’ve clicked on I might have daughter about. Are the products they purchased or returned or how much they paid you know? The quantities that I’d actually like to estimate in order for my model to be useful. I also need to have kind of lots of data. I need to have a big data set of relatively independent instances.

Julian McAuley 08:16
So you know, we can solve the task like predicting ratings using a large corpus of reviews from Amazon. Um the slides online are different from me. I did update them a little bit earlier, so if you look at them. More than a week or so ago, just refresh. The page it should be alright. This is the intro slides, though not the lecture one slides.

Julian McAuley 08:40
Ok so something like predicting a rating perfect example of an ice data mining or predictive. Analytics tasks that kind of thing you might have seen another machine learning classes. If you wanted to do something like predicting weather review is sarcastic you know that’s a fun task. I’m not sure that we can really solve that with these types of predictive. Analytics techniques, so what’s the problem. Don’t have any labels that say, this review is a sarcastic review in this review is not a sarcastic review now. You can sit down and think? What are the qualities of a sarcastic review you know? Maybe maybe if I use? Very positive language, even though the star rating is negative that would be an indicator of sarcasm and yeah, OK. That sounds like a reasonable hypothesis but.

Julian McAuley 09:28
What do you do with it? How would you actually validate or evaluate whether that hypothesis was correct in a given data set you couldn’t do that unless you had labeled data. Or measurements really. Ok, so second component. You need is kind of an objective to be optimized exactly what I just said. You need to be able to know if you have. Built a good model or an accurate model. You need to be able to take a measurement that says you know did you estimate the star ratings? Close to the truth in other words, so don’t don’t get too scared off by equations. Yet there’s not much going on here. This is just an equation that would measure how different is the prediction of some model compared to the actual Ground Truth. Ratings given by some users that’s an example of a measurement. We’d like to make this once called the mean squared error.

Julian McAuley 10:24
Ok. Yeah, so I think we need is we need lots of data. We need to have many measurements and for the case of personalized models like recommender systems. We need to have many measurements associated with each individual. We need that so that we can make predictions that are statistically reliable so that we can really establish correlations between features about a movie or userin feedback like ratings or purchases. And yeah, we need those features to actually be good. We need features that are really. Meaningfully correlated with.

Julian McAuley 11:00
Today we’re trying to predict OK. So yeah, that’s our extremely broad stroke summary of what predictive Analytics. Really is in this specific case of things like recommender systems. Ok so this course is all about teaching how can we model data set data sets in order to make these types of predictions? And how can we meaningfully validate that our predictions are actually good? And a little bit of sort of. I don’t know what sort of more sciency topics. Can we actually reason about what’s going on? Can we take a machine learning model that we’ve trained figure out? Why people behave the way they do actually kind of under examined what’s going on?

Julian McAuley 11:46
Under the hood. Ok, so that’s really what data mining is or maybe even more broadly things like computational social science. Great so yeah, I mean, there’s a big focus in this class in particular on recommender systems on the web. It’s in the name of the Class I’ve actually kind of increased the focus on recommender systems quite a lot this year. Are compared to previous years I’ll kind of explain? Why a bit later on. But yeah, we? Do all kinds of predictive modeling of data sets like this fashion recommendation from from Amazon movie recommendation from Netflix lots of. Prediction about what these people will like will maybe get to that later today.

Julian McAuley 12:29
That kind of thing again. I think this is this largely opportunistic. These are just the data sets that are. Most easily collected we can actually go and scrape or we can go and download and we can build real kind of. Industry scale predictive models out of them, which is you know it makes the Class A lot more fun than working with toy data sets or anything like that.

Julian McAuley 12:55
Um OK, so we have this kind of particular focus on things like E commerce or recommended system scenarios, and because of that. We have a lot of predictive tasks that are concerned with.

Julian McAuley 13:08
Estimating things like. What Advil opposing click on what product? Will they purchase next? What star rating? Will they give which out of 2 movies? Are there more likely to watch things like that? So yeah, that’s really what recommender systems is all about. Great so yeah, I’m kind of expected knowledge for this class. I think the word expected is a bit scary. This is not really things that I literally expect everyone to know it’s more like. If you don’t know this stuff you gotta learn on your own OK, so you should have some sort of familiarity with data processing. I’ll give you some examples of the kind of structured data sets, we work with. Later on in the lecture. So you should be able to read in data sets from file convert them to structured information like key value pairs. Um extract things like. Update information from them stuff like that, so you’re texting text manipulation string processing all that stuff.

Julian McAuley 14:18
Some graph analysis, so you know can you represent graphs adjacency matrices edge lists whatever? Nothing too scary dealing with with structured files like Jason and CSV and TSV I’ll give examples from all of this stuff but if you’re not so familiar with it.

Julian McAuley 14:40
Then you know that might be some rating you want to do on your own. Some math, I mean, I think this is about the scariest sort of equation. We’re likely to see in well. We’ll see it today. But you know it’s about as scary as it gets. You’ve all seen it before probably maybe you don’t remember. But it’s there this is you know, some matrix transposes and some matrix inversions. I mean, you can you can survive this class and you know differentiating in matrix expression you can kind of survive this class without knowing this stuff. I mean, I do like to give derivations of these different machine learning models for the sake of completeness. If you really don’t care about that stuff, you can you can by all means get through the class and use all of the library functions that implement the various machine learning models without completely understanding how the? Optimization routines and so forth. Really walk onto the hood that’s that’s fine if that’s how you prefer to do it. I think if you really want to become expert in machine learning. Then you do want to have a pretty strong knowledge of these fundamentals. If that’s just not for you, you know. Will do OK? A fearsome statistics, not too much.

Julian McAuley 15:54
Do I have a mention of P value? I don’t think I do? Forget about P values nonsense. Alright coding everything is going to be done in Python. I think most people know Python. There are always some that have managed to escape it until now. I would strongly encouraged that something you take the work to study on your own if you haven’t seen Python before go ahead set up an environment.

Julian McAuley 16:20
Download all the code examples from the class web page. Make sure you can just get them running. It’s not too hard of a language to learn there have been people have taken this class and said. The most valuable thing they got out of it was that they like Python. I don’t know about the point of the rest of the material then but OK. At least if you learn Python, you’ll have gotten value from the class and yeah.

Julian McAuley 16:43
We’re reliant on standard. Python libraries like NUM py and Sci FI and an LTC and fancier deep learning libraries like. Tensor flow and so forth, which are sort of optional, but useful if you know them. Ok. So yeah, I mean, I think. You know, these are some examples of. Comments people have given about the class before you can kind of see how it is. Itis difficult to create the perfect class in a classroom. This large in this device, especially given the various other machine learning offerings around here.

Julian McAuley 17:20
One hand, we kind of spend the first few weeks covering. I don’t know standard machine learning material that you’ll see in any other machine learning class on campus. So if you’ve taken a bunch of machine learning classes, you should find the first couple of weeks.

Julian McAuley 17:37
Pretty easy if you’ve never taken a machine learning class before and this is your first one, you know, I think the ramp up is pretty quick. The first 2 weeks will be hard. The good news is that after that. Everyone is kind of going to be on equal footing so just. To struggle for 2 weeks and you’ll catch up, I think.

Julian McAuley 17:54
I think it’s a pretty good class to take as a machine learning class in the sense that the first machine learning class in the sense that it’s much more focused on. Building real working applications and less so about developing theory. So yeah, if it’s your first machine learning. Class definitely don’t be scared off by some fairly fast material.

Julian McAuley 18:20
For the first couple of weeks. Yeah, so I mean, these are kind of the comparison points. These are classes that a bunch of other people might have taken. I don’t know which of them are off at this quarter, but you know intraday and principles of AI and machine learning? You can take them in any order.

Julian McAuley 18:37
This is certainly the most hands-on out of all of these classes. I think most about just. Getting things to walk and maybe the least about developing complex underlying mathematical theory or there’s a bit of that. Too. I didn’t want to do that. Ok, so yeah, there’s an undergraduate and graduate version of this class people have been asking what the difference is this year. They are all Co sheduled in one big happy family, which I think is the most fun way to do it. So the lectures and the course content are exactly the same. We did the same thing last year worked pretty good. The only difference is in the actual assessments, so there’s going to be a single assignment Spec. But it will have some questions that are 158 and some questions that are 258 specific but even then, the differences. Is not too much but yeah? I think if you want to get a sense of like the? The difference in difficulty just go ahead and check last years course webpage. For the Losties assignment specs to get a sense of the difference. Alright so.

Julian McAuley 19:54
In lectures, I try to spend most of my time covering motivation and derivations and a little bit of time covering code. Examples later on, I’ll try to use the lectures to work through. Some of the homework problems that people found the most difficult or anything that had been coming up during office hours. Yeah, like I said, I mean, you can you can survive the class without getting too deep into the derivations of these algorithms, but I think? Yeah, the purpose of the lecture is really too. Um cover those fundamental concepts. Alright so. Yeah, let’s go through the course outline a little bit. I don’t think there’s any questions, popping up yet. Maybe people are a bit shy given it’s one of the first class is back in a physical classroom. If you don’t want to ask questions in a giant lecture Theatre. That’s that’s good. Feel free to pop on Twitch chat too. If that would be easier for you. Is there a way to remove advertisements from twitch streams? Um there isn’t they shouldn’t play Midstream, though there’s only play at the very beginning, which is why I start a few minutes early. You can watch the videos after the fact you can watch. You can watch everything on the UCSD podcasts page. If you have really a strong moral aversion to.

Julian McAuley 21:17
Seeing any ADS. Someone got on Midstream, it generally shouldn’t happen, I mean, yeah, do, do install an AD blocker. If it’s really bugging you I think it. Historically hasn’t affected many people. Yeah, I don’t get ADS when I viewed through the web page. Um yeah, maybe explain on PRC how to how to get rid of them, they can be made to go away.

Julian McAuley 21:44
Ok, so let’s go through the course outline a bit. Everything is right up there on the course web page sure we just take a quick look at it. Is it going to play my video? I’ll make it stop if it does? No, it’s not alright good. No. So yeah course webpages here. That’s where all of the course outline is already up there.

Julian McAuley 22:13
This is more or less complete. There are still a few lectures. I’m I’m working on so we’ll see how we go for time. This is going to have all of the textbook references and the code references and the. Homeworks will be up there at the top. And the files will be on the slides will be there each week. Um what else something I’ve done differently. This year is I moved all of these extra resources into this nice little slidey pain here. I think if you have A? If you ever have the experience of teaching a class. You know you want to help out. People who are struggling with material and you know, there’s a bunch of people who don’t know about Python or CSV or dealing with NUM py or anything like that, and it’s easy to say, Oh, I’ll make your life easier for you.

Julian McAuley 23:03
Here’s30 videos to watch right. And in practice that doesn’t work, so great, it, mostly just makes people more stressed out and then I watched 30 videos. So yeah, I had them hidden behind this nice. Lady thing this time, so they’re all going to go away.

Julian McAuley 23:19
I think yeah, if you’re struggling I mean, I really. I’ve tried to make it so this is like the minimal set of stuff. You need that’s really part of this class and the. You know, I don’t know who the extra resource is a full weather there for people who are who are who are struggling with the material out there for people who have? Extra time on their hands and want to one of you a bit of extra stuff and revise some more things.

Julian McAuley 23:42
But yeah, I think really start with the you know the code that’s provided on the course web page and the chapter references before worrying about going too much further. Ok, so yeah, all the information is going to be on that course webpage. You can also check out last years course webpage. This here’s1 still kind of on the development while I’m working on some new material. So there’s maybe 30 or 40% it will be different. So you can check out a more complete course webpage from from last year.

Julian McAuley 24:19
Ok so this is kind of the the syllabus kind of the fundamental stuff. We’re going to start with regression and classification this week and next week regression classification of very much. Boil applied things that you get in every machine learning class recommended systems is probably the first content. That’s. More or less knew to everyone here.

Julian McAuley 24:41
I think and then for the second half of the class. We dive a lot deeper into. All sort of more complex types of recommender system and all the types of personalized machine learning and so forth.

Julian McAuley 24:54
This is what we had last year. Of course, if you didn’t take the class last year. Maybe you don’t care. But just for the sake of comparison to tell you what I’ve gotten rid of and what I’ve added I don’t spend a week on dimensionality reduction upfront too much math. I guess for something that we don’t use too much of later on. I did remove some topics like social network analysis. Actually, you know, social network analysis was a kind of fun topic.

Julian McAuley 25:25
It was a fun lecture to give as well, but it ended up kind of not being very relevant, nobody really. Ended up doing their assignments on. On the open ended assignment at least nobody ended up doing it on that topic side. Added to kind of scrap it and focus much more on recommended systems in practice. Everyone was doing their open ended assignment on some kind of recommender system so I just decided to make that. Much of a bigger focus for the course overall. So yeah, hopefully it’s a bit more streamlined and kind of every item is a bit more relevant to everyone. But let’s see happy to take feedback and see what we can do better, so yeah, more focus on recommender systems.

Julian McAuley 26:08
I did write this lovely textbook for this class that everyone can go and read. So this course outline is kind of streamlined with the content of the textbook. If you have time to. To rate 20 or 30 pages every couple of weeks. Um assignments are going to be pretty much the same as always assignment 2 is always been open ended but people always tend to do it on recommender systems project so I thought let’s cover more recommended systems great. Ok, so getting into some detail about what we actually cover.

Julian McAuley 26:44
Starting with. You know regression and classification kind of the the fundamentals of machine learning. Um you know. Standard stuff here, linear regression, some feature analysis regularization gradient descent. You know, so how do we build predictive models that help us estimate real valued quantities so if you want to estimate a star rating or? Person’s height or a price or something? How do we build predictive systems that estimate real valued quantities from resort features? How do we deal with things like overfitting? How do we validate the regression model is actually doing a good job?

Julian McAuley 27:23
Secondly, classifiers, so will mostly focus on. When did the progression it’s in amount of many many choices of classifier but it kind of has the same mathematical foundations that we see coming coming up again, and again and again later on in the class. And also how do we evaluate classifieds and sort of non standard settings. So you know? Can you predict? What is the category of an object in an image will a person purchase a product? Will they click on add those are sort of. Binary or multiclass outcomes, so we use classifiers rather than aggressive for that kind of predictive task.

Julian McAuley 28:01
An you know classifier evaluation? What do we do if we’re building a classifier in a setting where there’s? 99. 9 percent of things you didn’t click on and . 1 percent of things you did click on so your data is extremely imbalanced. What do you do about that?

Julian McAuley 28:21
Yeah, someone’s asking about the textbook is definitely auxiliary material. I mean, the textbook didn’t exist. Until a year ago, I guess I was bored during the pandemic and you know, people got by just fine. Now there’s a textbook it’s something it’s something you can read. Maybe you can buy a copy later on, and put it on your shelf keep it for the rest of your life, but you don’t you don’t need it. No. Um, I think some people learn better from a textbook then lectures as well so rather than going through a bunch of videos just reading the text might be better for some people right so weeks, 3 and 4 will get into recommender systems. So you know things like.

Julian McAuley 29:01
People who bought X also bought why this is from Amazon. Their recommendations for Harry Potter or the other Harry Potter movies. Their recommendations for a pair of jeans are. All the pairs of jeans nothing too shocking there, but how do they do? That seems like magic that you look at jeans? They know you want jeans? What’s going on under the hood there.

Julian McAuley 29:22
That’s1 class of recommendation is sort of very simple kind of recommendation that’s sort of just based on looking at the overlap between. I’ll what products were purchased by similar sets of users, or something like that versus more complex machine learning based recommender systems that are actually sort of finding these underlying low dimensional structures of.

Julian McAuley 29:47
Why do certain people like certain things so you know? Can we figure out that in movie preferences there’s sort of this latent preference towards the amount of romance or the budget of a movie or whether it has a certain actor or actress. Can we sort of figure out that there are these kinds of dynamics at play and determine? What sort of users are compatible with what sort of movies. Um yeah, so it’s really making predictions like this, you know, given a user and an item can, we estimate. How will they write it? What price would they pay for it or will they purchase it sounds very, very similar to a regressor or a classifier except that we’re now looking at the. Interaction between a user and an item right? Where is a classifier is looking at just a bunch of features in which features are associated with the label in which features are not here. We’re really talking about the interaction between a user and some content. We’ll see what that means later on, but that’s like the fundamental characteristic of what is a recommender system and what makes it different from other types of machine learning this technique to model interaction data sets.

Julian McAuley 30:58
Ok and then we’ll get into you know, fancy are more complex recommender systems. What can you do when you have complex features? What do you do when you’re in sort of non standard settings? What do you do if you’re building a recommender system for online dating right where you’re not recommending products to people but you’re recommending people to other people. It’s like you have all sorts of other constraints when you’re doing that, like you can’t just find the best person and recommend them to everyone. That’s perfectly reasonable if your Netflix you can find the most popular movie and recommend it to everyone. But you can’t do that in an online dating setting or for that matter.

Julian McAuley 31:31
Inside and online advertising setting where you have users preferences towards ADS, but ADS also have well. Advertisers also have preferences towards users and advertisers have limited budgets that maybe limit the number of ADS they can actually afford to show.

Julian McAuley 31:48
Around that time once we’re starting to get into the assignments will see how we go, but I sort of like to use that time to discuss some tools and libraries and so forth. You know how do we? How do we scrape data sets? How do we use these high level machine learning libraries like? Tensorflow apply torch or whatever, but we’ll see how we’re doing for time. So yeah, I’d like to explain to people. It’s not the most ethical lecture of all time you know how can you build a web scraper and actually collect some of these data sets that we use in all of the class examples and that’s something some people like to do for their assignments is collected.

Julian McAuley 32:24
Data sets. Ok so yeah, then we’ll get onto sort of more specialized stuff. There is a mid term in the middle somewhere, I haven’t exactly decided on the format of that. So it’ll be a take home because we can’t do it in the classroom because half the people are remote so will have some. Remote slightly asynchronous format last year, we did a24 hour midterm it was, it was. I don’t want to say disaster it was OK. Some people didn’t like it so. It was, I mean, honestly, I just wanted to have a normal mid term, but give 24 hours to do it, which I thought was exceedingly generous and then what happens in practice is that people spend 24 hours doing it. It’s like this is meant to be a short answer question and you write this whole essay. And then you complain that I made you do 24 hours of midterms so. I think will avoid that so it will probably have A? You know uh block that is longer than the period of the lecture just to give people some flexibility and in the worst case. If you want to spend the whole block doing there, during the mid term then. At least you won’t lose any sleep over it.

Julian McAuley 33:41
But it shouldn’t be too long, I mean, I did like the take home midterm in the sense that we could you could use the mid term to do coding questions rather than just pen and paper questions, which makes a lot more sense in a class that’s really about applied machine learning to actually. Show that you can use the skills you’ve learned to. Quickly make correct decisions about which machine learning algorithms to apply and build something that kind of works.

Julian McAuley 34:09
Ok and yes, it later on in the class. We’re going to get into this more. Specialized material I’d say, This is I mean at this point in the class. The only assessment. That’s really left is like the open ended assignment so this is more like. Material you don’t have to know all of that. You might you might try to follow some of it. That is more interesting to you, that you want to go into a bit more depth give yourself some ideas for the assignments. So we’ll look at temple in sequence data.

Julian McAuley 34:35
How do you build predictive systems in in temporally evolving settings especially in temporally evolving recommender systems where people’s preferences are on non stationary and they’re changing all the time. How can you build models of that? So there’s a bad slide. Sorry.

Julian McAuley 34:55
So how do you do sort of how do you adapt standard regression or classification algorithms to temporally evolving data if you’re doing things like I don’t know time series forecasting estimating the amount of traffic. They’ll be UCSD tomorrow as a function of recent traffic trends and. And weather and other variables and yeah, also temporally evolving recommended systems which turns out to be a huge factor in actually making recommender systems work and making them really deployable is that peoples.

Julian McAuley 35:24
Opinions are extremely non stationary or they just they just change preferences overtime, but they might sort of change preferences vary immediately due to exogenous events or do a celebrity tweeting something or due to product being on sale or just. Generally, popular all of a sudden modeling those kinds of temporal dynamics is sort of absolutely critical to get recommender systems to actually walk and be deployable.

Julian McAuley 35:49
We gate will look at text mining. This is I think this is also a fun topic, and and so it’s one that a lot of people end up basing their sort of open ended assignments on so we look at sort of classical things like sentiment analysis.

Julian McAuley 36:05
How do you how do you come up with effective text representations? How do you find the important words in a document so that you can retrieve semantically similar documents also a little bit. About how we incorporate text into recommender systems and develop personalized text models. Obviously, you know text.

Julian McAuley 36:24
Mining is AIS a massive topic. There are entire quarter, long classes on it at UCSD so this is just really a very. Simple introduction, but yeah, it’s the one that people I think. Benefit from a lot. I’m getting excited about in terms of actually coming up with assignment topics.

Julian McAuley 36:41
Week, 9, there’s only one lecture in Week 9 ‘cause of Thanksgiving. I skip the lecture immediately before the Thanksgiving weekend since. Surprisingly, no one shows up in a Wednesday evening.

Julian McAuley 36:53
Before Thursday holiday. So we’ll spend maybe one lecture talking about visual models will see how we’re going for time. At that point, so how do you do things like fashion recommendation? How do you actually build fashion recommenders that are aware of visual characteristics of? Items. And you know, I can handle a changing vocabulary of items all the time.

Julian McAuley 37:18
How do you do things like compatible item recommendation? How do you find things that sort of visually go together to form good outfits? How do we use recommender systems to do sort of. Um personalized generation and design of new items, so I don’t know I mean, probably most of the week will be spent looking at fashion, but really this is. Touching upon a broader topic of building visually aware models visually aware recommender systems or personalized models of visual data where fashion is just a good use case.

Julian McAuley 37:49
How many people do we need for assignment 2? I think I’ll say more about that light up but between one and 4 is the answer. Plenty of people do it by themselves and that’s just fine. But you can do it in a group with up to 4 and everyone gets created the same.

Julian McAuley 38:08
Good yeah last week I will look at ethics of recommender systems which it’s a new topic. But I hope it’s a It’s a fun. One so you know what are the problems that sort of machine learning fairness problems of developing? Personalized recommendation algorithms so some of the ones that are I guess you know popular in the news.

Julian McAuley 38:27
Lately, things like extreme ification and diversification problems of recommender systems, So what are the ways that? By following the recommendations of machine learning based recommender systems. How can we end up sort of gradually navigating to more and more extreme content? Which is something sort of people who started in the context of YouTube and it really seems to be something that. Happens as a result of algorithmic recommendation on the other hand, how can recommender systems lack diversity if you’d like one video on YouTube about tiny houses you get 50 other videos about tiny houses right and maybe that’s a good thing. Maybe it’s a bad thing, but. If you build a recommender system in a certain way it’s going to it’s going to focus people, an extremely narrow extremely similar content. If you build it in a different way. It’s going to lead people to much more extreme and maybe dangerous content. Why does that happen? How do you fix it other things like popularity bias? Which basically says?

Julian McAuley 39:28
You know recommender systems can take things that are popular and by recommending them, they reinforce our popular. They already are so popular things just. Become more and more popular as a result of the recommender system more niche. Things never get recommended and everything kind of concentrates around a very small number of items, which probably is an undesirable outcome.

Julian McAuley 39:51
And also you know how can recommender systems be bad or how can they underperformed for other users from underrepresented groups or users? Who just have like weird taste or something and what can you do about that?

Julian McAuley 40:06
Lovely OK, so yes, there is a textbook. It’s all available for free. You don’t have to pay for anything whenever I provide this new content on like Corsair or textbook or the Twitch stream. Someone’s always. Word that I’m making money off it, I’m not making any money off anything. So it’s all good. If you can have the textbook for free you can go and download it. It isn’t actually published yet so that helps. Printed out give it a read. There’s also some other other textbooks that I’ll sometimes provide references from like I mentioned those are now bit more hidden on the class web page. But there there this is a great sort of introductory. Textbook. So yeah, I mean, I could make money from Twitch. I don’t think there’s going to be enough subscribers don’t worry about it. So yeah, this is a great introductory textbook for the first few weeks. I mean, it’s a good one to keep on your shelf if you want to get into machine learning. But it’s certainly not compulsory.

Julian McAuley 41:08
I also give some some references from an old class that used to be taught here on a similar subject material. Help us now about sort of 10 years old by this point, and yeah, there’s all these coursehero lectures. There hidden on the course web page. But if you’re kind of really struggling with the basic material, especially just getting up to speed with like. Sort of data processing manipulating files and CSV and TSV this is maybe a good place to start again. You can access everything for free. You don’t have to let any money flow to me, OK, so good. Alright so yeah, here’s the evaluation.

Julian McAuley 41:47
4 marks we drop the lowest Grade I mean, there is a pretty strict late policy, which we try to make up for by just giving you a assignment, you can skip. There’s going to be a mid term, which I talked about already so I could take home in Week 6 and there’s2 assignments. The first assignment is about building and building a recommender system to solve a specific predictive task that I will give you. And you have to solve it as accurately as possible. And then there’s an open ended assignment, which you can use to do whatever you like. Um do at the very end of the Class I think the sort of the buried lede Here is there’s no final. You may have heard about a final for this class or something the the detail here is if you make a class at UCSD. You you have to register their as being a final in the course approval just in case somebody else teaches the class in your place and wants to run a final so we have to create this fake finals. It gets scheduled and clashes with everything. But it doesn’t really exist, so, so don’t worry about it in Week 10, you’re done. You can you can go on with your lives? Wonderful so there, you go? I mean that’s what there is.

Julian McAuley 43:04
The purpose of the evaluation is kind of the following the homework is just the homework. Most people should be getting close to full marks. It’s not particularly toughly graded or anything, it’s just to make sure everyone staying on top of the fundamentals week to week. The mid term is kind of more about the foundational material, the assignment. Where we build a recommender system is sort of just I don’t care how you do it as long as you build something that works well and then the second assignments more about applying a knowledge creatively is assignment. One still a couple competition. Yes, it is. I don’t know if you’re asking that.

Julian McAuley 43:39
In an excited waiora worried way but yeah, I’ll I think I mentioned that in a minute. So yeah, assignments are all due on Monday before the beginning of lecture if it’s a few minutes late. It’s probably OK but I do want people to sort of show up to lectures and not spend his lectures, doing the. Assignments or anything everything submitted on grade scope. Skype link Sir. Already on pizza make sure you sign up for the. Correct grade scope there’s a different one for 1:58 in 2:58, so don’t submit it to the wrong place. Yeah, and that’s your due date sort of trying to make it so that there’s never more than one things you in a in a particular week. So yeah, the previously and this year to the assignment has usually been run as a competition on on toggle.

Julian McAuley 44:30
So you know, I give you this data set where I kind of withheld a fraction for testing. If you’ve never used cargo before I give you a bunch of training examples and you have to predict the ratings or the purchases or what books people will read. Or what clothing they’ll buy as accurately as possible, and you upload upload your predictions and you get scored and it gets evaluated on this leaderboard. Right so I mean, you know it’s. The leaderboard marks are not not worth much. But it’s meant to be, it’s meant to be fun. That’s the idea at least. It’s usually fun. I think last year, it was not as fun. And this time it wasn’t any different, but I think people just maybe one in the mood to. Be evaluated on a leaderboard during pandemic and so we don’t want that. But I mean. Pandemic, so long since history now, so we can do that again and I’ll enjoy it alright. So yeah, I’ll sort of I’ll show you later.

Julian McAuley 45:34
Some examples of the feedback people gave about the assignment, but the only thing I’d say in consolation is like look the actual ranking compared to the peers in your class is a very small fraction. Of the assignment grade you can have the bottom ranking and still pass the assignment just find the ranking based component of the grade is just supposed to. Keep the competition lively and interesting I guess.

Julian McAuley 46:01
Yeah, in defense of last years class. Maybe I didn’t read the room right and they didn’t want. That kind of liveliness, but it’s supposed to be fun and under normal circumstances. Most people find it fun I think. So yeah last year was on Goodreads. I haven’t updated this year. But last year. We did it on predicting what books people will read on Goodreads. This you will do something else. Maybe will do prediction on what videos people watch on Twitch or something who knows. Ok so I’m at 2 is more open ended. Here are some all sort of skim through these so I can move on to some real material.

Julian McAuley 46:42
You know, people have predicted when is a wine most fit to drink in terms of rating patterns overtime? When does it kind of become too old and turn to vinegar? How to ratings vary overtime or as a function of review length. Or other characteristics of of daughter on a recommender systems type data set lots of assignments on things like sentiment analysis. Can we understand the different textual linguistic dimensions that explain why people? Like or dislike different things, some weird stuff going on there. This was about building recommender systems for geotagged data on Google Maps, which is you know this is daughter said I give out.

Julian McAuley 47:25
So can, we predict how different geographical features influence peoples preferences and how people from different. Geography will like different things. This is another assignment that exactly the same thing, but in different ways, so they said. Rather than explicitly MoD. Geography can, we model, the extent to which one persons preferences are similar to those from their geographical neighbors. So if someone is close to you does that mean that physically close to you does that mean they have similar?

Julian McAuley 47:54
Preference that dynamics. Yeah, lots of things using text models to analyze things like restaurant reviews. This may be more of a weird one that’s not so closely related to recommender systems, but still building. I don’t know predictive models. We can build whatever you like this is from a game called. Wikipedia I don’t know, people have heard of this, but that’s a really fun game should check it out. So you go on your on this page Wikipedia. And it gives you a start point, which is asteroid or something and it says you have to get to Vikings. By clicking Wikipedia links only so and if you can get there in the fewest number of steps you win. We were nothing but you know you win. So you know they were predicting how people will do this.

Julian McAuley 48:45
And the sort of find the types of trace people have consists of. Exploring more and more generic topics until they’re kind of at a topic that’s so generic that it includes both asteroids and Vikings right so the solar system includes asteroids in Vikings. And then they narrow things down, or something like that. But yeah, they were trying to predict like what strategies do people use to navigate between pages or find information kind of fascinating.

Julian McAuley 49:18
Yeah, sort of looking at lots of lots of assignments about things like fashion recommendation to this is from Chictopia. Looking at. Social cues and how different tags can be used to predict different distributions or figure out which kind of images people will react positively lots of things like public. Um data sets released from the city is about predicting crime using a bunch of different temporal modeling techniques.

Julian McAuley 49:48
How to things very. Over a long time. How do they vary seasonally? How do they vary weekly out of a very monthly? How do they vary as a function of the hour of the day. So on, and so forth.

Julian McAuley 50:04
Um comprehensive exam good question. Now, the mid term is not the comprehensive exam assignment. One is the comprehensive exam. So to pass the comprehensive exam if you’re doing that. You have to get 15 out of 25 for assignment, one which is it’s not too high of a bar and if you miss, it will give you something to make it up. It’ll be alright. If you’ve never heard of the comprehensive exam. Forget I said anything, it doesn’t apply to you.

Julian McAuley 50:35
Good yeah, this is I mean, this is a good data set people used to. It’s kind of it’s a It’s a good example of a data set that kind of includes all of the different characteristics were teaching this class so this is like predicting. Given a taxi trip between Point A and point Bin Manhattan.

Julian McAuley 50:50
Can you predict how large that it will be so? How is that a function of weather? How’s it a function of season on temporal factors? How is it a function of? Geographical factors how do you appropriately transform your variables do you particular predict a tip or do you predict a tip percentage so on and so forth. Yeah, OK that’s about it those are the 2 assignments.

Julian McAuley 51:18
This is a long list of all of that. Days for both classes feel free to reach out to them individually. But I’ve also posted their office hours schedules. I think we have. At least in office hour every day, he should be able to find someone to help you. My office hours are going to be tomorrow morning. There’s a zoom link already on Piazza, you can show up to my office. If you like I’ll usually be there, but I don’t know I mean. Zoom is zoom is just fine, or if you can’t find me to set me up on zoom. As soon as I posted this link on the odds are people kept. Join my zoom room all day I don’t know why Tuesday morning OK. Good I think all that information is on the odds are by now, so yeah. That’s that’s about it, so they can self sign up for the arts or self signed up for the grade scope.

Julian McAuley 52:07
I’m sorry things are not integrated with canvas. But it doesn’t seem to work properly for a class that’s actually. Several different sections, Co Sheduled, but with different pieces for each one that didn’t really work properly, so just just go to the course web page and. Self sign up for everything you know, I should I should finally mention you know something, I didn’t get too. But I do go and put my last years course evaluations on my web page for full transparency and just so you. Just so you know what you’re getting into. I do suggest you go and read those. To. Understand some things you may like or dislike certainly I am going to change some of those things, especially in reaction to you know, some negative feedback about say last year’s midterm other things like the competition. I probably not going away, so if you don’t like that give it some thought. Someone else if you can join both. Both PR you can definitely do both parts is just don’t join both grade scopes. Um comprehensive exam in assignment is 15 out of 25. So what does that correspond to 60%? You know what I mean last year, the not everyone like taking a class on Twitch. I guess or maybe people would just in kind of a down mood during the pandemic fair enough. You know it was tough going for me as well so spare a thought I mean, it did see some of these funding comments. Like his voice is overrided by baby cries like Geez, too bad. I mean come on the baby was never in the same room as me. There was what 30 hours of lectures. I don’t think there was more than 10 minutes of crying babies, but you know you see. His voice is overrided by baby cries in the next comment is you know, not understanding of different people situations. It’s a bit ironic. I mean, there’s a different people, obviously, but yeah, I mean, it’s it’s not so straightforward for me either. Not everyone liked, which yeah fair enough, I think at least. It seems that you know the classroom is not at capacity for those watching on Twitch. Maybe don’t show up this week. But if you do want to seat. There’s probably going to be one here for you, just just make sure the classrooms not overfull. You can also watch things on on Podcast or at home. Yeah, people didn’t like our goal last year and only people like hogel alot and it’s fun and it’s only a tiny bit of your grade but I think people were just not in the mood to see. I was ranked compared to all other humans all at once. It was not the mood for 2020I guess. Ok, but, yeah, I mean, take a look at that in your own time if you’re if you’re curious. Right. So we’re going to get on to some real material. By now unless there’s any more logistical questions about anything. From anyone in the room or anyone on Twitch. Note well OK, let’s let’s talk about some regression for 15 minutes or so. Ok so this is our topic for the rest of this lecture and next lecture.

Julian McAuley 55:26
How can we build kind of supervised learning algorithms that predict? I predict labels from a training set and how can we understand the the function that associate’s features to labels? What is the window utility or using on your Mac this is not a Mac.

Julian McAuley 55:45
This is Windows 11 very exciting stuff. I do try to answer the questions even if they are a bit off topic while using Windows 11 because. I bought this, I bought this computer, which is like a Microsoft Surface, but it’s some bizarre 32 bit PowerPC or arm processor and I couldn’t get I couldn’t get tensor flow to work on it. So I had to use a64 bit operating system. And I had to sign up to Microsoft Beta or program that allowed me to install Windows 11, so I have no choice. Where can we find the previous slides so those are the very top of the web page they called course outline and introduction.

Julian McAuley 56:32
Great sorry supervised learning we’re trying to predict labels from data alright. We’re trying to infer this underlying function that explains the Association between some data or some features in data. And some labels that were trying to predict so in the case of regression. We’re going to be talking about real valued outcomes like ratings. This would be an example of a supervised learning problem can, we build a recommender system it says. Which of these movies will I give the highest rating too?

Julian McAuley 57:03
Ok. Yes, you can get an A with a bad ranking on cargo you’ll do fine. It’s only the first assignment is only a small fraction of the grade. Ok, so we’re trying to build a recommender system. We’re trying to build some predictive algorithm that says which film will receive the highest rating So what are labels and water features in a data set like this labels would be the ratings were trying to? Predict and the data or the features would be information. We extracted maybe about the movies. Or maybe about the users So what are the genres of the movie? What who is the director? What is the MPA rating of the movie? How long is it? What is its budget whatever you like those would be features that might be associated with the labels were trying to predict also features associated with the user could be productive so different demographic information. What is their age or gender could. Could be correlated with or associated with whether they’re likely to enjoy certain movies or not? Ok so really this problem of estimating labels from a data set is one of in this case, predicting a star rating from a bunch of features associated with the user and a bunch of features associated with the movie.

Julian McAuley 58:22
So there’s lots of ways, we could build this function right. I mean, this machine learning. Task says is about figuring out automatically? What is this function? What is the underlying function explains the relationship? Between the features under labels. There’s tons of ways, you can do that, and you don’t have to. But with machine learning if you don’t want it right so the you know the alternative would be something like this, you design a predictive algorithm based on prior knowledge.

Julian McAuley 58:49
This is sort of an extremely crude mockup, but you could say look. If a user is very young, they’re like G rated movies. If there a teenager. Maybe they’re like PG rated movies so on, and so forth you know this would be A? Valid solution to this problem, it would be using the features to make a prediction about the rating. With this would it would not be what we’re going to call supervised learning algorithm supervised learning algorithm.

Julian McAuley 59:14
So why is it sort of not a supervised learning algorithm? Because we’re not learning that relationship sort of automatically are not learning that that relationship based on the data. So we’re just inventing a solution out of out of our own heads rather than letting the data tell us what the relationship. Be that’s kind of the difference between supervised learning and not supervised learning.

Julian McAuley 59:43
So yeah, I mean, this is a alternative solution. Maybe I could do something a bit more intelligent. I could say well. I’ll collect text associated with the movie. And I’ll collect. Your social media posts from Facebook and if you’re sort of use common language. If the language in your social media posts is similar to the language used to describe the movie. Maybe that means you’ll like it right that’s another.

Julian McAuley 01:00:09
Another solution I could come up with to estimate this, this fight this predicted function. Ok. And you say you know is that supervised learning or not. It’s a It’s a little bit smarter, but maybe it’s still not supervised learning so what’s the? What’s the difference between solution.

Julian McAuley 01:00:26
One and solution 2. I mean solution one. We just invented this Association out of our heads kind of using? Using common sense. We never looked at the data at all to do that. We never looked at the labels at all to do that. To build the second solution well it’s getting a bit color, so we did look at the data. We actually tried to find associations between between users and movies. But we never actually looked at the labels we never actually. You know validated whether that was going to be a good Association or not alright and solution 3. You know the one that kind of is more like supervised machine learning. He’s going to say can we actually figure out which attributes are positively associated or negatively associated with the ratings. Anne and recommend things that predict those attributes so yeah, I’ll get to that. I’ll get to the definition of regression in a bit. The slides are not the same as the Corsair slides, but they overlap in the Corsair slides are certainly slower and more introductory, which is kind of why I said if you’re a bit behind on the sort of catch up material in the first couple of weeks, there could be a place to look. There’s some overlap between topics, but the slides are. Slower and simpler.

Julian McAuley 01:01:48
Ok so yeah, this third example is supervised learning in the sense that we, we are actually learning associations between the training data. We have and the labels were trying to predict that is the essence of water supervised learning algorithm actually is. Ok so you know, I mean? The thinking of today is that machine learning is fantastic and we should do it all the time. But you know why, why might you want to use supervised learning or not use supervised learning so.

Julian McAuley 01:02:20
Disadvantage of this very first approach where I just invented a solution or invented this relationship is that. You know, possibly assumptions about that relationship faults. Maybe it’s not true that young people really like G rated movies and that teenagers like PG rated movies. Maybe that’s wrong, Secondly this kind of solution can’t possibly adapt to. To new data sets, if you want to apply it to something very slightly different from movie recommendation. There’s nothing. You can do you have to kind of invent a new set of rules that’s going to work? But yeah, it’s not all bad.

Julian McAuley 01:02:56
I mean, the advantage of this kind of solution with not relying on the data is you don’t need data to build it right. You could if you’ve never seen anyone rate anything before and you haven’t collected any daughter of user reviews or social media posts or anything. You can still build this kind of trivial hand engineered solution so. Maybe it’s good for just initially booting up something that kind of works. Where can you find the review PDF you can find it on my homepage? Take a look.

Julian McAuley 01:03:29
Ok, so second solution disadvantages again, the assumptions could be could be false that people, social media posts are somehow related to the kind of movies, they watch similar disadvantage. It may not be particularly adaptable to new settings. Even if this is true for movie reviews it may not be true for what kind of car. You want to buy or something? Who knows?

Julian McAuley 01:03:54
The advantage similar to kind of the first approach is this time you know, we do require data. We require data of social media posts and we require data of movie reviews, but we don’t we don’t require any. Or you know sorry movie synopsis, I should say, but we don’t require any labeled data. We don’t actually need to see people rating movies. Maybe social media posts are relatively easy to collect and actually seeing what people like and what they purchase. Is much more difficult information to harvest so this is still potentially a reasonable solution if you haven’t been able to collect audio. And yes, solution 3, the sort of machine learning style solution to this problem.

Julian McAuley 01:04:35
Kind of swaps these things disadvantages. It can be difficult to collect this kind of data. You need to have a large data set that has meaningful features in it, and it has the kinds of labels you’re trying to predict. But then the good thing about machine learning is that we can vary directly optimize the kind of measure, we really care about predicting which in this case is estimating ratings or in another setting might be predicting a purchase or a price paid or anything like that. And that makes it very easy to kind of adapt to new settings. If you want to adopt this from movie recommendation to book recommendation or fashion recommendation. You need new data sets, but you don’t need a new model really you just just retrain the model OK, you don’t need to hand engineer. These relationships between features the model will. We’ll just see what relationships exist in the data and make a prediction as accurately as possible. Ok, so yeah, I mean that’s kind of what learning is and in particular, what supervised learning is which is nearly 100% of what we do in this class is also these related topics like unsupervised learning where we’re trying to do essentially. Pattern discovery in data sets, finding things like low dimensional structures or relationships in data or or topics in documents or things like that. But where we don’t have a specific predictive task in my. Things like estimating rating versus supervised learning where we’re trying to directly model this relationship between input and output variable so going from features or from data to things we want to predict. Ok and yeah, regression is going to be if I still have time to talk about it. I’ve got a few more minutes. It’s going to be just the the simplest most basic version of supervised learning? How can we learn relationships between? Some features and some output variables were trying to predict. So this is this is an example of a what’s called a linear regression algorithm first equation of the day. I haven’t been too hard on you so we have a predictor of this form, which I kind of explain expand in a minute.

Julian McAuley 01:06:46
We say that the labels were trying to predict. Uh based on the features or the data we’ve observed. Multiplied by some some unknowns, which tell us which features irrelevant, which features are positively associated with the labels are trying to predict which features are negatively associated in which features are. Not associated at all.

Julian McAuley 01:07:11
So here is like your simplest example of a regression algorithm basically something like finding a line of best fit so can, we estimate from a feature. Our data, which is the like the height of a user. Or person. What is their weight alright that’s an example of a regression algorithm or if you like it’s an example of a supervised learning algorithm. We’d like to find this line in this case, a straight line. Of Best Fit, which best approximates the data. And I mean, maybe you’ve seen this in statistics or something but to think about it in terms of machine learning and this is not just a line of best fit.

Julian McAuley 01:07:53
It’s also like a predictive algorithm right now. If you have a new person and I observe their height. Then I can say what point is that correspond to on this line and that is my prediction of their weight right so this line of best fit is really a simple predictive algorithm. And yeah, maybe you remember little bit of algebra you know the equation for a line is going to be Y equals. Mx plus B or in this case, we have.

Julian McAuley 01:08:25
White. The thing we’re trying to predict is equal to M times height. Plus, B. Ok. Great. Sorry I do sometimes draw things twice, so they don’t have to depend on my handwriting. Alright so yeah, if you can find the line. You can make predictions. That’s the simplest predictive model, even come up with?

Julian McAuley 01:08:54
How do you actually formulate this problem? Obviously, a line of best fit is an approximation. It doesn’t correspond to the data exactly so how do you make sure your approximation is good you know? How do you approximate this which line is best? There’s many different lines. How do you choose one that? How do you come up with the measurement that says what the best is so on and so forth. Alright so there’s my equation from the previous slide. We’re trying to fit this equation Y equals. Sorry. White is equal to M times height, plus B?

Julian McAuley 01:09:29
How do we go about observing a data set and choosing the best values for M&B? Ok, so. Yeah, that’s the problem. I’m trying to solve when we’re building a regression algorithm. That’s at least the very simplest version of it a very. It can be hard to distinguish 6 from BI will never use the constant 6 in any lecture. It’s always a big? Very good OK so. You know you can generalize this kind of algorithm to more than a single feature with previously. We had height. You could have many features associated with a user like their high and their age in this case would be fitting something like. White equals. M nought times age. Plus, M one. Times height. Plus Alright it better be OK. It’s lovely. So yeah, that’s not I guess that’s not a line of best fit anymore. That would be like a plane or a hyperplane of best fit? Which says if you have if you have 2 points.

Julian McAuley 01:10:46
Here are point for age and height? Where does that map to. In order to predict the white right. Office supervisor need to know the labels are yeah. Yeah, I think that questions been answered, but let me know later if not OK. So yeah, we’re fitting is general form of an equation Y equals we have a slope for every feature we have. You know how tightly is age associated with the label and how tightly is way to so, so height associated with the label. So that’s this equation here. It’s OK. Finish up pretty quickly, but will get there. So we can expand that equation out into sort of this inner product here.

Julian McAuley 01:11:42
Which says the white that we’re trying to predict is a function of? This vector of features X these are mine owns these are the things I observe the height and the age multiplied by a bunch of unknowns, which I’m going to call theater. Alright so my line of best fit can be written down as an inner product between everything I observe X and these 3 unknowns theater and you’ll notice there’s this kind of one there, which seems out of place that’s1 of the constants that does kind of. Show up a lot, but if you expand this equation out if you expand out the inner product. Yeah, it’ll it’ll it’ll come out like this so M Times. M one times the first feature M2 times the second feature and B times. One right so you always have this one sitting there in your feature vectors.

Julian McAuley 01:12:37
Ok and that’s like the that’s the equation describing the relationship between a single feature or age and height measurements and await. So what you really have is a ton of. A ton of observations of different peoples weights, so I don’t know weights in pounds, whatever they are. I did use A6 I’m sorry I said, I wouldn’t. Huawei is equal to what did we say we have X? Which is going to be a bunch of ones. Has an age? And we have what was the other one. Height. So uh how it can be in centimeters or something. I don’t know so on. And so forth. These are our observations X and we have these vector of unknowns, which I could call M one M2. B4, I can call them. Feeder nought theater one and theater 2, which I’ll just call theater. So I have this equation describing the relationship between my features an my labels in terms of this small vector. Of unknowns. So the question is yeah to solve regression. How do I solve that thing for Theta I think I can I can probably leave that to the next lecture and I leave a few minutes for questions in case there still are needed a burning on peoples minds. Not too much so I’ll let you go, but I’ll be. I’ll be down here for the next 5 minutes or so, if there’s anything still missing. And thanks everyone on Twitch see you on Wednesday.

rongyi.page

Explorer

Lecture Recording

Text Transcription

Graph View

Table of Contents

Backlinks