Thursday, March 10, 2011

Final office hours

As mentioned in class, we'll have 3 office hours to help with the final HW and project plan...

Today at 7pm (in 30min)
Monday 7-9pm
Tuesday 7-9pm

Be sure to attend so that the TAs don't get lonely!

Wednesday, March 9, 2011

Clickmaniac final results

Well, now that the ad auction assignment is over, I've arrived at a name for the winners -- clickmaniacs.

I know I'm a little late in posting this, but here are the final results (for posterity):

Doris, Dai Wei, Wenqi: 4523 clicks, 2813 actions
Wen-Hao: 2557 clicks, 1153 actions
Emil: 2511 clicks, 1325 actions
Mihail & Nihar: 761 clicks, 382 actions
Jami & Giordon: 732 clicks, 374 actions
Zack & Ye: 360 clicks, 255 actions
Fred & Scott: 329 clicks, 142 actions
George & Matt: 274 clicks, 85 actions
Ben & Claudia: 169 clicks, 71 actions
The TAs: 108 clicks, 38 actions
Andrew: 94 clicks, 24 actions
Sasha & Alejandro: 61 clicks 74 actions
Isaac & Bo: 36 clicks, 7 actions

Great job everyone!  I'm blown away at how many likes/clicks you guys were able to get in a week.

Final lectures

We're into the last week, and this week we're focusing on what actually makes all the things we've talked about so far in the course possible -- Data Centers.  On Monday Jeremy gave an excellent overview of the MapReduce, the simple but powerful programming paradigm used for many tasks in data centers.  You'll get a lot of practice with MapReduce on the HW, so that hopefully you're comfortable with it by the end of the course.  In addition to the details of programming with MapReduce, Jeremy discussed the how MapReduce is actually implemented and gave some amazing statistics about its use within Google.

That brings us to today's lecture.  Today, I'll spend half the class giving an overview of the design of data centers themselves, with a focus on energy-efficiency issues.  If you find yourself interested in these issues, you should definitely sign up for 146 this spring since we'll be going into a lot more detail about research in this area. 

Then, the last thing we'll do is a course review, where I'll quiz you all about the "big" ideas from the course, and reward correct answers with candy... Please come so that you get a reminder of all the things we covered this term. 

Monday, March 7, 2011

HW 7 is out

The last Homework is now posted... As usual, ask your questions here.

This one has half about auctions (GSP) and half about MapReduce (which you'll learn in today's lecture).

This Homework and your project plan are the last things of the term.  Both are due at the end of finals, but there is no reason why you can't turn them in earlier if you don't want to hang around until then!

Monday, February 28, 2011

Project plan

As I mentioned in class, your "final exam" will be to complete either a project plan or a course module.  The details are now available on the course website. 

Basically, if you're taking 145 you'll do a 5-10 page plan for the project that you'll pursue in the spring term, and if not, you'll do a new course module (lecture plus a few HW problems).  If the course modules are good I may adopt them for next year!

Wednesday, February 23, 2011

Lecture 9

Today we finished up routing games and then started lecture 9 -- ad auctions.  We'll be doing ad auctions for the next 2-3 classes...so we'll be covering a lot about them, but still it's only the tip of the iceberg.

Today we covered a little bit of historical perspective about them and next class we'll get into the theory.  We'll start with the basics of auction theory and then move to the special case of ad auctions and ad exchanges (where we'll do a little in class experiment to give you a feel for the problem).

Note -- we will be having a make-up lecture on Friday 2/25 (this friday) at 1-2:30 in Ann 213.  Please be sure to attend since we'll be covering important material. 

Also, we will be having only 1 office hours this week (since the HW is not due until next friday).  Raga will be covering office hours Thursday 7-9.  Please stop by if you have questions about the ad auction competition or about the theory part of HW6.

Monday, February 21, 2011

Homework 6

Homework 6 is now out on the course website. Do get started on problem 4 immediately - even though the competition doesn't start till Wednesday, you need to start setting up your campaigns and thinking about your advertising strategies asap so that you have time to contact me (Raga) in case you run into difficulties, and sort things out. Good luck!

Thursday, February 17, 2011

Rankmaniac finale

We had the "final search" for rankmaniac 2011 yesterday and here are the results.  I hope you had fun and learned a lot...I was certainly impressed with a lot of your ideas!

Google
1) Giordan & Jamie
2) Mihail & Nihar ***
3) Zach & Ye
4) Fred & Scott
5) Sasha & Alejandro
6) Isaac & Bo
7) TAs
8) Wen-Hao
9) Doris, Dai Wei, Wenqi
10) Claudia & Ben
11) Emil
12) Andrew
13) George & Matt

Bing
1) Fred & Scott ***
2) Doris, Dai Wei, Wenqi
3) Wen-Hao
4) Zach & Ye
5) TAs
6) Emil
7) Isaac & Bo
8) Sasha & Alejandro
9) George & Matt
10) Giordon &  Jamie
11) Claudia & Ben
?? Andrew
?? Mihail & Nihar

*** means top result for "rankmaniac"

Monday, February 14, 2011

Lecture 8

Today we started the "network economics" section of the course in earnest.  Our "warmup" is the case of "load balancing games".  These games are motivated by the fact that a data center / server farm is simply too big to optimally assign jobs, so instead we could consider just using a greedy heuristic of sending jobs to the best server they can go.  Well, this algorithm can be viewed as a game where the players are the jobs...and we spent the lecture starting the analysis of this game. 

Remember -- Wednesday is the conclusion of "rankmania".  We'll have the ceremonial search at 12:30 in Ann 213.  So, feel free to bring your lunch...but make sure someone from your group is there so that we can easily identify which pages belong to which group and assign grades.

Also, remember that the project proposals are due wednesday at 12:30 too, so bring them along with you.

Friday, February 11, 2011

Rankmaniac reports

The rankmaniac reports from Homework 4 were due yesterday. I found two Rankmaniac reports yesterday outside Adam's office, and none today. I hope the rest of you have turned in your reports electronically to Bose (the Guru of Homework 4). If you have not submitted your reports yet, perhaps because you forgot about the report altogether, please do so immediately. Remember, the report alone is worth 50 points. If you don't submit your reports by tonight, you may not be able to claim any credit at all for a late submission.

Tuesday, February 8, 2011

Lecture 7

Yesterday's lecture was on "cascading behavior in networks", a.k.a. epidemics over networks, a.k.a., information cascades, a.k.a., diffusion over networks, ...  It goes by many names.

We looked at a few "toy" models that I hope gave you a flavor for what cascades are and how network structure (specifically clustering) affects their growth.  We really only touched the tip of the iceberg however, and so there are lots of related questions that would make great project ideas on this topic.  I hope to see a bunch of ideas here when you turn in your project proposals.

Remember: HWs are due on Friday this week, and so office hours are Wednesday and Thursday.  I'll be having mine on Wednesday, so please come early and get started on the HW (even though it's an easy week).  Otherwise I'll get lonely...
 

Tuesday, February 1, 2011

Homework 5

Homework 5 is now up on the course web page.  This HW is purely to get you up to speed with game theory.  If you've seen it before you may find it easy, but hopefully it will be fun. 

If the problems seem hard, or use terms that you don't know, be sure to come to Bose's "Intro to Game Theory" lecture this wednesday during the normal class time.  If you know all the terms in the homework and it's easy for you, there's no need to come on wednesday.

As always, post your questions about the homework as comments to this post.

Project ideas

In yesterday's lecture Michelle gave a great overview of a bunch of different project ideas and useful tools for you to consider when looking for your own project ideas.  I'm excited to see what you all come up with! 

If you're still looking for a group for the project proposals, post a comment here and I'm sure you'll find others looking.

Sunday, January 30, 2011

Homework 4 Update

Some of you had doubts regarding Q2 in HW4 for the case when a webpage does not have outgoing links. Please note that the homework has been updated with a clarification. Download the new version from the course website.

Wednesday, January 26, 2011

Homework 3 clafication

Clarification regarding Q4: Q4 mentions "We add a new central node B to the network, and connect it with every other node on the ring...". Note that the "every other node" should just read as "every node" on the ring. It does NOT mean alternate nodes. So every node on the ring is linked to B independently with probability p.

If anyone has already done with the other meaning, you do not need to redo it. Sorry for the confusion!

Homework 4

Homework 4 was released today and is now available on the course web page.  So begins rankmaniac 2011...  I look forward to seeing all the crazy things you do.

(Remember not to get me in too much trouble though!)

I will be willing to post links to peoples pages from the course website if so desired.  There is a limit of 1 page per group that I will link to though.

Lecture 6

Over Monday and Wednesday this week we made our first venture into "exploiting" network structure instead of just trying to understand it.  Our example of a case where using network structure makes a huge difference is "search" -- the big idea of google back in 1998 was to use network structure to identify important pages, and feed this information into their search engine.  In particular, they defined a notion called "pagerank", which we explored over the two classes. Be sure to take a look at the original papers by Brin & Page on Google...it's quite interesting to read now in the context of what google has become.

Note -- understanding pagerank and the other aspects of search engines that we discuss will be very important for you on HW4, which will be posted after class today.

Thursday, January 20, 2011

Homework 3

As you know, HW3 has been handed out and is available on the website. Its due next Thursday. It seems that some of you are still unsure of some basic concepts in probability. Problem 4 of this homework heavily relies on your understanding of random variables and expectations. Please, feel absolutely free to clarify your understanding during office hours with the TAs. If you prefer, you may also contact me (Raga) personally to set up a time to meet and clarify your doubts.

As usual, ask your questions on the homework as comments to this post... Good luck!

Wednesday, January 19, 2011

Lecture 5

Today our focus was on explaining small world properties.  Remember that, in isolation, neither small diameter or high clustering coefficient is particularly "surprising" or difficult to explain.  It is only the combination that is "surprising"... 

But, we saw that one possible explanation for small world properties is a combination of local correlation with long-range random links.  However, this combination only explains the existence of short paths, it doesn't explain why we can find them so easily. 

To explain this second observation (and to me the more interesting observation) we saw that the long range links have to be proportional to the distance in a very precise manner.  So, it is in some sense much more rare that we can find short paths with myopic algorithms than it is for the short paths to exist in the first place.

Remember though that the models we have discussed for small world graphs DO NOT have heavy-tailed degree distributions!  So, we really haven't given a "natural" mechanism that explains all four of our "universal" properties, we've just given mechanisms for each individually.   Can you think of how to combine our heavy-tailed degree distribution ideas with the small world ideas?

Wednesday, January 12, 2011

Lecture 4

Today we talked a LOT about heavy-tails.

The details of what we covered could fill a whole probability course, so don't worry if you don't feel that you could prove all the results -- focus on understanding the intuition behind what I told you.  (And if you want to learn more, ask me and I'll point you in the right direction.)

In my mind the key things from today are
1) Know some examples of heavy-tailed distributions (Pareto, Weibull, and LogNormal)
2) Know some generic processes that create heavy-tails -- additive process (generalized CLT), extremal processes, and multiplicative processes.
3) Know how to apply these ideas to networks in order to get a heavy-tailed degree distribution.  (We introduced Preferential attachment today, and I'll do the proof of the degree distribution next time.)

Like I said in class, I could go on-and-on about heavy-tails...  So, I probably tried to pack too much into 1 lecture.  You'll likely need to go back and look at the notes (which will be up soon) in order to really process everything we talked about. 

Monday, January 10, 2011

Homework 2

I know you're all still working on HW1, but HW 2 is now available on the website.  We've covered everything you need to do it at this point, so feel free to get started if you're looking for things to do!  It's due a week from Thursday.

As always, ask your questions as comments to this post... Good luck!

Lecture 3

Today I showed a bunch of examples of other complex networks besides the web and illustrated that the properties we saw in the web graph aren't special to the web -- they happen in most of these other networks too.  So, they are in some sense "universal".

This motivates two natural directions:
1) Trying to understand what causes these properties to emerge
2) Figuring out how to exploit these properties.

We'll to 1 and then 2, spending about 3-4 lectures on each.

We started on 1 today by exploring Erdos-Renyi random graphs.  What we saw is that through a very simple random graph model we could obtain connectivity structure (a giant component and a small diameter) that very closely mimic what we saw in the web graph and the other networks.  So, in some sense, there is a very simple explanation of these properties -- they result from independent random connections.

But, the Erdos-Renyi graph does not provide an explanation for clustering or heavy-tailed degree distributions.  Understanding why these occur so often will be the focus of the next two lectures.

Thursday, January 6, 2011

Bonus Lecture: Probability Refresher

This is a reminder for the bonus lecture (a probability refresher) I will be giving tomorrow (Friday, Jan 7) at 1pm in Annenberg 213.

Here is the plan - we'll start by recalling the definition of a random variable and then review what discrete and continuous random variables are. Following that, we'll revisit the different types of distribution functions (probability mass function, probability density function, cumulative distribution function) and discuss some standard and well known discrete and continuous random variables (Bernoulli, binomial, geometric, Poisson, uniform, exponential, normal). Then we'll define expectation of a random variable and review some of the important properties of the expectation, with particular emphasis on linearity and its applications using indicator random variables (useful for HW1).

You don't have to attend the lecture if you are really comfortable with these concepts (and can comfortably solve Problem 2 of HW1). But if you've never really kept in touch with probability for a long time, then you'd probably benefit from the lecture.

It'll be a short lecture (about one hour). The slides are pretty verbose, and could serve as a reference later on. They will be posted on the course website after the lecture tomorrow.

Update: They are now posted on the course website.

Wednesday, January 5, 2011

Lecture 2

The notes and slides for lecture 2 are now up.

We didn't get quite a far as I'd hoped, but we'll finish up the slides next class by looking at the structure of lots of networks other than the web.

The key take-aways from today's lecture are the following 4 interesting properties:

1) Connectivity:  The web has one "giant" connected component and all other connected components are tiny.

2) Diameter: The distance between nodes is quite small (~6 avg. distance between noes in the SCC) and, for at least the beginning decade of the web, grow logarithmically with the size of the graph.

3) Degrees:  The degree distribution is not Normal... Instead it's heavy-tailed.  This means we should expect a few nodes with really "huge" degree and "lots" of nodes with small degree.

4) Clustering:  The web exhibits strong "homophilly", i.e., if A is connected to B and C it is significantly more likely than random for B to be connected to C.

We'll see next time that these four properties are "universal"...and then we'll try to explain why they are so common.

Don't forget to attend Raga's probability refresher on Friday @ 1pm in Annenberg 213!

Tuesday, January 4, 2011

No office hours this week

Since we don't have a HW set due until next week, and we haven't really gotten going yet, I guess we won't have office hours this week (Tue or Wed), so the first office hours will be mine on the evening of Tues 1/11 at 7-8pm in Annenberg's first floor conference room.   The TAs will then have their office hours on Wed 1/12 7-9pm.

As I said in class, I really hope that most of you will come to the office hours, even if you don't have questions, and work on the HW there.  Then, when you have questions you can ask them of me or the TAs (or your classmates). 

Monday, January 3, 2011

Homework 1

Homework 1 was passed out in class today, and is available on the course website.

Note that there is a typo in the due date on the ones handed out in class -- it is due next Thursday, Jan 13 at 1:30pm.  Be sure to start early.

Also, if you're nervous about whether you know enough probability, try out the first two problems and attend Raga's Probability refresher on Friday if you have any trouble.

If you have any questions about the homework (or want to point out typos), write a comments on this post and  the TAs/prof will respond ASAP.

Lecture 1

Glad to see so many faces at the first lecture -- I hope I didn't scare too many of you away...

If you still have any questions about what the course is about or about the procedural details of the course, please post them as comments here.  Also, I really recommend reading through the handouts that I gave out in class.  For those of you who didn't get a copy, there are links to them posted on the course website now.

Also, be sure to remember that Raga will be giving a probability refresher on Friday at 1-2pm in Annenberg 213. If you haven't had any probability since Math 2, I highly recommend attending.