Wednesday, December 15, 2010

Welcome to 144, version 2.0

Hi everyone,

Welcome to the 2nd year of 144.  You'll have the benefit of last years students debugging some of the assignments and lectures, but the drawback that I'll be adding even more things to the course that we ran out of time for last year!

Most of the details of the course are on the course webpage, but this blog will give a place where I can post announcements, bug fixes for homeworks, etc.  We will typically use this blog instead of broadcasting emails, so you should check it very regularly...  Also, we hope that you use this as a place to post questions about the homeworks/lectures/etc so that others can benefit from the answers.

Friday, March 12, 2010

Course wrapup

At this point, all that is left for you all to turn in is your project plan/course module, which are due 3/17.

When you turn it in, you can also stop by to pick up the remaining graded assignments from my office.  Almost everything should be graded by that point.

If you have any questions about the project plan/course module or your grade, please email me.  I won't be around campus, but I'll be happy to discuss things with you anyway.

Rankmaniac reborn results

Our last experiment of the term would down yesterday and here're the final standings.  The TAs held their own, but Jonathan & Manuel ran away with it ... winning by 175 clicks!  And it seems that click fraud was indeed very hard -- teams that put a lot of effort into it didn't get many returns.

I hope it was fun for everyone...

Here are the final standings:

1) Jonathan & Manuel  (514 clicks!)
2) Daniel, Michelle, Jeremy, & Walter
3) Dallin, Rishi, & Anthony
4) Patrick & Rico
5) Marland, Christina, Jiyoung
6) TAs (91 clicks)
7) Bose, Michelle, Thomas, Sonal
8) Ram, Victor, Esther
9) Daniel, Chris, Andy
10) Fernando, Masoud, Zhenhua
------------40 clicks---------------
11) Hailey, Jim

Monday, March 8, 2010

an adwords suggestion

As some of you realized, your campaign in adwords may stop/reduce gaining clicks after some time. Remember that the difference of the bidding algorithms between google and yahoo. google uses your bid submitted times the clickthrough rate (CRT) to bid the ad slots. CRT is initialized some value and then measured during your campaign. If the CRT of some keywords in your campaign become very low, you may try some new keywords to increase your clicks.

Last lecture!

We had the last lecture of the term today.  We finished up our discussion of ad auctions by talking about VCG and the relative drawbacks/benefits of it compared with GSP designs. 

Then, we finished with a brief overview of the term...please look at the slides in loo of a final just to make sure you remind yourself of all the topics we covered this term :)

...and thanks again to everyone for all the hard work. 

Note -- I will be off campus for much of the next week and a half, so please email me if you'd like to meet to discuss the project plan or course module design.  I'll be happy to set up an appointment.  You can also discuss this with the TAs either Tues or Wed night (they will be covering my office hours for me this week).

Thursday, March 4, 2010

Monday is the last lecture

...as I announced in class yesterday.  Monday will be the last lecture. 

I'll present VCG and then we'll discuss possible reasons why Google/Yahoo aren't using it.

Also, I'll do a brief review of what we did during the term.  Please try to attend.  Since I'm not having a final, I'm hoping this lecture serves the goal of causing you to think back over what we've done during the term.

Wednesday, March 3, 2010

Bug in HW 6

We've found this bug in homework 6 : in Fig. 2, ignore the bit that says r_1 = 1.

This means r_1 can take any arbitrary value in the examples you create.

HW 7 is out

The last HW is now available on the web page...

Be sure to start the "rankmaniac revisited" problem TODAY! Otherwise you're missing a whole $1 of your budget...

Tuesday, March 2, 2010

Ceremonial conclusion to Rankmaniac

The rankmaniac competition is coming to a close this Thursday...

The official results will be gathered from a search at 12:15 in 209 Annenberg.  All comers are welcome.  I will have a prize for the winning team, so please come if you're near the top :) 

I'll order some pizza too since it's lunch time.  (I'll ask in class tomorrow whether you're planning to come or not so that I can get some kind of estimate on how much to order.)

Monday, March 1, 2010

Lecture 15: Auction basics

Today we went through a brief overview of sponsored search and then transitioned into the basics of auction theory.  We didn't quite make it through the "basics of auction theory" lecture notes, but I'll post them soon anyway.  Next class we'll finish up those notes and then move on to studying how ad auctions actually work.

Note -- We took a vote in class and decided to change the project plan assignment deadline until the end of finals.  So, the project plans are now due on 3/17.

Thursday, February 25, 2010

HW 6 is out

...post any questions you have here.

Lecture 14: Project proposal presentations

I really enjoyed hearing about all your project proposals and am excited that so many of you are planning to work on them next term in 145.

For those of you interested in 145 who don't have a group yet ... if you heard about a project that interested you yesterday, just talk to those students (or post a comment here) about possibly joining forces.  You should be sure to find your group soon because the project plan due at the end of the term is meant to be the concrete plan for approaching the project in the spring.

Monday, February 22, 2010

Lecture 12-13: Routing Games

During the last two lectures we've focused on "routing games" a.k.a. "congestion games" where sources must choose the route they (or their packets) take through a network to a destination.  We've gone through the same set of analyses we did for load balancing games -- We studied the existence and the efficiency of Nash equilibria.  We saw that Nash equilibria need not exist, but that in the case of linear latency functions they will always exist and, not only that, they will be quite efficient (i.e. small "price of anarchy").  But, unlike load balancing games, the "price of stability" is not 1, and so the optimal routes are no longer always an equilibrium.


In addition to the particular application of routing games, we also introduced the class of "potential games" which has a number of important properties that helped us in our analysis.  Further, we saw an example of the technique of "marginal cost pricing", which can be used to align the local incentives of players with the global goal.  These two concepts are important and useful far beyond routing games.

Next class, we'll leave routing games and move on to the study of Ad auctions.

Wednesday, February 17, 2010

Hw 5 is out

Homework 5 is out! Grab it from the course page. Note that the printout handed out in class missed a page!

Please post your queries/commentsregarding this homework in the comments.

Project proposal presentations Wed 2/24

After taking a look through the project proposals everyone turned in, I've decided that it will be fun to give you all a chance to hear what your classmates proposed.  This could be especially useful for people who are taking 145 next term since they might find something really exciting to pursue.


So, next Wednesday (2/24), we will have a project proposal presentation session. 

Here's how it will work: Each group will have 5 minutes.  They can use this as they please -- presenting 1 idea or multiple ideas.  You can use the blackboard, the projector, or just talk.  You can have 1 group member talk or multiple people talk.   It's completely up to you.  However, if you want to use the projector you must send me your files before noon on the 24th so that I can preload them onto my laptop (which is a PC -- pdf or ppt are the safe choices).

The quality of the presentation will be incorporated into the grade for the proposals.

Thursday, February 11, 2010

Lecture 11: Load balancing games

Today we looked at our first CSy games -- load balancing games.  These games are motivated by the fact that a data center / server farm is simply too big to optimally assign jobs, so instead we could consider just using a greedy heuristic of sending jobs to the best server they can go.  Well, this algorithm can be viewed as a game where the players are the jobs...and we spent the lecture starting the analysis of this game.  We've seen so far that Nash equilibria always exist and that the optimal assignment is always an equilibrium.  Next class we'll look at how bad equilibria can be, and then we'll move on to routing games.

My notes include everything for the next lecture too, so I won't post them yet. 

Also, remember that Monday is a holiday, so we won't have lecture again until Wednesday.

Monday, February 8, 2010

Lecture 10: Economics and networks

We started the day with a quick review of pagerank and all the other topics we covered in Module 1, and then we finally moved on to Module 2 --Economics and networks.

We spent the lecture providing a primer on (i) why game theory is important for understanding today's networks and (ii) providing an introduction to "baby" game theory, which will hopefully be enough background for us to cover the 3 topics in this module.  Wednesday, we move to our first case study -- how game theory provides insight for routing.

Note -- I messed up the example where there was no Nash Equilibria in class today...but it's correct in the notes that are online.  I'll go over it at the beginning of the next class.

Tuesday, February 2, 2010

HW4 is out!

This is a fun one (I think) ... so I hope you enjoy it. Grab it from the course web page.

Like I said in class, be sure that you start early because if you don't you'll be in at a big disadvantage for the "gaming google" problem.  And, if you're having trouble finding a group of 2-3 people for the "gaming google" problem, just post a comment here and coordinate with people that way.  I'd like everyone to have a partner for it, but if you really can't find one feel free to do it alone.  I'm really hoping that you come up with some creative ways to get ranked highly, so think outside of the box....and talk to me and the TAs if you're stuck.

Monday, February 1, 2010

Lectures 8-9: Search

We moved on from the network structure and network formation topics that we've been studying so far, to now discuss one example of how to exploit our understanding of network structure -- Search.  We spent most of today talking about the general architecture of a search engine (crawling, indexing, ranking, and display), but just got to the point where we were discussing the key new insight of google when it formed -- pagerank.

We ended by putting up a first take on pagerank and I left you with the question of coming up with an example webgraph that "breaks" our first cut at pagerank.  So, we'll pick up there on Wednesday.

Remember -- the next homework will go out soon and you'll be competing in groups of two for ranking on google for the term "rankmaniac 2010".  One key constraint that I forgot to mention in class about this though is your page (which you are aiming to have show up first on google/bing) cannot be on the caltech domain.  So, you have to grab a free blog/wiki/etc from somewhere else to host your page.

Location change for office hours

Apparently the department prefers that we limit access to the 2nd floor of Annenberg in the evenings, so we have to switch the location for office hours.  From now on, we'll hold office hours in the 1st floor conference room in Annenberg (the fish bowl next to the computer lab).  I hope to see you Tuesday night to discuss the project proposals!

Remember, you're required to come at least once in the first 5 weeks of office hours and we're closing in on that number...

Thursday, January 28, 2010

Lecture 7: Project ideas session

We had a very fun lecture from one of our TAs yesterday that hopefully primed you to think about some creative project proposals.  Virgil, the "internet man of mystery", walked through a bunch of ideas for projects that varied from mildy impolite to questionably legal.  If you weren't at the lecture, you should definitely check out his slides, which are now on the course web site.  And, if you want some other ideas check out his web page to see some of his past hacks... 

There are some more standard sources for project ideas listed on the course web site too.  And, if you're interested in doing a theory oriented project, check out some of the papers here, or talk to me or the TAs and we can point you to some interesting starting points.

Late policy

It seems like the term is getting busy for everyone, and there have been a number of requests for extensions on this homework.  So, here's the late policy we'll follow for HWs:

-10 pts if turned in by Friday at 6pm
-20 pts if turned in by Monday at 8am

You must turn in what you have by Monday to get credit unless there are special circumstances that you have talked to me about.

I hope this helps everyone!

Tuesday, January 26, 2010

Be sure to attend Wednesday's lecture!

You definitely don't want to miss Wednesday's lecture.  Virgil will be giving you tons of ideas for interesting projects from a wide range of web related areas...  We know that class will conflict with the career fair on Wed, but please try to go to the career fair early so that you can attend class at 1!

Monday, January 25, 2010

Lecture 6: Small worlds

Today in class we had our last lecture about network structure.  We talked about the combination of high clustering coefficient and small diameter -- so called "small world" graphs.  The main take-away is that small-world properties can result from a simple combination of highly correlated local structure combined with random independent long range connections.  The example we looked at is a lattice of local connections with added random long range connections. 

The second aspect of small world graphs that we focused on is that simple, myopic, distributed agents can actually find the short paths.  This turns out to be a much more delicate phenomenon to model.  But, we saw that it can be viewed as the result of adjusting the long range connections to be "distance-dependent".  However, this dependency on distance has to be very carefully chosen to ensure that myopic agents actually find the short paths.

That's it for network structure!  There's a lot more that we could have covered, but for next class we'll move on and discuss "search" ... which will end up making use of some features of network structure.

Monday office hours moved to Wednesday

Since we've moved the homework submission day to thursday, we'll move the Monday office hours with the TAs to Wednesday.

So from now on, we'll have TA office hours from 7-9 pm on Wednesdays. Adam will continue to hold his office hours 7-8 pm on Tuesdays.

Sunday, January 24, 2010

Project proposals

In case you haven't noticed, the project proposal assignment has now been posted to the course web page.  We'll talk more about this in class this week.   In particular, Wednesday's lecture is meant to help give you some ideas about possible projects. 

The proposals are due Feb 12, so there's plenty of time still, but it's definitely time to start thinking about them.  Please try to use office hours as a place to discuss your ideas with me and the TAs ... we can help guide you to resources and background, or give you some starting points if you're having trouble coming up with ideas.

Wednesday, January 20, 2010

HW 3 is out!

Grab it from the web page...and post your questions here.

Note:  You'll need to use Mathematica/Matlab/Maple for question 3, so if you're not comfortable with these, please be sure to start early and feel free to ask the TAs for help.  

Lecture 5: Heavy tails

Today we had what basically turned into a probability lecture about heavy-tailed distribution.  We spent the time talking about (i) some examples of heavy-tailed distributions and properties of them and (ii) what processes lead to heavy tails.  Then, we applied what we learned about where heavy-tails come from to the context of graphs to give a simple model (preferential attachment) that leads to heavy-tailed degree distributions.

During the second part, we talked about some very deep probability results (generalized central limit theorem and extreme value theory) where I gave only a peek of the results.  If you're interested in learning more about these, let me know and I'll point you to some references.

(The notes will go on the course webpage by the end of the day.)

Thursday, January 14, 2010

Lecture 4: A first attempt at modeling networks

We spent the lecture today finishing up our discussion of G(n,p) random graphs.  We saw that G(n,p) turns out to have many of the connectivity and diameter properties that we saw in the web graph and the other networks we've discussed in previous lectures.  Further, we saw that G(n,p) has beautiful "threshold" behavior for almost every property we care about.  Most of your next homework will involve looking more deeply into the properties of G(n,p) so, if you got lost during the lecture or missed it because you were working on the HW, be sure to look back at the notes and ask questions. (The lecture notes are not available on the course webpage.)

Also, I found out (to my surprise) that many of you have not used "linearity of expectation" much.  If this applies to you, please be sure that you follow how we used it in class and be sure to look at the HW1 solutions so that you can see how it was useful in problem 2.

Monday, January 11, 2010

HW 2 is out

Grab it from the web page if you didn't get a copy in class.

Post your questions as comments here...

Lecture 3: Beyond the web

Today we finished up our discussion of the web graph by looking at "clustering" and then we plowed through a discussion of how the 4 properties we saw in the web graph show up in a wide variety of other networks, and so are "universal" in some sense.

From here we're going to try to understand where these properties come from?  The idea is that since all these networks (which evolve or are built using very different mechanisms) have the same structural properties, there must be some simple and universal explanation as to where the properties come from.

To try to understand this, we started with the simplest possible idea of how a graph might form -- as a sequence of independent random connections being made.  This led us to the Erdos-Renyi random graph, which we started to discuss, and will continue to discuss during the next lecture. 

I've posted the notes for lecture 3 "Beyond the web" on the course website already, but I'll hold off on posting the notes for the Erdos-Renyi graph until Wednesday when we finish discussing them.

Another seminar this week

X-Seed, a company that funds seed projects, will be coming on Thursday Jan 14 @ 2:30 to give a talk on "commercializing ideas" ... evaluating whether an idea can be commercialized and the steps to do so.  The talk is in Ann 213 and is open to everyone.

Friday, January 8, 2010

Relevant upcoming seminars

As I mentioned in the first lecture, there are a lot of interesting topics related to crowd sourcing and content aggregation that rely on machine learning.  We won't go into the machine learning related aspects of these topics in this course, but Yasar's course (146ab) focuses a lot on these issues...and, there's a very interesting upcoming guest lecture on the topic that I'd highly recommend everyone attend!  It focuses on the "Netflix prize" and is given by a lead member of one of the winning teams.

----quoted from Yasar's email ------

Dr. Joseph Sill, a Caltech alumnus and a lead member of one of the two teams
that tied the top score in the Netflix competition, will give a two-part
seminar next week about the technical details and other insights into this
highly celebrated competition in machine learning.

The seminar will be classroom style with whiteboard level of detail, and
will be given as guest lectures in the Learning Systems course, but others
at Caltech are welcome.


When: Tuesday, January 12, and Thursday, January 14, 2:30-4:00 PM.
Where: 070 Moore

----quoted from Yasar's email ------ 
 

Wednesday, January 6, 2010

Lecture 2: The structure of the web

Well, I didn't quite finish what I had hoped to cover in lecture today, but we'll come back to the last section and look at "Clustering" at the start of the next lecture.  The homework does a little bit with clustering, but it's completely self-contained.  So, it'd be great if you finished that before the next lecture!  (Remember that HW2 will go out on Monday even though HW1 isn't due until Wednesday.)

The notes (including the part about clustering) are up now.  They are handwritten, so I hope they're legible.  Please let me know if you find any typos, and I'll correct them.

Tuesday, January 5, 2010

Lecture 1: What is this course about?

I was glad to see so many people at the lecture Monday, I hope to see most of you again Wednesday!

-- It seems like the room will be tight, but workable, so unless lot's more people show up Wed we'll stay put in 213 Ann

-- As per our discussion, my office hours will be 7-8pm on Tuesdays, but if there are lot's of people asking questions I'll probably be willing to stay past 8.

-- I forgot to ask if people want me to have office hours this week.  My guess is that there's not much of a point since we haven't started much yet.  But, if you'd like to come today, post a comment here so that I know there's interest and I'll stick around.  If there are no comments, then I'll start my office hours next week.

-- With regards to the size of the web, Virgil made a nice observation after class: there is a page for each natural number listing interesting properties of it (see wikipedia for example), which proves that the web is at least countably infinite. 

-- Don't forget to start the homework early!  The preferred way to ask questions is to comment on the blog post for HW1.  But, of course, if you have something that you don't want to post, please feel free to email the HW guru (for HW1 this is JK).

Update: The slides are now up

Sunday, January 3, 2010

Homework 1 is out!

Homework 1 has been posted on the course web site.

Please feel free to discuss the homework and ask us any queries in the comments!

Welcome to CS144!

And, welcome back to Caltech!

We will be using this blog to post course announcements and discuss the homeworks/lectures, so please check it regularly!