Friday, January 6, 2012

Homework 1

Homework 1 is now posted to the course website. It is due next Friday at 1:30pm. 

To turn in the HW, you can either drop a paper copy off at Adam's office (I will soon have a box outside my door for submissions), or by email to the homework guru (in this case Raga).

If you have any questions about the HW, or just want to point out typos, please post them as comments on this post and then the TAs/prof will respond ASAP. 

Don't forget to start early!

37 comments:

  1. For 4b, do we necessarily have to include two paths for each pair? (My first path is the shortest/only path I could find)

    ReplyDelete
  2. The same path can certainly serve as "first" and "shortest". I just want you to report both so that we can talk in lecture about how close the two turn out to be on average across the class.

    ReplyDelete
  3. On 1(b), I think I've constructed a graph that works, but I'm having trouble calculating diameter and distance exactly for an c. Is this required?

    ReplyDelete
  4. Also, where are the slides from the probability review posted?

    ReplyDelete
  5. @robert I guess you don't need to calculate the exact diameter, but if you don't, you do need to prove that it's at least 3 (or c).

    And, sorry, I didn't get the slides from the probability refresher posted yet. That'll likely happen either tomorrow or Monday.

    ReplyDelete
  6. Thanks, I'm more worried about the average distance though, because calculating it involves a summation of many values. (maybe (number of nodes)^2?) Without this I don't know if the diameter is 3 times the average or less.

    ReplyDelete
  7. @Robert: Your graph should be such that you can easily compute the average distance, or at-least an upper bound to it.

    ReplyDelete
  8. For the 4b: Where do we get data on which scientist who have collaborated?

    ReplyDelete
  9. For 4(b) there are several resources you can use - the simplest is just to look at the list of publications on the authors' websites. You can use Google Scholar or Microsoft Academic Search to search for publications. If you come across any such tools on the Web, you are welcome to use it, and say so on your submission.

    ReplyDelete
  10. Q: "in the hwdata1.txt given, in the edgedef>node1,node2 list, there is 0
    listed where as there is no node '0' which is in the node list. So, I
    am confused as to what this actually means. Can you please throw light
    upon this?"
    A: Please use the new "hw1data_updated.txt" data file from the course website (it was updated yesterday, but we forgot to add a comment here about it - sorry for the inconvenience).

    ReplyDelete
  11. Q: 3(a) "The procedure described gives us a file with names of friends. Is there a way to get it in the node number format as the sample text file given? So that it will be easier to analyze it."
    A: The sample text file is actually an anonymized version of the actual file generated by one of your TAs. No, the tool will not directly give you the node number format - if you think it could save a lot of time, you could write your own script that does that.

    ReplyDelete
  12. For 4a, can we list category pages (such as http://en.wikipedia.org/wiki/Category:California_Institute_of_Technology_faculty) as one of the links as well?

    ReplyDelete
  13. I'm confused about the definition of diameter. Is it the longest path between two nodes? Or is it the longest "shortest distance between two nodes" (directly substituting the definition of distance into the given definition of diameter)?

    Also, for the Facebook problem, how do we find the overall number of triangles for the entire graph? It's not equal to the number of triangles centered at the given person's node...is it?

    ReplyDelete
  14. I am having trouble getting the friend network app to work... It seems to get to the end of its calculation and then it just hangs indefinitely. Is anyone else having problems?

    ReplyDelete
  15. Sorry, it isn't hanging indefinitely - It seems to complete, but the results are cut off at the bottom because the app's canvas is too small...

    ReplyDelete
  16. Nevermind, I managed to access the part I couldn't see with firebug...

    ReplyDelete
  17. Q: I'm confused about the definition of diameter. Is it the longest path between two nodes? Or is it the longest "shortest distance between two nodes" (directly substituting the definition of distance into the given definition of diameter)?

    A: The diameter is the longest "shortest distance" between two nodes.

    Q: Also, for the Facebook problem, how do we find the overall number of triangles for the entire graph? It's not equal to the number of triangles centered at the given person's node...is it?

    A: No. You can also have triangles not involving the given person. For example, the given person might be friends with A, B, and C, where A, B, and C are also friends with one another.

    ReplyDelete
  18. Q:For 4a, can we list category pages (such as http://en.wikipedia.org/wiki/Category:California_Institute_of_Technology_faculty) as one of the links as well?

    A. Sure

    ReplyDelete
  19. Q: Also, for the Facebook problem, how do we find the overall number of triangles for the entire graph? It's not equal to the number of triangles centered at the given person's node...is it?

    A: Adding on to JK's comment, you don't need to find the overall number of triangles for the entire graph. You are only asked to find that in the ego network graph of one person.

    ReplyDelete
  20. A: Adding on to JK's comment, you don't need to find the overall number of triangles for the entire graph. You are only asked to find that in the ego network graph of one person.

    But that's Cl_i(G). We need to find Cl(G), too, right?And for that we need 3x the number of triangles in the overall graph..

    ReplyDelete
  21. If you read the question carefully, you'll notice that it asks you to find Cl_i(G) and Cl(G), where G is the ego network of user i - G is NOT the entire Facebook friends graph. To find Cl(G) for the entire graph, you'll need a lot more data that what is provided to you in the GUESS file.

    ReplyDelete
  22. I guess I didn't understand what you meant by "entire graph" - if by that you mean the ego network graph of user i, then you are right, you do need to find the number of triangles in the "entire graph", which can include triangles other than that involving user i - the GUESS file has enough data for that.

    ReplyDelete
  23. I don't understand how we can process so much data without writing a program for it, and it's a 5 point problem.

    ReplyDelete
  24. Q: I don't understand how we can process so much data without writing a program for it, and it's a 5 point problem.

    A: A quick program is certainly one easy way to solve it. But, you can also think a little bit more about the definitions and make things easy on yourself...

    ReplyDelete
  25. Okay, I take back my answer above. The easiest way is to write a quick script to process the file.

    ReplyDelete
  26. I'm having trouble finding any list of co-authors of Schrodingers. When I search google scholar, the first several pages return papers where he was the sole author.
    Schrodinger's wikipedia page refers to this web site, which has a complete list of his publications: http://www.zbp.univie.ac.at/schrodinger/ebibliographie/publications.htm
    but again, co-authors are not listed.
    Any tips here?

    ReplyDelete
  27. Now I'm confused again about 4a. So it is possible to do it in some efficient way without writing a program? We basically have two numbers to work with, then.
    Or is it not possible at all?

    ReplyDelete
  28. @Anonymous: No, I'm not aware of a way to compute Cl(G) without some kind of small script/program to process the GUESS file. But this should be straightforward, and not hard.
    @Robert: I tried Googling for some of the publications of Schrodinger from the website which has the list of his papers, and some results list co-authors. This is one way to go, but there might be easier ways to go about.

    ReplyDelete
  29. This comment has been removed by the author.

    ReplyDelete
  30. For the Amazon part of problem 4, I ended up on the Kindle Fire page, where Amazon decided to omit the "Customers Who Bought/Viewed This Item Also Bought/Viewed" section, in favor of a "Kindle Fire Accessories" page. Can I use that instead or should the page be viewed as a dead end?

    ReplyDelete
  31. Do we have to include our script for 3a? My concern is more of...will we be graded on the quality of the script (do we have to comment things)?

    ReplyDelete
  32. Sure, just for the Kindle pages, feel free to use the Accessories list if you want to.

    ReplyDelete
  33. Q: Do we have to include our script for 3a? My concern is more of...will we be graded on the quality of the script (do we have to comment things)?

    A: Yes, if you use a script, you should submit it. We will not be grading it on its "prettiness" though, just it's correctness.

    ReplyDelete
  34. Reminder of guidelines for submitting your homework:
    (1) Note on your homework specifically which problems were a collaborative effort and with whom.
    (2) For hard copy submissions, at the top of your homework, note down the date and (approximate) time at which you turn in your homework so that we can keep track of the tokens you use.
    (3) Late days that fall on weekends and holidays also cost you tokens! So if you don't have access to the submission box on weekends and holidays, to avoid losing more tokens than necessary, please scan your homework and email it to the Guru. You can turn in the hard copy on the next business day (add an appropriate note at the top of your homework).

    ReplyDelete
  35. (4) Also, we highly recommend that you take a photocopy of your homework (if submitting a hard copy) before submitting it.

    ReplyDelete
  36. Students, those of you who have submitted your homework (either a hard copy or through email) should get an ACK by email within 24 hours of your submission. If you don't, please contact us immediately.

    ReplyDelete