Archive for the ‘sticky’ Category
Dirty Driving
Friday, August 8th, 2008Miller in Repo Man says something about driving that I have come to absolutely agree with. There is only one sane thing you can do while driving: run your own talk show! So while there’s no list of what you can do on a long commute/drive because of limits that reality places on the possible, here’s a list of some hard-learned ways you can set the driving part right:
- Lane changing in front of a vehicle: Pull out a bit in front of a vehicle before moving into their lane. You may be visible to them, but your turn indicators at the rear may be in their frontal blind spots (and your turn indicators at the front are not visible to them anyway).
- Right on Red (only after stop!) is a choice, not an obligation. Don’t risk it.
- Speed Limits are — surprise surprise — doable, at least on multi-lane highways, you can stay at the limit tucked in the right lane. Putting on cruise control at the limit achieves (i) fuel efficiency, (ii) constant speed, (iii) maintenance below the speed limit. Going over the limit exposes you to risks that have been amply assessed in coming up with that number (assuming it’s reasonable — not a remainder from a repair mission that got done a month ago), and to curbside conversations with conscientious cops. There are reasons for every limit: construction, residential area close to the road, stray animals, blind driveways, accident history — and most of them are not visible. Most motorists would slow down by themselves if they knew about the risk. Besides, speeding doesn’t save time — unless you’re hurrying for an audience with a bloodthirsty martinet dictator — or on a cross-country trip — in which case you should be focusing on getting there, not getting there early, lol.
- Honking at slow motorists, or even being mildly irritated or even bemused at slow motorists in the right lane (in right lane driving regions) is inappropriate, especially when there are multiple lanes. You can choose to pass; if you’re too chicken to pass, there’s a reason to honk at yourself.
- Turn on wipers and headlights in the rain — the first to see and the second to be seen.
- Park back-in in perpendicular parking spots. It saves a lot of aggravation while pulling out; and is a lot easier to achieve than it appears. Try it out once and you’ll never park front-in.
- Two-way left turn lanes (TWLTL; in right lane driving regions) are nicens nice for making a left from a highway. But getting onto a highway by joining in the TWLTL is a risky maneuver if there is any approaching traffic at all. The approaching traffic has to move into TWLTL as late as possible; and you have to make sure not to be there when it happens.
- Tailgaters: I’ve been thinking of making a tiny radar device that tells you if you’re tailgating by computing your speed relative to the vehicle you’re following and your distance from it. etc. There’s a more sophisticated way to warn tailgaters: a bumper sticker that says “If you can read this, I could slam on my brakes and then sue you!” that I saw on a humvee.
Sheela Talkies
Sunday, June 22nd, 2008Movie reminders
CineDelica
The title refers to the famous Rohtak cinema hall Sheela Talkies
The Zen of Cluster Counting
Sunday, May 25th, 2008Problem I’m using k-means (or insert-clustering-gizmo-here) algorithm. How many clusters shall I partition my data into?
Solution Consider the scale of the clustering, which is naively the zoom setting at which the data is plotted. At a wide enough zoom, all data is one cluster, at telephoto, each point is a cluster. Scale is chosen at the outset of the problem determined not by clustering algorithm but by what is to be achieved by the clustering.
Let’s characterize the correct number of clusters at a given scale.
- A big Tibshirani Gap: Across-group variance of n-grouped similarly-distributed random data is much higher than the across-group variance of n-grouped given data.
- How big is big? Try various values of n and pick the largest.
- How do you generate similarly distributed radom data in high dimensions??
- Low across-iteration variance in variance: If the number of clusters hits the sweet spot, the grouping will be stable across iterations; i.e. a global minima will exist for minimum variance which can be attained several times. For n-clustered data, the variance measure at each iteration will be stable.
- What’s the variance for multi-dimensional data? For a basic implementation: the trace of covariance matrix.
- How many iterations? Thousands of them.
Pedagogy: Best Practices
Tuesday, April 8th, 2008- Choose clear and spare expression, without repetition, and with pauses that let concepts sink into the students’ minds. This (besides mastery of the subject matter) lends the understated feel to lectures of masters like Naveen Garg and Uri Eden.
- Use slides only as exhibits to illustrate a point, not as the central tool for the lecture. Do not write more than two lines of text on a slide — it is supposed to anchor interest, not absorb all attention — and more text makes people read the words and gloss over the meaning.
- Ask questions about basic concepts being used, not so much to test, as much as to get students talking and making them link the new stuff to what they already know. Making students answer even trivial questions — that come along in regular business of the lecture — revives their interest.
- Prepare a clear lesson plan for every lecture. Going in with only understanding — however thorough — of the material but no lesson plan is a recipe for a shameful disaster.
- Prepare a clear plan for using the board (or learn to use it properly). Scribbling around, erasing useful equations, writing in gaps, overwriting, all drain away students’ attention. Besides the plan, practice using the board offline till you can use it coherently in class.
- Don’t gainsay yourself, or get ahead of yourself in lecturing. Saying “Oh, I said that, but I meant that only for sufficiently large n” is a good way to make students lose track of everything. A good way to avoid this is to have a clear lesson plan. Student attention is like a river, not like hypertext, it needs coherent flow.
- Avoid silly jokes (they have a talent for sounding funny when they creep in, but that’s just because you’re talking for more than an hour — everybody else will hate them, and you). Puns, visual jokes, and other word play are an annoyance to mature audience, and a distraction for everybody including you, the lecturer.
- Leave personal baggage outside the class. The connection that you are facilitating is between students and the subject; not the connection between you and the students, or you and the subject.
- Do not use adjectives to describe the class, and never ever use a negative adjective for the lecture/subject or any part of it; and show your enthusiasm about the subject by your excellence in teaching it instead.
- Explain the intent of the lecture: is it to convey a thorough understanding, or is it to give a flavor? Leaving this out makes students feel out of place or dumb, when they assume they’re understanding less than they are supposed to be able to.
- Motivate every idea, and keep the motivation visible at all times. To focus the lecture, it’s helpful to remind students of the motivating idea by pointing it out on the board.
- Do not present a solution without specifying the problem. For example, “We’re going to do additive models” is not a valid beginning, and it doesn’t help if the next thing you discuss is how backfitting, EM or your favorite algorithm does a good job of computing the parameters. “We got some data from ____, and it looks like this. To figure out why this could be the case, we model it as a ____. This is a tough problem, so we make simplifying assumptions ______ which lead us to additive models which we’ll talk about today.”
- Avoid references to new material that’s no longer on the board. Either keep it on the board, or point it out in a book, or don’t mention it. It’s not good enough to say “On the last slide we saw…”, “This Q is the cost function for the formula we just derived,” If they’re writing it all down, this might work, but this encourages students to tune out.
- Do not drop references to material you’re not discussing. It makes the referred material sound tough or daunting than it actually ever can be.
R from 0 to “What seems to be the problem, Officer?”
Tuesday, April 8th, 2008Links for learning R, clone of the statistical processing language S from Bell Labs, roughly along the abscissa of the learning curve.
- R Tutorial: Using R A quick introduction
- Statistical Computing with R: A tutorial An introduction with graphics — a big draw for R
- R Course A course, self-contained — maybe it should be at the top and be the only link here.
- R Tutorial A course-like tutorial to the basics of R
LaTeX/Python Weekly Planner
Monday, April 7th, 2008Problem: The LaTeX planner is cool, but how do I make one for me without much hassle?
![]()
Briefly: Separate out the configurable part (photo, schedule etc.), and generate the remaining calendar file automagically (using a programming language instead of bash this time around).
Solution:
The pdf calendar: newcal.pdf
The tex source with dates (generated from lc.py script weekly): newcal.tex
The tex source with graphics and schedule (edited manually): personalizations.tex
The python source that generated the tex code for newcal.tex: lc.py
C++ Quickguides
Thursday, March 20th, 2008Python Resources for Scientists, Engineers, and Statisticians
Tuesday, March 18th, 2008Ordered roughly by relevance for getting to know the kool tool:
- Mathematics in Python Pep talk about why (and how much) python is neat.
- Python Programming Learning to do science in Python in less than a day
- Python for scientific use. Part I: Data Visualization Introduction to using gplt with Python
- Python for scientific use, Part II: Data analysis Some more usage notes on gplt/Python
- SciPy and NumPy Main packages for science and numerical methods, to be used and contributed to.
- Package Scientific Some indispensable classes and utilities
- PLEAC Python “In this document, you’ll find an implementation of the Solutions of the Perl Cookbook in the Python language.” — Mighty useful.
- Dive into Python “Python from novice to pro” rapidly, but geared for programmers and painfully slow for scientific programming.
- Python Tutorial The basic tutorial.
Organized Sources of Knowledge Online
Tuesday, February 12th, 2008- Julius O. Smith, Music Signal Processing Series
- David MacKay, Information Theory, Inference, and Learning Algorithms
- Oded Goldreich, Computational Complexity: A Conceptual Perspective
- Todd Will, Introduction to the Singular Value Decomposition
- Meta: George Cain, List of online textbooks
- Meta: Jean-Marc Gulliet, Free Online Textbooks, Lecture Notes, Tutorials, and Videos on Mathematics
- Andrew Moore, Statistical Data Mining Tutorials, Slides by Andrew Moore