My company has a buffet every Friday, and the lines grow to epic proportions when the food arrives. I've suspected for years that the “classic” buffet line system is a deeply flawed and inefficient method, and every time I'm stuck in the line has made me more convinced.
I started writing this blog in late 2012, partly because I felt like it would help me improve my English and my writing skills, partly because I kept having a lot of random ideas in my head and I wanted to write them down somewhere.
Turns out having a toddler isn't super compatible with reading. I used to read ~100 books/year as a teenager, but it has slowly deteriorated to maybe 20-30 books, at most. And I don't even finish all of them because life is too short!
Just for fun, I generated these graphs of the number of letters in the word for each number. I really spent about 10 minutes on this (ok…possibly also another 40 minutes tweaking the plots):
Here's a dumb extremely accurate rule I'm postulating* for software engineering projects: *you need at least 3 examples before you solve the right problem*.
This is what I've noticed:
Don't factor out shared code between two classes.
I just bought Machine, Platform, Crowd: Harnessing Our Digital Future and discovered that it mentions my blog – in particular the post When machine learning matters.
Ok, I lied a little bit. I didn't discover it serendipitously.
There's about 765 million blog posts about the diversity “memo” that leaked out of Google a couple of weeks ago. I think the case for any biological difference is pretty weak, and it bothers me when people refer to an “interest gap” as anything else than caused by the environment.
Remember when everyone had a really ugly blog with a blogroll? Anyway, just think the word is funny.
I follow a few hundred blogs using Feedly and Reeder and have been reading a few hundred thousand blog posts over the last 10 years.
I was reading yet another blog post titled “Why our team moved from <language X> to <language Y>” (I forgot which one) and I started wondering if you can generalize it a bit. Is it possible to generate a N * N contingency table of moving from language X to language Y?
Here's a fun analysis that I did of the pitch (aka. frequency) of various languages. Certain languages are simply pronounced with lower or higher pitch. Whether this is a feature of the language or more a cultural thing is a good question, but there are some substantial differences between languages.
Pareto efficiency is a useful concept I like to think about. It often comes up when you compare items on multiple dimensions. Say you want to buy a new TV. To simplify it let's assume you only care about two factors: price and quality.
Why does it suck to wait for things? In a previous post I analyzed a NYC subway dataset and found that at some point, quite early, it's worth just giving up.
This isn't a proof that the subway doesn't run on time – in fact it might actually proves that the subway runs really well.
I've been trying to learn Clojure. I keep telling people I meet that I really want to learn Clojure, but still every night I can't get myself to spend time with it. It's unclear if I really want to learn Clojure or just want to have learned Clojure?
One of my favorite business hobbies is to reduce some nasty decision down to its absolute core objective, decide the most basic strategy, and then add more and more modifications as you have to confront the complexity of reality (yes I have very lame hobbies thanks I know).
Apparently MTA (the company running the NYC subway) has a real-time API. My fascination for the subway takes autistic proportions and so obviously I had to analyze some of the data. The documentation is somewhat terrible, but here's some relevant code for how to use the API:
(This is not a very relevant/useful post for regular readers – feel free to skip. I thought I would share it so people can find it on Google.)
My blog blew up twice in a week earlier this year when I landed on Hacker News.
My blog post about fonts generated lots of traffic – it landed on Hacker News, took down my site while I was sleeping, and then obviously vanished from HN before I woke up. But it also got retweeted by a ton of people.
For some reason I decided one night I wanted to get a bunch of fonts. A lot of them. An hour later I had a bunch of scrapy scripts pulling down fonts and a few days later I had more than 50k fonts on my computer.
(Warning: super speculative, feel free to ignore)
As Yogi Berra said, “It's tough to make predictions, especially about the future”. Unfortunately predicting is hard, and unsurprisingly people look for the Magic Trick™ that can resolve all the uncertainty.
I was playing around with D3 last night and built a silly visualization of antipodes and how our intuitive understanding of the world sometimes doesn't make sense. Check out the visualization at bl.ocks.org!
Basically the idea is if you fly from Beijing to Buenos Aires then you can have a layover at any point of the Earth's surface and it won't make the trip longer.
I have spent some time lately with D3. It's a lot of fun to build interactive graphs. See for instance this demo (will provide a longer writeup soon).
D3 doesn't have support for 3D but you can do projections into 2D pretty easily.
Saw this link on Hacker News the other day: The Highway Lane Next to Yours Isn’t Really Moving Any Faster
The article describes a phenomenon unique to traffic where cars spread out when they go fast and get more compact when they go slow.
I just pinged a few million random IP addresses from my apartment in NYC. Here's the result:
What's going on with Sweden? Too much torrenting? Ireland is likewise super slow, but not Northern Ireland Eastern Ukraine is also super slow, maybe not surprising given current events.
Wow I guess it was more than a year ago that I tweeted this. Crazy how time flies by. Anyway, here's my rationale:
When I update one line of code I feel like I have to put in a long explanation about its side effects, why it's fully backwards compatible, and why it fixes some issue #xyz.