Aloha from Hawaii

January 17, 2011 4:35 pm

We just got back from a vacation to Maui, Hawaii. It was a 5 hour flight from San Francisco. We had a great time there especially since Hawaii felt a lot like Kerala, our home state in India. There were a lot of similarities – warm weather, rain, the landscape, vegetation etc – they grow pineapples, coconut, banana, etc. Also Taro (Chembu as known in Kerala) is a staple food here. They have a saying that if someone said they liked taro, he has to be either a Hawaiian or a liar.

Black-sand beach in Hana.

Black-sand beach at Hana

Hump-back Whale

Hump-back Whale

Food at the Luau

Clouds over Haleakala volcano

Clouds over Haleakala volcano

OpenCV Performance and Threads

October 3, 2010 11:22 pm
If you use OpenCV library, be aware that the library spawns threads for image processing. I found this while investigating a performance issue. It turns out that the default number of threads is equal to the number of CPU cores. So in my dual quad-core box, it was spawning 8 threads per web server process, resulting in very bad performance. Creating threads per request is very bad for throughput anyway and won’t scale for high-traffic applications.
Explicitly setting the number of threads as 1 gave a 15x speed boost for my application. Not bad for a one-line code change. Have a look at cv::setNumThreads() if you are using the C++ library and cvSetNumThreads() if you are using the Python wrapper.

If you use the OpenCV library, be aware that it spawns threads for image processing. I found this while investigating a performance issue in a web application I was working on. It turns out the default number of threads is equal to the number of CPU cores. So in my dual quad-core box, it was spawning 8 threads per web server process, resulting in poor throughput while serving concurrent requests. This default behavior of OpenCV is probably targeted towards desktop applications where it makes sense to use all the available CPU cores. The performance problem arose from the fact that even under 5 rps, there were 40 threads, all competing for the CPU, so the cost of context switching was significant. In any case, creating threads on the fly per request is not a good idea for a server-side application and it’s not going to scale for high-traffic systems.

Explicitly setting the number of threads as 1 improved the throughput and latency of my application several times. Not bad for a one-line code change. Have a look at cv::setNumThreads() if you are using the C++ library and cvSetNumThreads() if you are using the Python wrapper.

Yahoo! Buzz Topic Pages Are Live!

July 29, 2010 1:37 am

Topic pages are live in the Yahoo! Buzz U.S. site.  This is the project I’ve been working on for the last few months. The idea is to algorithmically generate topic pages for the buzzing topics of the day.  The popular topics can be accessed from the “top topics” nav bar in the Buzz site. For a sample, click here to see the topic page about Chelsea Clinton.  A screenshot:

Yahoo! Buzz Topics

Fun fact: We used these programming languages to build the back-end systems which power the site: Perl, PHP, Python, Java. Not to mention different storage and indexing systems, databases, servers, in-house and external frameworks, libraries etc.

This is the hard-work of a lot of smart and dedicated people who I have the privilege to work with.  Please check it out and let me know what you think. Do you find it useful?  Did you run into any bugs?  Send me an email.  Keep watching this space for updates.  There is lot more to come, I promise.

2 + 2 = 4

May 8, 2010 12:34 am

I’m nearing the end of a  two-week vacation to India.  The long flight + free time gave me the opportunity to read a few books.  There are a couple of ones I thought was worth mentioning:

High Fidelity is  a movie I enjoyed thoroughly. I watched it several years ago and finally read the book last week.  The movie closely follows the book, with some changes, for instance, the the story happens in Chicago as opposed to London. Like the movie, it was humorous and authentic.  It’s not often you read a book and laugh out loud on each page.



Then I read Orwell’s Nineteen Eighty-Four.   The story is futuristic and takes place in 1984, when the world is divided mostly into 3 superpowers which are at a permanent state of war.  The protagonist lives in a country called Oceania which consists of the Americas, British Isles and Australia.  The government is totalitarian and controls every single aspect of the citizen’s life.  Even thinking unorthodox thoughts is punishable by torture and death. The government is working on a subset of the English language called NewSpeak to make it impossible for people to think unorthodox thoughts. While reading it, I was reminded of the East German Stasi.  My favorite quote from the book:

Freedom is the freedom to say that two plus two make four. If that is granted, all else follows.

It takes amazing foresight to write such a book  in 1949. It’s bitingly sarcastic and haunting.  Go read it if you haven’t already!

Biking to Work

July 28, 2008 9:30 pm

I recently bought a bike and started riding it to work. Usually it takes me around 15 minutes to drive to work and it takes just 20 minutes to ride the bike to work. I guess biking to work is a convenient way to get in shape without spending too much time.

Cycling Commute Map

gbookmark2delicious 2.1 is Out

July 5, 2008 4:41 am

A new version of gbookmark2delicious is out. All the credit goes to Yang Zhang who implemented the new features. Some of them include:

  • Incremental synchronization capability for continuous mirroring of Google Bookmarks onto delicious.
  • Updates to work with current Google Bookmarks and delicious interfaces/formats.
  • Handle throttling and persistent retries for delicious’ REST API.
  • More flexibility via beefed-up CLI frontend (more options, etc.)
  • Local cache of the remotely pulled data.

YUI Compressor

May 31, 2008 7:41 pm

If your web application is JavaScript intensive (these days most apps are), it will be a good idea to minify the JavaScript and CSS code. A minification tool reduces the byte footprint of the code without impacting the semantics. YUI Compressor is the best JavaScript minifier out there.

The product I work on at Yahoo! is pretty heavy on JavaScript and YUI Compressor gives me almost a 50% compression ratio, so the savings are significant.

$ du -hs foo*js
140K foo.js
216K foo_src.js

As you noticed, I keep the original source in CVS with a file name like foo_src.js. During build, YUI compressor is run on foo_src.js and foo.js is generated. The application includes the compressed version, foo.js.

Snow in Atlanta?

January 16, 2008 9:25 pm

My friend Rohit who lives in Atlanta sent me this. When I lived in the South, all I ever saw was snow flakes falling for 5 minutes, just once a year.

Snow in Atlanta

Yahoo! India Maps Now With Driving Directions

October 25, 2007 8:29 pm

One of the great things about the U.S. is the standardized address system which makes it so easy to locate any place. Almost every street has a name and each building has a number. Say you wanted to get from Washington, D.C. to Miami beach, it’s pretty straightforward.

Unfortunately there is no such thing as address standardization in India which makes it very hard to find directions. Yahoo! India Maps team has just released driving directions for Indian roads. An excerpt from an internal mail:

In India, we drive a little differently. We drive on the left. Or the right, if the left is occupied. We also give directions a little differently. “Third right after the big banyan tree on your left” might leave many bemused, but is guaranteed to lead the lost, out here. Starting today, you can search for driving directions, Indian style anywhere in India on the Yahoo! India Maps site.

We’ll tell you directions with landmarks to watch out for while taking turns. We’ll also tell you how many turns come before the one you have to make. So, go and take this feature out for a test drive and tell us what you think. If you, like us, spend a good portion of our lives arguing with auto rickshaw drivers on the correct fare, then you’ll be pleased to see the auto fare calculation along with the driving directions.

Enough said. Time for you to take a serious look at making this trip. 115 steps on the map, but a journey of a lifetime.

Some slick things that you may notice:

  • Directions for all of India (well, wherever we have data) and wherever US Maps works.
  • Landmarks and Nth left based directions. This is how you go from the Yahoo Office in MG road, Bangalore to the one at EGL.
  • Auto richshaw fares. (A three wheeler cab that we use to commute)
  • Directions in SMS/Text. (You’ve heard that a lot of people in India have mobiles, right?)
  • One search box and hence multi way points. (such as this one)
  • A lot of bugs.

My First Month With Yahoo!

October 22, 2007 1:33 am

Last month I accepted an offer from Yahoo! and moved to the Silicon Valley. I had been living in Atlanta for the last 3 years and had made quite a few good friends there. So leaving was harder than I thought it would be, but I was really excited about the new job. This was an opportunity to join a company which is synonymous with the web and work with some really smart people.

Yahoo! Logo

Anyway it has been almost a month since I started with Yahoo! and it has been a very interesting experience so far. I’ve never seen a company which runs completely on open-source software. Most of the development work here happens in C++, Perl and PHP on Linux and FreeBSD. Yahoo! contributes back to the community heavily and employs quite a few open-source hackers. (Rasmus Lerdorf from PHP, Jeremy Zawodny from MySQL, Doug Cutting from Lucene/Nutch/Hadoop, just to name a few.)

I was doing Java development before, so was used to doing coding in Windows using Eclipse. So initially I thought of doing development in my Windows laptop using some IDE, but later decided to do my work in Linux using Emacs. The project I work on is deployed in FreeBSD, so I do development in Linux and deploy it in a FreeBSD virtual machine hosted in my Linux workstation using vmware. It’s a pretty cool set up and I’ve become a big fan of virtualization at the OS level.

Yahoo! Sunnyvale Office

Yahoo! office in Sunnyvale, California where I work. There are 7 buildings in the Sunnyvale campus. There are two more campuses in the nearby Santa Clara and many satellite offices across the U.S. and around the world.

I’m trying to familiarize myself with the code base and understand the architecture of the whole system. It’s complex and the scale of deployment is challenging. I believe in learning things by doing hands-on work, so I’m working on a bunch of new features. The work environment is casual and the people here are very easy-going and helpful. I’m having a lot of fun.