2009-07-31

GitHub Contest


I am working on github contest with Daniel Haran. Unlike NetflixPrize, the github contest is a Top-K resys task. I think, it is another important task in recommender system.

Let's take movie recommender system for example. When we design a movie resys, we meet two problems:
1) Given a user, we should find which movies he/she will watch. That is finding a candidate movies set.
2) In the candidate set, we should find which movies this user will like after watching.

I think, the first task is top-k recommendation task (GitHub) and the second task is prediction task (NetflixPrize).

Solving two above tasks is the fundanmental of design good recommender system.

2009-07-28

Netflix Competitors Learn the Power of Teamwork

Published: July 27, 2009

A contest set up by Netflix, which offered a $1 million prize to anyone who could significantly improve its movie recommendation system, ended on Sunday with two teams in a virtual dead heat, and no winner to be declared until September.

But the contest, which began in October 2006, has already produced an impressive legacy. It has shaped careers, spawned at least one start-up company and inspired research papers. It has also changed conventional wisdom about the best way to build the automated systems that increasingly help people make online choices about movies, books, clothing, restaurants, news and other goods and services.

These so-called recommendation engines are computing models that predict what a person might enjoy based on statistical scoring of that person’s stated preferences, past consumption patterns and similar choices made by many others — all made possible by the ease of data collection and tracking on the Web.

“The Netflix prize contest will be looked at for years by people studying how to do predictive modeling,” said Chris Volinsky, a scientist at AT&T Research and a leader of one of the two highest-ranked teams in the competition.

The biggest lesson learned, according to members of the two top teams, was the power of collaboration. It was not a single insight, algorithm or concept that allowed both teams to surpass the goal Netflix, the movie rental company, set nearly three years ago: to improve the movie recommendations made by its internal software by at least 10 percent, as measured by predicted versus actual one-through-five-star ratings by customers.

Instead, they say, the formula for success was to bring together people with complementary skills and combine different methods of problem-solving. This became increasingly apparent as the contest evolved. Mr. Volinsky’s team, BellKor’s Pragmatic Chaos, was the longtime front-runner and the first to surpass the 10 percent hurdle. It is actually a seven-person collection of other teams, and its members are statisticians, machine learning experts and computer engineers from the United States, Austria, Canada and Israel.

When BellKor’s announced last month that it had passed the 10 percent threshold, it set off a 30-day race, under contest rules, for other teams to try to best it. That led to another round of team-merging by BellKor’s leading rivals, who assembled a global consortium of about 30 members, appropriately called the Ensemble.

Submissions came fast and furious in the last few weeks from BellKor’s and the Ensemble. Just minutes before the contest deadline on Sunday, the Ensemble’s latest entry edged ahead of BellKor’s on the public Web leader board — by one-hundredth of a percentage point.

“The contest was almost a race to agglomerate as many teams as possible,” said David Weiss, a Ph.D. candidate in computer science at the University of Pennsylvania and a member of the Ensemble. “The surprise was that the collaborative approach works so well, that trying all the algorithms, coding them up and putting them together far exceeded our expectations.”

The contestants evolved, it seems, along with the contest. When the Netflix competition began, Mr. Weiss was one of three seniors at Princeton University, including David Lin and Lester Mackey, who made up a team called Dinosaur Planet. Mr. Lin, a math major, went on to become a derivatives trader on Wall Street.

But Mr. Mackey is a Ph.D. candidate at the Statistical Artificial Intelligence Lab at the University of California, Berkeley. “My interests now have been influenced by working on the Netflix prize contest,” he said.

Software recommendation systems, Mr. Mackey said, will increasingly become common tools to help people find useful information and products amid the explosion of information and offerings competing for their attention on the Web. “A lot of these techniques will propagate across the Internet,” he predicted.

That is certainly the hope of Domonkos Tikk, a Hungarian computer scientist and a member of the Ensemble. Mr. Tikk, 39, and three younger colleagues started working on the contest shortly after it began, and in 2007 they teamed up with the Princeton group. “When we entered the Netflix competition, we had no experience in collaborative filtering,” Mr. Tikk said.

Yet based on what they learned, Mr. Tikk and his colleagues founded a start-up, Gravity, which is developing recommendation systems for commercial clients, including e-commerce Web sites and a European cellphone company.

Though the Ensemble team nudged ahead of BellKor’s on the public leader board, it is not necessarily the winner. BellKor’s, according to Mr. Volinsky, remains in first place, and Netflix contacted it on Sunday to say so.

And in an online forum, another member of the BellKor’s team, Yehuda Koren, a researcher for Yahoo in Israel, said his team had “a better test score than the Ensemble,” despite what the rival team submitted for the leader board.

So is BellKor’s the winner? Certainly not yet, according to a Netflix spokesman, Steve Swasey. “There is no winner,” he said.

A winner, Mr. Swasey said, will probably not be announced until sometime in September at an event hosted by Reed Hastings, Netflix’s chief executive. The movie rental company is not holding off for maximum public relations effect, Mr. Swasey said, but because the winner has not yet been determined.

The Web leader board, he explained, is based on what the teams submit. Next, Netflix’s in-house researchers and outside experts have to validate the teams’ submissions, poring over the submitted code, design documents and other materials. “This is really complex stuff,” Mr. Swasey said.

In Hungary, Mr. Tikk did not sound optimistic. “We didn’t get any notification from Netflix,” he said in a phone interview. “So I think the chances that we won are very slight. It was a nice try.”

2009-07-27

关于下一代推荐系统的一些看法

首先祝贺我们的队伍"The Ensemble"在leaderboard上取得第一名,最终谁赢得比赛还没有正式宣布,不过我们能在最后30天做出这样的成绩还是不错的,感谢队友们,不是他们的帮助,我永远也无法知道0.855x的结果是如何做出来的。

比赛结束了,我对现有的推荐系统做了一番思考。在中国,我一向最佩服的就是douban的推荐系统,因为他们的推荐系统设计是专业的,而不是只用了简单的方法。据说他们使用了业界的一些先进算法。我用豆瓣很久了,但是他的推荐系统还有问题,当然这个问题不是豆瓣的问题,而是推荐系统中很难解决的一些问题。

1.我们知道,单纯的collaborative filtering在实际系统中是不够的,我们需要利用内容信息,但是我们在使用内容时往往是简单的用来计算相似度。比如我们有书的作者,出版社,书名,标签信息。我们往往用这些信息来比较书的相似度,然后推荐相似的书给用户。但是我在研究中发现,用户对书的不同属性的依赖是不同的,有些用户比较信赖出版社,比如我买计算机书,只买几个著名出版社的,其他出版社的书我对他的质量不信任。也有些时候看作者,比如C++,一般只买大牛的书。但是,豆瓣的推荐系统并没有学习出我的这些喜好(应该说没有完全学习出来),他们只是学习出我喜欢C++的书,但没有学习出我对作者和出版社的要求。这一方面因为我没有提供太多的喜好数据,另一方面也是因为可能并没有进行对这些特征的学习。

2.用户评论和自然语言,我们在淘宝买东西的时候,经常喜欢看以前卖家的评论来决定我们的行为。所以下一代推荐系统设计中对论坛评论需要加以利用。当然这涉及自然语言理解中的情感分析,做到完全准确是困难的,但现有的技术足以利用评论来提高推荐的精度,不管提高多少,肯定是能提高的。

3.海量数据。我们考虑网页推荐,其实也就是个性化搜索。这个问题和电影推荐不同,在电影推荐系统中,电影数总是少于用户数的,但在个性化搜索中,用户数是远远小于网页数的。在这种情况下,我觉得聚类是最有效的,我们很难学习出用户对特定网页的喜好,计算量太大。但是我们还是可以先对网页聚类,然后学习出用户的不同类型网页的态度(这个类型可以是基于内容的,也可以是基于界面的,或者基于域名,总之聚类的方法很多)。而且对于用户也很多的系统,比如google,我们也可以对用户聚类,学习出特定类的用户对特定类网页的喜好。这在设计大型系统中,可以作为一个baseline。

所以,我认为,未来推荐系统需要解决的3个问题就是
1)如何结合内容特征
2)如何理解用户的自然语言
3)如何处理海量数据

2009-07-17

Ruby

最近在和一个团队设计一个推荐系统。他们向我推荐了Ruby。这个语言我以前从QQP那儿听到过,我感觉我已经学了太多的语言(C++,Java,Python,PHP)。所以当时就没有用Ruby。不过既然这个工程需要用,昨天就买了一本书看了看。

语法很快就学会了,同时用它实现了一下SVD++模型。速度看当然是不如C++,不过和python差不多,库很多,做很多其他事情比较容易。他们向我推荐了JRuby,说这个的速度已经可以和C++媲美了,但愿如此。

我觉得现在的程序员,应该精通一门语言,熟悉2-3种语言,了解5-6种语言的语法。最近新语言太多了。

P.S. Netflix Prize还有10天就结束了,得抓紧啊,希望还是有的,嘿嘿!最近我在研究用户聚类,感觉不错!

博客受难记

这个博客一会儿能访问,一会儿不能访问,中途我也试过国产的博客,但总觉得他们的界面不适合我,太像90后小孩子的节面了,我还是喜欢简洁的,能自己控制的。所以即使这个博客在中国不能访问,我也懒的换了,我相信总有一天是能够访问的。

而且一般看我的博客的人都是会翻墙的,所以也没有换的必要,嘿嘿。昨天我也学会了怎么翻,所以又可以更新了,很爽!

我觉得有些网站封也就算了,blogspot不应该都封掉,至少像我这种谈谈技术的,一点都不黄色也不反动,sigh...

2009-07-16

写个博客真不容易

现在能上的网是越来越少了,sigh!!