Keeping Busy

January 23, 2008 on 2:30 pm | by Mor | In General, News | 1 Comment

Hey there, readers - we’re still here. We’ve just been busy making all the things we worked on at the lab real in some way or another (and yes, Fire Eagle is coming soon). So we’re not blogging every few minutes like some other sites… but we did write a few interesting papers in the meantime (come see us at CHI 2008 and WWW 2008).

We’ll tell you more soon… stay tuned. And for more frequent entertainment, click Next!


Why do we write?

September 20, 2007 on 9:11 am | by Mor | In General | 3 Comments

Phew. The CHI 2008 paper deadline was yesterday, and our people where involved in a total of five different papers. That’s a lot of work, and a lot of time. Sometimes we need to remind ourselves why we do it.

Indeed, people inside and outside the Yahoo! organization often ask “Why do we publish academic papers?”. On its face, it would seem like writing papers is not only a waste of corporate time and money, but may also expose techniques and valuable knowledge and insights to competitors. Indeed, other companies (that shall remain nameless) had generally not encouraged participation in academic discourse by their (sometimes) brilliant researchers. Luckily, in our Advanced Development Research lab (aka Y!RB), the approach has been a little different. While we are not under constant pressure to publish papers, it is certainly encouraged and expected that we do.

Asking questions like ‘why do we publish’ usually grows into a much larger and philosophical question of “What is research?” Let’s not go there just quite yet – more on that later.

To me, there are a number of reasons that make academic papers a worthwhile endeavor (warning: personal opinions follow; the list below and the words above do not represent the views of Yahoo!, Yahoo! Research, Yahoo! Advanced Development Division, or my dog Mingus).

1. Writing makes you organize your thoughts. It’s akin to authoring a presentation; one that you are required to submit in advance and that can be rejected. To write a research paper that will get accepted to a top conference, you need to crystallize your thoughts, express your hypothesis and claims clearly, and be able to show significant results. The process forces you to do a better job in understanding, situating, evaluating and defending your work and its different components. I cannot tell you how many times we have started writing a paper just to realize, based on the initial writing, that we are doing something wrong (or not well enough).

2. Publishing is the best way to get feedback that in turn can validate and improve your work. At the basic level, getting a paper into a major conference serves as external validation that your work is worthwhile. You can be working in the dark for years and keep patting yourself on the back - but convincing the reviewers of a major conference or journal that your work is important and interesting is a mark of success that can be trusted (let’s assume a perfect reviewing system for the moment - this is certainly not often the case). At a different level, feedback from reviewers, people who read your work, or (more often) from those who attended your conference presentation, can greatly improve your work. People are often keen, perhaps too keen (argh!), to give you ideas related to your presented work or how to do it better.

3. Publishing papers gives back to the community and facilitates and invigorates academic discourse. Other than warm fuzzies, this gives you a chance to make an impact that exceeds the boundaries of your organization. Of course, the contribution gets you - and your company - credit as good citizens in the academic community.

4. Publishing is an opportunity to steer and inform the research community about a direction in which you are invested. Simply by writing about a new research problem, there is a chance that other researchers will become interested and start looking at the same domain. Such a chance to transform and inspire other brilliant researchers requires a well-thought of problem definition and some initial attempt at tackling it (i.e., a research paper).

5. [This is a weird one] In a large company, publishing a paper in a conference can be the first time when the relevant people in the company are exposed to your work. As absurd as this may sound, in a large organization, internal communication is as difficult as one may imagine. A conference brings people with the same interests together and they will find you instead of you having to find them. For example, our CHI 2007 papers had little exposure internally at Yahoo! before the conference. Many of our UED and UER people were exposed to this work at the conference, and a much wider internal discussion followed. This is not a Yahoo! thing. I have also heard stories from friends in other research labs that had their research “discovered” by product teams when presented in public conferences.

6. Publishing leads to recruiting from academia. Nothing tells exceptional students (and faculty) that Yahoo! is the best place for them like a brilliantly delivered presentation of deep and thoughtful ideas. And sometimes, even our presentations are enough to attract such interest. At least three researchers and interns in our lab are here mostly because they have seen a researcher speak at a conference or another venue about our work; countless other CVs were received.

7. Writing can be an outlet of creativity. We’re not all Dick Bulterman (see here for example), but at least we can pretend…

Anything I missed?


Flickr Fountain of Knowledge

July 31, 2007 on 10:00 am | by Mor | In General, Media in Context, Social Media | 1 Comment

What can we learn from Flickr? Well, for one, we have learned that there are a lot of people who like to take photographs and share them publicly. Who would have guessed! However, my question refers to a different type of knowledge: information about the world that is implicitly encoded in the activity on Flickr.

You do not need to go far to see a simple yet brilliant example of such knowledge: check out Flickr’s tag clusters (here are the clusters for love, jaguar, Taj Mahal, hack). Using tag co-occurrence on Flickr photos, Flickr’s clustering can break down a term into multiple semantics or meanings: Jaguar, for example, is the animal as well as the car and the guitar: the first co-occurs with the tags “zoo” and “cat”; the second meaning of “jaguar” appears with “car” and “auto”. Note that these meanings are not mined from any other resource: they represent some “knowledge” that is generated automatically from the implicit contributions of Flickr users uploading and tagging their photos.

In other examples, Patrick Schmitz developed a different co-occurrence model that allowed him to generate subsumption data in Flickr tags (e.g. San Francisco is subsumed by California). The work at Yahoo! Research on TagLines and at our own lab on Tag Maps had shown that Flickr community activity generates descriptive labels for events and locations.

Last week, in Amsterdam, as part of SIGIR 2007, we added yet another method of extracting knowledge from Flickr. The paper, “Towards Automatic Extraction of Event and Place Semantics from Flickr Tags”, by Tye Rattenbury, Nathan Good (two of our star interns) and myself*, begins to answer a simple question: given a tag that appears on Flickr (such as “dog”, “SIGIR 2007″, or “Yahoo! Research Berkeley”), can we automatically determine whether or not that tag refers to a specific place, and whether or not the tag refers to a specific event? As you may guess, SIGIR 2007 refers to an event, Yahoo! Research Berkeley is a place, and “dog” is neither a place not an event.

Knowing if a tag is a place or event leads to better image search, but can also help us to better visualize the Flickr data; generate automatic event and place gazetteers; associate missing time/location metadata based on tags, and more.

I will not get into the details of how we propose to do extract the place/event knowledge from Flickr; you can get these details in our paper (pdf). I will just mention that we are using the dataset of geotagged Flick photos, and looking at the time and location distributions for each individual tag in the dataset. If the location or time distribution for a tag have specific “structure” to them, we classify that tag as a place or event, accordingly.

Below, you can follow the presentation slides I gave at SIGIR, or just jump directly to the paper to get the full story.

While the debate on the “Is the semantic web is dead?” question continues, “emerging semantics” are alive and kicking. What other knowledge can be extracted from the Flickr dataset?

* “Towards” is a code word in research papers meaning “we didn’t take the research all the way quite yet but want to make the paper sound important nevertheless” - we try not use it too much.


Creative Acts beyond Dissemination

June 22, 2007 on 5:21 pm | by ayman | In General, Media and Community | Leave Comment

Conceptual Art Models

Last week Ryan and I were in DC at Creativity and Cognition 2007 (C&C) where we ran a workshop on Supporting Creative Acts Beyond Dissemination.

In its 6th year, C&C addresses creativity in both theory and practice. The research aims to provide us all with a deeper understanding of the creative processes. The conference was amazing and we were excited to be there: see it in Jean-Baptiste’s Flickr set. All of the attendees have a similar goal: to enable more people to be creative more of the time.

In our workshop, we began to think about new forms of digital creativity, one with a blurred distinction between creator-centric and experience-centric creativity. In attendance, we had a diverse set of participants (artists, dancers, research directors, business consultants, and a few computer/information scientists). Bringing together this set of people to discuss a common theme was a phenomenal experience. Each of us presented and discussed our perspective on how creativity works across a variety of domains. It’s good to see how artists and engineers intersect and connect ideas (we built a concept map of the work to show some of these connections). We then reconsidered models that cleanly separate the two, and began to seek out new models in which the user of a creative work takes on a generative role, not just an interpretive or interactive one.


The Month of UC Berkeley (Talks) at Y!RB

June 6, 2007 on 6:17 pm | by Mor | In General | 1 Comment

Our lab has a lots reasons to love UC Berkeley. Sitting two blocks away from campus means that we can look there for inspiration, feedback, collaboration, and - above all - interns!

This month, we get some more UC Berkeley love with a series of talks from Berkeley professors in our weekly Brain Jam seminar: Ray Larson (June 8th), Marti Hearst (June 15th), Nancy van House (June 22nd) and Maneesh Agrawala / Jeffery Heer (June 29th, details TBD). Head to Upcoming for the details (here are all future events for Y!RB) or sign up to our mailing list to get weekly reminders for all our Brain Jam events. There are quite a few interesting speaker stopping by this summer!


Next Page »  

Copyright © 2008 Yahoo! Inc. All rights reserved. Privacy Policy - Terms of Service - Login
Powered by WordPress on Yahoo! Web Hosting.