I just came back from the ECCS 2010 in Lisbon and want to share some of the great talks with you as long as the videos are not yet there. The talks in the actual conference were not as good as expected, but luckily there were fabulously organized satellite meetings. Especially good was the High Throughput Humanities workshop which was excellently organized by Riley Crane , Gourab Ghoshal, Sune Lehmann, and Max Schich.They provided the audience with a great variety of interesting topics well presented by enthusiastic speakers (which was kind of a contrast to the actual conference).
The talk that I certainly liked the most was the one by Sebastian Ahnert with the title: "Mapping Flavour Space". He came up with an experiment to test the hypothesis that "pairs of foods which share chemical flavour compounds also taste well together". To do so, he analyzed thousands of recipes all over the world and checked whether the ingredients rather share chemical flavour components or not. The article is not yet out but I'll keep you updated on this.
Also extremely interesting was the talk by Alan Mislove who has analyzed several large online social networks. One of the cool project he shows is Twittermood: it is a dynamic map for which he first evaluated tweets from a given region at a given time corresponding to their mood and then showed the active regions on an density-preserving map. It shows periodic pattern on different time scales but also that the west coast is consistently happier than the east coast. California, here we come! Of course there are technial difficulties due to the automatic mood detection. The algorithm is of course not able to detect irony or slang. So, when a Californian says 'that's sick' it does not necessarily mean she is in a bad mood ;-)
Unfortunately, I missed the other talks before lunch since I had to give my talk on one-mode projections of bipartite graphs in the workshop Science of Complex Networks. After lunch I enjoyed the talk by Sang Hoon Lee on "Googling social interactions" which appeared shortly beforehand in PLoS ONE. In this work they simply test for a group of n persons how often Google can find them on the same page. They interpret the number (with some caution) as the degree to which the two persons are interacting. They did a very interesting pre-election analysis of the changing social interactions between the most important members of the two parties. The approach is of course not without problems: Google does not give definite counts of the number of pages it found and it does not make sure that all of the pages are sufficiently different. Second, people might be named in a given article or blog but simply because they are doing the same (candidate for a seat) but not because they did something together. Third, people might be named together because they actually hate each other. Lee et al. did a good job of describing these difficulties. Still, such an analysis might reveal new and totally unforeseen connections that can then be analyzed with better relational data. For me, it is an ideal method to find the 'needle in the haystack' with a big fork before analyzing whether the needle is made of steel or of gold. Thus, with the necessary caution this new approach might be a very interesting first step for a large scale network analysis.
Sang Hoon is now with Petter Holme who gave an interesting talk on "Networks of Internet mediated prostitution". Basically, the research was on a forum where men can rate female prostitutes in Brasil. As always, he was witty and self-ironic. I liked it that he cited a statement from a famous scientist (Elizabeth Pisani) on HIV that made a blog entry with the title "Swedes make sex boring, even in Brazil" ;-) I don't agree with her on this one: it is actually a difficult topic for a scientist and I think they did a good job in presenting the data. Anyway, she also has a very interesting perspective on complex systems so make sure you see her TED talk.
I also liked Alexander Mehler's talk about categorization inconsistencies in Wikipedia a lot. Of course, if a categorization was perfect, it should be a tree or maybe a DAG. That is not what happens in Wikipedia, however, it is almost nearly a DAG. It seems that the paper itself is not yet out, but check his homepage to find related papers.
The talk by Leif Isaksen was quite funny. He and his group got a Google award by proposing a good project on what one could do with Google books. They proposed to scan through them and identify places, if possible in the timely order in which they come up in the book. The information can then be used to visualize the action in a single book or series, or to find all books in which a certain place is named. This sounds far more easy than it actually is. Of course, many cities have quite different names in different languages (and they might be named by any of them, even if the book is English), the name of a city can change over time (Lisbon has been Olissipo earlier :-)), the name can be a common word in a text (as stated in the talk there is a city called 'A' and one called 'B'), and finally, there can be different cities with the same name. So, it will be interesting to see how the project evolves.