« March 2005 | Main | May 2005 »
April 27, 2005
MSN's capitalistic search engine bias for Microsoft-driven web sites: damned statistics and the truth.
Slashdot readers have brains, despite what I've heard otherwise. However, the human mind doesn't always work as well as we thing it should: humans interpret their surroundings with a strong bias towards what we expect and what we want to hear.
One the one hand, it's a good thing. It's one of the things that makes us creative individuals as opposed to a big grey mass, where every intelligent person thinks the same about, say, a news story. Our bias towards the expected also helps us to perceive our world - for example a complicated continuous audio signal as speech.
Then again, it's not a great idea to confuse a biased opinion with truth. Let's look at the Slashdot crowd. The readers of one of the biggest geek websites on the planet have a very particular bias when it comes to software-producers. Open Source people are good. Microsoft is bad. Apple is good, despite being somewhat close-source.
Thankfully, we have the mathematical means to overcome our own tinted glasses that make us wine-goggle in the bars on Saturday nights. They're called statistics. A hated subject for those that believe in believing. I guess I'd simply say: Believing is fun, Knowing is fun-ner!
So, the other day, Slashdot posted a news story about Microsoft's own search engine (MSN search) having a tendency to list search results before other ones, if the sites are delivered by a Microsoft web server. That means, if you run a web site that uses Microsoft's (known-to-be inferior) web server IIS, it will turn up earlier than those run using the unix-based Apache server. So, the Microsoft search engine has a bias in favor of MS-driven sites. At least that's what the Slashdot story suggested.
![]()
(Diagrams courtesy of Ivor.it)
Looking at the original article, I noticed that it listed only some 20 search queries that were used to verify this disturbing hypothesis. It showed that the MS-engine shows a slightly higher proportion of IIS-run sites than Google does.
A good analysis should ask: is this pure chance, or is there a real correlation? My comment on Slashdot about the result: Is it significant? That is, can we exclude (to a reasonable certainty, that is, p>0.95) the possibility that the effect seen cannot be attributed to chance or some other criterion MSN uses?
Later on, the author the article decided to give us more (better) data, and a fellow Slashdot reader ran a quick Chi-square significance test and showed that there is a correlation beyond reasonable doubt. (What I mean is that the independent variable "Search Engine" (=MSN or Google) and the dependent variable "IIS proportion") are related according to the results.
So, I rest my case, it's a proven thing - Microsoft is evil, they manipulate their search results big-time right?
Couldn't be further from the truth. Apart from the fact that the effect - that is the bias - might not be very strong, the correlation doesn't mean that M$ introduced this bias. Data need to be approached with more caution.
Maybe it's just that sites with that particular web server are of higher quality? Maybe there are so many junk sites out there run with Apache simply because it's cheaper to install Apache? (see also this comment.)
This is just a little episode, and I have the feeling we encounter situations like this one all the time when dealing with scientific data. The mind plays tricks on us and hides the truth. Then, our statistical analysis plays more tricks, suggesting maybe what the statistician wanted to see. Oh well, life's a bitch, everything is relative and there is no truth anyways. Over and out.
Posted by dr at 2:14 PM | Comments (0) | TrackBack
April 26, 2005
Mac: open source projects need your help
If you know how to program, and if you're fit with Mac OS X technologies and one of the common implementation languages (Objective C, Java, C), check out these fine open source projects and consider contributing work! Despite their big nanmes in the open source world, it seems to me that they need a lot of help from OS X programmers to make things work well. Current CVS snapshots of these projects are sometimes buggy, and often lack in terms of their user interface. But traditionally, that's where the Mac shines!
- VideoLanClient (VLC) - shows movies (fix those shortcuts, finally!)
and also MplayerOSX, another movie player (needs bug fixes!) - TeXShop - great (La)TeX editor that needs some UI 'integration' (too much clutter right now) - language: Objective C
- GNU Emacs - work on unicode and Mac user interface (fix those sliders, finally!) - language: C / (e)lisp. On OS X, you could help a lot by contributing to the AquaMacs distribution, which packages Emacs with a setup that works well on the Mac. Alternatively, work on XEmacs and Andrew Choi's port.
- Audacity - great audio editor, needs performance optimizaiton - language: Java
So, if you have some time on your hands, do some programming! It's fun and very gratifying!
Posted by dr at 10:28 AM | Comments (0) | TrackBack
April 21, 2005
Even real academics publish at dubious conferences!
MIT shows us again what people in the community have known for a while: Conferences organized by some "professors" from certain slavic or south-american countries are bogus - they are held to cash in on $1500 conference fees for attendants. These spamferences are rallied for with spam - academic conferences usually issue their "calls for papers" by e-mail, too, so it doesn't look too much like spam. After some grad students from CSAIL (the computer science & AI institute at MIT) staged a big prank to publicize the spamference issue, I looked around on the net and found: a lot of otherwise respectable academics list publications at those bogus conferences.
Let's backtrack first. What happened? CSAIL made a little Natural Language Generator that automatically composes papers: articles that look like scientific publications, but are just randomly generated (albeit syntactically correct) text. They managed to get a paper accepted at one of these spamferences, and they even want to give a randomly generated talk there!
From an NLG perspective, it's not rocket science. In fact, the generated text is neither coherent nor cohesive, that is, broadly speaking, the sentences don't really fit together. The context-free grammars they're proud of using are the most basic thing you could imagine using here. But the SCIgen project, led by Jeremy Stribling et al. from CSAIL, is making a good statement about those bogus conferences.
If you search Google for Nagib Callaos, you'll find a whole bunch of conferences organized by this (alledged) professor from Venezuela, but only one publication, which is thankfully ripped apart by Mark Liberman over at the Language Log.
The SCIgen project's cause is worthwhile for another reason. Surprisingly, academics all over the world submit their hard work to such bogus conferences -- or, worse, they submit papers that didn't get accepted elsewhere. This Google search lists a whole bunch of people that proudly mention a publication at one of N. Callaos' conferences. Were those otherwise respectable academics tricked into publishing at spamferences? Where these workshops and conferences actually real? What's going on?
In other academic news: some guy is deep in brown stuff now. He stole a Berkeley biolgist's laptop with exam documents, but unfortunately, there are other data on the machine. In short, Jasper Rine is mad. (Link to video, audio and transcriptions.)
Posted by dr at 9:00 AM | Comments (0) | TrackBack
April 17, 2005
Old school: pictures on film, taken in Barcelona
I've uploaded pretty pictures from Barcelona, Spain, taken during my recent trip. Having switched back to conventional film, I appreciate the moment when I get the pictures back from the lab. In my case, I'm not getting prints but just order development and the pics on a CD-ROM. "Surprise, surprise", just like in the old days, when I used to browse through the prints. Why did I switch back? I had a nice Canon EOS 300D digital SLR camera, after all... But... (see below!)
![]()
![]()
![]()
Like the pics? I have the whole series in my photo blog for you.
Read on if you'd like to know why I've gone back to film photography.
So I had a nice Canon EOS 300D (Digital Rebel) SLR camera, but I wasn't too happy with it. Even though it's one of the best digital SLRs among the affordable models (say, below EUR 1.500), it's not up to par with a middle-class conventional SLR that you can get for a fraction of the price. The viewfinder is tiny, and I hear that the updated 350D model has an even smaller one. What, if not for better control of the picture in the viewfinder, would I want to buy an SLR for?
Sure, I can use Canon AF lenses with it. But when I wanted to hook up that cheap old Sigma lens, the camera just gave me an "Error 99" and shut down. My once expensive, digitally controlled Metz Flash, strong enough to light up a cathedral, doesn't work at all with it.
So I shot a lot of pretty pictures with my 300D Rebel. And now, concentrating more on work things, I just watched the cameras value depreciate. It's losing maybe EUR 350 in value in a year -- do I save that much by not using film?
No! And I don't even have a super-great camera; just some respectable but amateur-level body. Conclusion: Camera goes to ebay, David bought a very cheap analog Canon EOS50E (nothing great, but good value) and in a year's time, I'll check the market again.
Posted by dr at 6:01 PM | Comments (0) | TrackBack
April 11, 2005
Send e-mail everywhere with Postfix - thanks guys!
Look what I received from Indonesia the other day...
It's an amazing tutorial from You Dave
http://www.david-reitter.com/software/osxpostfix.html
It works and very useful.
Thanx a lot, pal.
And I've been getting a lot of these over the past 12 months. I've written a tutorial that explains how to activate Postfix on a Mac, which allows you to send e-mail with your normal mail program from everywhere - when from a hotel, or an airport wireless LAN, from where you normally can't send e-mail because you can't get through to a heavily protected company e-mail server.
Keep the comments flowing, guys - they are really motivating!
Posted by dr at 7:46 PM | Comments (0) | TrackBack
April 8, 2005
Ryanair: Don't go to Hahn at night, and other tips
Ryanair has revolutionized the European market for shorthaul and mid-range flights. Just like a number of budget-airlines, they offer flights starting from around 20 EUR (one-way, incl taxes) to European destinations. Avoiding the big airports and going to places that are an hour away from the real hotspots, they save on landing fees. Ryanair offers 'no-frills' service: no free scotch, no newspapers, and seats that sometimes don't recline.
I really like not having to pay for my neighbor's love of Scotch and Bloody Marys. Ryanair seems to be cheap. But what's the actual cost of taking Ryanair? Here's a bit of anecdotal evidence and a few tips for European travellers.
What we need to keep in mind: for the passenger, it's the door-to-door journey that counts. Getting to a lonesome airport on the country-side won't help me. I went to a couple of places recently with Ryanair. Here's what I learned.
Ryanair Flight Edinburgh to Dublin, February '05:
- B737, looks old and flies accordingly. Wow, what a hard landing!
- Brrr, nice cold and rainy. "Great Gangway", I smile at the flight attendant after I climb up the back steps, getting soaked.
- Apart from the usual and annoying security briefing and their even more stupid I shall be dimming the cabin lights, they bother me with no less than three announcements about duty free shopping, pay-snacks and lottery tickets. I hate it!
Ergo: unpleasant flight, but cheap enough.
For my next flight, I get good and super-expensive noise-cancelling earphones for my iPod, so I don't have to listen to the flight attendants any more. (Yeah! Don Quixote wins again!)
Another bumpy Ryanair Flight to "Frankfurt" Hahn, March '05:
- Looks like it won't cost me more time than going 'British Airways': they operate a direct flight that I can afford, and BA's stopover in Heathrow would cost more time - even considered those 2 hours I waste getting to Glasgow Prestwick extra-early.
- Prestwick airport parking is dead expensive. I spend 16 UK pounds (about 25 dollars) for an extended weekend. Plus the gas to get there!
- After a pleasant flight on a new aircraft with friendly staff and no interruptions (thanks to my near ear-phones), arrival at Hahn (which is about as close to Frankfurt as it is to Cologne) at 11.30pm local time. Ryanair artificially extends its flight times, so they can factor in delays without having to admit that they're late. Well, I guess a lot of airlines do that. So I arrive early, but have to wait till 12.30 for my bus to leave. The 10.50 EUR journey will take another 70 minutes.
- Too bad. Some other (Ryanair) flight from Pisa is late, so the bus company decides to make us wait another 45 minutes. It'll be 2.45am until I'll find some sleep.
- The next day, my family whom I am visiting tells me that this happens all the time at Hahn when you take the last bus for the day. If there is any flight that's late, they'll wait. And because there are several flights at that time, the probability is high...!
Conclusion: Forget about Frankfurt Hahn. Don't go there. It's not really Frankfurt, and more importantly, you will not get decent sleep if you take the last bus!!
For my German friends: Fliegt nicht zum Flughafen Hahn, zumindest nicht spät nachts! Der Bus wartet auf verspätete Maschinen und bringt Euch erst um drei nach Hause.
March 05, Edinburgh to Barcelona (Girona) and back.
- Nice new Boeing, and I even manage to get the extra-legroom emergency exit seat!
- We're not really going to Barcelona, Spain, but to Girona on the Costa Brava. There's a bus service (about 10 EUR, one way) to Barcelona that takes 70 minutes.
- The way back is unpleasant. The bus drops you off an hour early (two hours before departure!), so for my 1:30pm flight, I have to catch the bus at 10.30am. That's early when you're on holidays!
- Girona is the nicest airport ever, though - and that for one reason. They have a terrace that lets you sit outside in the sun. No flowers, no trees, but blue Spanish sky and sunshine. Wonderful!
Posted by dr at 11:49 PM | Comments (2) | TrackBack
April 6, 2005
X-Plane: a great flight simulator needs quality assurance
It's not a game, it's a simulation: the X-Plane flight simulator. It lets you take the pilot's seat on a 747 flightdeck or get behind the controls of a (much more fun) Cessna 172. Other than Microsoft's Flight Simulator, the veteran of flight simulation, X-Plane is meant to be highly realistic and allows aviation companies to test new or futuristic aircraft designs.
I've been training the basics - taking off, flying the traffic pattern (which is a rectangle in the air over airports) and making my landings with small planes. Navigation sort of works, too. That's fun and reasonably challenging if you're flying in weather - as long as the software does what you expect it to do. X-Plane is fun - its bugs are funny. Read on...

(Santa Monica municipal airport, final approach, runway 21 in a Piper Malibu. The real downtown Santa Monica has quite a few towers - but probably none right at the end of a runway, as these real-life pictures show.)
Version 8 is out - with more realistic scenery. X-Plane lets you fly everywhere, with the best scenery available for the United States. Unfortunately, good graphics demand much more than my two-year old old G4 Apple Powerbook, so for now, I'm flying in thick fog with reduced-quality scenery.
But that's not all that you should beware of. X-Plane is developed mainly by one guy in
South Carolina, Austin Meyer. In between coding sessions, he likes to get in his Corvette, drive to the airport to take his Cirrus (one of the more expensive General Aviation aircraft) for a spin. He's quite the geek, when it comes to development: more instruments, more realism. And just like in most nerd's software projects, quality assurance comes last. X-Plane works in general - but when you try to work the radio, it fails miserably.
Let's get a little technical to give you an example: Flying on an ILS (instrument) flight plan from KSFO (that's San Francisco Int'l) to KLAX (Los Angeles Int'l), Air Traffic Control just doesn't do it's job. I'm getting up to some 80nm before KLAX, and the virtual ATC guys seem to think I'm happy there up on 33.000 feet. No way to check in with approach - but when I'm getting really close, I can request vectors to the ILS (for non-aviators: that's the approach to a place from which I can use the automated 'instrument landing system' that'll guide me down to the runway). Funny thing is, I'm told to drop down to 4.000 ft within in seconds. I can hear those 500 passengers scream. The best is yet to come: "77 Alpha Uniform, cleared for the 07 left approach", and hey, what's next? They hand me off to a little Helipad tower. Did I mention I was flying a 747?
Someone with more experience with X-Plane tellsme on an X-Plane.org forum: "ATC sucks, always has - if you want reality, use the real-world charts, plan your flight & fly the approach you want." Darn it - if I wanted reality, I would get to the next airport. I just want to learn a bit about radio comm's!
I had another problem - this time it's my fault. X-Plane 8.10 has added realism in the 747's autopilot. I didn't know how to use it right and couldn't land the jumbo any more. Again, the helpful folks at X-Plane.org tell me that this is a technical problem with the default 747 that comes with X-Plane, yes, you heard it, the aircraft that pops up right after you start the program for the first time. The guy at the form doesn't lose his irony: "Exactly - the default planes are usually the ones that are the most outdated... don't be surprised if those planes are missing some vital systems as well (like anti-ice for a 747 )" Well, this time they weren't quite right with my problem - it's just a button that I missed.
Take a close look at the somewhat funny screenshots. Thank God the engineers at Boeing and Airbus take quality assurance seriously, maybe unlike Laminar Research, maker of X-Plane...

(And the million dollar question is... what's wrong in this picture?)
Posted by dr at 3:48 PM | Comments (2) | TrackBack
April 3, 2005
Aquamacs: An Emacs that finally feels more like an OS X application
Do you know Emacs? Yes, that geek-editor with the long complicated keyboard shortcuts, the strange ugly windows and the very useful colored text? The editor that can also manage your calendar, your e-mails and has a built-in automatic psychiatrist?
We've made Emacs more usable. It opens a new window for each file you load. It gives you a nice menu with the recent files. It uses all the standard shortcuts you're used to from other applications, and displays texts in a decent font by default.
As my friends would expect of me, this is meant for the prettiest operating system: Mac OS X.
The new Emacs is a distribution of a recent Emacs build from CVS, we call it the
Aquamacs Emacs distribution
It's a joined project with poet-programmer (!) Kevin Walzer. There is a more detailed description with download for users available. You can download the complete application there: ready to run! To install, just get it and move the automatically extracted Emacs application to, well, wherever - for example into your Applications folder!
What I did technically was theoretically pretty simple. I found, configured, installed, packaged a whole bunch of packages and code snippets from other people's .emacs (emacs configuration) files that contribute some functionality to make things really Mac-like. I did not modify the Emacs code itself - it's the Carbon Emacs compiled straight from CVS. A stand-alone package has been available for quite some time, but Kevin and I felt that it would be time to provide a complete binary build. So I would see this as a distribution of other people's and our code rather than our own invention.
At this point, I consider the quality of things pre-1.0, so I'm asking for people's feedback on the project. Does everything work as expected? If not, can you please figure out how to fix it and send me a patch? I'll include it in the next release.
If you would like to do it yourself, you can get the source code of the modifications right with the distribution (inside the Emacs.app package, in the site-lisp directory) and combine those with a CVS build. We're using one from 2005-03-29.
What's left to do? Lots. For starters, I think we'll need a good overhaul of the menu structure - stuff that doesn't work at this point shouldn't be there (as Edit/Text Properties, e.g.), because it's confusing otherwise. The 'exit Emacs' function should move to the 'Emacs' menu, and preferences should go there, too, theoretically. One of the other most problematic areas in any OS X build of Emacs is the behavior of the sliders (scrollbars).
Why does this come as a separate distribution? In other words, why didn't Kevin and I contribute to the main distribution? Well, several people have attempted to get the folks on the emacs-devel mailing list to make the Mac port more Mac-like, and to modernize the user interface in general. They had very little success. The consensus seems to be that compatibility across operating systems and consistency with former versions of Emacs is deemed much more important than adhering to modern UI standards. That's alright. But given that Emacs came to life almost 30 years ago, and the UI as it stands today started when graphical user interfaces were in their infancy, it is clear that new users will find Emacs hard to learn and inconsistent with all other applications.
Kevin and I welcome feedback and contributions. Check out the Aquamacs site for a description of the project and download information.
(Revised: Text properties vs. Options / Faces)
Posted by dr at 10:48 AM | Comments (8) | TrackBack
