Archive for the ‘Security’ Category

A Disappointing First Look At WolframAlpha

Saturday, May 16th, 2009

After all the hype that has occurred prior to launch of the new WolframAlpha knowledge engine I checked it out after it went live a couple of hours ago. Here is my personal evaluation of this new site. Having seen many search engines and technologies rise and fall within the last decade I was curious to test the site as soon as possible. As I am not a scientist I will try to provide a view which focuses on the normal internet user.

First Tests

As I had recently been trying to find out how many castles there are in Scotland I typed „castles scotland” into the input field. WolframAlpha returned the not very helpful message „Wolfram|Alpha isn’t sure what to do with your input.” so I changed the query to „number of castles in scotland” which still returned not a single result. Maybe WolframAlpha just does not know that so I tried „number of rivers in scotland” which was interpreted as “is Rivers, Manitoba, Canada in Scotland, Connecticut, United States” and returned „Result: no”.

So I changed the query to „number of rivers in Germany” which worked out fine - great! The first correct result was returned!

In the minutes that followed I went on to enter different queries and analyzed the results to come up with the following list of inconveniences and problems from my point of view.

Where Are All The Links?

The fundamental elements of the web that keep it all glued together are links yet I didn’t get any result which allowed me to dive deeper into the information. Surely after clicking on some values a layer occurred that allowed me to copy the values but only sometimes there were links contained that I could click on directly. Links from one page to another page is what sometimes keeps me for half an hour or even longer on Wikipedia traversing through the links from one article to the next although I only have been looking for a simple answer to a problem. WolframAlpha doesn’t seem to offer that kind of linking most of the time.

Just searching for Inverness which is a city in the Scottish Highlands near the famous Loch Ness returns some useful information on the city. It also lists cities nearby. Good idea but I’d like to click on any city name directly to perform a search for that city, e.g. Edinburgh. What I need to do is either click on the name of the city, find the link to the city and click on it or copy the city name or enter it manually into the input field and perform a new search.

Another example: searching for „Walt Disney” does not return a result on the person Walt Disney but on the company. I need to select „Use as a person instead” which by the way is in a very small font which I personally regard as bad usability. From the Walt Disney (person) result there is no link back to the Walt Disney Company he founded. Why not? Wikipedia has it.

Weird Incompleteness

WolframAlpha doesn’t return anything on “london underground” which is the oldest subway system of the world. Yet “new york subway” returns a result. Likewise you can search for “longest subway system” which returns the New York City subway system but don’t search for “oldest subway system” (which would be the London Underground) which will return nothing at all. This is even the more fascinating if you keep in mind that Stephen Wolfram was born in London.

Google’s top results for “oldest subway system” show that it’s the London Underground - and I don’t even need to click on the results to get that information as it is contained in the snippets Google provides.

If you enter “liberty island” WolframAlpha doesn’t find anything, however Google does. Yet WolframAlpha knows the Brooklyn Bridge. Seems to be more important to know the Brooklyn Bridge than to know the island where the Statue Of Liberty is located.

There Often Is More Than One Answer To A Question!

Most of the time you are stuck with the result without any helpful links whatsoever. Searching for “toons” shows “Interpreting “toons” as “towns”" which for me seems very far-fetched. It also does not return a result at all but allows you to select a city. I would have expected to receive the result for “toon” instead which is the correct singular for “toons”.

The word “simpsons” is interpreted as “sum formula” yet a search for “bugs bunny” returns information on a Warner Brothers movie entitled “Bugs Bunny’s 3rd Movie: 1001 Rabbit Tales (movie)” with data from the IMDB. I would rather have seen some historical info on Bugs Bunny as on Wikipedia.

Likewise a search for “james bond” returns no info on the fictional character but returns information on the movie “A View To A Kill” from 1985 which is a somehow matching result but not really what I had expected. Why has exactly this movie been selected? Luckily WolframAlpha also shows a list of all of the other James Bond movies. But, again, something unexpected happens. If you select “Casino Royale” from the list you won’t see any information on the movie from 2006 but instead on the 1967 TV version of the book. There seems to be no ranking. And there is no way for you to find out that there is another movie from 2006. No link, no info. If you only depend on this information you’re doomed to fail. If you search for “Casino Royale” manually info on the 1967 movie is shown but you can select the movie from 2006 directly.

Using Google searching for “Casino Royale” shows the IMDB entry for the 2006 movie as the first result which is what I would have expected. The second result from Google shows the 1967 version. Great!

If you query WolframAlpha for “Wolfram” you’ll be shown info on Stephen Wolfram - the creator of WolframAlpha.

Yet if you try searching for the Google Founder’s last names “Brin” and “Page” on Google that kind of bias doesn’t exist there. For me WolframAlpha’s result in this case is not an objective result.

Some results seem to be very blatant errors. Just searching for my surname “Kimmel” as a single word without any spaces is interpreted as the distance between “Kim, Sughd, Tajikistan” and “Mel, Veneto, Italy”. Ouch! However searching for “Jimmy Kimmel” returns information on the talk show host.

Searching for “Illuminati” returns no result (conspiracy theorists: here we go) yet searching for “Adam Weishaupt” which has founded the Order of the Illuminati returns a result.

I could literally go on for hours but you should just try it for yourself but forget searching for “Mickey Mouse” and “Seinfeld” as no results are returned for these terms currently.

Bottom line: contrary to mathematics there is not always only one solution to a problem. Just imagine Google would only show you the one result it thinks is the best match. You wouldn’t like that either I suppose.

Diving Into The Scientific World

Although I am not a scientist I just tried some searches with some unexpected results as well.

I have learned not only from Google but also from the previous WolframAlpha searches performed above that queries seem to be case-insensitive. That was an error. I searched for “h2o” all in lowercase to get info on the water molecule. Yet WolframAlpha interpreted this as a degree value. No water here. Searching with H2O in uppercase works though returning the expected result.

Second try: I entered “au” which is what I believe to be the chemical abbreviation for gold (from Latin “aurum”) but this has been interpreted as “astronomical unit”. Although there are many links at the top of the page there is no link for “as a chemical element”. Searching for “Au” returns the expected result however. To get the correct results you obviously need to know the correct capitalization of the word you are looking for.

I don’t think the normal web user knows that.

Third try: searching for “fly genome” returns no result. Google shows the expected results with the Berkeley Drosophila Genome Project first. I then searched for “drosophila genome” on WolframAlpha but got no result either just a reference to “Animals: drosophila”.

Some More Searches

Here are some searches I performed which really give great results and bad results, respectively:

Good:

Bad:

One-Fits-All Approach Is Wrong - Case-Sensitivity Is A Problem

In my opinion WolframAlpha should not return only one result or if it does it should offer a better disambiguation to the user. The one-fits-all approach is wrong. Currently it still doesn’t return the correct result quite often. Google on the other hand shows not only one result but (most of the time, apparently) the results that it believes are the ones the user has been searching for but does not decide for the user what he seems to have intended. Searching for “flytrap” on WolframAlpha returns a word definition as “a trap for catching flies”, not even the botanical definition of the “Venus Fly Trap” or anything else. If you search Google for “flytrap” the results contain completely different entries allowing for a manual disambiguation. There is a company named “Flytrap Technologies”, an eZine named “Flytrap”, a Wikipedia article on the Venus Flytrap and much more. Therefore you can refine your search and search for “venus flytrap” on Google to get more information on that.

Bottom line: If you know exactly what you are looking for and know the complete correct term and capitalization you will most of the time get the results you are looking for. If you don’t you’re lost quite often. WolframAlpha knows only one “John Smith”, Wikipedia knows more than 80 people with that name.

Let’s hope WolframAlpha gets better for every one of us, not just for scientists - I’m sure they’ll love it. For now I’ll try WolframAlpha often but I’ll stick with Google and Wikipedia for most of the searches. What about you?

Anways, it still contains a huge amount of knowledge and definetly is something to thank the creators for. Surely it will develop over time. Let’s hope it doesn’t go where WikiaSearch has gone before.

tweetthis-15

Website Performance Checklist (PDF)

Thursday, May 14th, 2009

In this blog I have previously posted an article series on how to achieve maximum website performance. Now if you wish to follow the steps described in the articles I thought it’s quite helpful to have a checklist ready that you can print out and tick each box for every single optimization step that you have checked and optimized.

I have refrained from using any colors so it’s purely black and white for your day-to-day use.

So I created this checklist and offer it here as a free PDF download. Just click on the following button to access the website performance checklist. I always appreciate your comments, feedback and suggestions.

downloadnow-free

Should You Really Use Security Questions In Your User Account Management?

Friday, May 1st, 2009

I have just read the article on the Twitter employee account that has been hacked recently. According to the comments by the hacker in a forum he has only used social engineering to be able to answer the security question(s) on the Yahoo mail account that person has been using.

This article is not about blaming especially Yahoo because this is a common problem with most websites that are providing security questions to retrieve a lost or create a new password so it applies to other websites as well. I’m just using Yahoo as an example because the hacker (I wouldn’t even call him a hacker) gained access to a Yahoo account.

Yahoo currently allows the new user to choose between the following questions on the signup page:

What is your fathers middle name?
What was the name of your first school?
Who was your childhood hero?
What is your favorite pastime?
What is your all-time favorite sports team?
What was your high school mascot?
What make was your first car or bike?
Where did you first meet your spouse?
What is your pets name?

There are actually two security questions in another form if you’re using that one when signing up.

Do you think it is safe to use these questions?  If you’ve just got to know someone online in a chat or through a social network would you be suspicious if you’d be asked about your favorite pastime? I bet you wouldn’t. And I think you wouldn’t even remember that you used that answer for a security question for an account you have set up some years ago.

The main problem I see here is that if you have not given an alternative email address on signup you will immediately gain access to the user account and can begin reading and writing emails from that account (or whatever service is using that kind of ‘protection’).

In fact I have always refrained from using security questions at all on my websites. Of course providing this kind of “online account rescue service” surely saves you from many support requests. And I also understand that the larger the company the more password requests you’ll be getting which may put an immense strain on your support. But is this really a safe and secure method for re-gaining access to accounts? In my opinion it is not unless it is coupled with other features like cellphone verification via SMS or similar methods. If the user has provided an alternative email account most services will send the password reset information to that address so that in my opinion is rather safe.

For most websites however I still wouldn’t use security questions at all.

tweetthis-15

“Tweet This” WordPress Plugin Phones Home

Sunday, April 19th, 2009

Before installing any WordPress plugin for security reasons I always examine the plugin source code which is not a problem for me given my ten years developing in PHP. However I don’t expect the usual WordPress user to have this kind of knowledge most of which don’t even have any experience in PHP at all.

Thus when I was just reading the source code for the Tweet This plugin I was shocked to see that on both activation and deactivation of the plugin it  automatically transfers the following information to the developer’s website:

  • URL of your blog
  • Tweet This version
  • status (activated, deactivated)
  • number of posts in your blog
  • title of your blog
  • description of your blog
  • language of your blog
  • your email address
  • Tweet This plugin settings
  • WordPress version

This is actually the code snippet taken directly from the source code - no, I didn’t add the “Big brother” there, that’s how it’s written in the code:

// Big brother is watching.
function tt_phone_home($status) {
    global $current_site; global $wpdb; $wpv = get_bloginfo('version');
    $siteURL = $current_site->domain; $blogURL = get_bloginfo('url');
    $title = get_bloginfo('name'); $email = get_bloginfo('admin_email');
    $description = get_bloginfo('description');
    $lang = get_bloginfo('language');
    $posts = number_format($wpdb->get_var("SELECT COUNT(*)
    FROM $wpdb->posts WHERE post_status = 'publish'"));
    $settings = $wpdb->get_var("SELECT option_value
    FROM $wpdb->options WHERE option_name = 'tweet_this_settings'");
    $phone = tt_read_file('http://th8.us/ttph.php?s=' . $siteURL . '&b=' .
        $blogURL . '&v=1.3.9&u=' . $status . '&p=' . $posts . '&t=' .
        urlencode($title) . '&d=' . urlencode($description) . '&l=' .
        urlencode($lang) . '&e=' . urlencode($email) . '&w=' . $wpv .
        '&x=' . urlencode($settings));
}

So if you don’t want all this information to be transferred to the developer prior to activating the plugin you should simply add a “return;” right after the function definition leaving the rest untouched.

function tt_phone_home($status) {
  return;

That way the function will return right away and not calling any URL at all. I’m not telling that the developer is doing anything harmful with that plugin yet I don’t see any reason in transferring this information to his server.

tweetthis-15

Google Introduces New CAPTCHA

Sunday, April 19th, 2009

Sometimes on the web you’re just thinking “Why the heck didn’t I have that idea?” and that’s just what happened when I read this article on CNET. Everybody should already be familiar with the CAPTCHA concept and I recently had to enter a CAPTCHA code for three times because the first two were simply non-decipherable. Actually that problem occured on a Google page.

I have developed my own custom CAPTCHA about two years ago where you have to identify the male person on the image and enter the corresponding code under the face with some random parameters added:

ttt-imagephp

However Google’s method is superior yet so simple because they only rotate an image. Rotation can be detected by a human easily yet not automatically by your usual automated spam bot. Maybe I’ll change my CAPTCHAs to this method as well - even far easier to implement.

The full details of Google’s new method can be found in this PDF file.

tweetthis-15