Posts Tagged ‘searchengine’

Bing’s Blatant Censorship In Germany

Sunday, June 7th, 2009

Microsoft’s new search (decision) engine decides too much in Germany with very blatant and in my opinion stupid censorship despite no obvious reason for it. Of course Google doesn’t have these problems in Germany, so with Google in Germany you may actually search for a new pantyhose with Bing it’s impossible (this link only shows the message if accessed from Germany):

bingcensorship_1

The text says (in German):

The search pantyhose may return sexually explicit content.
To get results, change your search terms.

This censorship was already in use sometime on the MSN Live Search before. So does Microsoft really think this kind of censorship makes sense and will lead to more use than Google in Germany sometime in the future?

The problem I see is that the restrictions seem to be based only on specific words, so it is very easy to circumvent it. Sometimes it is even sufficient just to add an additional word. The most simple solution would be to change the country in the top right corner on the Bing website. Even if you are in Germany – just by switching the country to e.g. the United States there are no restrictions anymore whatsoever.

This kind of censorship also happens in other countries like India.

I have compiled a list of terms here that shows some of the words that are blocked and their corresponding translation. Ambiguity of words does not seem to matter at Bing. Very explicit terms have been excluded – anyways viewer discretion is advised:

bing-censored-search

It is amazing that even such terms like “handcuffs” and “pantyhose” are blocked by Bing.

Doublethink? Advertisers Are Allowed To Use These Terms

Interestingly there does not seem to be such a restrictions for advertisers. Although the ads are not shown if the search has been blocked by Bing searching for related words shows ads containing words that you can’t even search for. There are even some more explicit terms in the ads than I have used in the sample above.

bing-censored-ads

Do advertisers know that these words have been blocked? If I would be a website owner trying to sell pantyhose I would like to have my ad shown when someone searches for the word.

Good For Microsoft?

I highly doubt that this kind of censorship will be beneficial for Microsoft in Germany given the current discussions on internet censorship in Germany. But it shows what may be used in the future anyways not only on Bing.com.

If you want to protect your child from potentially harmful content on the internet (in a way that cannot be circumvented by two clicks as with Bing currently) there are other ways like talking with your child about it or in the worst case install a filter software on your PC. But if someone is trying to find websites on syringes (blocked in Germany) on Bing please let him find results.

My personal conclusion: I do not like censorship therefore I do not like Bing. I’ll stick with Google although it is being censored as well but not to such a degree based on single results not on search queries.

How To Use The Bing Webmaster Tools To Get Info On Your Site

Monday, June 1st, 2009

Just today bing.com, Microsoft’s new search engine, has launched. Surely you have already checked out the search results and checked positions of your favorite keywords. But have you also checked out all the tools Microsoft now offers webmasters to analyze their websites?

Verify Ownership Of Your Website

Just go to the Bing Webmaster Center and click on the “Add a site” button to add your website. In the form that is shown enter the URL of the website you wish to add. Bing even allows you to provide an email address ” to contact you if [they] encounter specific issues with your site” which sounds very interesting because Google does not provide that feature. Only the following weeks and months will show what the result of using that email feature will be.

bing-scr01

After submitting the form you have to add some verification code to your site (or your server). In contrast to Google which only requires you to create an empty file with a specific name Microsoft wants you to add an XML file to your server which a specific content. You can also choose to add a META tag to your site but I recommend using the XML file because it’s much simpler – you only need to upload it once to your server whereas you’d have to add the META information to the homepage template.

bing-scr02

After you have added the META tag to your homepage or uploaded the XML file click on the “Return to list” button. You’ll see your website in the list. Just click on the domain name.

Bing will access your website immediately and check for both the META tag or the XML file. If you have done everything correctly you will be taken to the site summary page which provides a wealth of information on your site as seen by Bing.

bing-scr03

Site Summary And Domain Score

The site summary shows you when your site was last crawled by the Bing crawler, the number of indexed pages, whether Bing has been blocked from accessing your site (if you have blocked it via the robots.txt file for example) and a domain score which is shown as five boxes. Microsoft writes here:

“Domain Score provides a measurement of how authoritative Bing views your domain to be, with five green boxes being the highest rating and five empty boxes being the lowest. This is based on many of the same factors Bing uses to determine static rank, but isn’t directly comparable.”

Luckily this blog has a domain score of 5/5 at the time of writing.

Bing also shows you the top 5 pages of your site.

Your Profile

When selecting “Profile” from the top navigation you can change the settings you have already seen when you added your site. You can also see the current verification method Bing is using to verify your site ownership.

Crawl Issues

This section shows you crawling issues that may have occurred on your site such as pages that Bing could not find (404 error) or pages blocked by the robots.txt file.

It also shows you a list of long dynamic URLs Bing has flagged because they think it might lead the crawler into an infinite loop trying to crawl the dynamic URLs and may also lead to duplicate content.

The Crawl Issues page also tells you whether the crawler found pages on your site which it believes to be infected with malware or using unsupported content types.

bing-scr05

Backlinks

The backlinks page shows you all of the backlinks Bing has found to your domain together with the page score, language and region of the page linking to your content. I really like the inclusion of the page score because it may be used to find “bad neighborhoods” linking to your site although Microsoft says that the score isn’t directly comparable.

The page will only show the first 20 backlinks but you can download the complete list as a CSV file to your system.

bing-scr04

Outbound Links

This page will show you all of the links on your site Bing has found that are leading to other websites. Just like on the backlinks page it shows you the page score, language and region as well and even allows you to show your outbound links to malware sites – let’s hope you don’t have any on your site.

Just like before you can also download the complete list as a CSV file.

Interestingly all of the links on my page leading to Twitter (the source of which is a Twitter plugin for WordPress which shows the latest tweets on my blog) have a page score of 5/5. Does that mean that Bing sees Twitter as an authoritative site?

bing-scr06

Keywords

This page allows you to see “how your site performs in search results for searches using specific keywords” although I don’t quite understand the results. You can enter a keyword in the text field provided and it will show you the page on your site, the page score of that page and once again the language, region, last crawl date and whether the Bing crawler was prevented from accessing the page.

bing-scr07

It is interesting but I had expected to see SERP positions for the given keyword which would be a great feature. Entering “wolframalpha” shows a page score of 5 for my article on WolframAlpha yet when searching for “wolframalpha” on bing.com that page is not listed in the first 100 results.

More (Not So) Interesting Stuff

You can also add your sitemap directly by clicking on the Sitemaps tab.

The “Related Tools” section in the navigation on the left side lists some links that sound interesting at first but in my opinion they are a bit disappointing. If you thought that by clicking on the Robots.txt validator link you would be able to analyze the robots.txt file for your current site you’re wrong. You can copy the contents of any robots.txt file there to check it for incompatibilites with the MSNBot but that’s all. Slightly disappointing.

Likewise the HTTP Verifier and Keyword Research Tool links lead you directly to the default pages on the Microsoft website.

Bottom Line…

I recommend that you add your site(s) to the Bing Webmaster Center so that you can access the interesting statistics they provide – I’m sure many more tools will be provided in the future.

You should also check out the forum for many interesting discussions.

I’m amazed that Microsoft provides these features just from the launch day on.

We’ll see what else will be provided in the future.

Get Info On Domains With WolframAlpha

Saturday, May 16th, 2009

Did you know that WolframAlpha allows you to get info on specific websites and domains?

It’s quite easy as you just need to enter the domain name into the search field. Most of the data seems to have been retrieved from Alexa but it’s still a great idea. I still don’t understand the HTML element hierarchy graph but it looks interesting. However I don’t know what the tag information could be used for.

The following image shows the info WolframAlpha shows currently for this blog. And I think I need to say that the number of visitors is a bit far off ;-)

WolframAlpha Sascha Kimmel

A Disappointing First Look At WolframAlpha

Saturday, May 16th, 2009

After all the hype that has occurred prior to launch of the new WolframAlpha knowledge engine I checked it out after it went live a couple of hours ago. Here is my personal evaluation of this new site. Having seen many search engines and technologies rise and fall within the last decade I was curious to test the site as soon as possible. As I am not a scientist I will try to provide a view which focuses on the normal internet user.

First Tests

As I had recently been trying to find out how many castles there are in Scotland I typed „castles scotland” into the input field. WolframAlpha returned the not very helpful message „Wolfram|Alpha isn’t sure what to do with your input.” so I changed the query to „number of castles in scotland” which still returned not a single result. Maybe WolframAlpha just does not know that so I tried „number of rivers in scotland” which was interpreted as “is Rivers, Manitoba, Canada in Scotland, Connecticut, United States” and returned „Result: no”.

So I changed the query to „number of rivers in Germany” which worked out fine – great! The first correct result was returned!

In the minutes that followed I went on to enter different queries and analyzed the results to come up with the following list of inconveniences and problems from my point of view.

Where Are All The Links?

The fundamental elements of the web that keep it all glued together are links yet I didn’t get any result which allowed me to dive deeper into the information. Surely after clicking on some values a layer occurred that allowed me to copy the values but only sometimes there were links contained that I could click on directly. Links from one page to another page is what sometimes keeps me for half an hour or even longer on Wikipedia traversing through the links from one article to the next although I only have been looking for a simple answer to a problem. WolframAlpha doesn’t seem to offer that kind of linking most of the time.

Just searching for Inverness which is a city in the Scottish Highlands near the famous Loch Ness returns some useful information on the city. It also lists cities nearby. Good idea but I’d like to click on any city name directly to perform a search for that city, e.g. Edinburgh. What I need to do is either click on the name of the city, find the link to the city and click on it or copy the city name or enter it manually into the input field and perform a new search.

Another example: searching for „Walt Disney” does not return a result on the person Walt Disney but on the company. I need to select „Use as a person instead” which by the way is in a very small font which I personally regard as bad usability. From the Walt Disney (person) result there is no link back to the Walt Disney Company he founded. Why not? Wikipedia has it.

Weird Incompleteness

WolframAlpha doesn’t return anything on “london underground” which is the oldest subway system of the world. Yet “new york subway” returns a result. Likewise you can search for “longest subway system” which returns the New York City subway system but don’t search for “oldest subway system” (which would be the London Underground) which will return nothing at all. This is even the more fascinating if you keep in mind that Stephen Wolfram was born in London.

Google’s top results for “oldest subway system” show that it’s the London Underground – and I don’t even need to click on the results to get that information as it is contained in the snippets Google provides.

If you enter “liberty island” WolframAlpha doesn’t find anything, however Google does. Yet WolframAlpha knows the Brooklyn Bridge. Seems to be more important to know the Brooklyn Bridge than to know the island where the Statue Of Liberty is located.

There Often Is More Than One Answer To A Question!

Most of the time you are stuck with the result without any helpful links whatsoever. Searching for “toons” shows “Interpreting “toons” as “towns”" which for me seems very far-fetched. It also does not return a result at all but allows you to select a city. I would have expected to receive the result for “toon” instead which is the correct singular for “toons”.

The word “simpsons” is interpreted as “sum formula” yet a search for “bugs bunny” returns information on a Warner Brothers movie entitled “Bugs Bunny’s 3rd Movie: 1001 Rabbit Tales (movie)” with data from the IMDB. I would rather have seen some historical info on Bugs Bunny as on Wikipedia.

Likewise a search for “james bond” returns no info on the fictional character but returns information on the movie “A View To A Kill” from 1985 which is a somehow matching result but not really what I had expected. Why has exactly this movie been selected? Luckily WolframAlpha also shows a list of all of the other James Bond movies. But, again, something unexpected happens. If you select “Casino Royale” from the list you won’t see any information on the movie from 2006 but instead on the 1967 TV version of the book. There seems to be no ranking. And there is no way for you to find out that there is another movie from 2006. No link, no info. If you only depend on this information you’re doomed to fail. If you search for “Casino Royale” manually info on the 1967 movie is shown but you can select the movie from 2006 directly.

Using Google searching for “Casino Royale” shows the IMDB entry for the 2006 movie as the first result which is what I would have expected. The second result from Google shows the 1967 version. Great!

If you query WolframAlpha for “Wolfram” you’ll be shown info on Stephen Wolfram – the creator of WolframAlpha.

Yet if you try searching for the Google Founder’s last names “Brin” and “Page” on Google that kind of bias doesn’t exist there. For me WolframAlpha’s result in this case is not an objective result.

Some results seem to be very blatant errors. Just searching for my surname “Kimmel” as a single word without any spaces is interpreted as the distance between “Kim, Sughd, Tajikistan” and “Mel, Veneto, Italy”. Ouch! However searching for “Jimmy Kimmel” returns information on the talk show host.

Searching for “Illuminati” returns no result (conspiracy theorists: here we go) yet searching for “Adam Weishaupt” which has founded the Order of the Illuminati returns a result.

I could literally go on for hours but you should just try it for yourself but forget searching for “Mickey Mouse” and “Seinfeld” as no results are returned for these terms currently.

Bottom line: contrary to mathematics there is not always only one solution to a problem. Just imagine Google would only show you the one result it thinks is the best match. You wouldn’t like that either I suppose.

Diving Into The Scientific World

Although I am not a scientist I just tried some searches with some unexpected results as well.

I have learned not only from Google but also from the previous WolframAlpha searches performed above that queries seem to be case-insensitive. That was an error. I searched for “h2o” all in lowercase to get info on the water molecule. Yet WolframAlpha interpreted this as a degree value. No water here. Searching with H2O in uppercase works though returning the expected result.

Second try: I entered “au” which is what I believe to be the chemical abbreviation for gold (from Latin “aurum”) but this has been interpreted as “astronomical unit”. Although there are many links at the top of the page there is no link for “as a chemical element”. Searching for “Au” returns the expected result however. To get the correct results you obviously need to know the correct capitalization of the word you are looking for.

I don’t think the normal web user knows that.

Third try: searching for “fly genome” returns no result. Google shows the expected results with the Berkeley Drosophila Genome Project first. I then searched for “drosophila genome” on WolframAlpha but got no result either just a reference to “Animals: drosophila”.

Some More Searches

Here are some searches I performed which really give great results and bad results, respectively:

Good:

Bad:

One-Fits-All Approach Is Wrong – Case-Sensitivity Is A Problem

In my opinion WolframAlpha should not return only one result or if it does it should offer a better disambiguation to the user. The one-fits-all approach is wrong. Currently it still doesn’t return the correct result quite often. Google on the other hand shows not only one result but (most of the time, apparently) the results that it believes are the ones the user has been searching for but does not decide for the user what he seems to have intended. Searching for “flytrap” on WolframAlpha returns a word definition as “a trap for catching flies”, not even the botanical definition of the “Venus Fly Trap” or anything else. If you search Google for “flytrap” the results contain completely different entries allowing for a manual disambiguation. There is a company named “Flytrap Technologies”, an eZine named “Flytrap”, a Wikipedia article on the Venus Flytrap and much more. Therefore you can refine your search and search for “venus flytrap” on Google to get more information on that.

Bottom line: If you know exactly what you are looking for and know the complete correct term and capitalization you will most of the time get the results you are looking for. If you don’t you’re lost quite often. WolframAlpha knows only one “John Smith”, Wikipedia knows more than 80 people with that name.

Let’s hope WolframAlpha gets better for every one of us, not just for scientists – I’m sure they’ll love it. For now I’ll try WolframAlpha often but I’ll stick with Google and Wikipedia for most of the searches. What about you?

Anways, it still contains a huge amount of knowledge and definetly is something to thank the creators for. Surely it will develop over time. Let’s hope it doesn’t go where WikiaSearch has gone before.

tweetthis-15