Newspapers: Google does something for humanity (again)


The BBC reports on a press conference where Google announced that it was set to digitize ~100 historical newspapers from North America. Take a look at the search interface. Or read Google’s blog post.

And who thought that Google was just evil?


Google + Deep Web = True?


I’m not posting at the moment (too much going on on the development front, more on this later), but I’ll make an exception for some really important information. What this is about is that Google for the first time actually indexes information that can only be retrieved by filling in a web form. Basically, Google is getting savvy to the deep web, which has hitherto not been accessible via the search giant.

There seem to be getting less reasons to doubt the supremacy of Google, except the following:

  • Google will only be indexing a selection of sites
  • They will only index a selection of the information on these sites
  • The native interfaces for these sites typically add value to your search (contextual cues, component information, semantic correlates, related terms, etc.) and present the information accordingly.

The development is nevertheless extremely positive, as Google will help find relevant sites, if not all the information contained within this. Nice work, G!

Lær Google: Sju kjappe


OK, Google er noe vi bruker hver dag, men her er fem ting du burde viste:

  1. bruk + for å tvinge google til å ta med ett av flere ord, +cat +dog +raining
  2. bruk – for å utelate ord, +cat -dog +raining
  3. bruk site: for å søke i ett nettsted, infonatives
  4. bruk define: når du er ute etter en definisjon, define:clarion
  5. bruk hermetegn hvis du leter etter en frase, “My way by Frank Sinatra”
  6. bruk stjerne hvis det er noe “valgfritt” i en frase, “My way by * Sinatra”
  7. bruk ~ for å tvinge google til å ta med synonymer, ~prate stoltenberg

Lær flere slike triks.

Mashed-up Google


For de som er interesserte i websøk anbefaler jeg en tur innom searchmash, en søkemotor fra Google. Søkemotoren er ikke ny, men jeg har “funnet den på ny”.

Professor bans/bannlyser Google, Wikipedia



I learnt about this from Phil Bradley’s excellent blog. Apparently, a professor at the University of Brighton (Tara Brabazon) banned her students from using Google and Wikipedia on the basis that these did not represent proper scholarship.

While I can understand that students using sources blindly is rather annoying, I reckon that this kind of banning represents a rather gross misunderstanding of what scholarship in relation to valid sources is.

I would think that it is rather widely accepted that it is the responsibility of each individual doing research to a) find and b) evaluate a source. It is surely up to the individual to evaluate and argue why content is relevant, rather than pander to the whims of their lecturers.

The content of Wikipedia (or any encyclopedia) can hardly be expected to represent anything other than a “scratching of the surface”, but it does provide a low-level introduction to a topic. A quick peek at the Wikipedia article on Tara Brabazon (you can find it on Google, I found it at 2008-01-14:20:05 UTC) informs us that she “[c]urrently gained widespread notoriety for banning Google and Wikipedia from her courses.” Wait for the book version of that before citing, though.


Jeg leste om dette på Phil Bradleys blogg. I følge den engelske lokalavisa “The Argus”, en professor ved University of Brighton (Tara Brabazon) bannlyste sine studenter fra å bruke Google og Wikipedia på grunn av at disse ikke er verdige kilder.

Mens jeg forstår at det er svært irriterende når studenter siterer kilder uten å vurdere de, synes jeg at det å bannlyse noe slik viser en ganske tydelig misforståelse av hva kildekritikk er.

Jeg håper at det er bred aksept for at det er hver forskers ansvar å finne og vurdere sine kilder. Det er også forskerens rett og plikt til å vurdere og argumentere hvorfor innhold fra en kilde er relevant, heller enn å følge sin veileder slavisk.

Innhold fra Wikipedia (eller en hvilken som helst encyclopedia) er bare et sted å begynne, som gir lavtnivå informasjon om et emne. En rask titt på Wikipedia-artikkelen om Tara Brabazon (du finner det på Google, jeg fant det 2008-01-14:20:05 UTC) forteller at hun “[c]urrently gained widespread notoriety for banning Google and Wikipedia from her courses.” Men jeg tror at du bør vente på at dette kommer på trykk før du siterer det.

Don’t be evil


Not a lot of people are aware of Google’s corporate motto, don’t be evil, but it seems that a lot of librarians actually think that Google is doing the exact opposite. In fact, a while ago I suggested dropping all of our bibliographical databases in favour of Google Scholar*, the response was of course underwhelming, but a few people were really dead against the idea and not for a good reason — like it not being any good — but for ideological reasons.

In this case, I think that ideology is a dangerous thing because it allows us to think of Google and its ilk as “the enemy”, which they’re clearly not: they’re just a bunch of folk who manage to make information appear accessible to the vast majority of computer literate people. The thing that bites is that they’re doing this better than the so-called information professionals (read: librarians like us). I know that a lot of people have said this same thing before, but we need to learn from Google, figure out what they’re doing well, simulate this and improve it (I’ve written about a presentation at ILI2007 that made the point that Google doesn’t do everything well, so there is room for improvement).

One of the very interesting things that I’ve often come across is the perception that keeping track of what a user does in order to present them with a. relevant hits, and b. relevant advertising is particularly insidious. I probably think this myself on occasion. But, what’s the real problem: improving the results, tailoring them to a particular audience — isn’t this what we should be doing? I think we should.

Now, you don’t need to be a genius to recognize the value of Bayesian learning in information filtering. A simple implementation of Bayesian learning — attached by cookie and user profile (and perhaps IP address!) — in your OPAC might make the world of difference to the user experience for a lot of people. Imagine being presented with hits on the right topic even if you entered a hopelessly bland keyword. We’re not in it for the money, we’re just here to help, so this doesn’t seem quite so insidious, particularly if you offer users the opportunity to change their profile in a meaningful way.

It’s not evil, it’s sensible. Be sensible.

* Probably not a very good idea, but when you consider a. the state of play in bibliographic databases, b. the innovation that takes place at Google and c. the ability of most students to use plain old Google instead of a proper source, it maybe doesn’t seem so daft.