May 15, 2006

"Search, Search and More Search" - Google Update

.: Today, Information Today reported on the Google Press Day Webcast, during wnouncedhich four new Google products were announced. Introduction from the article by Barbara Quint, titled "Whither Google? Report on Google's Press Day Webcast":

May 15, 2006 — From all over the country and, indeed, all over the world, reporters came to Mountain View, Calif., last Wednesday to hear what Google executives had to say about where the company was headed and what new treats it had planned for the world of Web users. Even reporters who did not come to California for Google Press Day participated through a 5-hour Webcast with the option to ask questions via e-mail. (By the way, if you’d like to experience what the Webcast watchers did, just grab your popcorn and aim your Windows Media Player or Real Player at Many of the reporters asking questions during the meeting wanted to hear about the imminent war of the titans between Google and Microsoft. They were disappointed. Google executives want the company to address all the new and unaddressed problems in the world of search, not “do over again what’s already been done.” Search, search, and more search is Google’s goal. Four new products, all search-oriented, were announced: Google Trends, Google Gadgets, Google Co-op, and Google Notebook.
Quint singles out Google Co-op as the best of the bunch. From the introduction to her article:
May 15, 2006 — Of the four new product announcements made last week at Google Press Day [], Google Co-op ( looks to have the greatest potential impact. At first glance, it would seem that Google has now entered the social bookmarking arena, along with services like, Furl, Spurl, Shadows, Scuttle, Yahoo! MyWeb 2.0, Ma.gnolia, etc. All of these services, and many others, offer ways for users to share and find collections of linked material built around and by communities of user interests. Regardless of the present quality of Google Co-op—and some users with whom I spoke consider it anemic at this point—the entry of giant Google into this arena, along with Yahoo!, could mark a sea change in the importance and growth of such tools. The product manager for Google Co-op, Shashi Seth, said that, as Google Co-op grows, lessons drawn from its content and usage are expected to lead to improvements in search quality in the main service.

March 13, 2006

Knowledgespeak Updates

.: One of the consequences of working two jobs at the same time is that my inbox is growing faster than I can hit the delete button. Here are a few recent items from the Knowledgespeak news archive, which I have been meaning to post for some time:

  • BioMed Central unveils new online open access journal - "Open access publisher BioMed Central, UK, has announced the launch of Biology Direct, a new online open access journal with a new peer review system. Led by Editors-in-Chief David J Lipman, Director of the National Center Biotechnology Information (NCBI); Eugene V Koonin, Senior Investigator at NCBI; and Laura Landweber, Associate Professor at Princeton University, the journal seeks to provide authors and readers with a unique system of peer review.

    The journal will cover original research articles, hypotheses and reviews, and is available online at The journal includes publications in the fields of Systems Biology, Computational Biology and Evolutionary Biology, to be soon followed by an Immunology section..."

Continue reading "Knowledgespeak Updates" »

June 9, 2005

SLA Session - Future of Search Engines (Google View)

:: I attended a session at SLA in Toronto called The Future of Search Engines. One of the speakers was Cathy Gordon, Director of Business Development for Google. After stating that her opinions were her own, and didn't reflect those of Google, she began with a mini infomercial (standard fare, and expected) about the infamous search engine, noting that users can now search in >100 languages, it is the #1 search engine in 17 of 20 countries (didn't say which 17 countries, or the names of the other 3), and that Google powers 70% of all Internet searches. Google doesn't create its own content, tries to knock down spam results, and does not accept payment for inclusion. Users may submit web pages to Google for indexing, and currently it indexes >8 billion pages, and >12,000 news sources.

She mentioned "Premium Content" (apparently a new or forthcoming feature in which Google will somehow get behind subscription firewalls to for-fee dbs), and noted that book search results from Google Print are not integrated into Google web searches, as the pages from the books are not ranked.

In discussing Google Scholar, she said it was created by an engineer looking for scholarly content. Some dbs and full-text scholarly journals are being indexed, along with theses, dissertations, books, technical reports, and other material. When asked about a source list of what is indexed in Google Scholar, she said she didn't know exactly why this still isn't being offered by Google. GS uses link resolvers to allow for access to articles on an IP-authenticated machine (where the institution subscribes to the publication in which the article is found).

As for the topic of the session itself, Gordon believes that users will continue to demand more control, while the search engine will become more personal and sophisticated but must remain simple to use. The depth, breadth and type of searchable content will continue to expand, and the challenge is to present this variety of information in a coherent and cohesive manner. Geographic and language barriers will continue to decrease, and desktop tethering will be eliminated - i.e., the need to have a desktop computer to search will no longer be the case as searching becomes ubiquitious. To this end, Google offers the option of personalizing your Google homepage, along with My Search History.

She summarized as follows: searching remains a primitive function, users must be at the centre of improvements, the increasing amounts of searchable information will require innovative solutions to manage its complexity, and searching must be accessible at any time, from any location, using any device.

December 14, 2004

Harvard Libraries and Google Announce Pilot Digitization Project

:: The rest of the title reads: "...with Potential Benefits to Scholars Worldwide." I hope the use of "Scholars" doesn't upset ACS. ;-) Harvard is the first in what could become a series of major libraries to collaborate with Google:

Harvard University is embarking on a collaboration with Google that could harness Googles search technology to provide to both the Harvard community and the larger public a revolutionary new information location tool to find materials available in libraries. In the coming months, Google will collaborate with Harvard's libraries on a pilot project to digitize a substantial number of the 15 million volumes held in the University's extensive library system. Google will provide online access to the full text of those works that are in the public domain. In related agreements, Google will launch similar projects with Oxford, Stanford, the University of Michigan, and the New York Public Library. An FAQ detailing the Harvard pilot program with Google is available at and on the Harvard home page.

The Harvard pilot will provide the information and experience on which the University can base a decision to launch a large-scale digitization program. Any such decision will reflect the fact that Harvard's library holdings are among the University's core assets, that the magnitude of those holdings is unique among university libraries anywhere in the world, and that the stewardship of these holdings is of paramount importance. If the pilot is deemed successful, Harvard will explore a long-term program with Google through which the vast majority of the University's library books would be digitized and included in Google's searchable database. Google will bear the direct costs of digitization in the pilot project.

Larry Page, co-founder of Google, noted that "we dreamed of making the incredible breadth of information that librarians so lovingly organize to be searchable online.":
"Even before we started Google, we dreamed of making the incredible breadth of information that librarians so lovingly organize to be searchable online. Today we're pleased to announce this program with these prestigious libraries to digitize their collections so that every Google user can search them instantly," said Larry Page, Google co-founder and president of products.

Page continued, "Our work with libraries further enhances the existing Google Print program which enables users to find matches within the full text of books, while publishers and authors monetize that information. Google's mission is to organize the world's information, and we're excited to be working with libraries to make even more of it available to Google users worldwide."

Now begins another round in the ongoing series of speculations and musings about Google and its place in the world of libraries and reference and collection work and information retrieval. Wait a minute - Google is swallowing our online catalogues! What happens next? Will these collections also appear in Google Scholar? (Will Google Scholar change its name to Google Educated, or Google Learned?)

If Google starts adding digitized collections the size of Harvard, what are the copyright implications? How might the option to access full-texts of millions of titles via Google affect book sales?

November 18, 2004

Google Scholar (Beta)

:: In case you haven't heard, Google has released its latest product, Google Scholar:

Google Scholar enables you to search specifically for scholarly literature, including peer-reviewed papers, theses, books, preprints, abstracts and technical reports from all broad areas of research. Use Google Scholar to find articles from a wide variety of academic publishers, professional societies, preprint repositories and universities, as well as scholarly articles available across the web.

Just as with Google Web Search, Google Scholar orders your search results by how relevant they are to your query, so the most useful references should appear at the top of the page. This relevance ranking takes into account the full text of each article as well as the article's author, the publication in which the article appeared and how often it has been cited in scholarly literature. Google Scholar also automatically analyzes and extracts citations and presents them as separate results, even if the documents they refer to are not online. This means your search results may include citations of older works and seminal articles that appear only in books or other offline publications.

Analysis and response has been swift, from Shirl Kennedy and Gary Price at Resource Shelf, and Danny Sullivan at SearchEngineWatch, and comes only three days after OCLC and Yahoo! announced their free toolbar that allows searches of OCLC WorldCat as well as Yahoo! Search's web search engine.

October 13, 2004

All of OCLCs WorldCat Heading Toward the Open Web

:: Some months ago, I wrote about Yahoo! and Google's inclusion of OCLC WorldCat records. In 2003, Barbara Quint wrote of the OCLC test project, where approximately 2 million of the 53 million+ records on OCLC WorldCat were made available via Google. Quint follows up with a report that the pilot project has been a success, and as a result, OCLC will open its entire collection of 53.3 million items for "harvesting" by Google and Yahoo! :

Excited by the "resounding success" of the Open WorldCat pilot program, the management of OCLC, the worlds largest library vendor, has decided to open the entire collection of 53.3 million items connected to 928.6 million library holdings for "harvesting" by Google and Yahoo! Search. A letter from Jay Jordan, president and CEO of OCLC, went out to members on Oct. 8. Currently, the Open WorldCat subset database contains about 2 million records, all items held by 100 or more academic, public, or school libraries, some 12,000 libraries all told. The new upgraded Open WorldCat program will automatically include all OCLC libraries contributing ownership information (holdings) to WorldCat, unless the library asks to have its holdings excluded. In January 2005, Open WorldCat will officially graduate from a pilot program to a permanent "ongoing program"; however, the database will be open for "harvesting" to Google and Yahoo! Search as early as late November 2004.