Friday, February 19, 2010

VALA 2010 - user generated content

The final plenary session for Wednesday was a panel discussion entitled Top Trends which primarily dealt with user generated content - comments, tags, search terms. Below are notes of some of the discussion.

Increasingly users have the opportunity to add comments to content on the web. It was suggested that comments could be treated as letters to the editor. In some cases the comments can also result in a dialogue among users - users responding to other users' comments - a forum without having a forum. Often useful, previously unknown information can be provided by users, particularly in regard to images, and can enhance the information provided. The user information can provide opportunities for further research by the institution. An example is the images that Powerhouse Museum has placed on Flickr - often useful information has been provided by the users.

Tagging records is available in many databases but generally this has not been taken up by users - not a culture of using tags on library databases. Institutions tend not to moderate the tags except if unacceptable words are added. In some cases users can delete tags of other users.

At the National Library user data is not moderated - user data is another layer from the original data. It is proposed to encourage annotation of catalogue records by users in the future.

OCLC WorldCat, a site that allows users to search for books in libraries throughout the world, is planning to separate LC headings and sub-headings into facets instead of one long term with subdivisions. For example geographic facets treated as a separate entity. Tags can be used to modernise LC terms and synonym lists will be used. These can be blended in tag clouds. It was suggested for some detailed subjects, user tagging may be the only answer to adequately provide search terms.

It was suggested that new classification schemes may be needed for cloud sourcing however another member of the audience said that as Dewey DC was available in 30 languages and could therefor be considered a universal scheme, the stem numbers without geographical and other additions might be used.

Micro tagging will also be useful for helping people locate information.

The use of search terms used in a search can also be tracked and this can be useful for suggesting additional search terms for an item as the public uses its own language.

The discussion moved on to how to get people to use tags and comments. Although the opportunity to apply tags and comments may not generally be used in library catalogues users are doing this in resources such as Library Thing. People who make comments have an interest in the topic.

It is necessary to provide useful tools that are easy for people to use and then make them available where the users are. An example of well used user content is the ability to correct the OCR for Australian Newspapers.

Linkages, trails and themes were also mentioned. Picture Australia and Music Australia currently utilise trails. It is planned that user created themes will become available in Trove.

The Internet currently has content and tools for people to use. It is now necessary to bring together the range of collections. Generally people do not care where the information comes from - they just want the information.

The final summation reflected on the 'power of communities that want to be involved. We have the tools. Go for it.'

Sunday, February 14, 2010

VALA 2010 - Wednesday afternoon

Many of the collections in the National Library and the Australian War Memorial are now accessible to the public via the Internet .

Developing Trove, the policy and technical challenges
Trove, a discovery service for the public, links metadata from a variety of digital collections enabling items to be discovered in one search. Launched by the National Library as Prototype in 2009 and later rebranded as Trove, the site is constantly evolving.

The National Library already had a number of digital online collections including Libraries Australia, Picture Australia, Australian Newspapers, Music Australia, Pandora and the library catalogue. Trove has been designed to provide access to all these and other digital collections in one search. Most of the stand alone services will be phased out.

After conducting a search eight collection views are provided - Books, Journals, magazines, articles..., Pictures and photos, Australian Newspapers (1803-1854), Music, sound and video, Archived websites (1996-), Diaries, letters, archives..., About people and organisations.

A feature of the site is user interaction. This is especially seen in the digitised newspaper section where users can correct the OCR. Users can also merge or split records, where appropriate, in Libraries Australia. Future developments will include RSS feeds, enhanced sorting, API, journal articles, e-resources from partner vendors and texts of guides to collections.

Digital preservation: the problems and issues involved in publishing private records online: lessons learnt from the web publishing of the notebooks and diaries of C.E.W. Bean

The Australian War Memorial began digitising collections ten years ago. Charles Edwin Woodrow Bean was Australia's official war correspondent during the First World War and later wrote the official history of Australia's involvement in the war based on his diaries and the unit commanders' diaries.

The digitised war diaries of C.E.W. Bean were made available online on the Australian War Memorial website on Remembrance Day 2009. The digitisation of the notebooks and diaries commenced in 2003 and was completed in 2004. There were 286 volumes and the information in them has been made available to the public as pdfs.

Copyright of the material was a major issue as it was confusing. The Commonwealth had copyright of the material created by Bean when employed as an historian by the Commonwealth Government. The diaries however were private papers gifted to the Australian War Memorial in 1942 with the understanding that they could be made available to the public after Bean's death. Bean later changed this condition to 30 years after his death (he died in 1968). The collection was reopened 1981 and Bean's family gave permission for publication. The correspondence and ephemera among the papers created additional copyright issues. Section 200AB of the Copyright Act was used to allow publication of these items. The website provides a detailed copyright statement regarding the publication of the records.

The Australian War Memorial has also published online the Official history of Australia in the War of 1914-1918.

Also available online are the digitised copies of the First World War diaries of the unit commanders. A demonstration of this part of the site was provided during the afternoon tea break. The original index to the diaries will also be made available.

A database relating to indigenous servicemen will shortly be online. The Australian War Memorial site has a growing number of image and biographical databases and federated searching will soon be introduced for searching the material.

Later in the afternoon Stephanie Orlic from The Louvre spoke about a project entered into with a Japanese company to use technology to explore more fully items in a collection including 3D representations of items, enlargement of sections and commentary in a variety of languages. Cycle 1 of The Louvre - DNP Museum Lab ran from 2006 to 2009. Cycle 2 will commence in April.

Saturday, February 13, 2010

VALA 2010 - Tuesday afternoon (b)

The afternoon sessions continued with information about another site using federated search followed by an introduction to the use of the semantec web and linked data.

All aboard ParlInfo:the journey towards integrated access to bibliographic and full text information from the Parliament of Australia
ParlInfo has been developed to provide federal government information to members of parliament, those working in government and also the general public. It was designed to incorporate changes in user expectations and utilise new developments in information technology including social networking. The site uses federated searching and plus Google-like functionality. Search options include Basic Search, Advanced Search and Guided (canned) Search. In The user can also browse collections and then narrow the search using provided options. Collections include Bills, Hansard, Notice papers, House of Representatives votes and procedures, Senate journals, Biographies of current members, Library catalogue and Parliament of Australia website. The site recognises internal and external users and external users need to log on especially if they want to use web 2.0 features including RSS and alerts. The site is still a work in progress. Future developments will be the digitisation of all Hansards and old Bills. The Parliament of Australia website is also being updated.

Next up? the linked content economy
The highlight of the afternoon was the plenary session presented by Thomas Tague from Thomson Reuters, USA. The session provided a preview of the next development in the use of human knowledge systems. Tom Tague described the first stage of the web as collecting and putting online content with the second stage the emphasis on social networking. This has resulted in a mixture of information often making it harder to discover what we are looking for. Semantec metatagging will assist users locate the information required from the mass of information available online.

Wikipedia has an article describing the Semantec Web and another on Linked Data, two of the concepts described in the talk.

Thomson Reuters purchased software for generating semantec metadata and in January 2008 made it available as open source software, OpenCalais. The Calais Viewer is avalilable for potential users of the software to see how it works. A document is submitted and the software automatically generates tags relating to events, people, facts and linking the information in the document to information on other sites on the web.

The use of Open Calais for preparing and presenting auto-generated tags is provided in this example of a search for telescopes on the Powerhouse Museum website. Other projects using Calais include Media Cloud and DocumentCloud. The metadata generated via Calais is kept by Thomson Reuters for further projects.

The use of auto-generated tags and linked data greatly expands online research opportunities and historical and genealogical research were two fields mentioned where this could be used. Discussion also centred around using Open Calais with WorldCat. The talk provided an introduction to the possibilities of future development of the web - Web 3.0.

Friday, February 12, 2010

VALA 2010 - Tuesday afternoon (a)

The VALA 2010 conference with the theme, Connections, Content, Conversations, produced an assortment of papers about IT related developments in libraries and information organisations and advancements in electronic resources. The sessions provided an overview of some of the uses of information technology and resources currently in use plus possibilities for the future. Although this was predominantly a conference for librarians the speakers, especially the keynote speakers, were from a range of organisations and it was stressed at a number of the sessions that information technology crossed institutional boundaries. A summary of sessions I attended on the Tuesday and Wednesday afternoon of the conference is provided in this and subsquent posts. The first to posts investigated the use of e-books and a catalogue using federated searching in academic libraries but the results are also of interest to a wider audience.

Ebook usage at Curtin University Library: patterns, projections and strategy
E-books have been around for a while now and an analysis of the use of e-books at Curtin University provided information on the uptake of online material in an academic environment. Curtin University started purchasing e-books in 2003. Although the e-book collection has expanded, particularly since 2007, the electronic book format is secondary to the print version. E-journals, however, have become a dominant format for providing access to periodical literature.

Two major e-book collections described were electronic copies of textbooks and other items in the reserve collection where there would be high usage and a collection of research material in e-book format for ongoing use. Graphs from statistics of usage of the collections were provided along with a description of selecting e-book materials. Generally titles selected directly by the library staff experienced higher usage than items collected in a subject specific or other form of group purchase. The graphs showed that usage of the student e-books varied from one semester to another though overall usage did increase significantly in 2009. The numbers of students in classes and lecturers recommending the e-book titles were suggested factors affecting these figures. [Another possible factor for the variance in usage patterns could be that different subjects are offered in each semester]. It was concluded that more analysis is required.

Beyond the grave: where to with Gen (wh)Y?
The glossary on the University of Western Sydney Library website describes their Library Search Box as enabling 'a simultaneous search of a selection of Library databases, a number of web resources and the Library catalogue with additional refining and discovery features.' The talk demonstrated and discussed the new catalogue utilising federated searching. Users have been encouraged to provide feedback and this has provided information on the acceptance and usage of the new features. Requests were made to also retain the 'classic' catalogue so a link to that version will be available until Easter.

Using the Library Search Box provides not just a list of possible items relating to the search term but also a tag cloud which students appear to like and the ability to refine the search using options relating to library format, subject, date and geographic region. The ability for users to tag items is also provided but not widely used so far. Images of book jackets are provided for most entries and icons show type of material. Linked tags in the record help in searching and tables of contents provide additional information about the book contents. There is also the ability to export the details of the record to EndNote or RefWorks. The search box for the catalogue is prominent on the library home page but as 80% of students do not use the library home page a library search box is located on all pages of the website. Although the new catalogue has additional features and access to a wider range of resources in one search, statistics show probably only a 5% increase in usage over the 'classic' catalogue. Usage statistics have also shown that some students tried using the library search box for searching for other library information such as hours of opening.