Planet Code4Lib

MarcEdit 6 Update / Terry Reese

Happy Thanksgiving to those celebrating.  Rather than over indulging in food, my family and I spent our day relaxing and enjoying some down time together.  After everyone went to bed, I had a little free time and decided to wrap up the update I’ve been working on.  This update includes the following changes:

  • Language File changes.
  • Export/Delete Selected Records: UI changes
  • Biblinker — updated the tool to provide support for linking to FAST headings when available in the record
    • Updated the fields processed (targeted to ignore uncontrolled or local items)
  • Z39.50 Client — Single Search, multiple databases selected bug when number of results exceed data limit, blank data would be returned.
  • RDA Helper Bug Fix — Updated an error where under certain conditions, bracketed data would be incorrectly parsed.
  • Miscellaneous UI changes to support language changes

 

The Language file changes represent a change in how internationalization of the interface works.  Master language files are now hosted on GItHub, with new files added on update.  The language files are automatically generated, so they are not as good as if they were done by an individual – though some individuals are looking at the files and providing updates.  My hope is that through this process of automated language generation, coupled with human intervention, the new system will significantly help non-English speakers.  But I guess time will tell.

The download can be found by using the automated update tool in MarcEdit, or downloading the update from: http://marcedit.reeset.net/downloads/

Code4LibBC Day 1: Lightning Talks Part 2 / Cynthia Ng

Code4LibBC mornings are all about lightning talks. Here’s Part2.##AtoM’s XML-to-XSLT conversion feature for creating user-friendly PDF finding aids Dan Gillean What is AtoM? Access to Memory migrating SFU Archives and Special Collections to Archivematica and AtoM researchers browse whole descriptions into PDF finding aids transform EAD output into PDF EAD + XSLT + XSL-FO / … Continue reading Code4LibBC Day 1: Lightning Talks Part 2

Code4LibBC Day 1: Lightning Talks Part 1 / Cynthia Ng

The first morning of Code4LibBC is all about lightning talks. Here’s part 1. Adapt, integrate, collaborate: Applying lessons from Battlestart Galactica to academic libraries Gordon Coleman job talk, part of the interview process imagine a future where challenges in acquisitions and serials is solved with one system Acquisition Systems in 2024: Discovery/ERM/KB/LR/proxy fusion initiator; file … Continue reading Code4LibBC Day 1: Lightning Talks Part 1

Analysing library data flows for efficient innovation / Lukas Koster

In my work at the Library of the University of Amsterdam I am currently taking a step forward by actually taking a step back from a number of forefront activities in discovery, linked open data and integrated research information towards a more hidden, but also more fundamental enterprise in the area of data infrastructure and information architecture. All for a good cause, for in the end a good data infrastructure is essential for delivering high quality services in discovery, linked open data and integrated research information.
In my role as library systems coordinator I have become more and more frustrated with the huge amounts of time and effort spent on moving data from one system to another and shoehorning one record format into the next, only to fulfill the necessary everyday services of the university library. Not only is it not possible to invest this time and effort productively in innovative developments, but this fragmented system and data infrastructure is also completely unsuitable for fundamental innovation. Moreover, information provided by current end user services is fragmented as well. Systems are holding data hostage. I have mentioned this problem before in a SWIB presentation. The issue was also recently touched upon in an OCLC Hanging Together blog post: “Synchronizing metadata among different databases” .

Fragmented data (SWIB12)

Fragmented data (SWIB12)

In order to avoid confusion in advance: when using the term “data” here, I am explicitly not referring to research data or any other specific type of data. I am using the term in a general sense, including what is known in the library world as “metadata”. In fact this is in line with the usage of the term “data” in information analysis and system design practice, where data modelling is one of the main activities. Research datasets as such are to be treated as content types like books, articles, audio and people.

It is my firm opinion that libraries have to focus on making their data infrastructure more efficient if they want to keep up with the ever changing needs of their audience and invest in sustainable service development. For a more detailed analysis of this opinion see my post “(Discover AND deliver) OR else – The future of the academic library as a data services hub”. There are a number of different options to tackle this challenge, such as starting completely from scratch, which would require huge investments in resources for a long time, or implementing some kind of additional intermediary data warehouse layer while leaving the current data source systems and workflows in place. But for all options to be feasible and realistic, a thorough analysis of a library’s current information infrastructure is required. This is exactly what the new Dataflow Inventory project is about.

The project is being carried out within the context of the short term Action Plans of the Digital Services Division of the Library of the University of Amsterdam, and specifically the “Development and improvement of information architecture and dataflows” program. The goal of the project is to describe the nature and content of all internal and external datastores and dataflows between internal and external systems in terms of object types (such as books, articles, datasets, etc.) and data formats, thereby identifying overlap, redundancy and bottlenecks that stand in the way of efficient data and service management. We will be looking at dataflows in both front and back end services for all main areas of the University Library: bibliographic, heritage and research information. Results will be a logical map of the library data landscape and recommendations for possible follow up improvements. Ideally it will be the first step in the Cleaning-Reconciling-Enriching-Publishing data chain as described by Seth van Hooland and Ruben Verborgh in their book “Linked Data for Libraries, Archives and Museums”.

The first phase of this project is to decide how to describe and record the information infrastructure in such a form that the data map can be presented to various audiences in a number of ways, and at the same time can be reused in other contexts on the long run, for instance designing new services. For this we need a methodology and a tool.

At the university library we do not have any thorough experience with describing an information infrastructure on an enterprise level, so in this case we had to start with a clean slate. I am not at all sure that we came up with the right approach in the end. I hope this post will trigger some useful feedback from institutions with relevant experience.

Since the initial and primary goal of this project is to describe the existing infrastructure instead of a desired new situation, the first methodological area to investigate appears to be Enterprise Architecture (interesting to see that Wikipedia states “This article appears to contain a large number of buzzwords“). Because it is always better to learn from other people’s experiences than to reinvent all four wheels, we went looking for similar projects in the library, archive and museum universe. This proved to be rather problematic. There was only one project we could find that addresses a similar objective, and I happened to know one of the project team members. The Belgian “Digital library system’s architecture study” (English language report here)” was carried out for the Flemish Public Library network Bibnet, by Rosemie Callewaert among others. Rosemie was so kind to talk to me and explain the project objectives, approaches, methods and tools used. For me, two outcomes of this talk stand out: the main methodology used in the project is Archimate, which is an Enterprise Architecture methodology, and the approach is completely counter to our own approach: starting from the functional perspective as opposed to our overview of the actual implemented infrastructure. This last point meant we were still looking at a predominantly clean slate.
Archimate also turned out to be the method of choice used by the University of Amsterdam central enterprise architecture group, whom we also contacted. It became clear that in order to use Archimate efficiently, it is necessary to spend a considerable amount of time on mastering the methodology. We looked for some accessible introductory information to get started. However the official Open Group Archimate website is not as accessible as desired in more than one way. We managed to find some documentation anyway, for instance the direct linkt to the Archimate specification and the free document “Archimate made practical”. After studying this material we found that Archimate is a comprehensive methodology for describing business, application and technical infrastructure components, but we also came to the conclusion that for our current short term project presentation goals we needed something that could be implemented fairly soon. We will keep Archimate in mind for the intermediate future. If anybody is interested, there is a good free open source modelling tool available, Archi. Other Enterprise Architecture methodologies like Business Process Modelling focus more on workflows than on existing data infrastructures. Turning to system design methods like UML (Unified Modelling Language) we see similar drawbacks.

An obvious alternative technique to consider is Dataflow Diagramming (DFD) (what’s in a name?), part of the Structured Design and Structured Analysis methodology, which I had used in previous jobs as systems designer and developer. Although DFD’s are normally used for describing functional requirements on a conceptual level, with some tweaking they can also be used for describing actual system and data infrastructures, similar to the Archimate Application and Infrastructure layers. The advantage of the DFD technique is that it is quite simple. Four elements are used to describe the flow of information (dataflows) between external entities, processes and datastores. The content of dataflows and datastores can be specified in more detail using a data dictionary. The resulting diagrams are relatively easy to comprehend. We decided to start with using DFD’s in the project. All we had left to do was find a good and not too expensive tool for it.

Basic DFD structure

Basic DFD structure

There are basically two types of tools for describing business processes and infrastructures: drawing tools, focusing on creating diagrams, and repository based modelling tools, focused on reusing the described elements. The best known drawing tool must be MicroSoft Visio, because it is part of their widely used Office Suite. There are a number of other commercial and free tools, among which the free Google Drive extension Draw.io. Although most drawing tools cover a wide range of methods and techniques, they don’t usually support reuse of elements with consistent characteristics in other diagrams. Also, diagrams are just drawings, they can’t be used to generate data definition scripts or basic software modules or reverse engineering or flexible reporting. Repository based tools can do all these things. Reuse, reporting, generating, reverse engineering and import and export features are exactly the features we need. We also wanted a tool that supports a number of other methods and techniques for employing in other areas of modelling, design and development. There are some interesting free or open source tools, like OpenModelSphere (which supports UML, ERD Data modelling and DFD), and a range of commercial tools. To cut a long story short we selected the commercial design and management tool Visual-Paradigm because it supports a large number of methodologies with an extensive feature set in a number of editions for reasonable fees. An additional advantage is the online shared teamwork repository.

After acquiring the tool we had to configure it the way we wanted to use it. We decided to try and align the available DFD model elements to the Archimate elements so it would in time be possible to move to Archimate if that would prove to be a better method for future goals. Archimate has Business Service and Business Process elements on the conceptual business level, and Application Component (a “system”), Application Function (a “module”) and Application Service (a “function”) elements on the implementation level.

Basic Archimate Structure

Basic Archimate Structure

In our project we will mainly focus on the application layer, but with relations to the business layer. Fortunately, the DFD method supports a hierarchical process structure by means of the decomposition mechanism, so the two hierarchical structures Business Service – Business Process and Application Component – Application Function – Application Service can be modeled using DFD. There is an additional direct logical link between a Business Process and the Application Service that implements it. By adding the “stereotypes” feature from the UML toolset to the DFD method in Visual Paradigm, we can effectively distinguish between the five process types (for instance by colour and attributes) in the DFD.

Archimate DFD alignment

Archimate DFD alignment

So in our case, a DFD process with a “system” stereotype represents a top level Business Service (“Catalogue”, “Discover”, etc.) and a “process” process within “Cataloguing” represents an activity like “Describe item”, “Remove item”, etc. On the application level a “system” DFD process (Application Component) represents an actual system, like Aleph or Primo, a “module” (Application Function) a subsystem like Aleph CAT or Primo Harvesting, and a “function” (Application Service) an actual software function like “Create item record”.
A DFD datastore is used to describe the physical permanent and temporary files or databases used for storing data. In Archimate terms this would probably correspond with a type of “Artifact” in the Technical Infrastructure layer, but that might be subject for interpretation.
Finally an actual dataflow describes the data elements that are transferred between external entities and processes, between processes, and between processes and datastores, in both directions. In DFD, the data elements are defined in the data dictionary in the form of terms in a specific syntax that also supports optionality, selection and iteration, for instance:

  • book = title + (subtitle) + {author} + publisher + date
  • author = name + birthdate + (death date)

etc.
In Archimate there is a difference in flows in the Business and Application layers. In the Business layer a flow can be specified by a Business Object, which indicates the object types that we want to describe, like “book”, “person”, “dataset”, “holding”, etc. The Business Object is realised as one or more Data Objects in the Application Layer, thereby describing actual data records representing the objects transferred between Application Services and Artifacts. In DFD there is no difference between a business and a dataflow. In our project we particularly want to describe business objects in dataflows and datastores to be able to identify overlap and redundancies. But besides that we are also interested in differences in data structure used for similar business objects. So we do have to distinguish between business and data objects in the DFD model. In Visual-Paradigm this can be done in a number of ways. It is possible to add elements from other methodologies to a DFD with links between dataflows or datastores and the added external elements. Data structures like this can also be described in Entity Relationship Diagrams, UML Class Diagrams or even RDF Ontologies.
We haven’t decided on this issue yet. For the time being we will employ the Visual Paradigm Glossary tool to implement business and data object specifications using Data Dictionary terms. A specific business object (“book”) will be linked to a number of different dataflows and datastores, but the actual data objects for that one business object can be different, both in content and in format, depending on the individual dataflows and datastores. For instance a “book” Business Object can be represented in one datastore as an extensive MARC record, and in another as a simple Dublin Core record.

Example bibliographic dataflows

Example bibliographic dataflows

After having determined method, tool and configuration, the next step is to start gathering information about all relevant systems, datastores and dataflows and describing this in Visual Paradigm. This will be done by invoking our own internal Digital Services Division expertise, reviewing applicable documentation, and most importantly interviewing internal and external domain experts and stakeholders.
Hopefully the resulting data map will provide so much insight that it will lead to real efficiency improvements and really innovative services.

Share

flattr this!

Does Google think Your Library is Mobile Friendly? / LibUX

If your users are anything like mine, then

  • no one has your website bookmarked on their home-screen
  • your url is kind of a pain to tap-out

and consequently inquiries about business hours and location start not on your homepage but in a search bar. As of last Tuesday (November 18th), searchers from a mobile device will be given the heads up that this or that website is “mobile friendly.” Since we know how picky mobile users are (spoiler: very), we need to assume that more quickly than not users will avoid search results if a website isn’t tailored for their screen. A mobile-friendly result looks like this:

mobile-friendly

The criteria from the announcement are that the website

  • Avoids software that is not common on mobile devices, like Flash
  • Uses text that is readable without zooming
  • Sizes content to the screen so users don’t have to scroll horizontally or zoom
  • Places links far enough apart so that the correct one can be easily tapped

and we should be grateful that this is low-hanging fruit. The implication that a website is not mobile friendly will certainly ward off clickthrough, which for public libraries especially may have larger repercussions.

Your website has just 2 seconds to load at your patron’s point of need before a certain percentage will give up. This may literally affect your foot traffic. Rather than chance the library being closed, your patron might just change plans. Mobile Users Are Demanding

You can test if your site meets Googlebot’s standards. Here’s how the little guy sees the New York Public Library:

nypl-mobile-friendly

Cue opportunistic tangent about pop-ups

On an unrelated note, the NYPL is probably missing out on more donations than they get through that pop-up. People hate pop-ups, viscerally.

Users not only dislike pop-ups, they transfer their dislike to the advertisers behind the ad and to the website that exposed them to it. In a survey of 18,808 users, more than 50% reported that a pop-up ad affected their opinion of the advertiser very negatively and nearly 40% reported that it affected their opinion of the website very negatively. The Most Hated Advertising Techniques

And, in these circumstances, the advertiser is the library itself. ( O_o )

At least Googlebot thinks they’re mobile friendly.

The post Does Google think Your Library is Mobile Friendly? appeared first on LibUX.

10 Historic Thanksgiving Celebrations / DPLA

Happy Thanksgiving, DPLA friends! No matter how you choose to celebrate, we hope it’s a good one. In celebration, a selection of the best Thanksgiving day photos from the DPLA, showing how American families have celebrated for more than 100 years.

Children gather round their mother, cooking the Thanksgiving turkey, 1951.
Thanksgiving at the Home for the Friendless, New York.
Georgia Tech football players on the bench at the Thanksgiving Day game, 1945.
Salvation Army Thanksgiving, 1916.
U.S. troops eat Thanksgiving dinner, 1981.
The Golden Age Group’s Thanksgiving dinner, 1955.
FDR carving the Thanksgiving turkey, 1933.
Cooks, with the turkeys they prepared, for a “military Thanksgiving,” 1918.
“The Car-Driver’s Thanksgiving,” 1877. 
Claremont California residents dressed as pilgrims for Thanksgiving, 1958.

hosted pbx denver / FOSS4Lib Upcoming Events

Date: 
Thursday, November 27, 2014 (All day)
Supports: 

Last updated November 27, 2014. Created by fredwhite on November 27, 2014.
Log in to edit this page.

Get the voip connection for the business as well as resident purpose. For more information of the phones systems and voip connection visit here.

“Managing Monsters”? Academics and Assessment / HangingTogether

Warner managing monsters

Recently in the London Review of Books Marina Warner explained why she quit her post at the University of Essex. I found it a shocking essay. Warner was pushed out because she is chairing the Booker Prize committee this year, in addition to delivering guest lectures at Oxford. (If those lectures are anything like Managing Monsters (1994), they will probably change the world.) Warner’s work – as a creative writer, scholar, public intellectual – does not count in the mechanics of assessment, which includes both publishing and teaching.

Warner opens her LRB essay with the library at Essex as the emblem of the university: “New brutalism! Rarely seen any so pure.” I don’t want to make light of the beautifully-written article, which traces changes over time in the illustrious and radical reputation of the University of Essex since it was founded in the 60s. Originally Warner had enthusiastic support, which later waned when a new vice-chancellor muttered, “‘These REF stars – they don’t earn their keep.”

Warner’s is just the latest high-profile critique about interference in research by funders and university administrators.  The funniest I’ve read is a “modest proposal” memo mandating university-wide use of research assessment tools that have acronyms such as Stupid, Crap, Mess, Waste, Pablum, and Screwed.

I have been following researchers’ opinions about management of information about research ever since John MacColl synthesized assessment regimes in five countries. This past spring John sent me an opinion piece from the Times Higher in which the author, a REF coordinator himself, despairs about the damage done by years of assessment to women’s academic careers, to morale, to creativity, and to education and research. During my visits to the worlds of digital scholarship, I invariably hear of the failure of assessment regimes for the humanities, the digital humanities, digital scholarship, and e-research.

I figure it is high time I post another excerpt from my synthesis of user studies about managing research information. I prepared most of this post a year ago, when I was pondering the fraught politics (and ethics) of libraries’ contributions to research information management systems (RIMs). (Lorcan recently parsed RIM services.)

So here goes:

Alignment with the mission of one’s institution is not a black-and-white exercise. I believe that research libraries must think carefully about how they choose to ally themselves with their own researchers, academic administrations, and national funding agencies. If we are calibrating our library services – for new knowledge and higher education – to rankings and league tables, I certainly hope that we are reading the journals that publish those rankings, especially articles written by the same academics we want to support.

An editorial blog post for the Chronicle of Higher Education is titled, provocatively, “A Machiavellian Guide to Destroying Public Universities in 12 Easy Steps.” The fifth step is assessment regimes:

(5) Put into place various “oversight instruments,” such as quality-assessment exercises, “outcome matrices,” or auditing mechanisms, to assure “transparency” and “accountability” to “stakeholders.” You might try using research-assessment exercises such as those in Britain or Australia, or cheaper and cruder measures like Texas A&M’s, by simply publishing a cost/benefit analysis of faculty members.

This reminded me of a similar cri de coeur a few years ago in the New York Review of Books. In “The Grim Threat to British Universities,” Simon Head warned about applying a (US) business-style “bureaucratic control” – performance indicators, metrics, and measurement of outputs, etc. – to scholarship, especially science. Researchers often feel that administrators have no idea what research entails, and often for a good reason. For example, Warner’s executive dean for the humanities is a “young lawyer specialising in housing.”

A consistent theme in user studies with researchers is the sizeable gulf between what they use and desire and the kinds of support services that libraries and universities offer.[1] A typical case study in the life sciences, for example, concludes that there is a “significant gap” between researchers’ use of information and the strategies of funders and policy-makers.[2] In particular, researchers consider libraries unlikely to play a desirable role supporting research. [3]

Our own RIN and OCLC Research studies interviewing researchers reveal that libraries offering to manage research information seems “orthogonal, and at worst irrelevant,” to the needs of researchers.[4] One of the trends that stands out is oversight: researchers require autonomy, so procedures mandated in a top-down fashion are mostly perceived as intrusive and unnecessary.

Librarians and administrators need to respect networks of trust between researchers. In particular, researchers may resist advice from the Research Office or any other internal agency removed from the colleagues they work with.[5]

Researchers feel that their job is to do research. They begrudge any time spent on activities that serve administrative purposes.[6] A heavy-handed approach to participation in research information management is unpopular and can back-fire.[7] In some cases, mandates and requirements – such as national assessment regimes – become disincentives for researchers to improve methodologies or share their research.[8]

On occasion researchers have pushed back against such regimes. For example, in 2011, Australian scholars successfully quashed a journal-ranking system used for assessment. The academics objected that such a flawed “blunt instrument” for evaluating individuals uses crude criteria to rank journals rather than professional respect. [9]

Warner – like many humanists I have met – calls for a remedy that research libraries could provide. “By the end of 2013, all the evidence had been gathered, and the inventory of our publications fought over, recast and finally sent off to be assessed by panels of peers… A scholar whose works are left out of the tally is marked for assisted dying.” Librarians can  improve information about those “works left out,” or get the attributions right.

But assisted dying? Yikes. At our June meeting in Amsterdam on Supporting Change/Changing Support, Paul Wouters gave a thoughtful warning of the “seduction” of measurements, such as the trendy quantified self. Wouters gave citation analysis as an example of a measure that is necessarily backward-looking and disadvantages some domains. “You can’t see everything in publications.” Wouters pointed out that assessment is a bit “close to the skin” for academics, and that libraries might not want to “torment their researchers,” inadvertently making an honest mistake that could influence or harm careers.

Just because we can, we might consider whether we should, and when, and how. The politics of choosing to participate in expertise profiling and research assessment regimes potentially have consequences for research libraries that are trying to win the trust of their faculty members.

References beyond embedded links:

[1] pp. 4, 70 in Sheridan Brown and Alma Swan (i.e. Key Perspectives). 2007. Researchers’ use of academic libraries and their services. London: RIN (Research Information Network)/CURL (Consortium of Research Libraries). http://www.rin.ac.uk/our-work/using-and-accessing-information-resources/researchers-use-academic-libraries-and-their-serv

[2] pp. 5-6 in Robin Williams and Graham Pryor. 2009. Patterns of information use and exchange: case studies of researchers in the life sciences. London: RIN and the British Library. http://www.rin.ac.uk/our-work/using-and-accessing-information-resources/patterns-information-use-and-exchange-case-studie

[3] Brown and Swan 2007, p. 4.

[4] p. 6 in John MacColl and Michael Jubb. 2011. Supporting research: environments, administration and libraries. Dublin, Ohio: OCLC Research and London: Research Information Network (RIN). http://www.oclc.org/research/publications/library/2011/2011-10.pdf

[5] p. 10 in Research Information Network (RIN). 2010. Research support services in selected UK universities. London: RIN. http://www.rin.ac.uk/system/files/attachments/Research_Support_Services_in_UK_Universities_report_for_screen.pdf

[6] MacColl and Jubb, 2011, p. 3-4.

[7] p. 12-13 in Martin Feijen. 2011. What researchers want: A literature study of researchers’ requirements with respect to storage and access to research data. Utrecht: SURFfoundation. http://www.surf.nl/nl/publicaties/Documents/What_researchers_want.pdf. P. 56 in Elizabeth Jordan, Andrew Hunter, Becky Seale, Andrew Thomas and Ruth Levitt. 2011. Information handling in collaborative research: an exploration of five case studies. London: RIN and the BL. http://www.rin.ac.uk/our-work/using-and-accessing-information-resources/collaborative-research-case-studies. MacColl and Jubb 2011, p.6.

[8] p. 53 in Robin Williams and Graham Pryor. 2009. Patterns of information use and exchange: case studies of researchers in the life sciences. London: RIN and the British Library. http://www.rin.ac.uk/our-work/using-and-accessing-information-resources/patterns-information-use-and-exchange-case-studie

[9] Jennifer Howard. 2011 (June 1). “Journal-ranking system gets dumped after scholars complain.” Chronicle of higher education. http://chronicle.com/article/Journal-Ranking-System-Gets/127737/

 

About Jennifer Schaffner

Jennifer Schaffner is a Program Officer with the OCLC Research Library Partnership. She works with the rare books, manuscripts and archives communities. She joined RLG/OCLC Research in August of 2007.

Order Up: 10 Thanksgiving Menu Inspirations / DPLA

With Thanksgiving just a day away, the heat’s turned up for the perfect kitchen creation. Whether you’re the one cooking the turkey, or are just in charge of expertly arranging the table napkins, creating the perfect Thanksgiving meal is a big responsibility. Take some cues from these Thanksgiving dinner menus from hotels and restaurants across the country, from The New York Public Library.

Gramercy Park Hotel, NY, 1955.
Metropole Hotel, Fargo, ND, 1898.
The New Yorker at Terrace Restaurant, NY, 1930.
Briggs House, Chicago, IL, 1899.
Normandie Café, Detroit, MI, 1905.
Hotel De Dijon, France, 1881.
M.F. Lyons Dining Rooms, NY, 1906.
L’Aiglon, NY, 1947.
Hotel Roanoke, Roanoke, VA, 1899.
The Waldorf Astoria, NY, 1961.

Collecting and Preserving Digital Art: Interview with Richard Rinehart and Jon Ippolito / Library of Congress: The Signal

Jon Ippolito, Associate Professor of New Media at the University of Maine

Jon Ippolito, Professor of New Media at the University of Maine

As artists have embraced a range of new media and forms in the last century as the work of collecting, conserving and exhibiting these works has become increasingly complex and challenging. In this space, Richard Rinehart and Jon Ippolito have been working to develop and understand approaches to ensure long-term access to digital works. In this installment of our insights interview series I discuss Richard and Jon’s new book, “Re-collection: Art, New Media, and Social Memory.” The book offers an articulation of their variable media approach to thinking about works of art. I am excited to take this opportunity to explore the issues the book raises about digital art in particular and a perspective on digital preservation and social memory more broadly as part of our Insights Interview Series.

Trevor: The book takes a rather broad view of “new media”; everything from works made of rubber, to CDs, art installations made of branches, arrangements of lighting, commercial video games and hacked variations of video games. For those unfamiliar with your work more broadly, could you tell us a bit about your perspective on how these hang together as new media? Further, given that the focus of our audience is digital preservation, could you give us a bit of context for what value thinking about various forms of non-digital variable new media art offer us for understanding digital works?

Richard Rinehart, Richard Rinehart, Director of the Samek Art Museum at Bucknell University.

Richard Rinehart, Director of the Samek Art Museum at Bucknell University.

Richard: Our book does focus on the more precise and readily-understood definition of new media art as artworks that rely on digital electronic computation as essential and inextricable. The way we frame it is that these works are at the center of our discussion, but we also discuss works that exist at the periphery of this definition. For instance, many digital artworks are hybrid digital/physical works (e.g., robotic works) and so the discussion cannot be entirely contained in the bitstream.

We also discuss other non-traditional art forms–performance art, installation art–that are not as new as “new media” but are also not that old in the history of museum collecting. It is important to put digital art preservation in an historical context, but also some of the preservation challenges presented by these works are shared with and provide precedents for digital art. These precedents allow us to tap into previous solutions or at least a history of discussion around them that could inform or aid in preserving digital art. And, vice versa, solutions for preserving digital art may aid in preserving these other forms (not least of which is shifting museum practices). Lastly, we bring non-digital (but still non-traditional) art forms into the discussion because some of the preservation issues are technological and media-based (in which case digital is distinct) but some issues are also artistic and theoretical, and these issues are not necessarily limited to digital works.

Jon: Yeah, we felt digital preservation needed a broader lens. The recorded culture of the 20th century–celluloid, vinyl LPs, slides–is a historical anomaly that’s a misleading precedent for preserving digital artifacts. Computer scientist Jeff Rothenberg argues that even JPEGs and PDF documents are best thought of as applications that must be “run” to be accessed and shared. We should be looking at paradigms that are more contingent than static files if we want to forecast the needs of 21st-century heritage.

Casting a wider net can also help preservationists jettison our culture’s implicit metaphor of stony durability in favor of one of fluid adaptability. Think of a human record that has endured and most of us picture a chiseled slab of granite in the British Museum–even though oral histories in the Amazon and elsewhere have endured far longer. Indeed, Dragan Espenschied has pointed out cases in which clay tablets have survived longer than stone because of their adaptability: they were baked as is into new buildings, while the original carvings on stones were chiseled off to accommodate new inscriptions. So Richard and I believe digital preservationists can learn from media that thrive by reinterpretation and reuse.

Trevor: The book presents technology, institutions and law as three sources of problems for the conservation of variable media art and potentially as three sources of possible solutions. Briefly, what do you see as the most significant challenges and opportunities in these three areas? Further, are there any other areas you considered incorporating but ended up leaving out?

Jon: From technology, the biggest threat is how the feverish marketing of our techno-utopia masks the industry’s planned obsolescence. We can combat this by assigning every file on our hard drives and gadget on our shelves a presumptive lifespan, and leaving room in our budgets to replace them once their expiration date has expired.

From institutions, the biggest threat is that their fear of losing authenticity gets in the way of harnessing less controllable forms of cultural perseverance such as proliferative preservation. Instead of concentrating on the end products of culture, they should be nurturing the communities where it is birthed and finds meaning.

From the law, the threat is DRM, the DMCA, and other mechanisms that cut access to copyrighted works–for unlike analog artifacts, bits must be accessed frequently and openly to survive. Lawyers and rights holders should be looking beyond the simplistic dichotomy of copyright lockdown versus “information wants to be free” and toward models in which information requires care, as is the case for sacred knowledge in many indigenous cultures.

Other areas? Any in which innovative strategies of social memory are dismissed because of the desire to control–either out of greed (“we can make a buck off this!”) or fear (“culture will evaporate without priests to guard it!”).

Trevor: One of the central concepts early in the book is “social memory,” in fact, the term makes its way into the title of the book. Given its centrality, could you briefly explain the concept and discuss some of how this framework for thinking about the past changes or upsets other theoretical perspectives on history and memory that underpin work in preservation and conservation?

Richard: Social memory is the long-term memory of societies. It’s how civilizations persist from year to year or century to century. It’s one of the core functions of museums and libraries and the purpose of preservation. It might alternately be called “cultural heritage,” patrimony, etc. But the specific concept of social memory is useful for the purpose of our book because there is a body of literature around it and because it positions this function as an active social dynamic rather than a passive state (cultural heritage, for instance, sounds pretty frozen). It was important to understand social memory as a series of actions that take place in the real world every day as that then helps us to make museum and preservation practices tangible and tractable.

The reason to bring up social memory in the first place is to gain a bit of distance on the problem of preserving digital art. Digital preservation is so urgent that most discussions (perhaps rightfully) leap right to technical issues and problem-solving. But, in order to effect the necessary large-scale and long-term changes in, say, museum practices, standards and policies we need to understand the larger context and historic assumptions behind current practices. Museums (and every cultural heritage institution) are not just stubborn; they do things a certain way for a reason. To convince them to change, we cannot just point at ad-hoc cases and technical problematics; we have to tie it to their core mission: social memory. The other reason to frame it this way is that new media really are challenging the functions of social memory; not just in museums, but across the board and here’s one level in which we can relate and share solutions.

These are some ways in which the social  memory allows us to approach preservation differently in the book, but here’s another, more specific one. We propose that social memory takes two forms: formal/canonical/institutional memory and informal/folkloric/personal memory (and every shade in between). We then suggest how the preservation of digital art may be aided by BOTH social memory functions.

Trevor: Many of the examples in the book focus on boundary-breaking installation art, like Flavin’s work with lighting, and conceptual art, like Nam June Paik’s work with televisions and signals, or Cory Arcangel’s interventions on Nintendo cartridges. Given that these works push the boundaries of their mediums, or focus in depth on some of the technical and physical properties of their mediums do you feel like lessons learned from them apply directly to seemingly more standardized and conventional works in new media? For instance, mass produced game cartridges or Flash animations and videos? To what extent are lessons learned about works largely intended to be exhibited art in galleries and museums applicable to more everyday mass-produced and consumed works?

Richard: That’s a very interesting question and its speaks to our premise that preserving digital art is but one form of social memory and that lessons learned therein may benefit other areas. I often feel that preserving digital art is useful for other preservation efforts because it provides an extreme case. Artists (and the art world) ensure that their media creations are about as complex as you’ll likely find; not necessarily technically (although some are technically complex and there are other complexities introduced in their non-standard use of technologies) but because what artists do is to complicate the work at every level–conceptually, phenomenologically, socially, technically; they think very specifically about the relationship between media and meaning and then they manifest those ideas in the digital object.

I fully understand that preserving artworks does not mean trying to capture or preserve the meaning of those objects (an impossible task) but these considerations must come into play when preserving art even at a material level; especially in fungible digital media. So, for just one example, preserving digital artworks will tell us a lot about HCI considerations that attend preserving other types of interactive digital objects.

Jon: Working in digital preservation also means being a bit of a futurist, especially in an age when the procession from medium to medium is so rapid and inexorable. And precisely because they play with the technical possibilities of media, today’s artists are often society’s earliest adopters. My 2006 book with Joline Blais, “At the Edge of Art,” is full of examples, whether how Google Earth came from Art+Com, Wikileaks from Antoni Muntadas, or gestural interfaces from Ben Fry and Casey Reas. Whether your metaphor for art is antennae (Ezra Pound) or antibodies (Blais), if you pay attention to artists you’ll get a sneak peek over the horizon.

Trevor: Richard suggests that the key to digital media is variability and not fixity which is the defining feature of digital media. Beyond this that conservators should move away from “outdated notions of fixity.” Given the importance of the concept of fixity in digital preservation circles, could you unpack this a bit for us? While digital objects do indeed execute and perform the fact that I can run a fixity check and confirm that this copy of the digital object is identical to what it was before seems to be an incredibly powerful and useful component of ensuring long-term access to them. Given that based on the nature of digital objects, we can actually ensure fixity in a way we never could with analog artifacts, this idea of distancing ourselves from fixity seemed strange.

Richard: You hit the nail on the head with that last sentence; and we’re hitting a little bit of a semantic wall here as well–fixity as used in computer science and certain digital preservation circles does not quite have the same meaning as when used in lay text or in the context of traditional object-based museum preservation. I was using fixity in the latter sense (as the first book on this topic, we wrote for a lay audience and across professional fields as much as possible.) Your last thought compares the uses of “fixity” as checks between analog media (electronic, reproducible; film, tape, or vinyl) compared to digital media, but in the book I was comparing fixity as applied to a different class of analog objects (physical; marble, bronze, paint) compared to digital objects.

If we step back from the professional jargon for a moment, I would characterize the traditional museological preservation approach for oil painting and bronze sculptures to be one based on fixity. The kind of digital authentication that you are talking about is more like the scientific concept of repeatability; a concept based on consistency and reproduction–the opposite of the fixity! I think the approach we outline in the book is in opposition to fixity of the marble-bust variety (as inappropriate for digital media) but very much in-line with fixity as digital authentication (as one tool for guiding and balancing a certain level of change with a certain level of integrity.) Jon may disagree here–in fact we built in these dynamics of agreement/disagreement into our book too.

Jon: I’d like to be as open-minded as Richard. But I can’t, because I pull my hair out every time I hear another minion of cultural heritage fixated on fixity. Sure, it’s nifty that each digital file has a unique cryptographic signature we can confirm after each migration. The best thing about checksums is that they are straightforward, and many preservation tools (and even some operating systems) already incorporate such checks by default. But this seems to me a tiny sliver of a far bigger digital preservation problem, and to blow it out of proportion is to perpetuate the myth that mathematical replication is cultural preservation.

Two files with different passages of 1s and 0s automatically have different checksums but may still offer the same experience; for example, two copies of a digitized film may differ by a few frames but look identical to the human eye. The point of digitizing a Stanley Kubrick film isn’t to create a new mathematical artifact with its own unchanging properties, but to capture for future generations the experience us old timers had of watching his cinematic genius in celluloid. As a custodian of culture, my job isn’t to ensure my DVD of A Clockwork Orange is faithful to some technician’s choices when digitizing the film; it’s to ensure it’s faithful to Kubrick’s choices as a filmmaker.

Furthermore, there’s no guarantee that born-digital files with impeccable checksums will bear any relationship to the experience of an actual user. Engineer and preservationist Bruno Bachiment gives the example of an archivist who sets a Web spider loose on a website, only to have the website’s owners update it in the middle of the crawling process. (This happens more often than you might think.) Monthly checksums will give the archivist confidence that she’s archived that website, but in fact her WARC files do not correspond to any digital artifact that has ever existed in the real world. Her chimera is a perversion caused by the capturing process–like those smartphone panoramas of a dinner where the same waiter appears at both ends of the table.

As in nearly all storage-based solutions, fixity does little to help capture context.  We can run checksums on the Riverside “King Lear” till the cows come home, and it still won’t tell us that boys played women’s parts, or that Elizabethan actors spoke with rounded vowels that sound more like a contemporary American accent than the King’s English, or how each generation of performers has drawn on the previous for inspiration. Even on a manuscript level, a checksum will only validate one of many variations of a text that was in reality constantly mutating and evolving.

The context for software is a bit more cut-and-dried, and the professionals I know who use emulators like to have checksums to go with their disk images. But checksums don’t help us decide what resolution or pace they should run at, or what to do with past traces of previous interactions, or what other contemporaneous software currently taken for granted will need to be stored or emulated for a work to run in the future.

Finally, even emulation will only capture part of the behaviors necessary to reconstruct digital creations in the networked age, which can depend on custom interfaces, environmental data or networks. You can’t just go around checksumming wearable hardware or GPS receivers or Twitter networks; the software will have to mutate to accommodate future versions of those environments.

So for a curator to run regular tests on a movie’s fixity is like a zookeeper running regular tests on a tiger’s DNA. Just because the DNA tests the same doesn’t guarantee the tiger is healthy, and if you want the species to persist in the long term, you have to accept that the DNA of individuals is certainly going to change.

We need a more balanced approach. You want to fix a butterfly? Pin it to a wall. If you want to preserve a butterfly, support an ecosystem where it can live and evolve.

Trevor: The process of getting our ideas out on the page can often play a role in pushing them in new directions. Are there any things that you brought into working on the book that changed in the process of putting it together?

Richard: A book is certainly slow media; purposefully so. I think the main change I noticed was the ability to put our ideas around preservation practice into a larger context of institutional history and social memory functions. Our previous expressions in journal articles or conference presentation simply did not allow us time to do that and, as stated earlier, I feel that both are important in the full consideration of preservation.

Jon: When Richard first approached me about writing this book, I thought, well it’s gonna be pretty tedious because it seemed we would be writing mostly about our own projects. At the time I was only aware of a single emulation testbed in a museum, one software package for documenting opinions on future states of works, and no more conferences and cross-institutional initiatives on variable media preservation than I could count on one hand.

Fortunately, it took us long enough to get around to writing the book (I’ll take the blame for that) that we were able to discover and incorporate like-minded efforts cropping up across the institutional spectrum, from DOCAM and ZKM to Preserving Virtual Worlds and JSMESS. Even just learning how many art museums now incorporate something as straightforward as an artist’s questionnaire into their acquisition process! That was gratifying and led me to think we are all riding the crest of a wave that might bear the digital flotsam of today’s culture into the future.

Trevor: The book covers a lot of ground, focusing on a range of issues and offering myriad suggestions for how various stakeholders could play a role in ensuring access to variable media works into the future. In all of that, is there one message or issue in the work that you think is the most critical or central?

Richard: After expanding our ideas in a book; it’s difficult to come back to tweet format, but I’ll try…

Change will happen. Don’t resist it; use it, guide it. Let art breathe; it will tell you what it needs.

Jon: And don’t save documents in Microsoft Word.

Congratulations to the Panton Fellows 2013-2014 / Open Knowledge Foundation

Samuel Moore, Rosie Graves and Peter Kraker are the 2013-2014 Open Knowledge Panton Fellows – tasked with experimenting, exploring and promoting open practises through their research over the last twelve months. They just posted their final reports so we’d like to heartily congratulate them on an excellent job and summarise their highlights for the Open Knowledge community.

Over the last two years the Panton Fellowships have supported five early career researchers to further the aims of the Panton Principles for Open Data in Science alongside their day to day research. The provision of additional funding goes some way towards this aim, but a key benefit of the programme is boosting the visibility of the Fellow’s work within the open community and introducing them to like-minded researchers and others within the Open Knowledge network.

On stage at the Open Science Panel Vienna (Photo by FWF/APA-Fotoservice/Thomas Preiss)

On stage at the Open Science Panel Vienna (Photo by FWF/APA-Fotoservice/Thomas Preiss)

Peter Kraker (full report) is a postdoctoral researcher at the Know-Centre in Graz and focused his fellowship work on two facets: open and transparent altmetrics and the promotion of open science in Austria and beyond. During his Felowship Peter released the open source visualization Head Start, which gives scholars an overview of a research field based on relational information derived from altmetrics. Head Start continues to grow in functionality, has been incorporated into Open Knowledge Labs and is soon to be made available on a dedicated website funded by the fellowship.

Peter’s ultimate goal is to have an environment where everybody can create their own maps based on open knowledge and share them with the world. You are encouraged to contribute! In addition Peter has been highly active promoting open science, open access, altmetrics and reproducibility in Austria and beyond through events, presentations and prolific blogging, resulting in some great discussions generated on social media. He has also produced a German summary of open science activities every month and is currently involved in kick-starting a German-speaking open science group through the Austrian and German Open Knowledge local groups.

Rosie with an air quality monitor

Rosie Graves (full report) is a postdoctoral researcher at the University of Leicester and used her fellowship to develop an air quality sensing project in a primary school. This wasn’t always an easy ride, the sensor was successfully installed and an enthusiastic set of schoolhildren were on board, but a technical issue meant that data collection was cut short, so Rosie plans to resume in the New Year. Further collaborations on crowdsourcing and school involvement in atmospheric science were even more successful, including a pilot rain gauge measurement project and development of a cheap, open source air quality sensor which is sure to be of interest to other scientists around the Open Knowledge network and beyond. Rosie has enjoyed her Panton Fellowship year and was grateful for the support to pursue outreach and educational work:

“This fellowship has been a great opportunity for me to kick start a citizen science project … It also allowed me to attend conferences to discuss open data in air quality which received positive feedback from many colleagues.”

Samuel Moore (full report) is a doctoral researcher in the Centre for e-Research at King’s College London and successfully commissioned, crowdfunded and (nearly) published an open access book on open research data during his Panton Year: Issues in Open Research Data. The book is still in production but publication is due during November and we encourage everyone to take a look. This was a step towards addressing Sam’s assessment of the nascent state of open data in the humanities:

“The crucial thing now is to continue to reach out to the average researcher, highlighting the benefits that open data offers and ensuring that there is a stock of accessible resources offering practical advice to researchers on how to share their data.”

Another initiative Sam initiated during the fellowship was establishing the forthcoming Journal of Open Humanities Data with Ubiquity Press, which aims to incentivise data sharing through publication credit, which in turn makes data citable through usual academic paper citation practices. Ultimately the journal will help researchers share their data, recommending repositories and best practices in the field, and will also help them track the impact of their data through citations and altmetrics.

We believe it is vital to provide early career researchers with support to try new open approaches to scholarship and hope other organisations will take similar concrete steps to demonstrate the benefits and challenges of open science through positive action.

Finally, we’d like to thank the Computer and Communications Industry Association (CCIA) for their generosity in funding the 2013-14 Panton Fellowships.

This blog post a cross-post from the Open Science blog, see the original here.

Sufia 4.2.0 released / Hydra Project

We are pleased to announce the release of Sufia 4.2.0.

This release of Sufia includes the ability to cache usage statistics in the application database, an accessibility fix, and a number of bug fixes. Thanks to Carolyn Cole, Michael Tribone, Adam Wead, Justin Coyne, and Mike Giarlo for their work on this release.

View the upgrade notes and a complete changelog on the release page: https://github.com/projecthydra/sufia/releases/tag/v4.2.0

Who Uses Library Mobile Websites? / LibUX

Almost every American owns a cell phone. More than half use a smartphone and sleeps with it next to the bed. How many do you think visit their library website on their phone, and what do they do there? Heads up: this one’s totally America-centric.

Who uses library mobile websites?

Almost one in five (18%) Americans ages 16-29 have used a mobile device to visit a public library’s website or access library resources in the past 12 months, compared with 12% of those ages 30 and older.) Younger Americans’ Library Habits and Expectations (2013)

If that seems anticlimactic, consider that just about every adult in the U.S. owns a cell phone, and almost every millenial in the country is using a smartphone. This is the demographic using library mobile websites, more than half of which already have a library card.

In 2012, the Pew Internet and American Life Project found that library website users were often young, not poor, educated, and–maybe–moms or dads.

Those who are most likely to have visited library websites are parents of minors, women, those with college educations, those under age 50, and people living in households earning $75,000 or more.

This correlates with the demographics of smartphone owners for 2014.

What do they want?

This 2013 Pew report makes the point that while digital natives still really like print materials and the library as a physical space, a non-trivial number of them said that libraries should definitely move most library services online. Future-of-the-library blather is often painted in black and white, but it is naive to think physical–or even traditional–services are going away any time soon. Rather, there is already demand for complementary or analogous online services.

Literally. When asked, 45% of Americans ages 16 – 29 wanted “apps that would let them locate library materials within the library.” They also wanted a library-branded Redbox (44%), and an “app to access library services” (42%) – by app I am sure they mean a mobile-first, responsive web site. That’s what we mean here at #libux.

For more on this non-controversy, listen to our chat with Brian Pichman about web vs native.

Eons ago (2012), the non-mobile specific breakdown of library web activities looked like this:

  • 82% searched the catalog
  • 72% looked for hours, location, directions, etc.
  • 62% put items on hold
  • 51% renewed them
  • 48% were interested in events and programs – especially old people
  • 44% did research
  • 30% sought readers’ advisory (book reviews or recommendations)
  • 30% paid fines (yikes)
  • 27% signed-up for library programs and events
  • 6% reserved a room

Still, young Americans are way more invested in libraries coordinating more closely with schools, offering literacy programs, and being more comfortable ( chart ). They want libraries to continue to be present in the community, do good, and have hipster decor – coffee helps.

Webbification is broadly expected, but it isn’t exactly a kudos subject. Offering comparable online services is necessary, like it is necessary that MS Word lets you save work. A library that doesn’t offer complementary or analogous online services isn’t buggy so much as it is just incomplete.

Take this away

The emphasis on the library as a physical space shouldn’t be shocking. The opportunity for the library as a hyper-locale specifically reflecting its community’s temperament isn’t one to overlook, especially for as long as libraries tally success by circulation numbers and foot traffic. The whole library-without-walls cliche that went hand-in-hand with all that Web 2.0 stuff tried to show-off the library as it could be in the cloud, but “the library as physical space” isn’t the same as “the library as disconnected space.” The tangibility of the library is a feature to be exploited both for atmosphere and web services. “Getting lost in the stacks” can and should be relegated to just something people say than something that actually happens.

The main reason for library web traffic has been and continues to be to find content (82%) and how to get it (72%).

Bullet points

  • Mobile first: The library catalog, as well as basic information about the library, must be optimized for mobile
  • Streamline transactions: placing and removing holds, checking out, paying fines. There is a lot of opportunity here. Basic optimization of the OPAC and cart can go along way, but you can even enable self checkout, library card registration using something like Facebook login, or payment through Apple Pay.
  • Be online: [duh] Offer every basic service available in person online
  • Improve in-house wayfinding through the web: think Google Indoor Maps
  • Exploit smartphone native services to anticipate context: location, as well as time-of-day, weather, etc., can be used to personalize service or contextually guess at the question the patron needs answered. “It’s 7 a.m. and cold outside, have a coffee on us.” – or even a simple “Yep. We’re open” on the front page.
  • Market the good the library provides to the community to win support (or donations)

The post Who Uses Library Mobile Websites? appeared first on LibUX.

Sufia - 4.2.0 / FOSS4Lib Recent Releases

Package: 
Release Date: 
Tuesday, November 25, 2014

Last updated November 25, 2014. Created by Peter Murray on November 25, 2014.
Log in to edit this page.

The 4.2.0 release of Sufia includes the ability to cache usage statistics in the application database, an accessibility fix, and a number of bug fixes.

Bookmarks for November 25, 2014 / Nicole Engard

Today I found the following resources and bookmarked them on <a href=

  • PressForward
    A free and open-source software project launched in 2011, PressForward enables teams of researchers to aggregate, filter, and disseminate relevant scholarship using the popular WordPress web publishing platform. Just about anything available on the open web is fair game: traditional journal articles, conference papers, white papers, reports, scholarly blogs, and digital projects.

Digest powered by RSS Digest

The post Bookmarks for November 25, 2014 appeared first on What I Learned Today....

CopyTalk: Free Copyright Webinar / District Dispatch

Join us for our CopyTalk, our copyright webinar, on December 4 at 2pm Eastern Time. This installment of CopyTalk is entitled, “Introducing the Statement of Best Practices in Fair Use of Collections Containing Orphan Works for Libraries, Archives, and Other Memory Institutions”.

Peter Jaszi (American University, Washington College of Law) and David Hansen (UC Berkeley and UNC Chapel Hill) will introduce the “Statement of Best Practices in Fair Use of Collections Containing Orphan Works for Libraries, Archives, and Other Memory Institutions.” This Statement, the most recent community-developed best practices in fair use, is the result of intense discussion group meetings with over 150 librarians, archivists, and other memory institution professionals from around the United States to document and express their ideas about how to apply fair use to collections that contain orphan works, especially as memory institutions seek to digitize those collections and make them available online. The Statement outlines the fair use rationale for use of collections containing orphan works by memory institutions and identifies best practices for making assertions of fair use in preservation and access to those collections.

There is no need to pre-register! Just show up on December 2, at 2pm Eastern time. http://ala.adobeconnect.com/copyright/

The post CopyTalk: Free Copyright Webinar appeared first on District Dispatch.

From the Book Patrol: A Parade of Thanksgiving Goodness / DPLA

Did you know that over 2,400 items related to Thanksgiving reside at the DPLA? From Thanksgiving menus from hotels and restaurants across this great land to Thanksgiving postcards to images of the fortunate and less fortunate taking part in Thanksgiving day festivities.

Here’s just a taste of Thanksgiving at the Digital Public Library of America.

Enjoy and and have a Happy Thanksgiving!

Thanksgiving Day, Raphael Tuck & Sons, 1907
Macy’s Thanksgiving Day Parade, 1932 Photograph by Alexander Alland
 Japanese Internment Camp – Gila River Relocation Center, Rivers, Arizona. One of the floats in the Thanksgiving day Harvest Festival, 11/26/1942
Annual Presentation of Thanksgiving Turkey, 11/16/1967 . Then President Lyndon Baines Johnson presiding
 A man with an axe in the midst of a flock of turkeys. Greenville North Carolina,1965
 Woman carries Thanksgiving turkey at Thresher & Kelley Market, Faneuil Hall in Boston, 1952. Photograph by Leslie Jones
 Thanksgiving Dinner Menu. Hotel Scenley, Pittsburgh, PA. 1900
More than 100 wounded African American soldiers, sailors, marines and Coast Guardsmen were feted by The Equestriennes, a group of Government Girls, at an annual Thanksgiving dinner at Lucy D. Slowe Hall, Washington, D. C., Photograph by Helen Levitt, 1944.
Volunteers of America Thanksgiving, 22 November 1956. Thanksgiving dinner line in front of Los Angeles Street Post door

Have questions about WIOA? / District Dispatch

ALA Washington OfficeTo follow up on the October 27th webinar “$2.2 Billion Reasons to Pay Attention to WIOA,” the American Library Association (ALA) today releases a list of resources and tools that provide more information about the Workforce Innovation and Opportunity Act (WIOA). The Workforce Innovation and Opportunity Act allows public libraries to be considered additional One-Stop partners, prohibits federal supervision or control over selection of library resources and authorizes adult education and literacy activities provided by public libraries as an allowable statewide employment and training activity.

Subscribe to the District Dispatch, ALA’s policy blog, to be alerted to when additional WIOA information becomes available.

The post Have questions about WIOA? appeared first on District Dispatch.

Advanced DSpace Training / FOSS4Lib Upcoming Events

Date: 
Tuesday, March 17, 2015 - 08:00 to Thursday, March 19, 2015 - 17:00
Supports: 

Last updated November 25, 2014. Created by Peter Murray on November 25, 2014.
Log in to edit this page.

In-person, 3-day Advanced DSpace Course in Austin March 17-19, 2015. The total cost of the course is being underwritten with generous support from the Texas Digital Library and DuraSpace. As a result, the registration fee for the course for DuraSpace Members is only $250 and $500 for Non-Members (meals and lodging not included). Seating will be limited to 20 participants.

For more details, see http://duraspace.org/articles/2382

Dutch vs. Elsevier / David Rosenthal

The discussions between libraries and major publishers about subscriptions have only rarely been actual negotiations. In almost all cases the libraries have been unwilling to walk away and the publishers have known this. This may be starting to change; Dutch libraries have walked away from the table with Elsevier. Below the fold, the details.

VNSU, the association representing the 14 Dutch research universities, negotiates on their behalf with journal publishers. Earlier this month they announced that their current negotiations with Elsevier are at an impasse, on the issues of costs and the Dutch government's Open Access mandate:
Negotiations between the Dutch universities and publishing company Elsevier on subscription fees and Open Access have ground to a halt. In line with the policy pursued by the Ministry of Education, Culture and Science, the universities want academic publications to be freely accessible. To that end, agreements will have to be made with the publishers. The proposal presented by Elsevier last week totally fails to address this inevitable change.
In their detailed explanation for scientists (PDF), VNSU elaborates:
During several round[s] of talks, no offer was made which would have led to a real, and much-needed, transition to open access. Moreover, Elsevier has failed to deliver an offer that would have kept the rising costs of library subscriptions at an acceptable level. ... In the meantime, universities will prepare for the possible consequences of an expiration of journal subscriptions. In case this happens researchers will still be able to publish in Elsevier journals. They will also have access to back issues of these journals. New issues of Elsevier journals as of 1-1-2015 will not be accessible anymore.
I assume that this means that post-cancellation access will be provided by Elsevier directly, rather than by an archiving service. The government and the Dutch research funder have expressed support for VNSU's position.

This stand by the Dutch is commendable; the outcome will be very interesting. In a related development, if my marginal French is not misleading me, a new law in Germany allows authors of publicly funded research to make their accepted manuscripts freely available 1 year after initial publication. Both stand in direct contrast to the French "negotiation" with Elsevier:
France may not have any money left for its universities but it does have money for academic publishers.
While university presidents learn that their funding is to be reduced by EUR 400 million, the Ministry of Research has decided, under great secrecy, to pay EUR 172 million to the world leader in scientific publishing Elsevier .

Top Technologies Webinar – Dec. 2, 2014 / LITA

Don’t miss the Top Technologies Every Librarian Needs to Know Webinar with Presenters: Brigitte Bell, Steven Bowers, Terry Cottrell, Elliot Polak and Ken Varnum

Offered: December 2, 2014
1:00 pm – 2:00 pm Central Time

See the full course description with registration information here.
or
Register Now Online, page arranged by session date (login required)

Varnum300pebWe’re all awash in technological innovation. It can be a challenge to know what new tools are likely to have staying power — and what that might mean for libraries. The recently published Top Technologies Every Librarian Needs to Know highlights a selected set of technologies that are just starting to emerge and describes how libraries might adapt them in the next few years.

In this webinar, join the authors of three chapters from the book as they talk about their technologies and what they mean for libraries.

Hands-Free Augmented Reality: Impacting the Library Future
Presenters: Brigitte Bell & Terry Cottrell

Based on the recent surge of interest in head-mounted augmented reality devices such as the 3D gaming console Oculus Rift and Google’s Glass project, it seems reasonable to expect that the implementation of hands-free augmented reality technology will become common practice in libraries within the next 3-5 years.

The Future of Cloud-Based Library Systems
Presenters: Elliot Polak & Steven Bowers

In libraries, cloud computing technology can reduce the costs and human capital associated with maintaining a 24/7 Integrated Library System while facilitating an up-time that is costly to attain in-house. Cloud-Based Integrated Library Systems can leverage a shared system environment, allowing libraries to share metadata records and other system resources while maintaining independent local information allowing for reducing redundant workflows and yielding efficiencies for cataloging/metadata and acquisitions departments.

Library Discovery: From Ponds to Streams
Presenter: Ken Varnum

Rather than exploring focused ponds of specialized databases, researchers now swim in oceans of information. What is needed is neither ponds (too small in our interdisciplinary world) or oceans (too broad and deep for most needs), but streams — dynamic, context-aware subsets of the whole, tailored to the researcher’s short- or long-term interests.

Webinar Fees are:

LITA Member: $39
Non-Member: $99
Group: $190

Register Online now to join us what is sure to be an excellent and informative webinar.

Code for Africa & Open Knowledge Launch Open Government Fellowship Pilot Programme: Apply Today / Open Knowledge Foundation

Open Knowledge and Code for Africa launch pilot Open Government Fellowship Programme. Apply to become a fellow today. This blog announcement is available in French here and Portuguese here.

C4A_logo (1) OpenKnowledge_LOGO_COLOUR_CMYK

Open Knowledge and Code for Africa are pleased to announce the launch of our pilot Open Government Fellowship programme. The six month programme seeks to empower the next generation of leaders in field of open government.


We are looking for candidates that fit the following profile:

  • Currently engaged in the open government and/or related communities . We are looking to support individuals already actively participating in the open government community
  • Understands the role of civil society and citizen based organisations in bringing about positive change through advocacy and campaigning
  • Understands the role and importance of monitoring government commitments on open data as well as on other open government policy related issues
  • Has facilitation skills and enjoys community-building (both online and offline).
  • Is eager to learn from and be connected with an international community of open government experts, advocates and campaigners
  • Currently living and working in Africa. Due to limited resources and our desire to develop a focused and impactful pilot programme, we are limiting applications to those currently living and working in Africa. We hope to expand the programme to the rest of the world starting in 2015.

The primary objective of the Open Government Fellowship programme is to identify, train and support the next generation of open government advocates and community builders. As you will see in the selection criteria, the most heavily weighted item is current engagement in the open government movement at the local, national and/or international level. Selected candidates will be part of a six-month fellowship pilot programme where we expect you to work with us for an average of six days a month, including attending online and offline trainings, organising events, and being an active member of the Open Knowledge and Code for Africa communities.

Fellows will be expected to produce tangible outcomes through during their fellowship but what these outcomes are will be up to the fellows to determine. In the application, we ask fellows to describe their vision for their fellowship or, to put it another way, to lay out what they would like to accomplish. We could imagine fellows working with a specific government department or agency to make a key dataset available, used and useful by the community or organising a series of events addressing a specific topic or challenge citizens are currently facing. We do not wish to be prescriptive, there are countless possibilities for outcomes for the fellowship but successful candidates will demonstrate a vision that has clear, tangible outcomes.

To support fellows in achieving these outcomes, all fellows will receive a stipend of $1,000 per month in addition to a project grant of $3,000 to spend over the course of your fellowship. Finally, a travel stipend is available for each fellow for national and/or international travel related to furthering the objective of their fellowship.

There are up to 3 fellowship positions open for the February to July 2015 pilot programme. Due to resourcing, we will only be accepting fellowship applications from individuals living and working in Africa. Furthermore, in order to ensure that we are able to provide fellows with strong local support during the pilot phase, we will are targeting applicants from the following countries where Code for Africa and/or Open Knowledge already have existing networks: Angola, Burkina Faso, Cameroon, Ghana, Kenya, Morocco, Mozambique, Mauritius, Namibia, Nigeria, Rwanda, South Africa, Senegal, Tunisia, Tanzania, and Uganda. We are hoping to roll out the programme in other regions in autumn 2015. If you are interested in the fellowship but not currently located in one of the target countries, please get in touch.

Do you have questions? See more about the Fellowship Programme here and have a looks at this Frequently Asked Questions (FAQ) page. If this doesn’t answer your question, email us at Katelyn[dot]Rogers[at]okfn.org

Not sure if you fit the profile? Drop us a line!

Convinced? Apply now to become a Open Government fellow. If you would prefer to submit your application in French or Portuguese, translations of the application form are available in French here and in Portuguese here.

The application will be open until the 15th of December 2014 and the programme will start in February 2015. We are looking forward to hearing from you!

Educators Rejoice! This Week’s Featured Content from the PeerLibrary Collections / PeerLibrary

PeerLibrary’s groups and collections functionality is especially suited towards educators running classes that involve reading and discussing various academic publications. This week we would like to highlight one such collection, created for a graduate level computer science class taught by Professor John Kubiatowicz at UC Berkeley. The course, Advanced Topics in Computer Systems, requires weekly readings which are handily stored on the PeerLibrary platform for students to read, discuss, and collaborate outside of the typical classroom setting. Articles within the collection come from a variety of sources, such as the publicly available “Key Range Locking Strategies” and the closed access “ARIES: A Transaction Recovery Method”. Even closed access articles, which hide the article from unauthorized users, allow users to view the comments and annotations!

“Gates Foundation to require immediate free access for journal articles” / Jonathan Rochkind

http://news.sciencemag.org/funding/2014/11/gates-foundation-require-immediate-free-access-journal-articles

Gates Foundation to require immediate free access for journal articles

By Jocelyn Kaiser 21 November 2014 1:30 pm

Breaking new ground for the open-access movement, the Bill & Melinda Gates Foundation, a major funder of global health research, plans to require that the researchers it funds publish only in immediate open-access journals.

The policy doesn’t kick in until January 2017; until then, grantees can publish in subscription-based journals as long as their paper is freely available within 12 months. But after that, the journal must be open access, meaning papers are free for anyone to read immediately upon publication. Articles must also be published with a license that allows anyone to freely reuse and distribute the material. And the underlying data must be freely available.

 

Is this going to work? Will researchers be able to comply with these requirements without harm to their careers?  Does the Gates Foundation fund enough research that new open access venues will open up to publish this research (and if so how will their operation be funded?), or do sufficient venues already exist? Will Gates Foundation grants include funding for “gold” open access fees?

I am interested to find out. I hope this article is accurate about what their doing, and am glad they are doing it if so.

The Gates Foundation’s own announcement appears to be here, and their policy, which doesn’t answer very many questions but does seem to be bold and without wiggle-room, is here.

I note that the policy mentions “including any underlying data sets.”  Do they really mean to be saying that underlying data sets used for all publications “funded, in whole or in part, by the foundation” must be published? I hope so.  Requiring “underlying data sets” to be available at all is in some ways just as big or bigger as requiring them to be available open access.


Filed under: General

BitCurator Users Forum / FOSS4Lib Upcoming Events

Date: 
Friday, January 9, 2015 - 08:00 to 17:00
Supports: 

Last updated November 24, 2014. Created by Peter Murray on November 24, 2014.
Log in to edit this page.

Join BitCurator users from around the globe for a hands-on day focused on current use and future development of the BitCurator digital software environment. Hosted by the BitCurator Consortium (BCC), this event will be grounded in the practical, boots-on-the-ground experiences of digital archivists and curators. Come wrestle with current challenges—engage in disc image format debates, investigate emerging BitCurator integrations and workflows, and discuss the “now what” of handling your digital forensics outputs.

What languages do public library collections speak? / HangingTogether

Slate recently published a series of maps illustrating the languages other than English spoken in each of the fifty US states. In nearly every state, the most commonly spoken non-English language was Spanish. But when Spanish is excluded as well as English, a much more diverse – and sometimes surprising – landscape of languages is revealed, including Tagalog in California, Vietnamese in Oklahoma, and Portuguese in Massachusetts.

Public library collections often reflect the attributes and interests of the communities in which they are embedded. So we might expect that public library collections in a given state will include relatively high quantities of materials published in the languages most commonly spoken by residents of the state. We can put this hypothesis to the test by examining data from WorldCat, the world’s largest bibliographic database.

WorldCat contains bibliographic data on more than 300 million titles held by thousands of libraries worldwide. For our purposes, we can filter WorldCat down to the materials held by US public libraries, which can then be divided into fifty “buckets” representing the materials held by public libraries in each state. By examining the contents of each bucket, we can determine the most common language other than English found within the collections of public libraries in each state:

MAP 1: Most common language other than English found in public library collections, by state

MAP 1: Most common language other than English found in public library collections, by state

As with the Slate findings regarding spoken languages, we find that in nearly every state, the most common non-English language in public library collections is Spanish. There are exceptions: French is the most common non-English language in public library collections in Massachusetts, Maine, Rhode Island, and Vermont, while German prevails in Ohio. The results for Maine and Vermont complement Slate’s finding that French is the most commonly spoken non-English language in those states – probably a consequence of Maine and Vermont’s shared borders with French-speaking Canada. The prominence of German-language materials in Ohio public libraries correlates with the fact that Ohio’s largest ancestry group is German, accounting for more than a quarter of the state’s population.

Following Slate’s example, we can look for more diverse language patterns by identifying the most common language other than English and Spanish in each state’s public library collections:

MAP 2: Most common language other than English and Spanish found in public library collections, by state

MAP 2: Most common language other than English and Spanish found in public library collections, by state

Excluding both English- and Spanish-language materials reveals a more diverse distribution of languages across the states. But only a bit more diverse: French now predominates, representing the most common language other than English and Spanish in public library collections in 32 of the 50 states. Moreover, we find only limited correlation with Slate’s findings regarding spoken languages. In some states, the most common non-English, non-Spanish spoken language does match the most common non-English, non-Spanish language in public library collections – for example, Polish in Illinois; Chinese in New York, and German in Wisconsin. But only about a quarter of the states (12) match in this way; the majority do not. Why is this so? Perhaps materials published in certain languages have low availability in the US, are costly to acquire, or both. Maybe other priorities drive collecting activity in non-English materials – for example, a need to collect materials in languages that are commonly taught in primary, secondary, and post-secondary education, such as French, Spanish, or German.

Or perhaps a ranking of languages by simple counts of materials is not the right metric. Another way to assess if a state’s public libraries tailor their collections to the languages commonly spoken by state residents is to compare collections across states. If a language is commonly spoken among residents of a particular state, we might expect that public libraries in that state will collect more materials in that language compared to other states, even if the sum total of that collecting activity is not sufficient to rank the language among the state’s most commonly collected languages (for reasons such as those mentioned above). And indeed, for a handful of states, this metric works well: for example, the most commonly spoken language in Florida after English and Spanish is French Creole, which ranks as the 38th most common language collected by public libraries in the state. But Florida ranks first among all states in the total number of French Creole-language materials held by public libraries.

But here we run into another problem: the great disparity in size, population, and ultimately, number of public libraries, across the states. While a state’s public libraries may collect heavily in a particular language relative to other languages, this may not be enough to earn a high national ranking in terms of the raw number of materials collected in that language. A large, populous state, by sheer weight of numbers, may eclipse a small state’s collecting activity in a particular language, even if the large state’s holdings in the language are proportionately less compared to the smaller state. For example, California – the largest state in the US by population – ranks first in total public library holdings of Tagalog-language materials; Tagalog is California’s most commonly spoken language after English and Spanish. But surveying the languages appearing in Map 2 (that is, those that are the most commonly spoken language other than English and Spanish in at least one state), it turns out that California also ranks first in total public library holdings for Arabic, Chinese, Dakota, French, Italian, Korean, Portuguese, Russian, and Vietnamese.

To control for this “large state problem”, we can abandon absolute totals as a benchmark, and instead compare the ranking of a particular language in the collections of a state’s public libraries to the average ranking for that language across all states (more specifically, those states that have public library holdings in that language). We would expect that states with a significant population speaking the language in question would have a state-wide ranking for that language that exceeds the national average. For example, Vietnamese is the most commonly spoken language in Texas other than English and Spanish. Vietnamese ranks fourth (by total number of materials) among all languages appearing in Texas public library collections; the average ranking for Vietnamese across all states that have collected materials in that language is thirteen. As we noted above, California has the most Vietnamese-language materials in its public library collections, but Vietnamese ranks only eighth in that state.

Map 3 shows the comparison of the state-wide ranking with the national average for the most commonly spoken language other than English and Spanish in each state:

MAP 3: Comparison of state-wide ranking with national average for most commonly spoken language other than English and Spanish

MAP 3: Comparison of state-wide ranking with national average for most commonly spoken language other than English and Spanish

Now it appears we have stronger evidence that public libraries tend to collect heavily in languages commonly spoken by state residents. In thirty-eight states (colored green), the state-wide ranking of the most commonly spoken language other than English and Spanish in public library collections exceeds – often substantially – the average ranking for that language across all states. For example, the most commonly spoken non-English, non-Spanish language in Alaska – Yupik – is only the 10th most common language found in the collections of Alaska’s public libraries. However, this ranking is well above the national average for Yupik (182nd). In other words, Yupik is considerably more prominent in the materials held by Alaskan public libraries than in the nation at large – in the same way that Yupik is relatively more common as a spoken language in Alaska than elsewhere.

As Map 3 shows, six states (colored orange) exhibit a ranking equal to the national average; in all of these cases the language in question is French or German, languages that tend to be highly collected everywhere (the average ranking for French is four, and for German, five). Five states (colored red) exhibit a ranking that is below the national average; in four of the five cases, the state ranking is only one notch below the national average.

The high correlation between languages commonly spoken in a state, and the languages commonly found within that state’s public library collections suggests that public libraries are not homogenous, but in many ways reflect the characteristics and interests of local communities. It also highlights the important service public libraries provide in facilitating information access to community members who may not speak or read English fluently. Finally, public libraries’ collecting activity across a wide range of non-English language materials suggests the importance of these collections in the context of the broader system-wide library resource. Some non-English language materials in public library collections – perhaps the French Creole-language materials in Florida’s public libraries, or the Yupik-language materials in Alaska’s public libraries – could be rare and potentially valuable items that are not readily available in other parts of the country.

Visit your local public library … you may find some unexpected languages on the shelf.

Acknowledgement: Thanks to OCLC Research colleague JD Shipengrover for creating the maps.

Note on data: Data used in this analysis represent public library collections as they are cataloged in WorldCat. Data is current as of July 2013. Reported results may be impacted by WorldCat’s coverage of public libraries in a particular state.

 

About Brian Lavoie

Brian Lavoie is a Research Scientist in OCLC Research. Brian's research interests include collective collections, the system-wide organization of library resources, and digital preservation.

Multi-Entity Models.... Baker, Coyle, Petiya / Karen Coyle

Multi-Entity Models of Resource Description in the Semantic Web: A comparison of FRBR, RDA, and BIBFRAME
by Tom Baker, Karen Coyle, Sean Petiya
Published in: Library Hi Tech, v. 32, n. 4, 2014 pp 562-582 DOI:10.1108/LHT-08-2014-0081
Open Access Preprint

The above article was just published in Library hi Tech. However, because the article is a bit dense, as journal articles tend to be, here is a short description of the topic covered, plus a chance to reply to the article.

We now have a number of multi-level views of bibliographic data. There is the traditional "unit card" view, reflected in MARC, that treats all bibliographic data as a single unit. There is the FRBR four-level model that describes a single "real" item, and three levels of abstraction: manifestation, expression, and work. This is also the view taken by RDA, although employing a different set of properties to define instances of the FRBR classes. Then there is the BIBFRAME model, which has two bibliographic levels, work and instance, with the physical item as an annotation on the instance.

In support of these views we have three RDF-based vocabularies:

FRBRer (using OWL)
RDA (using RDFS)
BIBFRAME (using RDFS)

The vocabularies use a varying degree of specification. FRBRer is the most detailed and strict, using OWL to define cardinality, domains and ranges, and disjointness between classes and between properties. There are, however, no sub-classes or sub-properties. BIBFRAME properties all are defined in terms of domains (classes), and there are some sub-class and sub-property relationships. RDA has a single set of classes that are derived from the FRBR entities, and each property has the domain of a single class. RDA also has a parallel vocabulary that defines no class relationships; thus, no properties in that vocabulary result in a class entailment. [1]

As I talked about in the previous blog post on classes, the meaning of classes in RDF is often misunderstood, and that is just the beginning of the confusion that surrounds these new technologies. Recently, Bernard Vatant, who is a creator of the Linked Open Vocabularies site that does a statistical analysis of the existing linked open data vocabularies and how they relate to each other, said this on the LOV Google+ group:
"...it seems that many vocabularies in LOV are either built or used (or both) as constraint and validation vocabularies in closed worlds. Which means often in radical contradiction with their declared semantics."
What Vatant is saying here is that many vocabularies that he observes use RDF in the "wrong way." One of the common "wrong ways" is to interpret the axioms that you can define in RDFS or OWL the same way you would interpret them in, say, XSD, or in a relational database design. In fact, the action of the OWL rules (originally called "constraints," which seems to have contributed to the confusion, now called "axioms") can be entirely counter-intuitive to anyone whose view of data is not formed by something called "description logic (DL)."

A simple demonstration of this, which we use in the article, is the OWL axiom for "maximum cardinality." In a non-DL programming world, you often state that a certain element in your data is limited to the number of times it can be used, such as saying that in a MARC record you can have only one 100 (main author) field. The maximum cardinality of that field is therefore "1". In your non-DL environment, a data creation application will not let you create more than one 100 field; if an application receiving data encounters a record with more than one 100 field, it will signal an error.

The semantic web, in its DL mode, draws an entirely different conclusion. The semantic web has two key principles: open world, and non-unique name. Open world means that whatever the state of the data on the web today, it may be incomplete; there can be unknowns. Therefore, you may say that you MUST have a title for every book, but if a look at your data reveals a book without a title, then your book still has a title, it is just an unknown title. That's pretty startling, but what about that 100 field? You've said that there can only be one, so what happens if there are 2 or 3 or more of them for a book? That's no problem, says OWL: the rule is that there is only one, but the non-unique name rule says that for any "thing" there can be more than one name for it. So when an OWL program [2] encounters multiple author 100 fields, it concludes that these are all different names for the same one thing, as defined by the combination of the non-unique name assumption and the maximum cardinality rule: "There can only be one, so these three must really be different names for that one." It's a bit like Alice in Wonderland, but there's science behind it.

What you have in your database today is a closed world, where you define what is right and wrong; where you can enforce the rule that required elements absolutely HAVE TO be there; where the forbidden is not allowed to happen. The semantic web standards are designed for the open world of the web where no one has that kind of control. Think of it this way: what if you put a document onto the open web for anyone to read, but wanted to prevent anyone from linking to it? You can't. The links that others create are beyond your control. The semantic web was developed around the idea of a web (aka a giant graph) of data. You can put your data up there or not, but once it's there it is subject to the open functionality of the web. And the standards of RDFS and OWL, which are the current standards that one uses to define semantic web data, are designed specifically for that rather chaotic information ecosystem, where, as the third main principle of the semantic web states, "anyone can say anything about anything."

I have a lot of thoughts about this conflict between the open world of the semantic web and the needs for closed world controls over data; in particular whether it really makes sense to use the same technology for both, since there is such a strong incompatibility in underlying logic of these two premises. As Vatant implies, many people creating RDF data are doing so with their minds firmly set in closed world rules, such that the actual result of applying the axioms of OWL and RDF on this data on the open web will not yield the expected closed world results.

This is what Baker, Petiya and I address in our paper, as we create examples from FRBRer, RDA in RDF, and BIBFRAME. Some of the results there will probably surprise you. If you doubt our conclusions, visit the site http://lod-lam.slis.kent.edu/wemi-rdf/ that gives more information about the tests, the data and the test results.

[1] "Entailment" means that the property does not carry with it any "classness" that would thus indicate that the resource is an instance of that class.

[2] Programs that interpret the OWL axioms are called "reasoners". There are a number of different reasoner programs available that you can call from your software, such as Pellet, Hermit, and others built into software packages like TopBraid.

A Ferguson Twitter Archive / Ed Summers

If you are interested in an update about where/how to get the data after reading this see here.

Much has been written about the significance of Twitter as the recent events in Ferguson echoed round the Web, the country, and the world. I happened to be at the Society of American Archivists meeting 5 days after Michael Brown was killed. During our panel discussion someone asked about the role that archivists should play in documenting the event.

There was wide agreement that Ferguson was a painful reminder of the type of event that archivists working to “interrogate the role of power, ethics, and regulation in information systems” should be documenting. But what to do? Unfortunately we didn’t have time to really discuss exactly how this agreement translated into action.

Fortunately the very next day the Archive-It service run by the Internet Archive announced that they were collecting seed URLs for a Web archive related to Ferguson. It was only then, after also having finally read Zeynep Tufekci‘s terrific Medium post, that I slapped myself on the forehead … of course, we should try to archive the tweets. Ideally there would be a “we” but the reality was it was just “me”. Still, it seemed worth seeing how much I could get done.

twarc

I had some previous experience archiving tweets related to Aaron Swartz using Twitter’s search API. (Full disclosure: I also worked on the Twitter archiving project at the Library of Congress, but did not use any of that code or data then, or now.) I wrote a small Python command line program named twarc (a portmanteau for Twitter Archive), to help manage the archiving.

You give twarc a search query term, and it will plod through the search results, in reverse chronological order (the order that they are returned in), while handling quota limits, and writing out line-oriented-json, where each line is a complete tweet. It worked quite well to collect 630,000 tweets mentioning “aaronsw”, but I was starting late out of the gate, 6 days after the events in Ferguson began. One downside to twarc is it is completely dependent on Twitter’s search API, which only returns results for the past week or so. You can search back further in Twitter’s Web app, but that seems to be a privileged client. I can’t seem to convince the API to keep going back in time past a week or so.

So time was of the essence. I started up twarc searching for all tweets that mention ferguson, but quickly realized that the volume of tweets, and the order of the search results meant that I wouldn’t be able to retrieve the earliest tweets. So I tried to guesstimate a Twitter ID far enough back in time to use with twarc’s --max_id parameter to limit the initial query to tweets before that point in time. Doing this I was able to get back to 2014-08-10 22:44:43 — most of August 9th and 10th had slipped out of the window. I used a similar technique of guessing a ID further in the future in combination with the --since_id parameter to start collecting from where that snapshot left off. This resulted in a bit of a fragmented record, which you can see visualized (sort of below):

In the end I collected 13,480,000 tweets (63G of JSON) between August 10th and August 27th. There were some gaps because of mismanagement of twarc, and the data just moving too fast for me to recover from them: most of August 13th is missing, as well as part of August 22nd. I’ll know better next time how to manage this higher volume collection.

Apart from the data, a nice side effect of this work is that I fixed a socket timeout error in twarc that I hadn’t noticed before. I also refactored it a bit so I could use it programmatically like a library instead of only as a command line tool. This allowed me to write a program to archive the tweets, incrementing the max_id and since_id values automatically. The longer continuous crawls near the end are the result of using twarc more as a library from another program.

Bag of Tweets

To try to arrange/package the data a bit I decided to order all the tweets by tweet id, and split them up into gzipped files of 1 million tweets each. Sorting 13 million tweets was pretty easy using leveldb. I first loaded all 16 million tweets into the db, using the tweet id as the key, and the JSON string as the value.

import json
import leveldb
import fileinput
 
db = leveldb.LevelDB('./tweets.db')
 
for line in fileinput.input():
    tweet = json.loads(line)
    db.Put(tweet['id_str'], line)

This took almost 2 hours on a medium ec2 instance. Then I walked the leveldb index, writing out the JSON as I went, which took 35 minutes:

import leveldb
 
db = leveldb.LevelDB('./tweets.db')
for k, v in db.RangeIter(None, include_value=True):
    print v,

After splitting them up into 1 million line files with cut and gzipping them I put them in a Bag and uploaded it to s3 (8.5G).

I am planning on trying to extract URLs from the tweets to try to come up with a list of seed URLs for the Archive-It crawl. If you have ideas of how to use it definitely get in touch. I haven’t decided yet if/where to host the data publicly. If you have ideas please get in touch about that too!

Top Tech Trends: Call For Panelists / LITA

What technology are you watching on the horizon? Have you seen brilliant ideas that need exposing? Do you really like sharing with your LITA colleagues?

The LITA Top Tech Trends Committee is trying a new process this year and issuing a Call for Panelists. Answer the short questionnaire by 12/10 to be considered. Fresh faces and diverse panelists are especially encouraged to respond. Past presentations can be viewed at http://www.ala.org/lita/ttt.

Here’s the link:
https://docs.google.com/forms/d/1JH6qJItEAtQS_ChCcFKpS9xqPsFEUz52wQxwieBMC9w/viewform

If you have additional questions check with Emily Morton-Owens, Chair of the Top Tech Trends committee: emily.morton.owens@gmail.com

Opportunity knocks: Take the HHI 2014 National Collections Care Survey / District Dispatch

 Heritage Health Information logoHelp preserve our shared heritage, increase funding for conservation, and strengthen collections care by completing the Heritage Health Information (HHI) 2014 National Collections Care Survey. The HHI 2014 is a national survey on the condition of collections held by archives, libraries, historical societies, museums, scientific research collections, and archaeological repositories. It is the only comprehensive survey to collect data on the condition and preservation needs of our nation’s collections.

The deadline for the Heritage Health Information 2014: A National Collections Care Survey is December 19, 2014. In October, the Heritage Health Information sent invitations to the directors of over 14,000 collecting institutions across the country to participate in the survey. These invitations included personalized login information, which may be entered at hhi2014.com.

Questions about the survey may be directed to hhi2014survey [at] heritagepreservation [dot] org or 202-233-0824.

The post Opportunity knocks: Take the HHI 2014 National Collections Care Survey appeared first on District Dispatch.

Archive webinar available: Giving legal advice to patrons / District Dispatch

Reference librarian assisting readers. Photo by the Library of Congress.

Reference librarian assisting readers. Photo by the Library of Congress.

An archive of the free webinar “Lib2Gov.org: Connecting Patrons with Legal Information” is now available. Hosted jointly by the American Library Association (ALA) and iPAC, the webinar was designed to help library reference staff build confidence in responding to legal inquiries. Watch the webinar

The session offers information on laws, legal resources and legal reference practices. Participants will learn how to handle a law reference interview, including where to draw the line between information and advice, key legal vocabulary and citation formats. During the webinar, leaders offer tips on how to assess and choose legal resources for patrons.

Speaker:

Catherine McGuire is the head of Reference and Outreach at the Maryland State Law Library. McGuire currently plans and presents educational programs to Judiciary staff, local attorneys, public library staff and members of the public on subjects related to legal research and reference. She serves as Vice Chair of the Conference of Maryland Court Law Library Directors and the co-chair of the Education Committee of the Legal Information Services to the Public Special Interest Section (LISP-SIS) of the American Association of Law Libraries (AALL).

Watch the webinar

The post Archive webinar available: Giving legal advice to patrons appeared first on District Dispatch.

Islandora Show and Tell: Fundación Juan March / Islandora

A couple of week ago we kicked off Islandora Show and Tell by looking at a newly launched site: Barnard Digital Collection. This week, we're going to take a look at a long-standing Islandora site that has been one of our standard answers when someone asks "What's a great Islandora site?" - Fundación Juan March, which will, to our great fortune, be the host of the next European Islandora Camp, set for May 27 - 29, 2015.

It was a foregone conclusion that once we launched this series, we would be featuring FJM sooner rather than later, but it happens that we're visiting them just as they have launched a new collection: La saga Fernández-Shaw y el teatro lírico, containing three archives of a family of Spanish playwrights. This collection is also a great example of why we love this site: innovative browsing tools such as a timeline viewer, carefully curated collections spanning a wide varieties of objects types living side-by-side (the Knowledge Protal approach really makes this work), and seamless multi-language support.

FJM was also highlighted by D-LIB Magazine this month, as their Featured Digital Collection, a well -deserved honour that explores their collections and past projects in greater depth. 

But are there cats? There are. Of course when running my standard generic Islandora repo search term, it helps to acknowledge that this is a collection of Spanish cultural works and go looking for gatos, which leads to Venta de los gatos (Sale of Cats), Orientaçao dos gatos (Orientation of Cats), Todos los gatos son pardos (All Cats are Grey). 

Curious about the code behind this repo? FJM has been kind enough to share the details of a number of their initial collections on GitHub. Since they take the approach of using .NET for the web interface instead of using Drupal, the FJM .Net Library may also prove useful to anyone exploring alternate front-ends for their own collections.

Our Show and tell interview was completed by Luis Martínez Uribe, who will be joining us at Islandora Camp in Madrid as an instructor in the Admin Track in May 2015.


What is the primary purpose of your repository? Who is the intended audience?

We have always said that more than a technical system, the FJM digital repository tries to bring in a new working culture. Since the Islandora deployment, the repository has been instrumental in transforming the way in which data is generated and looked after across the organization. Thus the main purpose behind our repository philosophy is to take an active approach to ensure that our organizational data is managed using appropriate standards, made available via knowledge portals and preserved for future access.

The contents are highly heterogeneous with materials from the departments of Art, Music, Conferences, a Library of Spanish Music and Theatre as well as various outputs from scientific centres and scholarships. Therefore the audience ranges from the general public interested in particular art exhibitions, concerts or lecture to the highly specialised researchers in fields such as theatre, sociology or biology.

Why did you choose Islandora?

Back in 2010 the FJM was looking for a robust and flexible repository framework to manage an increasing volume of interrelated digital materials. With preservation in mind, the other most important aspect was the capacity to create complex models to accommodate relations between diverse types of content from multiple sources such as databases, the library catalogue, etc. Islandora provided the flexibility of Fedora plus easy customization powered by Drupal. Furthermore, discoverygarden could kick start us with their services and having Mark Leggott leading the project provided us with the confidence that our library needs and setting would be well understood.

Which modules or solution packs are most important to your repository?

In our latest collections we mostly use Drupal for prototyping. For this reason modules such as the Islandora Solr Client, the PDF Solution Pack or the Book Module are rather useful components to help us test and correct our collections once ingested and before the web layer is deployed.

What feature of your repository are you most proud of?

We like to be able to present the information through easy to grasp visualizations and have used timelines and maps in the past. In addition to this, we have started exploring  the use of recommendation systems that once an object is selected it will suggest other materials of interest. This has been used in production in “All our art catalogues since 1973”.

Who built/developed/designed your repository (i.e, who was on the team?)

Driven by the FJM Library, Islandora was initially setup at FJM with help from discoverygarden and the first four collections (CLAMOR, CEACS IR, Archive of Joaquín Turina, Archive of Antonia Mercé) were developed in the first year.

After that, the Library and IT Services undertook the development of a small and simple collection of essays to then move into a more complex product like the Personal Library of Cortazar that required more advanced work from web programmers and designers.

In the last year, we have developed a .NET library that allows us to interact with the Islandora components such as Fedora, Solr or RISearch. Since then we have undertaken more complex interdepartmental ventures like the collection  “All our art catalogues since 1973” where Library, IT and the web team have worked with colleagues in other departments such digitisation, art and design.

In addition to this we have also kept working on Library collections with help from IT like Sim Sala Bim Library of Illusionism or our latest collection “La Saga de los Fernández Shaw” which merges three different archives with information managed in Archivist Toolkit.

Do you have plans to expand your site in the future?

The knowledge portals developed using Islandora have been well received both internally and externally with many visitors. We plan to expand the collections with many more materials as well as using the repository to host the authority index and the thesaurus collections for the FJM. This will continue our work to ensure that the FJM digital materials are managed, connected and preserved.

What is your favourite object in your collection to show off?

This is a hard one, but if we have to chose our favourite object we would probably chose a resource like the The Avant-Garde Applied (1890.1950) art catalogue. The catalogue is presented with different photos of the spine and back cover, with other editions and related catalogues with a responsive web design and multi-device progressive loading viewer.


Our thanks to Luis and to FJM for agreeing to this feature. To learn more about their approach to Islandora, you can query to source by attending Islandora Camp EU2.

5 Tech Tools to be Thankful For / LITA

In honor of Thanksgiving, I’d like to give thanks for 5 tech tools that make life as a librarian much easier.

leaf1
Google Drive
On any given day I work on at least 6 different computers and tablets. That means I need instant access to my documents wherever I go and without cloud storage I’d be lost. While there are plenty of other free file hosting services, I like Drive the most because it offers 15GB of free storage and it’s incredibly easy to use. When I’m working with patrons who already have a Gmail account, setting up Drive is just a click away.

leaf2
Libib
I dabbled in Goodreads for a bit, but I must say, Libib has won me over. Libib lets you catalog your personal library and share your favorite media with others. While it doesn’t handle images quite as well as Goodreads, I much prefer Libib’s sleek and modern interface. Instead of cataloging books that I own, I’m currently using Libib to create a list of my favorite children’s books to recommend to patrons.

leaf3
Hopscotch
Hopscotch is my favorite iOS app right now. With Hopscotch, you can learn the fundamentals of coding through play. The app is marketed towards kids, but I think the bubbly characters and lighthearted nature appeals to adults too. I’m using Hopscotch in an upcoming adult program at the library to show that coding can be quirky and fun. If you want to use Hopscotch at your library, check out their resources for teachers. They’ve got fantastic ready made lesson plans for the taking.

leaf4
Adobe Illustrator
My love affair with Photoshop started many years ago, but as I’ve gotten older, Illustrator and I have become a much better match. I use Illustrator to create flyers, posters, and templates for computer class handouts. The best thing about Illustrator is that it’s designed for working with vector graphics. That means I can easily translate a design for a 6-inch bookmark into a 6-foot poster without losing image quality.

leaf5
Twitter
Twitter is hands-down my social network of choice. My account is purely for library-related stuff and I know I can count on Twitter to pick me up and get me inspired when I’m running out of steam. Thanks to all the libraries and librarians who keep me going!

What tech tools are you thankful for? Please share in the comments!

The mob that feeds / DPLA

When Boston Public Library first designed its statewide digitization service plan as an LSTA-funded grant project in 2010, we offered free imaging to any institution that agreed to make their digitized collections available through the Digital Commonwealth repository and portal system. We hoped and suggested that money not spent by our partners on scanning might then be invested in the other side of any good digital object – descriptive metadata. We envisioned a resurgence of special collections cataloging in libraries, archives, and historical societies across Massachusetts.

SpreadsheetDataAfter a couple of years, reality set in. Most of our partners did not have the resources to generate good descriptive records structured well enough to fit into our MODS application profile without major oversight and intervention on our part. What we did find, however, were some very dedicated and knowledgeable local historians, librarians, and archivists who maintained a variety of documentation that could be best described as “pre-metadata.”  Their local landscapes included inventories, spreadsheets, caption files, finding aids, catalog cards, sleeve inscriptions, dusty three-ring binders – the rich soil from which good metadata grows.

NatickHamNegSleeve

We understood it was now our job to cultivate and harvest metadata from these local sources. And thus the “Metadata Mob” was born. It is a fun and creative type of mob — less roughneck and more spontaneous dance routine. Except, instead of wildly cavorting to Do-Re-Mi in train stations, we cut-and-paste, we transcribe, we script, we spell check, we authorize, we regularize, we refine, we edit, and we enhance. It is a highly customized, hands-on process that differs slightly (or significantly) from collection to collection, institution to institution.

In many ways, the work Boston Public Library does has come to resemble the locally-sourced food movement in that we focus on how each community understands and represents their collections in their own unique way. Free-range metadata, so to speak, that we unearth after plowing through the annals of our partners.

plowingtheannals

Randall Harrow, 1870-1900. Boston Public Library via Digital Commonwealth.

We don’t impose our structures or processes on anyone beyond offering advice on some standard information science principles – the three major “food groups” of metadata as it were – well defined schema, authority control, and content standard compliance. We encourage our partners to maintain their local practices.

juicyMODSrecordWe then carefully nurture their information into healthy, juicy, and delicious metadata records that we can ingest into the Digital Commonwealth repository. We have all encountered online resources with weak and frail frames — malnourished with a few inconsistently used Dublin Core fields and factory-farmed values imported blindly from collection records or poorly conceived legacy projects. Our mob members eschew this technique. They are craftsmen, artisans, information viticulturists. If digital library systems are nourished by the metadata they ingest, then ours will be kept vigorous and healthy with the rich diet they have produced.

 

Thanks to SEMAP for use of their the logo in the header image. Check out SEMAP’s very informative website at semaponline.org. Buy Fresh, Buy Local! Photo credit: Lori De Santis. 


cc-by-iconAll written content on this blog is made available under a Creative Commons Attribution 4.0 International License. All images found on this blog are available under the specific license(s) attributed to them, unless otherwise noted.

Tutorial: Set Up Your Own DSpace Development Environment / DuraSpace News

From Bram Luyten, @mire

With the DSpace 5 release coming up, we wanted to make it easier for aspiring developers to get up and running with DSpace development. In our experience, starting off on the right foot with a proven set of tools and practices can reduce someone’s learning curve and help in quickly getting to initial results. IDEA 13, the integrated development environment by IntelliJ can make a developer’s life a lot easier thanks to a truckload of features that are not included in your run-of-the-mill text editor.

Online Bookstores to Face Stringent Privacy Law in New Jersey / Eric Hellman

Before you read this post, be aware that this web page is sharing your usage with Google, Facebook, StatCounter.com, unglue.it and Harlequin.com. Google because this is Blogger. Facebook because there's a "Like" button, StatCounter because I use it to measure usage, and Harlequin because I embedded the cover for Rebecca Avery's Maid to Crave directly from Harlequin's website. Harlequin's web server has been sent the address of this page along with you IP address as part of the HTTP transaction that fetches the image, which, to be clear, is not a picture of me.

I'm pretty sure that having read the first paragraph, you're now able to give informed consent if I try to sell you a book (see unglue.it embed -->) and constitute myself as a book service for the purposes of a New Jersey "Reader Privacy Act", currently awaiting Governor Christie's signature. (Update Nov 22: Gov. Christie has conditionally vetoed the bill.) That act would make it unlawful to share information about your book use (borrowing, downloading, buying, reading, etc.) with a third party, in the absence of a court order to do so. That's good for your reading privacy, but a real problem for almost anyone running a commercial "book service".

Let's use Maid to Crave as an example. When you click on the link, your browser first sends a request to Harlequin.com. Using the instructions in the returned HTML, it then sends requests to a bunch of web servers to build the web page, complete with images, reviews and buy links. Here's the list of hosts contacted as my browser builds that page:

  • www.harlequin.com
  • stats.harlequin.com
  • seal.verisign.com (A security company)
  • www.goodreads.com  (The review comes from GoodReads. They're owned by Amazon.)
  • seal.websecurity.norton.com (Another security company)
  • www.google-analytics.com
  • www.googletagservices.com
  • stats.g.doubleclick.net (Doubleclick is an advertising network owned by Google)
  • partner.googleadservices.com
  • tpc.googlesyndication.com
  • cdn.gigya.com (Gigya’s Consumer Identity Management platform helps businesses identify consumers across any device, achieve a single customer view by collecting and consolidating profile and activity data, and tap into first-party data to reach customers with more personalized marketing messaging.)
  • cdn1.gigya.com
  • cdn2.gigya.com
  • cdn3.gigya.com
  • comments.us1.gigya.com
  • gscounters.us1.gigya.com
  • www.facebook.com (I'm told this is a social network)
  • connect.facebook.net
  • static.ak.facebook.com
  • s-static.ak.facebook.com
  • fbstatic-a.akamaihd.net (Akamai is here helping to distribute facebook content)
  • platform.twitter.com (yet another social network)
  • syndication.twitter.com
  • cdn.api.twitter.com
  • edge.quantserve.com (QuantCast is an "audience research and behavioural advertising company")

All of these servers are given my IP address and the URL of the Harlequin page that I'm viewing. All of these companies except Verisign, Norton and Akamai also set tracking cookies that enable them to connect my browsing of the Harlequin site with my activity all over the web. The Guardian has a nice overview of these companies that track your use of the web. Most of them exist to better target ads at you. So don't be surprised if, once you've visited Harlequin, Amazon tries to sell you romance novels.

Certainly Harlequin qualifies as a commercial book service under the New Jersey law. And certainly Harlequin is giving personal information (IP addresses are personal information under the law) to a bunch of private entities without a court order. And most certainly it is doing so without informed consent. So its website is doing things that will be unlawful under the New Jersey law.

But it's not alone. Almost any online bookseller uses services like those used by Harlequin. Even Amazon, which is pretty much self contained, has to send your personal information to Ingram to fulfill many of the book orders sent to it. Under the New Jersey law, it appears that Amazon will need to get your informed consent to have Ingram send you a book. And really, do I care? Does this improve my reading privacy?

The companies that can ignore this law are Apple, Target, Walmart and the like. Book services are exempt if they derive less than 2% of their US consumer revenue from books. So yay Apple.

Other internet book services will likely respond to the law with pop-up legal notices like those you occasionally see on sites trying to comply with European privacy laws. "This site uses cookies to improve your browsing experience. OK?" They constitute privacy theater, a stupid legal show that doesn't improve user privacy one iota.

Lord knows we need some basic rules about privacy of our reading behavior. But I think the New Jersey law does a lousy job of dealing with the realities of today's internet. I wonder if we'll ever start a real discussion about what and when things should be private on the web.

Emergency! Governor Christie Could Turn NJ Library Websites Into Law-Breakers / Eric Hellman

Nate Hoffelder over at The Digital Reader highlighted the passage of a new "Reader Privacy Act" passed by the New Jersey State Legislature. If signed by Governor Chris Christie it would take effect immediately. It was sponsored by my state senator, Nia Gill.

In light of my writing about privacy on library websites, this poorly drafted bill, though well intentioned, would turn my library's website into a law-breaker, subject to a $500 civil fine for every user. (It would also require us to make some minor changes at Unglue.it.)
  1. It defines "personal information" as "(1) any information that identifies, relates to, describes, or is associated with a particular user's use of a book service; (2) a unique identifier or Internet Protocol address, when that identifier or address is used to identify, relate to, describe, or be associated with a particular user, as related to the user’s use of a book service, or book, in whole or in partial form; (3) any information that relates to, or is capable of being associated with, a particular book service user’s access to a book service."
  2. “Provider” means any commercial entity offering a book service to the public.
  3. A provider shall only disclose the personal information of a book service user [...] to a person or private entity pursuant to a court order in a pending action brought by [...] by the person or private entity.
  4. Any book service user aggrieved by a violation of this act may recover, in a civil action, $500 per violation and the costs of the action together with reasonable attorneys’ fees.
My library, Montclair Public Library, uses a web catalog run by Polaris, a division of Innovative Interfaces, a private entity, for BCCLS, a consortium serving northern New Jersey. Whenever I browse a catalog entry in this catalog, a cookie is set by AddThis (and probably other companies) identifying me and the web page I'm looking at. In other words, personal information as defined by the act is sent to a private entity, without a court order.

And so every user of the catalog could sue Innovative for $500 each, plus legal fees.

The only out is "if the user has given his or her informed consent to the specific disclosure for the specific purpose." Having a terms of use and a privacy policy is usually not sufficient to achieve "informed consent".

Existing library privacy laws in NJ have reasonable exceptions for "proper operations of the library". This law does not have a similar exemption.

I urge Governor Christie to veto the bill and send it back to the legislature for improvements that take account of the realities of library websites and make it easier for internet bookstores and libraries to operate legally in the Garden State.

You can contact Gov. Christie's office using this form.

Update: Just talked to one of Nia Gill's staff; they're looking into it. Also updated to include the 2nd set of amendments.

Update 2: A close reading of the California law on which the NJ statute was based reveals that poor wording in section 4 is the source of the problem. In the California law, it's clear that it pertains only to the situation where a private entity is seeking discovery in a legal action, not when the private entity is somehow involved in providing the service.

Where the NJ law reads
A provider shall only disclose the personal information of a book service user to a government entity, other than a law enforcement entity, or to a person or private entity pursuant to a court order in a pending action brought by the government entity or by the person or private entity.  
it's meant to read
In a pending action brought by the government entity other than a law enforcement entity, or by a person or by a private entity, a provider shall only disclose the personal information of a book service user to such entity or person pursuant to a court order.
Update 3 Nov 22: Governor Christie has conditionally vetoed the bill.

NJ Gov. Christie Vetoes Reader Privacy Act, Asks for Stronger, Narrower Law / Eric Hellman

According to New Jersey Governor Chris Christie's conditional veto statement, "Citizens of this State should be permitted to read what they choose without unnecessary government intrusion." It's hard to argue with that! Personally, I think we should also be permitted to read what we choose without corporate surveillance.

As previously reported in The Digital Reader, the bill passed in September by wide margins in both houses of the New Jersey State Legislature and would have codified the right to read ebooks without letting the government and everybody else knowing about it.

I wrote about some problems I saw with the bill. Based on a California law focused on law enforcement, the proposed NJ law added civil penalties on booksellers who disclosed the personal information of users without a court order. As I understood it, the bill could have prevented online booksellers from participating in ad networks (they all do!).

Governor Christie's veto statement pointed out more problems. The proposed law didn't explicitly prevent the government from asking for personal reading data, it just made it against the law for a bookseller to comply. So, for example, a local sheriff could still ask Amazon for a list of people in his town reading an incriminating book. If Amazon answered, somehow the reader would have to:
  1. find out that Amazon had provided the information
  2. sue Amazon for $500.
Another problem identified by Christie was that the proposed law imposed privacy burdens on booksellers stronger than those on libraries. Under another law, library records in New Jersey are subject to subpoena, but bookseller records wouldn't be. That's just bizarre.

In New Jersey, a governor can issue a "Conditional Veto". In doing so, the governor outlines changes in a bill that would allow it to become law. Christie's revisions to the Reader Privacy Act make the following changes:
  1. The civil penalties are stripped out of the bill. This allows Gov. Christie to position himself and NJ as "business-friendly".
  2. A requirement is added preventing the government from asking for reader information without a court order or subpoena. Christie gets to be on the side of liberty. Yay!
  3. It's made clear that the law applies only to government snooping, and not to promiscuous data sharing with ad networks. Christie avoids the ire of rich ad network moguls.
  4. Child porn is carved out of the definition of "books". Being tough on child pornography is one of those politically courageous positions that all politicians love.
The resulting bill, which was quickly reintroduced in the State Assembly, is stronger but narrower. It wouldn't apply in situations like the recent Adobe Digital Editions privacy breach, but it should be more effective at stopping "unnecessary government intrusion". I expect it will quickly pass the Legislature and be signed into law. A law that properly addresses the surveillance of ebook reading by private companies will be much more complicated and difficult to achieve.

I'm not a fan of his by any means, but Chris Christie's version of the Reader Privacy Act is a solid step in the right direction and would be an excellent model for other states. We could use a law like it on the national level as well.

(Guest posted at The Digital Reader)

Crossing the country / Galen Charlton

As some of you already know, Marlene and I are moving from Seattle to Atlanta in December. We’ve moved many (too many?) times before, so we’ve got most of the logistics down pat. Movers: hired! New house: rented! Mail forwarding: set up! Physical books: still too dang many!

We could do it in our sleep! (And the scary thing is, perhaps we have in the past.)

One thing that is different this time is that we’ll be driving across the country, visiting friends along the way.  3,650 miles, one car, two drivers, one Keurig, two suitcases, two sets of electronic paraphernalia, and three cats.

Cross-country route

Who wants to lay odds on how many miles it will take each day for the cats to lose their voices?

Fortunately Sophia is already testing the cats’ accommodations:

Sophie investigating the crate

I will miss the friends we made in Seattle, the summer weather, the great restaurants, being able to walk down to the water, and decent public transportation. I will also miss the drives up to Vancouver for conferences with a great bunch of librarians; I’m looking forward to attending Code4Lib BC next week, but I’m sorry to that our personal tradition of American Thanksgiving in British Columbia is coming to an end.

As far as Atlanta is concerned, I am looking forward to being back in MPOW’s office, having better access to a variety of good barbecue, the winter weather, and living in an area with less de facto segregation.

It’s been a good two years in the Pacific Northwest, but much to my surprise, I’ve found that the prospect of moving back to Atlanta feels a bit like a homecoming. So, onward!

Bookmarks for November 21, 2014 / Nicole Engard

Today I found the following resources and bookmarked them on <a href=

Digest powered by RSS Digest

The post Bookmarks for November 21, 2014 appeared first on What I Learned Today....

Free webinar: The latest on Ebola / District Dispatch

Photo by Phil Moyer

Photo by Phil Moyer

As the Ebola outbreak continues, the public must sort through all of the information being disseminated via the news media and social media. In this rapidly evolving environment, librarians are providing valuable services to their communities as they assist their users in finding credible information sources on Ebola, as well as other infectious diseases.

On Tuesday, December 12, 2014, library leaders from the U.S. National Library of Medicine will host the free webinar “Ebola and Other Infectious Diseases: The Latest Information from the National Library of Medicine.” As a follow-up to the webinar they presented in October, librarians from the U.S. National Library of Medicine will be discussing how to provide effective services in this environment, as well as providing an update on information sources that can be of assistance to librarians.

Speakers

  • Siobhan Champ-Blackwell is a librarian with the U.S. National Library of Medicine Disaster Information Management Research Center. Champ-Blackwell selects material to be added to the NLM disaster medicine grey literature data base and is responsible for the Center’s social media efforts. Champ-Blackwell has over 10 years of experience in providing training on NLM products and resources.
  • Elizabeth Norton is a librarian with the U.S. National Library of Medicine Disaster Information Management Research Center where she has been working to improve online access to disaster health information for the disaster medicine and public health workforce. Norton has presented on this topic at national and international association meetings and has provided training on disaster health information resources to first responders, educators, and librarians working with the disaster response and public health preparedness communities.

Date: December 12, 2014
Time: 2:00 PM–3:00 PM Eastern
Register for the free event

If you cannot attend this live session, a recorded archive will be available to view at your convenience. To view past webinars also done in collaboration with iPAC, please visit Lib2Gov.org.

The post Free webinar: The latest on Ebola appeared first on District Dispatch.

Library as Digital Consultancy / M. Ryan Hess

As faculty and students delve into digital scholarly works, they are tripping over the kinds of challenges that libraries specialize in overcoming, such as questions regarding digital project planning, improving discovery or using quality metadata. Indeed, nobody is better suited at helping scholars with their decisions regarding how to organize and deliver their digital works than librarians.

At my institution, we have not marketed our expertise in any meaningful way (yet), but we receive regular requests for help by faculty and campus organizations who are struggling with publishing digital scholarship. For example, a few years ago a team of librarians at my library helped researchers from the University of Ireland at Galway to migrate and restructure their online collection of annotations from the Vatican Archive to a more stable home on Omeka.net. Our expertise in metadata standards, OAI harvesting, digital collection platforms and digital project planning turned out to be invaluable to saving their dying collection and giving it a stable, long-term home. You can read more in my Saved by the Cloud post.

These kinds of requests have continued since. In recognition of this growing need, we are poised to launch a digital consultancy service on our campus.

Digital Project Planning

A core component of our jobs is planning digital projects. Over the past year, in fact, we’ve developed a standard project planning template that we apply to each digital project that comes our way. This has done wonders at keeping us all up to date on what stage each project is in and who is up next in terms of the workflow.

Researchers are often experts at planning out their papers, but they don’t normally have much experience with planning a digital project. For example, because metadata and preservation are things that normally don’t come up for them, they overlook planning around these aspects. And more generally, I’ve found that just having a template to work with can help them understand how the experts do digital projects and give them a sense of the issues they need to consider when planning their own projects, whether that’s building an online exhibit or organizing their selected works in ways that will reap the biggest bang for the buck.

We intend to begin formally offering project planning help to faculty very soon.

Platform Selection

It’s also our job to keep abreast of the various technologies available for distributing digital content, whether that is harvesting protocols, web content management systems, new plugins for WordPress or digital humanities exhibit platforms. Sometimes researchers know about some of these, but in my experience, their first choice is not necessarily the best for what they want to do.

It is fairly common for me to meet with campus partners that have an existing collection online, but which has been published in a platform that is ill-suited for what they are trying to accomplish. Currently, we have many departments moving old content based in SQL databases to plain HTML pages with no database behind them whatsoever. When I show them some of the other options, such as our Digital Commons-based institutional repository or Omeka.net, they often state they had no idea that such options existed and are very excited to work with us.

Metadata

I think people in general are becoming more aware of metadata, but there is still lots of technical considerations that your typical researcher may not be aware of. At our library, we have helped out with all aspects of metadata. We have helped them clean up their data to conform to authorized terms and standard vocabularies. We have explained Dublin Core. We have helped re-encode their data so that diacritics display online. We have done crosswalking and harvesting. It’s a deep area of knowledge and one that few people outside of libraries know on a suitably deep level.

One recommendation for any budding metadata consultants that I would share is that you really need to be the Carl Sagan of metadata. This is pretty technical stuff and most people don’t need all the details. Stick to discussing the final outcome and not the technical details and your help will be far more understood and appreciated. For example, I once presented to a room of researchers on all the technical fixes to a database that we made to enhance and standardize the metadata, but his went over terribly. People later came up to me and joked that whatever it was we did, they’re sure it was important and thanked us for being there. I guess that was a good outcome since they acknowledged our contribution. But it would have been better had they understood, the practical benefits for the collection and users of that content.

SEO

Search Engine Optimization is not hard, but it is likely that few people outside of the online marketing and web design world know what it is. I often find people can understand it very quickly if you simply define it as “helping Google understand your content so it can help people find you.” Simple SEO tricks like defining and then using keywords in your headers will do wonders for your collection’s visibility in the major search engines. But you can go deep with this stuff too, so I like to gauge my audience’s appetite for this stuff and then provide them with as much detail as I think they have an appetite for.

Discovery

It’s a sad statement on the state of libraries, but the real discovery game is in the major search engines…not in our siloed, boutique search interfaces. Most people begin their searches (whether academic or not) in Google and this is really bad news for our digital collections since by and large, library collections are indexed in the deep web, beyond the reach of the search robots.

I recently tried a search for the title of a digital image in one of our collections in Google.com and found it. Yeah! Now I tried the same search in Google Images. No dice.

More librarians are coming to terms with this discovery problem now and we need to share this with digital scholars as they begin considering their own online collections so that they don’t make the mistakes libraries made (and continue to make…sigh) with our own collections.

We had one department at my institution that was sitting on a print journal that they were considering putting online. Behind this was a desire to bring the publication back to life since they had been told by one researcher in Europe that she thought the journal had been discontinued years ago. Unfortunately, it was still being published, it just wasn’t being indexed in Google. We offered our repository as an excellent place to do so, especially because it would increase their visibility worldwide. Unfortunately, they opted for a very small, non-profit online publisher whose content we demonstrated was not surfacing in Google or Google Scholar. Well, you can lead a horse to water…

Still, I think this kind of understanding of the discovery universe does resonate with many. Going back to our somewhat invisible digital images, we will be pushing many to social media like Flickr with the expectation that this will boost visibility in the image search engines (and social networks) and drive more traffic to our digital collections.

Usability

This one is a tough one because people often come with pre-conceived notions of how they want their content organized or the site designed. For this reason, sometimes usability advice does not go over well. But for those instances when our experiences with user studies and information architecture can influence a digital scholarship project, it’s time well spent. In fact, I often hear people remark that they “never thought of it that way” and they’re willing to try some of the expert advice that we have to share.

Such advice includes things like:

  • Best practices for writing for the web
  • Principles of information architecture
  • Responsive design
  • Accessibility support
  • User Experience design

Marketing

It’s fitting to end on marketing. This is usually the final step in any digital project and one that often gets dropped. And yet, why do all the work of creating a digital collection only to let it go unnoticed. As digital project expert, librarians are familiar with the various channels available to promote and build followers with tools like social networking sites, blogs and the like.

With our own digital projects, we discuss marketing at the very beginning so we are sure all the hooks, timing and planning considerations are understood by everyone. In fact, marketing strategy will impact some of the features of your exhibit, your choice of keywords used to help SEO, the ultimate deadlines that you set for completion and the staffing time you know you’ll need post launch to keep the buzz buzzing.

Most importantly, though, marketing plans can greatly influence the decision for which platform to use. For example, one of the benefits of Omeka.net (rather than self-hosted Omeka) is that any collection hosted with them becomes part of a network of other digital collections, boosting the potential for serendipitous discovery. I often urge faculty to opt for our Digital Commons repository over, say, their personal website, because anything they place in DC gets aggregated into the larger DC universe and has built-in marketing tools like email subscriptions and RSS feeds.

The bottom line here is that marketing is an area where librarians can shine. Online marketing of digital collections really pulls together all of the other forms of expertise that we can offer (our understanding of metadata, web technology and social networks) to fulfill the aim of every digital project: to reach other people and teach them something.


Steve Hetzler's "Touch Rate" Metric / David Rosenthal

Steve Hetzler of IBM gave a talk at the recent Storage Valley Supper Club on a new, scale-free metric for evaluating storage performance that he calls "Touch Rate". He defines this as the proportion of the store's total content that can be accessed per unit time. This leads to some very illuminating graphs that I discuss below the fold.

Steve's basic graph is a log-log plot with performance increasing up and to the right. Response time for accessing an object (think latency) decreases to the right on the X-axis and the touch rate, the proportion of the total capacity that can be accessed by random reads in a year (think bandwidth) increases on the Y-axis. For example, a touch rate of 100/yr means that random reads could access the entire contents 100 times a year. He divides the graph into regions suited to different applications, with minimum requirements for response time and touch rate. So, for example, transaction processing requires response times below 10ms and touch rates above 100 (the average object is accessed about once every 3 days).

The touch rate depends on the size of the objects being accessed. If you take a specific storage medium, you can use its specifications to draw a curve on the graph as the size varies. Here Steve uses "capacity disk" (i.e. commodity 3.5" SATA drives) to show the typical curve, which varies from being bandwidth limited (for large objects on the left, horizontal side) being response limited (for small objects on the right, vertical side).

As an example of the use of these graphs, Steve analyzed the idea of MAID (Massive Array of Idle Drives). He used HGST MegaScale DC 4000.B SATA drives, and assumed that at any time 10% of them would be spun-up and the rest would be in standby. With random accesses to data objects, 9 out of 10 of them will encounter a 15sec spin-up delay, which sets the response time limit. Fully powering-down the drives as Facebook's cold storage does would save more power but increase the spin-up time to 20s. The system provides only (actually somewhat less than) 10% of the bandwidth per unit content, which sets the touch rate limit.

The Steve looked at the fine print of the drive specifications. He found two significant restrictions:
  • The drives have a life-time limit of 50K start/stop cycles.
  • For reasons that are totally opaque, the drives are limited to a total transfer of 180TB/yr.
Applying these gives this modified graph. The 180TB/yr limit is the horizontal line, reducing the touch rate for large objects. If the drives have a 4-year life, we would need 8M start/stop cycles to achieve a 15sec response time. But we only have 50K. To stay within this limit, the response time has to increase by a factor of 8M/50K, or 160, which is the vertical line. So in fact a traditional MAID system is effective only in the region below the horizontal line and left of the vertical line, much smaller than expected.

This analysis suggests that traditional MAID is not significantly better than tapes in a robot. Here, for example, Steve examines configurations varying from one tape drive for 1600 LTO6 tapes, or 4PB per drive, to a quite unrealistically expensive 1 drive per 10 tapes, or 60TB per drive. Tape drives have a 120K lifetime load/unload cycle limit, and the tapes can withstand at most 260 full-file passes, so tape has a similar pair of horizontal and vertical lines.

The reason that Facebook's disk-based cold storage doesn't suffer from the same limits as traditional MAID is that it isn't doing random I/O. Facebook's system schedules I/Os so that it uses the full bandwidth of the disk array, raising the touch rate limit to that of the drives, and reducing the number of start-stop cycles. Admittedly, the response time for a random data object is now a worst-case 7 times the time for which a group of drives is active, but this is not a critical parameter for Facebook's application.

Steve's metric seems to be a major contribution to the analysis of storage systems.

Townhall, not Shopping Mall! Community, making, and the future of the Internet / Jenny Rose Halperin

I presented a version of this talk at the 2014 Futurebook Conference in London, England. They also kindly featured me in the program. Thank you to The Bookseller for a wonderful conference filled with innovation and intelligent people!

A few days ago, I was in the Bodleian Library at Oxford University, often considered the most beautiful library in the world. My enthusiastic guide told the following story:

After the Reformation (when all the books in Oxford were burned), Sir Thomas Bodley decided to create a place where people could go and access all the world’s information at their fingertips, for free.

“What does that sound like?” she asked. “…the Internet?”

While this is a lovely conceit, the part of the story that resonated with me for this talk is the other big change that Bodley made, which was to work with publishers, who were largely a monopoly at that point, to fill his library for free by turning the library into a copyright library. While this seemed antithetical to the ways that publishers worked, in giving a copy of their very expensive books away, they left an indelible and permanent mark on the face of human knowledge. It was not only preservation, but self-preservation.

Bodley was what people nowadays would probably call “an innovator” and maybe even in the parlance of my field, a “community manager.”

By thinking outside of the scheme of how publishing works, he joined together with a group of skeptics and created one of the greatest knowledge repositories in the world, one that still exists 700 years later. This speaks to a few issues:

Sharing economies, community, and publishing should and do go hand in hand and have since the birth of libraries. By stepping outside of traditional models, you are creating a world filled with limitless knowledge and crafting it in new and unexpected ways.

The bound manuscript is one of the most enduring technologies. This story remains relevant because books are still books and people are still reading them.

As the same time, things are definitely changing. For the most part, books and manuscripts were pretty much identifiable as books and manuscripts for the past 1000 years.

But what if I were to give Google Maps to a 16th Century Map Maker? Or what if I were to show Joseph Pulitzer Medium? Or what if I were to hand Gutenberg a Kindle? Or Project Gutenberg for that matter? What if I were to explain to Thomas Bodley how I shared the new Lena Dunham book with a friend by sending her the file instead of actually handing her the physical book? What if I were to try to explain Lena Dunham?

These innovations have all taken place within the last twenty years, and I would argue that we haven’t even scratched the surface in terms of the innovations that are to come.

We need to accept that the future of the printed word may vary from words on paper to an ereader or computer in 500 years, but I want to emphasize that in the 500 years to come, it will more likely vary from the ereader to a giant question mark.

International literacy rates have risen rapidly over the past 100 years and companies are scrambling to be the first to reach what they call “developing markets” in terms of connectivity. In the vein of Mark Surman’s talk at the Mozilla Festival this year, I will instead call these economies post-colonial economies.

Because we (as people of the book) are fundamentally idealists who believe that the printed word can change lives, we need to be engaged with rethinking the printed word in a way that recognizes power structures and does not settle for the limited choices that the corporate Internet provides (think Facebook vs WhatsApp). This is not as a panacea to fix the world’s ills.

In the Atlantic last year, Phil Nichols wrote an excellent piece that paralleled Web literacy and early 20th century literacy movements. The dualities between “connected” and “non-connected,” he writes, impose the same kinds of binaries and blind cure-all for social ills that the “literacy” movement imposed in the early 20th century. In equating “connectedness” with opportunity, we are “hiding an ideology that is rooted in social control.”

Surman, who is director of the Mozilla Foundation, claims that the Web, which had so much potential to become a free and open virtual meeting place for communities, has started to resemble a shopping mall. While I can go there and meet with my friends, it’s still controlled by cameras that are watching my every move and its sole motive is to get me to buy things.

85 percent of North America is connected to the Internet and 40 percent of the world is connected. Connectivity increased at a rate of 676% in the past 13 years. Studies show that literacy and connectivity go hand in hand.

How do you envision a fully connected world? How do you envision a fully literate world? How can we empower a new generation of connected communities to become learners rather than consumers?

I’m not one of these technology nuts who’s going to argue that books are going to somehow leave their containers and become networked floating apparatuses, and I’m not going to argue that the ereader is a significantly different vessel than the physical book.

I’m also not going to argue that we’re going to have a world of people who are only Web literate and not reading books in twenty years. To make any kind of future prediction would be a false prophesy, elitist, and perhaps dangerous.

Although I don’t know what the printed word will look like in the next 500 years,

I want to take a moment to think outside the book,

to think outside traditional publishing models, and to embrace the instantaneousness, randomness, and spontaneity of the Internet as it could be, not as it is now.

One way I want you to embrace the wonderful wide Web is to try to at least partially decouple your social media followers from your community.

Twitter and other forms of social media are certainly a delightful and fun way for communities to communicate and get involved, but your viral campaign, if you have it, is not your community.

True communities of practice are groups of people who come together to think beyond traditional models and innovate within a domain. For a touchstone, a community of practice is something like the Penguin Labs internal innovation center that Tom Weldon spoke about this morning and not like Penguin’s 600,000 followers on Twitter. How can we bring people together to allow for innovation, communication, and creation?

The Internet provides new and unlimited opportunities for community and innovation, but we have to start managing communities and embracing the people we touch as makers rather than simply followers or consumers.

The maker economy is here— participatory content creation has become the norm rather than the exception. You have the potential to reach and mobilize 2.1 billion people and let them tell you what they want, but you have to identify leaders and early adopters and you have to empower them.

How do you recognize the people who create content for you? I don’t mean authors, but instead the ambassadors who want to get involved and stay involved with your brand.

I want to ask you, in the spirit of innovation from the edges

What is your next platform for radical participation? How are you enabling your community to bring you to the next level? How can you differentiate your brand and make every single person you touch psyched to read your content, together? How can you create a community of practice?

Community is conversation. Your users are not your community.

Ask yourself the question Rachel Fershleiser asked when building a community on Tumblr: Are you reaching out to the people who want to hear from you and encouraging them or are you just letting your community be unplanned and organic?

There reaches a point where we reach the limit of unplanned organic growth. Know when you reach this limit.

Target, plan, be upbeat, and encourage people to talk to one another without your help and stretch the creativity of your work to the upper limit.

Does this model look different from when you started working in publishing? Good.

As the story of the Bodelian Library illustrated, sometimes a totally crazy idea can be the beginning of an enduring institution.

To repeat, the book is one of the most durable technologies and publishing is one of the most durable industries in history. Its durability has been put to the test more than once, and it will surely be put to the test again. Think of your current concerns as a minor stumbling block in a history filled with success, a history that has documented and shaped the world.

Don’t be afraid of the person who calls you up and says, “I have this crazy idea that may just change the way you work…” While the industry may shift, the printed word will always prevail.

Publishing has been around in some shape or form for 1000 years. Here’s hoping that it’s around for another 1000 more.

ALA Washington Office copyright event “too good to be true” / District Dispatch

ALA Washington Office Executive Director Emily Sheketoff, Jonathan Band, Brandon Butler and Mary Rasenberger.

(Left to right) ALA Washington Office Executive Director Emily Sheketoff, Jonathan Band, Brandon Butler and Mary Rasenberger.

On Tuesday, November 18th, the American Library Association (ALA) held a panel discussion on recent judicial interpretations of the doctrine of fair use. The discussion, entitled “Too Good to be True: Are the Courts Revolutionizing Fair Use for Education, Research and Libraries?” is the first in a series of information policy discussions to help us chart the way forward as the ongoing digital revolution fundamentally changes the way we access, process and disseminate information. This event took place at Arent Fox, a major Washington, D.C. law firm that generously provided the facility for our use.

These events are part of the ALA Office for Information Technology Policy’s broader Policy Revolution! initiative—an ongoing effort to establish and maintain a national public policy agenda that will amplify the voice of the library community in the policymaking process and position libraries to best serve their patrons in the years ahead.

Tuesday’s event convened three copyright experts to discuss and debate recent developments in digital fair use. The experts—ALA legislative counsel Jonathan Band; American University practitioner-in-practice Brandon Butler; and Authors Guild executive director Mary Rasenberger—engaged in a lively discussion that highlighted some points of agreement and disagreement between librarians and authors.

The library community is a strong proponent of fair use, a flexible copyright exception that enables use of copyrighted works without prior authorization from the rights holder. Fair use can be determined by the consideration of four factors. A number of court decisions issued over the last three years have affirmed the use of copyrighted works by libraries as fair, including the mass digitization of books housed in some research libraries, such as Authors Guild v. HathiTrust.

Band and Butler disagreed with Rasenberger on several points concerning recent judicial fair use interpretations. Band and Butler described judicial rulings on fair use in disputes like the Google Books case and the HathiTrust case as on-point, and rejected arguments that the reproductions of content at issue in these cases could result in economic injury to authors. Rasenberger, on the other hand, argued that repositories like HathiTrust and Google Books can in fact lead to negative market impacts for authors, and therefore do not represent a fair use.

Rasenberger believes that licensing arrangements should be made between authors and members of the library, academic and research communities who want to reproduce the content to which they hold rights. She takes specific issue with judicial interpretations of market harm that require authors to demonstrate proof of a loss of profits, suggesting that such harm can be established by showing that future injury is likely to befall an author as a result of the reproduction of his or her work.

Despite their differences of opinion, the panelists provided those in attendance at Tuesday’s event with some meaningful food for thought, and offered a thorough overview of the ongoing judicial debates over fair use. We were pleased that the Washington Internet Daily published an article “Georgia State Case Highlights Fair Use Disagreement Among Copyright Experts,” on November 20, 2014, about our session. ALA continues to fight for public access to information as these debates play out.

Stay tuned for the next event, planned for early 2015!

OITP Copyright Event OITP Copyright Event OITP Copyright Event OITP Copyright Event OITP Copyright Event OITP Copyright Event OITP Copyright Event OITP Copyright Event OITP Copyright Event OITP Copyright Event OITP Copyright Event OITP Copyright Event OITP Copyright Event OITP Copyright Event OITP Copyright event ALA Washington Office Executive Director Emily Sheketoff, Jonathan Band, Brandon Butler and Mary Rasenberger.

The post ALA Washington Office copyright event “too good to be true” appeared first on District Dispatch.