Planet Code4Lib

Big federal funding increases for libraries / District Dispatch

The following message is from ALA President Jim Neal:

We are thrilled that Congress has passed an FY2018 omnibus spending bill today that includes significant federal funding increases for our nation’s libraries!

One year ago, the White House proposed eliminating the Institute of Museum and Library Services (IMLS) and slashed millions of dollars in federal funding for libraries. Twelve months and tireless advocacy efforts later, ALA advocates have helped libraries:

  • win $9 million more for IMLS than it had in FY 2017, including $5.7 million for the Library Services and Technology Act.
  • restore $27 million for the Innovative Approaches to Literacy program.
  • provide $350 million for the Public Service Loan Forgiveness program.

Congress also appropriated an unexpected $700 million for Title IV education programs, which opens doors to new funding for school libraries.
On top of the good news about funding for libraries, Congress added a policy provision that has been on our advocacy agenda for years: Congressional Research Services (CRS) reports will now be published online by the Library of Congress, ensuring for the first time permanent public access to valuable government information.

The path through the FY2018 appropriations process has been long. One lesson from this budget cycle is that when libraries speak, decision-makers listen. At critical points in the process last year, ALA members from every U.S. congressional district responded to our calls to action. As a result, a record number of representatives and senators signed our FY 2018 “dear appropriator” letters last spring. As the House and Senate Appropriations Committees worked on their respective bills last summer, ALA members made more targeted phone calls and visits and leveraged their local media to tell their library stories. Our advocacy earned bipartisan support in both chambers of Congress.

The persistence of library advocates has paid off for every single community in our nation, from big cities to small towns. This is a time to honor the power of ALA’s advocacy.

This is also a time to strengthen our resolve. The FY2018 budget passage represents a major win for libraries – a win that needs to fuel even more aggressive efforts to advocate for federal library funding in FY2019.

To protect federal library funding, we need to keep reminding Congress that libraries bring leaders and experts together to solve difficult problems, that we deliver opportunities, from academic success to work-readiness. We need to invite elected leaders into our libraries to see what we do for their constituents with a small investment of federal dollars. And we need to engage our library users and other community leaders in this important work.

ALA’s Washington Office will continue to provide the expertise, strategy and resources that have helped make our advocacy so effective. To get involved, visit

The post Big federal funding increases for libraries appeared first on District Dispatch.

Evergreen 3.0.5 and 2.12.11 released / Evergreen ILS

The Evergreen community is pleased to announce two maintenance releases of Evergreen, 3.0.5 and 2.12.11.

Evergreen 3.0.5 has the following changes improving on Evergreen 3.0.4:

  • The MARC Editor in the Web staff client now wraps long fields.
  • The MARC Editor no longer allows catalogers to enter newline characters into MARC subfields.
  • Fixes an issue that prevented serials items from being deleted or modified.
  • The Web staff client Check In screen no longer reloads the whole page multiple times each time an item is scanned.
  • Fixes an issue that displayed the oldest rather than the newest transit in the Web staff client Item Status page.
  • Fixes an issue that prevented the reports module from being displayed in the Web client.
  • Fixes an issue in the Web staff client reports module that caused syntax errors in reports that use virtual fields and joins.
  • Fixes an issue that prevented several dropdown menus in the Web staff client from activating.
  • Fixes an issue that created duplicate copy data when copies or volumes with parts were transferred.
  • Fixes the Trim List feature in the Web staff client Check In screen.
  • The Item Status grid now displays the Circulation Modifier.
  • Restores missing data from the Profile column in Place Hold patron search results.
  • Fixes an issue with the http -> https redirect on Apache 2.4.
  • Fixes an color contrast accessibility issue in the Web staff client and adds underlining to links in grid cells for added accessibility.
  • Adds automated regression and unit tests for the Web staff client
    reports module.
  • Adds a process for spell-checking the official documentation.
  • Adds a script that simplifies the release process related to translations.

Evergreen 2.12.11 has the following changes improving on Evergreen 2.12.10:

  • The MARC Editor in the Web staff client now wraps long fields.
  • The MARC Editor no longer allows catalogers to enter newline characters into MARC subfields.
  • Fixes an issue in the Web staff client reports module that caused syntax errors in reports that use virtual fields and joins.
  • Fixes an issue that created duplicate copy data when copies or volumes with parts were transferred.
  • Fixes an issue with the http -> https redirect on Apache 2.4.
  • Fixes a color contrast accessibility issue in the Web staff client and adds underlining to links in grids for added accessibility.
  • Adds automated regression and unit tests for the Web staff client reports module.

With the release of Evergreen 2.12.11, the 2.12.x series is no longer under active regular maintenance and has reached end of life for general bugfixes. New releases of 2.12.x will be made only when fixes for security issues are available.

Please visit the Evergreen downloads page to download the upgraded software and to read full release notes. Many thanks to everyone who contributed to the releases!

Revisiting Mills Kelly’s “Lying About the Past” 10 Years Later / Dan Cohen

If timing is everything, history professor Mills Kelly didn’t have such great timing for his infamous course “Lying About the Past.” Taught at George Mason University for the first time in 2008, and then again in 2012—both, notably, election years, although now seemingly from a distant era of democracy—the course stirred enormous controversy and then was never taught again in the face of institutional and external objections. Some of those objections understandably remain, but “Lying About the Past” now seems incredibly prescient and relevant.

Unlike other history courses, “Lying About the Past” did not focus on truths about the past, but on historical hoaxes. As a historian of Eastern Europe, Kelly knew a thing or two about how governments and other organizations can shape public opinion through the careful crafting of false, but quite believable, information. Also a digital historian, Kelly understood how modern tools like Photoshop could give even a college student the ability to create historical fakes, and then to disseminate those fakes widely online.

In 2008, students in the course collaborated on a fabricated pirate, Edward Owens, who supposedly roamed the high (or low) seas of the Chesapeake Bay in the 1870s. (In a bit of genius marketing, they called him “The Last American Pirate.”) In 2012, the class made a previously unknown New York City serial killer materialize out of “recently found” newspaper articles and other documents.

It was less the intellectual focus of the course, which was really about the nature of historical truth and the importance of careful research, than the dissemination of the hoaxes themselves that got Kelly and his classes in trouble. In perhaps an impolitic move, the students ended up adding and modifying articles on Wikipedia, and as YouTube recently discovered, you don’t mess with Wikipedia. Although much of the course was dedicated to the ethics of historical fakes, for many who looked at “Lying About the Past,” the public activities of the students crossed an ethical line.

But as we have learned over the last two years, the mechanisms of dissemination are just as important as the fake information being disseminated. A decade ago, Kelly’s students were exploring what became the dark arts of Russian trolls, putting their hoaxes on Twitter and Reddit and seeing the reactive behaviors of gullible forums. They learned a great deal about the circulation of information, especially when bits of fake history and forged documents align with political and cultural communities.

As Yoni Appelbaum, a fellow historian, assessed the outcome of “Lying About the Past” more generously than the pundits who piled on once the course circulated on cable TV:

If there’s a simple lesson in all of this, it’s that hoaxes tend to thrive in communities which exhibit high levels of trust. But on the Internet, where identities are malleable and uncertain, we all might be well advised to err on the side of skepticism.

History unfortunately shows that erring on the side of skepticism has not exactly been a widespread human trait. Indeed, “Lying About the Past” showed the opposite: that those who know just enough history to make plausible, but false, variations in its record, and then know how to push those fakes to the right circles, have the chance to alter history itself.

Maybe it’s a good time to teach some version of “Lying About the Past” again.

The Curious Case of Balanced/Equilateral Higher Education Institutions and Potential Implications for Academic Libraries / HangingTogether

Our Institutional Directions Model of US Universities and Colleges, which we have developed as part of The University Futures, Library Futures project, offers a novel system of accounting for the main educational directions of 1506 public and nonprofit US higher education Institutions.

Simply put, our model calculates the extant to which a university’s educational activity is focused on one of the following: (1) doctoral level research, (2) liberal education in the arts and sciences, and (3) career-oriented education and professional training. For each institution within our project population, our model calculates the percentage of each of these three educational directions, all three of which always add up to a total of one hundred percent. Namely, an institution with a score of 33% Research, 33% Liberal Education, and 33% Career would have its educational directions equally divided. For more information about the model’s underpinnings, please refer to our data set and scoring formula.

Because our research is focused on institutional differentiation and identifying distinctive institutional types, we have generally looked at cohorts defined by strong, shared directional emphasis. For example, institutions with a strong directional emphasis on Research, or Career-directed learning. However, it is equally interesting to consider the case of institutions with relatively balanced educational activity in each of the directions we are examining.

To this end, we extracted a subset of institutions for which educational activity on Research, Liberal Education, and Career is between 28% and 38% (I.e., 5% above or below 33%). On a radar graph, these institutions would look like a set of nearly equilateral triangles. Interestingly, there are relatively few of these: just nine institutions in our population of 1506.

To investigate how modifying the upper and lower bounds of this range affects the count of institutions, we increased the range to 5% above or below 33%. The results are presented in the table below:

Research Liberal Education Career Count of Institutions Percent of Institutions in UFLF population
>32% to <34% >32% to <34% >32% to <34% 0 0
>28% to <38% >28% to <38% >28% to <38% 9 0.6%
>28% to <38% >28% to <38% 1% to 100% 11 0.7%
1% to 100% >28% to <38% >28% to <38% 9 0.6%
>28% to <38% 1% to 100% >28% to <38% 16 1.0%

The nine institutions that fall between 28% and 38% in all categories, or between 28% and 38% on the Liberal Education and Career directions are identical.

Cumulatively, then, a total of 36 institutions (or 2.3% of the project population) exhibit a relatively “equilateral” distribution. This is clearly a very small share of the overall population. While a majority of the population exhibits strong directionality toward one, or at most two, of the educational directions we are examining, it is nonetheless interesting to consider the significance of an institutional identity that is more evenly distributed across Research, Liberal Education and Career-directed programs, for example:

  • What does the small number of institutions (in our project population) that exhibit this pattern tell us about what it takes to succeed in a competitive higher education marketplace?
  • Is it by initial design that these institutions (or some of them) offer a balanced, generalist type of education?
  • If not by initial design, have they intentionally moved in this direction (toward the center) or were they crowded out of another more highly differentiated space?
  • What do academic library services look like in these institutions? Are they notably different from the library service bundle in institutions with a “spiky” or asymmetrical distribution?
  • Is it any easier (or more challenging) to support academic library services in an institution with evenly distributed educational directions vs. an institution with a stronger emphasis in one or more direction(s)?

We invite our readers to share their thoughts in the comments section below or by email ( or

We thank our colleague Brian Lavoie for reviewing an earlier version of this post and providing helpful comments.

Twitter / pinboard

this is all of #code4lib working on @bot4lib circa 2012.

Jobs in Information Technology: March 21, 2018 / LITA

New vacancy listings are posted weekly on Wednesday at approximately 12 noon Central Time. They appear under New This Week and under the appropriate regional listing. Postings remain on the LITA Job Site for a minimum of four weeks.

New This Week

Pratt Institute, Library Media Resources Coordinator, Brooklyn, NY

Barry University, Director, Library Services, Miami Shores, FL

Ruth Lilly Medical Library, Indiana University School of Medicine, Emerging Technologies Librarian, Indianapolis, IN

Visit the LITA Job Site for more available jobs and for information on submitting a job posting.

March 2018 ITAL Issue Published / LITA

The March 2018 issue (volume 37, number 1) of Information Technology and Libraries (ITAL) has been published and may be read at:

This issue marks the journal’s 50th anniversary. The table of contents and brief abstracts of reviewed articles are below.

Ken Varnum

“Academic Libraries on Social Media: Finding the Students and the Information They Want”
Heather Howard, Sarah Huber, Lisa Carter, and Elizabeth Moore

Although most libraries today participate in some form of social media, few take the time to learn how they might use this medium more effectively to meet the needs and interests of their users. This study by Purdue University Libraries offers an instructive example of how to apply user research to the development of an effective social media strategy. This article will be of interest to librarians looking to gain a better understanding of the social media habits of college students or improve communication with their users.

“Accessible, Dynamic Web Content Using Instagram”
Jaci Wilkinson

Using social media to reach a library’s communities has traditionally focused on using Twitter and Facebook to engage patrons. In this article, the authors discuss how they developed an interface to push content from their archives and special collections to Instagram. This article is especially interesting as it focuses on a number of issues: developing the API, designing for accessibility, and taking advantage of evolving social media trends.

“Trope or Trap? Role-Playing Narratives and Length in Instructional Video”
Amanda S. Clossen

This article, detailing the results of a large-scale survey, provides a solid and useful addition to the literature on how best to create instructional videos. A must read for instructional-video-creating librarians!

“Identifying Emerging Relationships in Healthcare Domain Journals via Citation Network Analysis”
Kuo-Chung Chu, Hsin-Ke Lu, and Wen-I Liu

Ever wonder how the articles in a particular research domain connect to one another, or how those connections might evolve over time? Eager to help researchers quickly and visually identify key articles within a particular research domain? Incorporating data mining techniques for co-citation analysis, Chu, Liu, and Lu offer a tantalizing glimpse – a study that builds an automated web-based citation analysis system presenting an automated system that can do both.

“Digitization of Textual Documents Using PDF/A”
Yan Han and Xueheng Wan

This article provides a technical yet practical explanation of the value of using the open PDF/A file format for the long-term preservation of digital content, and will appeal to staff in any type of library responsible for determining preferred file formats for future discovery and access.

Editorial Content

Submit Your Ideas
for contributions to ITAL to Ken Varnum, editor, at with your proposal. Current formats are generally

  • Articles – original research or comprehensive and in-depth analyses, in the 3000-5000 word range.
  • Communications – brief research reports, technical findings, and case studies, in the 1000-3000 word range.

Questions or Comments?

For all other questions or comments related to LITA publications, contact LITA at (312) 280-4268 or Mark Beatty,

The powers of library consortia 3: Scaling influence and capacity for impact and efficiency / Lorcan Dempsey

The powers of library consortia 1: How consortia scale capacity, learning, innovation and influence The powers of library consortia 2: Soft power and purposeful mobilization: scaling learning and innovation   The powers of library consortia 3: Scaling influence and capacity The powers of library consortia 4: Scoping, sourcing and scaling  Scaling capacity (building shared infrastructure and … Continue reading The powers of library consortia 3: Scaling influence and capacity for impact and efficiency

The post The powers of library consortia 3: Scaling influence and capacity for impact and efficiency appeared first on Lorcan Dempsey's Weblog.

What’s in a Name? More Than Meets the Eye, or IPEDS: On the Religious Identity of US Colleges and Universities / HangingTogether

In our development of the University Futures, Library Futures working model, we entertained the thought of incorporating a dimension that would distinguish between religious and secular US colleges and universities, and allow us to compare and contrast their respective library services.

We were interested in finding out if colleges and universities that have religious affiliations have a distinctly different library services portfolio compared to independent or unaffiliated, namely secular, colleges and universities. For example, we hypothesized that the scope of academic library collections would differ between religious and secular institutions.

To test the waters, we first wanted to determine if an explicit religious affiliation is significant to colleges’ and universities’ institutional identities. We did an informal content analysis of university mission statements for a sample of 100 institutions in our target population. The results of this qualitative investigation were striking: consider, for example, the mission statement of Ouachita Baptist University in Arkadelphia, AR:

Ouachita Baptist University seeks to foster a love of God and a love of learning by creating for students and other constituents dynamic growth opportunities both on campus and throughout the world. With foresight and faithfulness, Ouachita makes a difference.

The institution’s name, of course, includes an affiliation with the Baptist church, and the ideas expressed in this statement are strongly tied to an approach to higher education, which is expressed in terms of fostering “a love of God” and enlists “faithfulness” to make a difference.

Now, consider the following mission statement from Drexel University in Philadelphia, PA:

Drexel University fulfills our founder’s vision of preparing each new generation of students for productive professional and civic lives while also focusing our collective expertise on solving society’s greatest problems. Drexel is an academically comprehensive and globally engaged urban research university, dedicated to advancing knowledge and society and to providing every student with a valuable, rigorous, experiential, technology-infused education, enriched by the nation’s premier co-operative education program.

There is no mention of religious affiliation or mission, nor any explicit reference to faith.  Instead, there is a focus on the practical benefits of higher education: personal professional advancement and shared societal benefits of civil engagement. Drexel’s mission is “advancing knowledge and society”. If not avowedly secular, the university does not declare a faith-based purpose.

False Positive – Type 1 Error
Consistent with our decision to derive our model’s indicators from the national IPEDS data source, we next extracted the religious affiliation variable for our population of 1,500 institutions. Roughly 40 percent of our project population reported a religious affiliation in their 2015 IPEDS survey response. We planned on introducing a binary variable, which would allow us to differentiate religious and secular institutions, so that we could investigate whether distinctive patterns of library service emerged in either category. Quickly, however, we encountered a problem when certain institutions, which, according to common knowledge, were secular, and whose mission statements demonstrated a strong liberal education approach, reported a religious affiliation in their IPEDS survey data, as in the case of Duke University, whose historic relationship with the Methodist Church is not currently reflected in the university’s governance. We considered a couple of such cases and learned that sometimes a historic religious affiliation is still held as a formality and is therefore reported to IPEDS, while the actual nature of the institution is essentially secular. Such ‘false positive’ type of error gave us pause. We readily knew that an institution like Duke University was unlikely to be religious and therefore thought to check it but our knowledge of the 1,500 institutions that belong in our project population is far from complete and we realized many such ‘false positive’ cases could easily go under our radar.

Ray Granade, professor and director of library services

God Is in the Details
Furthermore, the opposite ‘error’ became evident when we recently had the pleasure of talking with Dr. Ray Granade, Professor of History and Director of Library Services at Ouachita Baptist University. We consulted with Dr. Granade about our hesitation on how to approach the issue of formal versus actual religious affiliation of colleges and universities and found ourselves in the privileged proverbial front row seats of an expert historian’s class (okay, we were on the phone with him, but still) on some of the changes and developments in the Southern Baptist Convention and its affiliate higher education institutions and, very possibly, by extension, in many other religious institutions. In a nutshell, it became apparent to us that because of the differences between the various Christian denominations (e.g. Catholics, Orthodox, Lutherans, Anglican, or Baptist) as well as possibly among the various streams within any single given denomination (e.g. more conservative versus more progressive streams within a single church with which higher education institutions are affiliated), can make a world of difference in the approach that an institution – and potentially its library – may exhibit regarding services such as collection scope.

The critical point for our University Futures, Library Futures project is that these very different institutional characteristics – identities, really – would not be discernable through IPEDS institutional survey data, nor would they necessarily lend themselves to being detectable through institutional mission statements.

We thus realized that using a binary variable of religious and secular colleges and universities in our entire project population of 1,506 institutions was impractical and that our initial intent to explore library services differences in religious versus secular settings could not be done without additional qualitative research.

Our takeaway is that college and university names, mission statements, and IPEDS-reported religious affiliations encompass much that is conveyed explicitly, much that is conveyed implicitly, and much that is being left unsaid but carries a significant influence on higher education institutions’ identities and, potentially, their library services. 

Looking ahead, we have learned that there is some ground to our hypothesis that library services (specifically collections scopes but maybe others too) exemplify some distinct differences that tie back to their parent institution’s religious or secular identity. With this in mind, we are considering future research on this topic, possibly on a subset of our UFLF project population – stay tuned!

We invite our readers to let us know their thoughts on this topic – does a religious affiliation of the parent institution have an impact on a library’s collection and other services? Please leave us a comment below, or reach out via email ( or

We thank our colleague Rebecca Bryant for reviewing an earlier version of this post and providing helpful comments.

User-Centered Provisioning of Interlibrary Loan: a Framework / In the Library, With the Lead Pipe

In Brief:

Interlibrary loan (ILL) has grown from a niche service limited to few privileged scholars to a ubiquitous expected service. Yet, workflows still assume specialness. Users’ needs should come first and that means redesigning ILL into a unified linear user-centered process. It is not just a request form, rather we need improved mechanisms for users to track, manage, and communicate about their requests. This article explores how ILL developed, problems with the current ILL ecosystem, and changes that can make ILL centered on users’ needs and processes rather than backend library systems.

By Kurt Munson


Interlibrary loan (ILL) provides library users with a critical tool to acquire resources they need for their information consumption and evaluation activities whether research, teaching, learning, or something else. The 129% increase in ILL volume between 1991 and 2015 in the Association of Research Libraries (ARL) statistics clearly shows that ILL has grown from a niche service to an expected one (ARL, 2016). Yet, our library processes for providing this service have not kept pace with technological development. Thus, the provision of ILL is less effective than it could be because it is predicated upon library processes and systems rather than on most effectively meeting users’ needs. This article explores the development of ILL as a service, suggests areas in need of improvement, provides a framework for redesigning this service in a user-centered way, and finally outlines efforts to create such a user-centered ILL to meet those needs.

Interlibrary loan holds a unique place within the suite of services libraries provide. ILL is entirely user initiated and driven by demonstrated user need. It provides a mechanism for users to acquire materials they have discovered and determined to be worthy of additional investigation but for which local copy is not available. ILL expands the resources available to users to that which can be delivered, not just the contents of the local collection.

The modern research library offers a range of services under the ‘Resource Sharing’ umbrella, including consortial sharing of returnables, interlibrary loan of returnables and non-returnables, and local document delivery operations. The ILL process discussed in this article is restricted to ILL as a brokered process whereby a library requests and arranges the loan of a physical item for use by an affiliated user. ILL practitioners refer to this process as traditional ILL of returnables, as the item will be returned to the owning library. Scans or reproductions of articles or portions of a work provided from a local collection or by another library fall outside this article’s scope because the workflows for sourcing and providing those items are quite different. This article primarily concentrates on ILL between academic libraries, though its recommendations are generalizable to public, medical, and other libraries.

Historical Development

ILL has a long history as a library service but for most of that history, it was a niche service provided to only a select group of library users, most often faculty members and perhaps graduate students. ILL was difficult, time consuming, and required a great deal of staff effort. Simply identifying an owning library was a challenge before the introduction of shared computerized catalogs. Citations needed careful verification to ensure accuracy, particularly for items created prior to the introduction of the International Standard Book Number (ISBN) system in 1968. Identifying holdings and ownership represented huge challenges. While tools like the Pre-1956 Union Catalog existed, these were out of date as soon as they were printed. Requests were made via mailed paper request forms. The library that owned the item would likely know nothing of the requesting library so the trusted relationships we take for granted had not yet developed. A library might send an item or it might not. An owning library might respond in the negative or it might not. It was at best an arduous process analogous to weaving cloth and sewing garments by hand rather than purchasing ready-made off the rack clothing.

The creation of the OCLC cooperative in 1967, specifically its shared index of items, provided the opportunity to vastly improve ILL processes and workflows. The OCLC database, eventually to be known as WorldCat, contained one record for a work and libraries that could indicate who owned a copy of that item. It was now possible to identify ownership easily. Moreover, this identification could be done in one place with simultaneous citation verification. OCLC introduced the first of its interlibrary loan subsystems in 1979 (Goldner, Birch, 2012, p. 5) because there were now enough item records and holding records in the shared OCLC index to support ILL processing. Over time, additional axillary ILL services for library staff were introduced by OCLC. For example, a library can provide contact information, address information, and explain what it will and will not lend with any associated costs for these services in the ILL policies directory. The OCLC ILL Fee Management (IFM) system provides an automated billing system as part of the transaction process. ILL became markedly easier to do, or at least portions of the process did.

The development of WorldCat and other union catalogs made the process of identifying owning libraries and placing requests much easier but these were closed systems with limited functionality. These systems did one thing: placed a request. Yet, ILL is a multi-part process consisting of many disparate steps that library staff perform. Files of request forms require maintenance. Users need to be contacted when items arrive or need to be returned. Circulating necessitates tracking over time. Physical items require packing and shipping. Invoices require payment.

For the library user, ILL is just one of many tools to acquire materials and the user’s interest is accessing the materials, not how the library chooses to source the requested item. Users once filled out a paper form which staff keyed into the requesting system. Then the user patiently waited until they received a phone call or postcard alerting them that the item had arrived. To be sure, verification and ordering had become easier but the process still involved many handoffs between different systems with minimal communication.

Easier ordering allowed ILL request volumes to increase markedly (Goldner, Birch, 2012, p. 5). ILL management systems were developed to automate the management and tracking of requests over their lifespan in addition to handling communication with users and to circulate the items. ILLiad is the most common ILL management system used today in academic libraries. Both owning libraries and requesting libraries came to rely upon these systems to manage requests over their lifespan. Request databases replaced file folders. Data could be pushed from one system into another. Routine tasks, such as sending overdue notices, could be automated. ILL had become a standard mainstream expected service rather than the niche one.

Improved staff processing was not the only driver for increased volume. OpenURL and other outgrowths of user-facing databases and the ubiquity of the internet made discovery easier (Musser, 2016, p. 646). The easy transfer of metadata via OpenURL increased request volume because users could request items by pressing a button instead of filling out a paper form. The request went into the request database for staff processing. Nonetheless, the improvements ILL management systems provided remained rooted in ILL’s traditional union catalog-based requesting workflows. They focused on making library staff processes to provide items more easily rather than user workflows or needs. Issues with the approach and workflows described above are explored below.

Problems with Our Current Approach

A number of issues limit ILL service’s usability which in turn limits its effectiveness for both users and library staff. To be sure, ILL services are valued by users and play an integral part in the suite of services libraries provide to source materials for users, but it can be improved upon by reconceptualizing the process whereby it is provided. Libraries can rethink how the individual parts of the process, be they software or workflow, are put together. Areas for reconceptualization fall into five broad categories, and these are discussed below.

First, existing systems are based on identifying libraries that own a requested item. But for the purposes of ILL, ownership is only the first step in the process. An on-shelf loanable copy must be located because only items that fit these criteria can fill the user’s need. WorldCat can tell us who owns an item but what we need is a library that can loan the item. Owning libraries, or lenders as ILL practitioners call them, still need to perform a search of their local catalog to determine if the item is on shelf and loanable. This involves a time-consuming antiquated manual workflow that fails to take advantage of tools such as Z39.50 for automated catalog lookup. Workflows have not kept up with technological advancements.

Consortial borrowing systems, such as Relais D2D or VDX, where a group of libraries share a discovery layer that displays availability, mitigate the issue described above but these systems also have a serious shortcoming: they force users to execute the same search in multiple discovery layers to find an available copy. Users, having identified an item, cannot simply submit a request and have the library source it for them. Rather, libraries expect users to navigate across disparate interfaces with unique request processes to request an item. Thus discovery and delivery become a fractured process for users as libraries push the work of finding a loanable copy onto their users.

Second, identifying owning libraries remains tied to the searching of union catalogs because metadata is not recycled efficiently. A user searches their local library’s discovery tool and finds that an item they want is checked out so they fill out an ILL request form populated with metadata from their local discovery tool. Library staff, or preferably automated systems, then re-execute a similar search using that same metadata against a larger database to identify potential lending libraries and the request is ported into a different system. Since the metadata populating the local discovery tool likely came from WorldCat in the first place and that metadata will be used to search against WorldCat again, said metadata should be trusted rather than assuming that the citation needs verification by library staff. This is again an antiquated workflow rooted in past practices.

Third, ILL is very much predicated on the terms imposed by the owning library. While the OCLC policies directory provides library staff with information about terms of use for borrowed items, the lack of consistent agreed-upon standards for loan periods between libraries creates a situation ripe for confusion on the part of users. Again, this harks back to an era where ILL was rare, difficult, and unique rather than the current situation where ILL is a standard service. Too much emphasis is placed on unique locally defined rules rather than on setting broadly agreed-upon standards or considering users’ needs for materials.

Fourth, the process uses siloed systems with weak integrations and poor interoperability. Discovery happens in one system. Requests are managed in a separate ILL management system which ties to an external ordering system for sourcing items. When the item arrives at the borrowing library, these respective systems must be updated but then the item needs to be handled as a circulation likely in a separate system again or in a system separate from the one that manages the user’s loans for locally owned materials. Yes, the systems can communicate between each other but this process is staff intensive and lacking in automation. Crosswalks, bridges, and information exchange protocols are not employed fully or efficiently.

Finally and most importantly, providing ILL services is predicated on library processes or library tools rather than user processes or needs. Users must learn and jump between disparate systems, often with jarring handoffs, to acquire materials. Depending on how the item is sourced by the library for the user, they need to find the system where the library has chosen to process that request. Communication is scant. It comes from different systems and mostly consists of silence until a pick up notification is sent. This confusing process is followed by inconsistent rules surrounding use based on the lending library’s terms of use. Usability studies have demonstrated how this confuses users (Foran, 2015, p. 6). Presented with multiple, often contradictory delivery options, and unclear explanations of the differences between them, users tend to place requests in each system in the hopes that one will work. Not only is this poor customer service, but it also increases staff workloads and costs for the library with duplicated work. Why? Because libraries define ILL success as having acquired a copy for the user. The user’s needs—required turnaround time, format, amount of time they will need the item or even its relative importance to them for intended use of it—are secondary, when even considered. Libraries need to gain a better understanding of how ILL fits into the user’s activities and how they can more effectively support those activities. ILL needs to be borrower-centered not lender-centered.

In many ways the issues outlined above are a natural outcome of a service’s evolution over time and the result of a fairly stable ecosystem that expanded gradually over time. The foundational systems which undergird the service were able to absorb the increased request volume and processes simply continued without redesign or rethinking. Yet the environment in which the service exists is evolving rapidly and the time for a radical rethinking of the technology used to support the service workflows and metrics for success is here.

Recommendations for Developing an Alternative Framework

At the International ILLiad Conference in March 2016, Katie Birch of OCLC announced that OCLC intended to “move ILLiad to the cloud”. Far more than any other change in ILL processing or systems, including the introduction of Worldshare ILL, this announcement shook the foundations of academic library ILL in the United States. We were presented the opportunity to reimagine how we provide ILL services. We began to ask the question “what should the ILL workflows be?” How could we make them more user-centered rather than continuing the historic workflows mandated by vendor-supplied platforms? Concurrently and partially in response to this announcement the Big Ten Academic Alliance (BTAA), previously known as the Center for Institutional Cooperation (CIC), embarked on a project to explore, redefine, document, and share a user-centered discovery to delivery process. The project’s goal was to describe an easy-to-understand user experience that shielded them from the disparate library staff systems and provided a more linear discovery to resource delivery process. Usability studies confirmed library staff members’ impression that the process was confusing and disjointed to users (Big, 2016, pp. 19-22; Big, 2017b, pp. 19-21). Cooperatively with Ivies Plus Libraries and the Greater Western Library Association (GWLA), we defined base requirements and system functionalities for a new user centered vision of ILL. A one page summary document entitled “Next Generation Discovery to Delivery: A Vision” was released in February of 2017. Staff from BTAA libraries, including the author of this article, wrote two reports entitled “A Vision for Next Generation Resource Delivery” and “Next Generation Resource Delivery: Management System and UX Functional Requirements”. These works, in part, inform the three broad recommendations outlined below, described as: user process, technological, and cultural.

To start, the library tools that support the users’ processes must be based upon their workflows rather than the processes library systems staff use to manage that work. Where in the past a user interface was tacked onto a library staff system, this should no longer be the case. Users deserve a simple universal request mechanism, a “get it” button (Foran, 2015, p. 5) that connects to a smart fulfillment system (Big, 2017b, p. 9). Requests should display in a single dashboard-like interface that allows users to manage all their library interactions in one place (Big, 2017b, p. 9). No longer should users be expected to hunt across disparate library system interfaces to locate their request for that specific item. Achieving this requires that we rethink how we, library staff, present library systems to users. Since the primary local discovery layer is the user’s primary entry point into the library and the place where they manage their library interactions, this interface needs to be the place where we display all request information to them. Thus, vendors who provide discovery layer tools must make them open and capable of incorporating data from external sources so we can provide users a unified display. They should be shielded from systems libraries use to perform their work of fulfilling requests. Users need items and which library staff process is invoked is immaterial to them. Getting the item is paramount. This notion must inform how libraries design, combine, and present their backroom systems to our customers.

Second, delivery of an available on-shelf loanable copy to the user who needs it and made the effort to ask for it is what matters, not identifying owning libraries. ILL loans are simply more complicated circulations. Discovery tools should be separated from discovery options as these two do not need to be interconnected. The metadata from discovery is all that is needed to initiate delivery. Request should be managed via a lightweight system specifically designed around the efficient and timely fulfillment of that user’s request with user satisfaction serving as the primary metric for defining success. The BTAA reports named this new idea “Resource Delivery Management System” (RDMS) (Big, 2017b, p. 12). Working off a list of potential partner libraries maintained and defined in the RDMS, a simple Z39.50 search using that recycled metadata should identify a potential lending partner and when a loanable copy is found, a request should be placed via NCIP with routing and courier tracking/shipping information included in the RDMS’s request record. Circulations of ILL items should occur in the local Library Services Platform (LSP) so users can managed all loans regardless of how they are sourced in one place.

The ideas above, in many ways, represent a somewhat radical break from past processes or practices. They decouple sourcing of materials from a shared index. Instead, they are based on library-defined partnerships and the identification of a loanable copy at a partner. Moreover, this approach promotes interoperability across different systems as the request is not tied to any legacy or monolithic system. Multiple micro-systems each play a part to complete a multistep process. Finally, it limits the functional scope of the RDMS to just the management of delivery, avoiding the current problem of (often subpar) duplication of functionality across systems. While no such system as described above exists, potential development is under exploration by vendors.

The ideas outlined above further move us from the current siloed systems to one where integrations are central and key and where the best, most appropriate system, manages or provides the required information (Big, 2017a, p. 1). Thus, the local LSP handles all aspects of notification, circulation, and fines or blocks. Viewing this as a process consisting of many parts also allows us to reimagine it so that we can incorporate other previously excluded information such as shipping status derived from the UPS or FedEx APIs. Additional communications to users about the status of their request should be included too. Companies provide these updates on orders and shipping as a matter of course so libraries can also. Users reasonably expect them. Authoritative sources, rather than poorly duplicated ones, should be called upon to provide information as needed. Local address information sourced from that campus identity management system, for example. This system consists of many parts communicating with each other via protocols using APIs when needed. Binding their collective parts together with each assigned a specific task provides a new framework for the workaday provisioning of ILL services.

Technology is easy to change. Culture is more difficult, particularly entrenched library policies. These policies’ efficacy at guiding user behavior and promoting shared stewardship of materials is almost never tested. Yet, users and library staff are both equally engaged in the management of loaned items. Libraries need to embrace the early slogan of the Rethinking Resources Sharing Initiative, “throw down your policies and embrace your collections” and libraries need to manage this sharing efficiently in a data-driven way.

It is important to remember that users need materials to complete their work. The use of materials by users is predicated upon their need, associated timeline, and perceived value of the item. As the Big Ten Academic Alliance has stressed, “All that matters is format, time to delivery, loan period, and costs to the patron, if any” (Big, 2016, p. 9). These items have value to the user. They put effort into acquiring them. ILL is entirely user-driven unlike many other library processes. Arbitrary loan periods as set by any owning lending library may and in fact do come into conflict with users’ needs (Foran, 2015, p. 4). Libraries can resolve these conflicts easily by moving to standardized loan periods for ILL. Standards should replace the boutique exceptionality encouraged by the OCLC policies directory.

Stated differently, the emphasis needs to shift from lender-imposed restrictions to borrowing libraries having the ability to communicate standard policies. For example, the BTAA shared twelve week loan period, when complemented by the equivalent Northwestern University local loan period, coupled with user blocks and assessment of replacement cost fines after thirty days provide a consistent user experience that, in turn, encourages the timely return of items. For example, only 29 of 29,137 total ILL loans were lost by Northwestern University users in 2016. This example demonstrates how consistent policies promote compliance. Why? Because they are both easy to understand and failure to comply with communicated expectations has direct consequences, specifically the loss of library privileges. Further, research done by the Ivies Plus Libraries demonstrates that almost all items are returned to the owning library after the user has completed their use of said item. Only 70 items of roughly 750,000 over three years were truly lost by patrons or never returned. This data clearly demonstrates the need to rethink policies across libraries and reconsider shared assumptions. In other words, the emphasis needs to be on understanding user behavior based on their needs and developing effective ways to affect their behavior to achieve agreed upon reasonable outcomes.

Libraries must also shift from their historic lender-centric ILL system to one where an ILL user receives an item and national standards provide them a consistent easy-to-understand experience. This would promote an environment where borrowing libraries can more effectively manage their users. Appropriate effective tools, tested by data, are needed. Ineffective tools need to be discarded, like overdue notices via email from the lending library to the ILL borrowing staff. These will never affect user behavior. Making the process easier for users to understand in terms of policy is critical. The introduction of standardized loan periods, replacement costs, and the like across libraries would simplifying the management of ILL for both users and library staff. It would also greatly assist in achieving compliance and reducing (often pointless) staff work.

Rather than starting with the question of which library system can perform a specific job, we need to rethink this process and backfill the appropriate system, library or other, from the starting point: the initial discovery and request by the user. The BTAA phrased this as smart fulfillment. Smart fulfillment is a linear path for users to follow where effective automated handoffs between library systems source and manage requests from or in the most appropriate place.


ILL has grown from a niche service to an expected standard one, growing 129% between 1991 and 2015 in ARL libraries (ARL, 2016). Yet workflows and system integrations have not evolved as much as they should have in response to this growth. A confluence of announcements and work to redefine processes now presents libraries with a unique opportunity to rethink ILL, transition from legacy practices, and to unify the fractured discovery to delivery process we present to our users. If we integrate library systems and systems that support library systems differently, and effectively leverage each system’s strength, we can create an easy-to-use service that meets demonstrated user needs. We can provide a service that provides smart fulfilment of requests and improves both the user and staff experience. This should be our goal.

The author wishes to extend his deepest thanks to Heidi Nance, Director of Resource Sharing Initiatives for the Ivy Plus Libraries, for her willingness to review this article, apply her deep knowledge of ILL while doing so, and for the thoughtful comments and suggestions. Thank you, Heidi.


Association of Research Libraries. (2016). ARL Statistics 2014-15. Association of Research Libraries. Retrieved from

Big Ten Academic Alliance. (2016). A Vision for Next Generation Resource Delivery. Retrieved from

Big Ten Academic Alliance. (2017a). Next Generation Discovery to Delivery System: a Vision. Retrieved from

Big Ten Academic Alliance. (2017b). Next Generation Resource Delivery: Management System and UX Functional Requirements. Retrieved from–functional-requirements.pdf

Foran, K. (2015). “New Zealand Library Patron Expectations of an Interloan Service.” New Zealand Library & Information Management Journal. 55(3), 3-9.

Goldner, M., & Birch, K. (2012). “Resource Sharing in a Cloud Computing Age.” Interlending & Document Supply, 40(1), 4-11.

Musser, L., & Coopey, B. (2016). “Impact of a Discovery System on Interlibrary Loan.” College & Research Libraries. 77(5), 643-653.

Stapel, J. (2016). “Interlibrary Loan and Document Supply in the Netherlands.” Interlending & Document Supply. 44(3), 104-107.,

Back to the Blog / Dan Cohen

One of the most-read pieces I’ve written here remains my entreaty “Professors Start Your Blogs,” which is now 12 years old but might as well have been written in the Victorian age. It’s quaint. In 2006, many academics viewed blogs through the lens of LiveJournal and other teen-oriented, oversharing diary sites, and it seemed silly to put more serious words into that space. Of course, as I wrote that blog post encouraging blogging for more grown-up reasons, Facebook and Twitter were ramping up, and all of that teen expression would quickly move to social media.

Then the grown-ups went there, too. It was fun for a while. I met many people through Twitter who became and remain important collaborators and friends. But the salad days of “blog to reflect, tweet to connect” are gone. Long gone. Over the last year, especially, it has seemed much more like “blog to write, tweet to fight.” Moreover, the way that our writing and personal data has been used by social media companies has become more obviously problematic—not that it wasn’t problematic to begin with.

Which is why it’s once again a good time to blog, especially on one’s own domain. I’ve had this little domain of mine for 20 years, and have been writing on it for nearly 15 years. But like so many others, the pace of my blogging has slowed down considerably, from one post a week or more in 2005 to one post a month or less in 2017.

The reasons for this slowdown are many. If I am to cut myself some slack, I’ve taken on increasingly busy professional roles that have given me less time to write at length. I’ve always tried to write substantively on my blog, with posts often going over a thousand words. When I started blogging, I committed to that model of writing here—creating pieces that were more like short essays than informal quick takes.

Unfortunately this high bar made it more attractive to put quick thoughts on Twitter, and amassing a large following there over the last decade (this month marks my ten-year anniversary on Twitter) only made social media more attractive. My story is not uncommon; indeed, it is common, as my RSS reader’s weekly article count will attest.

* * *

There has been a recent movement to “re-decentralize” the web, returning our activities to sites like this one. I am unsurprisingly sympathetic to this as an idealist, and this post is my commitment to renew that ideal. I plan to write more here from now on. However, I’m also a pragmatist, and I feel the re-decentralizers have underestimated what they are up against, which is partially about technology but mostly about human nature.

I’ve already mentioned the relative ease and short amount of time it takes to express oneself on centralized services. People are chronically stretched, and building and maintaining a site, and writing at greater length than one or two sentences seems like real work. When I started this site, I didn’t have two kids and two dogs and a rather busy administrative job. Overestimating the time regular people have to futz with technology was the downfall of desktop linux, and a key reason many people use Facebook as their main outlet for expression rather a personal site.

The technology for self-hosting has undoubtedly gotten much better. When I added a blog to, I wrote my own blogging software, which sounds impressive, but was just some hacked-together PHP and a MySQL database. This site now runs smoothly on WordPress, and there are many great services for hosting a WordPress site, like Reclaim Hosting. It’s much easier to set up and maintain these sites, and there are even decent mobile apps from which to post, roughly equivalent to what Twitter and Facebook provide. Platforms like WordPress also come with RSS built in, which is one of the critical, open standards that are at the heart of any successful version of the open web in an age of social media. Alas, at this point most people have invested a great deal in their online presence on closed services, and inertia holds them in place.

It is psychological gravity, not technical inertia, however, that is the greater force against the open web. Human beings are social animals and centralized social media like Twitter and Facebook provide a powerful sense of ambient humanity—the feeling that “others are here”—that is often missing when one writes on one’s own site. Facebook has a whole team of Ph.D.s in social psychology finding ways to increase that feeling of ambient humanity and thus increase your usage of their service.

When I left Facebook eight years ago, it showed me five photos of my friends, some with their newborn babies, and asked if I was really sure. It is unclear to me if the re-decentralizers are willing to be, or even should be, as ruthless as this. It’s easier to work on interoperable technology than social psychology, and yet it is on the latter battlefield that the war for the open web will likely be won or lost.

* * *

Meanwhile, thinking globally but acting locally is the little bit that we can personally do. Teaching young people how to set up sites and maintain their own identities is one good way to increase and reinforce the open web. And for those of us who are no longer young, writing more under our own banner may model a better way for those who are to come.

2018 LITA Library Technology Forum Call for Proposals / LITA

Submit your Proposals for the:

2018 LITA Library Technology Forum

Minneapolis, MN
November 8-10, 2018

There are two ways of spreading light: to be the candle or the mirror that reflects it.
Edith Wharton

The Library and Information Technology Association seeks proposals for the 21st Annual LITA Library Technology Forum in Minneapolis, Minnesota from November 8-10, 2018.

Building & Leading

Our theme for the 2018 LITA Library Technology Forum is: Building & Leading.
What are you most passionate about in librarianship and information technology? This conference is your chance to share how your passions are building the future, and to lead others by illumination and inspiration. Along those lines, we are inviting you to rethink your take on presentations and programming.

To inspire your creativity, we have three options from which you may choose:

  • Traditional: Solo or panel presentations on a topic, fixed length, may be streamed/recorded.
  • Hands-on: Leading the audience through a creative or generative process, attendees should feel like they did something when they leave, variable lengths of 1-3 hours possible.
  • Discussion-based: Involving shorter presentation lengths, with extensive time for break-out discussions with audience.

We have rooms that are dedicated to each type, with seating and AV setups appropriate to each style of presentation. The goal will be to give attendees a variety of styles, timeframes, and topics to choose from at every turn.
Submission Deadline: Tuesday, May 1, 2018.

Proposal Details

Proposals may cover projects, plans, ideas, or recent discoveries. We accept proposals on any aspect of library and information technology. The committee particularly invites submissions from first time presenters, library school students, and individuals from diverse backgrounds.

We deliberately seek and strongly encourage submissions from underrepresented groups, such as women, people of color, the LGBTQA+ community, and people with disabilities. We also strongly encourage submissions from public, school, and special libraries.

For a longer document of LITA’s commitment to diversity, please see LITA’s Statement on Diversity, and all attendees are expected to read, understand, and follow the LITA Statement of Appropriate Conduct.

Vendors wishing to submit a proposal should partner with a library representative who is testing/using the product.

Presenters will submit final presentation slides and/or electronic content (video, audio, etc.) to be made available online following the event. Presenters are expected to register and participate in the Forum as attendees; a discounted registration rate will be offered.

Submit Program Proposal

If you have any questions, contact Jason Griffey, Forum Program Committee Chair, at griffey AT

More information about LITA is available from the LITA website (, Facebook ( and Twitter (

Holtzbrinck has attacked Project Gutenberg in a new front in the War of Copyright Maximization / Eric Hellman

As if copyright law could be more metaphysical than it already is, German publishing behemoth Holtzbrinck wants German copyright law to apply around the world, or at least in the part of the world attached to the Internet. Holtzbrinck's empire includes Big 5 book publisher Macmillan and a majority interest in academic publisher Springer-Nature.

S. Fischer Verlag, Holtzbrinck's German publishing unit, publishes books by Heinrich Mann, Thomas Mann and Alfred Döblin. Because they died in 1950, 1955, and 1957, respectively, their published works remain under German copyright until 2021, 2026, and 2028, because German copyright lasts 70 years after the author's death, as in most of Europe. In the United States however, works by these authors published before 1923 have been in the public domain for over 40 years.

Project Gutenberg is the United States-based non-profit publisher of over 50,000 public domain ebooks, including 19 versions of the 18 works published in Europe by S. Fischer Verlag. Because Project Gutenberg distributes its ebooks over the internet, people living in Germany can download the ebooks in question, infringing on the German copyrights. This is similar to the situation of folks in the United States who download US-copyrighted works like "The Great Gatsby" from Project Gutenberg Australia (not formally connected to Project Gutenberg), which relies on the work's public domain status in Australia.

The first shot in S. Fischer Verlag's (and thus Holtzbrinck's) copyright maximization battle was fired in a German Court at the end of 2015. Holtzbrinck demanded that Project Gutenberg prevent Germans from downloading the 19 ebooks, that it turn over records of such downloading, and that it pay damages and legal fees. Despite Holtzbrinck's expansive claims of "exclusive, comprehensive, and territorially unlimited rights of use in the entire literary works of the authors Thomas Mann, Heinrich Mann, and Alfred Döblin", the venue was apparently friendly and in February of this year, the court ruled completely in favor of Holtzbrinck, including damages of €100,000, with an additional €250,000 penalty for non-compliance. Failing the payment, Project Gutenberg's Executive director, Greg Newby, would be ordered imprisoned for up to six months! You can read Project Gutenberg's summary with links to the judgment of the German court.

The German court's ruling, if it survives appeal, is a death sentence for Project Gutenberg, which has insufficient assets to pay €10,000, let alone €100,000. It's the copyright law analogy of the fatwa issued by Ayatollah Khomeini against Salman Rushdie. Oh the irony! Holtzbrinck was the publisher of Satanic Verses.

But it's worse than that. Let's suppose that Holtzbrink succeeds in getting Project Gutenberg to block direct access to the 19 ebooks from German internet addresses. Where does it stop? Must Project Gutenberg enforce the injunction on sites that mirror it? (The 19 ebooks are available in Germany via several mirrors: in maybe Monserrat, at the UK's University of Kent, and at Universidade do Minho Mirror sites are possible because they're bare bones - they just run rsync and a webserver, and are ill-equipped to make sophisticated copyright determinations. Links to the mirror sites are provided by Penn's Online Books page.  Will the German courts try to remove the links for Penn's site? Penn certainly has more presence in Germany than does Project Gutenberg. And what about archives like the Internet Archive? Yes, the 19 ebooks are available via the Wayback Machine.

Anyone anywhere can run rsync and create their own Project Gutenberg mirror. I know this because I am not a disinterested party. I run the Free Ebook Foundation, whose GITenberg program uses an rsync mirror to put Project Gutenberg texts (including the Holtzbrinck 19) on Github to enable community archiving and programmatic reuse. We have no way to get Github to block users from Germany. Suppose Holtzbrinck tries to get Github to remove our repos, on the theory that Github has many German customers? Even that wouldn't work. Because Github users commonly clone and fork repos, there could be many, many forks of the Holtzbrinck 19 that would remain even if ours disappears. The Foundation's Free-Programming-Books repo has been forked to 26,0000 places! It gets worse. There's an EU proposal that would require sites like Github to install "upload filters" to enforce copyright. Such a rule would be introducing nuclear weapons into the global copyright maximization war. Github has objected.

Suppose Project Gutenberg loses its appeal of the German decision. Will Holtzbrinck ask friendly courts to wreak copyright terror on the rest of the world? Will US based organizations need to put technological shackles on otherwise free public domain ebooks? Where would the madness stop?

Holtzbrinck's actions have to be seen, not as a Germany vs. America fight, but as part of a global war by copyright owners to maximize copyrights everywhere. Who would benefit if websites around the world had to apply the longest copyright terms, no matter what country? Take a guess! Yep, it's huge multinational corporations like Holtzbrinck, Disney, Elsevier, News Corp, and Bertelsmann that stand to benefit from maximization of copyright terms. Because if Germany can stifle Project Gutenberg with German copyright law, publishers can use American copyright law to reimpose European copyright on works like The Great Gatsby and lengthen the effective copyrights for works such as Lord of the Rings and the Chronicles of Narnia.

I think Holtzbrinck's legal actions are destructive and should have consequences. With substantial businesses like Macmillan in the US, Holtzbrinck is accountable to US law. The possibility that German readers might take advantage of the US availability of texts to evade German laws must be balanced against the rights of Americans to fully enjoy the public domain that belongs to us. The value of any lost sales in Germany is likely to dwarfed by the public benefit value of Project Gutenberg availability, not to mention the prohibitive costs that would be incurred by US organizations attempting to satisfy the copyright whims of foreigners. And of course, the same goes for foreign readers and the copyright whims of Americans.

Perhaps there could be some sort of free-culture class action against Holtzbrinck on behalf of those who benefit from the availability of public domain works. I'm not a lawyer, so I have no idea if this is possible. Or perhaps folks who object to Holtzbrinck's strong arm tactics should think twice about buying Holtzbrinck books or publishing with Holtzbrinck's subsidiaries. One thing that we can do today is support Project Gutenberg's legal efforts with a donation. (I did. So should you.)

Disclaimer: The opinions expressed here are my personal opinions and do not necessarily represent policies of the Free Ebook Foundation.

  1. Works published after 1923 by authors who died before 1948 can be in the public domain in Europe but still under copyright in the US.  Fitzgerald's The Great Gatsby is one example.
  2. Many works published before 1978 in the last 25 years of an author's life will be in the public domain sooner in Europe than in the US. For example, C. S. Lewis' The Last Battle is copyrighted in the US until 2051, in Europe until 2034. Tolkein's Return of the King is similarly copyrighted in the US until 2051, in Europe until 2044. 
  3. Works published before 1924 by authors who died after 1948 are now in the US Public Domain but can still be copyrighted in Europe. Agatha Christie's first Hercule Poirot novel, The Mysterious Affair at Styles is perhaps the best known example of this situation, and is available (for readers in the US!) at Project Gutenberg.
  4. A major victory in the War of Copyright Maximization was the Copyright Term Extension Act of 1998.
  5. As an example of the many indirect ways Project Gutenberg texts can be downloaded, consider Heinrich Mann's Der Untertan. Penn's Online Books Page has many links. The Wayback Machine has a copy. It's free on Amazon (US). Hathitrust has two copies, the same copies are available from Google Books, which won't let you download it from Germany.
  6. Thanks go to VM (Vicky) Brasseur for help verifying the availability or blockage of Project Gutenberg and its mirrors in Germany. She used PIA VPN Service to travel virtually to Germany.
  7. The 19 ebooks are copied on Github as part of GITenberg. If you are subject to US copyright law, I encourage you to clone them! In other jurisdictions, doing so may be illegal.
  8. The geofencing software, while ineffective, is not in itself extremely expensive. However, integration of geofencing gets prohibitively expensive when you consider the number of access points,  jurisdictions and copyright determinations that would need to be made for an organization like Project Gutenberg.
  9. (added March 19) Coverage elsewhere:

Webinar: delivering digital literacy programs through ConnectHomeUSA / District Dispatch

In 2015, ALA joined an emerging initiative, spearheaded by ConnectHomeUSA (formerly ConnectHome Nation), to deliver tailored, on-site digital literacy programming and resources to public housing residents. As part of the initiative, local public libraries from 27 cities and one tribal nation provided tools and training to help residents maximize broadband access to advance job skills, complete homework assignments, pursue online learning, and protect their privacy and security of personal information as they expand their online lives.

ConnectHomeUSA connects community leaders, local governments, nonprofit organizations, and private industry to provide free or low-cost broadband access, devices, and digital literacy training. The goal is to extend affordable access to low-income families, ensuring that high-speed internet follows America’s children from their classrooms back to their homes.

Today, ConnectHomeUSA is in its second iteration and ALA has reaffirmed its commitment as a national partner. Together we are seeking libraries to join in the 27 new communities they are working in this year. This Friday, ALA and ConnectHomeUSA will host a webinar for those who would like to learn more about helping residents in participating communities to get connected at home and making public housing a platform for change.

Please register for ConnectHomeUSA webinar on Friday, March 23 at 3 p.m. to learn more. After registering, you will receive a confirmation email containing information about joining the webinar.

While anyone is welcome to join the webinar to learn more, we are specifically looking for libraries in the following areas:

  1. Akron, OH
  2. Brownsville, TX
  3. Charlotte, NC
  4. Choctaw Tribe of AL
  5. Detroit, MI
  6. Edinburg, TX
  7. New Haven, CT
  8. Goldsboro, NC
  9. Greensboro, NC
  10. North Little Rock, AR
  11. Louisville, KY
  12. Lumbee Tribe of NC
  13. Pasco County, FL
  14. Phoenix, AZ
  15. Pittsburgh, PA
  16. Ponca Tribe of NE
  17. Portland, OR
  18. Prichard, AL
  19. Renton, WA
  20. Rhode Island
  21. Salt Lake City, UT
  22. San Joaquin County, CA
  23. Sanford, NC
  24. Las Vegas, NV
  25. Westmoreland County, PA
  26. Wilson, NC
  27. Winnebago County, IL

ALA is proud to be a partner in realizing a shared vision to empower more people to thrive online through ConnectHomeUSA. Register here to join us on Friday, March 23 at 3 p.m. to learn more.

The post Webinar: delivering digital literacy programs through ConnectHomeUSA appeared first on District Dispatch.

Use Head-N-Tail Analysis to Increase Engagement / Lucidworks

One of the most exciting new features in Fusion is Head-N-Tail Analysis. Strangely enough this has nothing to do with shampoo or horses, but it is a way to look at a large set of queries and identify:

  • Head – The queries that generate most of your traffic and conversions
  • Tail – The queries that generate very few or no clicks
  • Torso – Everything else in between

When to Use Head-N-Tail Analysis

Why you’d want to know which queries don’t result in clicks should be obvious. If a user searched on “blue trees” but didn’t find those little car air fresheners they were looking to purchase, it’s a missed opportunity. Maybe an incentive could encourage the user to convert to a purchase or a click.

Frequent queries are also an opportunity for higher clickthroughs. Popular queries could be optimized with a particular promotion or featured on the homepage or in an email campaign. Remember, most users that go to any website, bounce. The more you cut that bounce rate, the better you serve your users whether that means higher productivity in enterprise search or more sales in e-commerce.

Whether you’re in digital commerce or you’re developing an enterprise search app for a corporate intranet, Head-n-Tail reveals the reasons users leave your site without finding what they were searching for. Either it should have been on the front page or it should have been in the search results. Whether the user misspelled something or they should have been more descriptive isn’t the issue. The issues is that your search needs to anticipate the user’s needs.

Optimizing the Head, the Tail, and Everything Else

For the tail queries, Fusion doesn’t just tell you “this isn’t good” but suggests ways that you can rewrite the query. In some cases it is just adding “slop” or flexibility to the keyword or phrase. In some cases it is a spelling issue and maybe you want to add a synonym. Head-n-Tail analysis will tell you some of those right off the bat.

For the head queries, this may be as simple as adding the top n items to your front page. It also might be a good hint that you should use a recommender to personalize the front page to a user. There are other tools in Fusion that may also be useful in this case. You could just offer a redirect when someone types “contact information” in the search bar. Or you could enable signal boosting the more relevant results will automatically bubble up to the top.

How to Get Started

We’ve got an in-depth technical paper Fusion Head-Tail Analysis Reveals Why Users Leave available to guide you along with an upcoming webinar: Fusion 4 Head-n-Tail Analysis, with Lucidworks VP of Research, Chao Han.

Additional Resources:

The post Use Head-N-Tail Analysis to Increase Engagement appeared first on Lucidworks.

All About T-Shirts 2018 Edition / Evergreen ILS

We on the Evergreen Outreach Committee have had a bunch of great questions from community members in the last few weeks. While we will continue to answer questions by email, Facebook and so on I thought it might be good to gather all of the information together in one place for easy reference.  

First of all, we currently have one t-shirt available as of March 2018 and that is the IRC t-shirt that went on sale last year with the quote from IRC that says, “I’m not a cataloger but I know enough MARC to be fun at parties”.  This shirt is black with green text and the Evergreen logo with our website on it. We have this shirt in sizes from women’s small to men’s 2X.

As of the 2018 conference we will have two more shirts to bring the total to three designs but the availability will be a little different.  One will be another shirt perennial design. It will be similar to last year’s but with the new quote voted upon by the community, “You never know what trouble you’ll get into reading the documentation.”  That one, along with the first, will available to purchase at the conference in limited quantities. However, after feedback from last year, sizes will be expanded to 3X for men and women. The print run will be fairly small so don’t drag your feet on picking one up!

There will also be a conference t-shirt for the first time featuring the 2018 conference logo.  These are only guaranteed via pre-order. We will be ordering them in batches. If we don’t get the exact number needed for a batch we may order a few more to get qualification for a higher discount.  If we do, we will sell the extras at the conference but we will not be able to guarantee a number or sizes available. T-shirts that are pre-ordered will be delivered at the conference to attendees. If you’re not attending the conference you are welcome to have an attendee pick yours up for you, just contact myself (Rogan Hamby) or another outreach committee person to make arrangements.  We don’t, currently, have any plans to ship shirts out individually.

We’ve had pre-orders available for a little while but had questions about sizes.  Initially, we were not able to offer 3X or larger sizes but now we can. The ordering page has been updated to reflect both being able to order those larger sizes now and upgrading if you’ve already ordered a smaller shirt.  If you do this you will only be charged for the price difference between your previous order and the upgrade.

Order here:


These make great summer reading shirts for staff if your library allows it!

#LITAchat – Writing for LITA Guides / LITA

Interested in writing for LITA Guides, and about the publishing experience?

The LITA Guide Series books from Rowman and Littlefield publishers contain practical, up to date, how-to information. Proposals can be submitted to the Acquisitions editor using this link.

Join LITA members and colleagues on

Friday, March 30, 1:00-2:00pm EST

on Twitter to discuss and ask questions about writing for publication and the LITA Guide publishing process.

To participate, launch your favorite Twitter mobile app or web browser and search for the #LITAchat hashtag and select “Latest” to follow along and reply to questions asked by moderator or other participants. When replying to discussion or asking questions, add or incorporate the hashtag #LITAchat.

See you there!

Anxiety, performative allyship, and stepping away from social media / Meredith Farkas

Trying to move towards mindfulness

Friends, this is my life.

I’ve lived with anxiety for so long that it took me forever to realize that I had a problem. I knew I had issues with social anxiety, but I thought everyone ruminated the way I did and had intrusive and frustrating thoughts that kept them up at night. I hate the amount of time I spend obsessing about inconsequential things and how I let my anxiety ruin my day-to-day life. I’ll be trying to enjoy time with my family and my head will be somewhere entirely different. Little things that most people wouldn’t think about twice will lodge in my brain for days or weeks. A big part of why I started blogging was because it was a way to process my thoughts and get them out of my head. When you grow up being told terrible things about yourself all the time, you come to believe them, and to believe that everyone holds these views about you. It’s made me tremendously self-conscious and full of negative self-talk that’s hard to quell.

For a long time, I thought it was something I couldn’t change, but I’m beginning to recognize that I have more power over my thoughts than I’ve given myself credit for. And I want to take steps to change my thought patterns and work towards becoming more mindful. A big part of that requires changing my relationship with social media.

Social media is something that I’m recognizing is terrible for my anxiety. I don’t think social media is inherently bad, but I think the ways that I’m using it are not doing me any favors. It’s notoriously difficult on Twitter to communicate effectively and misunderstandings are rife. I will fully admit that I am not perfect. I make mistakes, I often miss sarcasm in people’s responses, and I’m a slow thinker. Recently, a white female librarian I don’t know attacked me on Twitter and basically portrayed me as a “nice white woman” who doesn’t recognize her privilege and understands nothing about cultural, social, and institutional forces that constrain people’s lives and choices. I found myself trying to think of ways to better explain myself to her so she (and others) did not have an inaccurate picture of me, but then I recognized that in her performance of “wokeness,” she needed a “Becky” and nothing I was going to write was going to change her perception of me. What I’d done was argued against the idea someone had suggested that for half of white women in librarianship, their work is a “hobby job” (because they have access to wealth through their family or spouse). I said that calling it a “hobby job” was not accurate or helpful, and that the whole notion of librarianship being a “hobby job” of wives is, in good part, responsible for historically low wages in our field. Somehow, I ended up being characterized as someone who is ignorant of their privilege and doesn’t care about marginalized people in the profession.

I’m still puzzled how arguing about the idea of “hobby jobs” meant that I was denying that white women, by and large, have more social capital (and wealth) than people of color.  One does not necessarily follow the other. Even the most wealthy women in any career do not deserve to have their work characterized as a “hobby job” by others. But, as someone who suffers from anxiety, I obsessed over the exchange for days — what I could have done differently, what people must think of me, etc. The thoughts were intrusive. I wondered if I had done things to make other people feel this way in the past and felt ashamed of doing that. It also made me think about other times when I felt I had to justify myself about things online. In many cases, it was times when the person and I really did not differ significantly in our view of that issue (patron privacy, systems of privilege, etc.) but the person I was chatting back and forth with seemed intent on proving that they were “better or more” than me on whatever issue it was than actually creating understanding. Yes, you’re the best ally. You’re the best privacy advocate. You’re so much more ___ than I am. It often felt more like a performance than anything designed to help, educate, or create change.

I’m intrigued by the idea of recognizing privilege as an end in itself. I definitely understand that people’s lack of recognition of their privilege keeps them from understanding how others are oppressed (and their opportunities constrained) by people and institutions, but that feels like saying that people’s lack of awareness of being an alcoholic keeps them from recognizing they have a problem that needs changing. Yes, recognizing you’re an alcoholic and that you need to change is vital, but it’s only the first in twelve steps. If you only recognized you’re an alcoholic and kept on drinking or didn’t make amends for the hurt you caused, you haven’t really done anything.

There’s nothing that made me more aware of my privilege than working with children as a social worker. Longtime readers know that I was a child and family therapist in South Florida prior to coming to librarianship. I worked with families who qualified for psychotherapy through Medicaid, which usually meant that the kids were experiencing some pretty serious psychological and behavioral issues (since Medicaid is stingy with funding to actually prevent more serious issues). The vast majority of families I worked with were of Haitian and Puerto Rican descent and some were 1st generation immigrants. With every child I worked with, you could see how legacies of abuse, poverty, and institutional/educational racism conspired to limit their future. I love that saying “some people are born on third base and go through life thinking they hit a triple.” I started life at third base with only a little league catcher between me and home plate. The children I worked with didn’t even start out in the ball park, and they had a massive chain-link fence to scale, a stick instead of a bat, and a full roster of major league all-stars between them and home plate. No therapy I could provide could materially change those odds for them and I found the most valuable work I did as a social worker involved advocating for kids with their teachers, their schools (in IEP meetings especially), and their psychiatrists (who mostly wanted to overmedicate the kids into a stupor). I fully believe that most of the kids I worked with were perfectly capable of achieving everything I have in life had they been born into different circumstances.

But how did my recognizing my privilege help those kids? If I’d apologized to them for how much it sucks that they don’t have the opportunities I did at their age, would that make them feel better? Of course not. And talking about my own privilege — or the areas of my life in which I do and don’t have privilege — just centers the conversation around me. Recognizing my privilege was only valuable in informing how I worked with those kids and how I advocated for them. It was only valuable in how I used that understanding. As an inexperienced 23-year-old social worker, standing up to an experienced psychiatrist at my job who wanted to overmedicate a young client of mine was a risk worth taking, since it was clear to me that the behaviors the child was exhibiting would not have been treated in this way had he been an affluent white child. It’s a hell of a lot easier to recognize privilege than to work to your own disadvantage to dismantle things that create privilege or to take any risks to really be an ally to individuals or groups. And what risk are you really taking clapping back at someone on social media?

Social media often feels to me like a performance space and I’ve become more and more aware of performative allyship. When I read Tanya D.’s Medium article about performative allyship a while back, I found myself both nodding along and feeling badly for the times when I know I’ve done this shit myself —

IMHO it serves as ally performance to show you’re a good white person, that you’re “woke”. It also refocuses the issue back on you. … Miss me when you wanna get angry cause I won’t mange your white guilt. That’s what performative allyship comes down to. Demonstration of white guilt in hopes of praise for being more aware than other whites. Spoiler there’s no prize for being the most woke white person on Twitter, or FB or for slapping down other white people.

I’m not saying that all advocacy on social media is performative allyship, but I think we all could benefit from looking at why we post the things we do to social media and whether we are actually helping or just centering the conversation on ourselves (and our wokeness/goodness). Because if it’s about moving toward greater social justice, it definitely should not be about white people. Centering things on us is just replicating the whole of American history.

Of course social media has a role in social justice work. Giving marginalized people a space to share their stories and amplifying those stories is important. Raising awareness is important. Fighting trolls and getting them banned is important. Standing up for marginalized people who are being attacked online is important, but I also hope we’re all doing the same IRL. Social media can be a great tool for organizing protests and other actions that lead to real change — Shaun King’s work is a great example of that. And, of course, it can be a great place to get ideas, share good ideas, and keep up with friends. But if you’re railing about social justice issues on social media and are not taking any steps in your regular life to do good, it’s worth examining your motives, as suggested in this article from Affinity:

What you need to know is that impact matters more than intent. Waving your hands in front of people of color’s faces and proclaiming that you’re “so woke” is more for you than it is for us, if we’re being honest. Do less. Or rather, do more. If you want to be seen as a real ally, try doing something of substance for us instead of just calling other white girls Becky and “feeling bad” when your grandpa says that people of color are subhuman over dinner.

The articles goes on to suggest lots of really easy ways people can contribute in real life. They are truly the least we can do.

I’ve decided to step away from social media for a while to see how I feel when I’m not so connected to it and focused on it. I want to refocus that time I spend on social media on my family and on mindfulness work, because I’m so tired of feeling anxious all the time. I may still use social media a bit when a travel (I have two trips coming up in April), but I also may not. I don’t feel the need to make some big all-or-nothing pronouncement because that’s not healthy behavior either and also feels like a performance of some sort (like people who brag that they don’t own a television).

I think I’ll probably engage more in social media in the future, though in different ways, and I want to be more mindful of how I and others are using it. If I use social media to share something, I will question whether I’m doing for a constructive reason or because of how I want to be seen by the world. If I share information about something, it’ll be to get people involved, not to show them that I’m involved (since studies show that there is value in peripheral participation — aka slacktivism). I’m just over the competition to be the most… whatever.

Saying all this doesn’t mean that I’m more woke or aware or good or whatever. It means that I recognize I can be better and can do better. And that my participation in social media is not good for me. Your mileage may vary. You may see things totally differently and that’s fine; I would never ever assert that the way I perceive the world is necessarily the way the world is. I’d like to believe that we can have different views and still respect each other so long as those views don’t deny rights, safety, or dignity to others. You do you.

Twitter / pinboard

This is fabulous news for the cultural heritage open source world. Big ups to @code4lib and @CLIRDLF! #code4lib

Developing Good Privacy Policies, a free LITA webinar / LITA

Kicking off Privacy in Libraries, a LITA webinar series is the free webinar:

Developing Good Privacy Policies 
Wednesday, March 28, 2018, Noon – 1:30 pm Central Time
Presenter: Sarah Houghton

Use this link to reserve your spot (required)
Thank you to the Library Leadership & Management Association (LLAMA) for co-sponsoring.

Get the details on the LITA Privacy in Libraries series web page.

Writing policies can be exciting when it’s something you care about–and librarians care about privacy. This webinar will help you write (or revise) a privacy policy to include critical issues like privacy law, professional ethics surrounding privacy, how you handle personally identifiable information, what section should be covered in a solid privacy policy, and making it all comprehensible to a layperson. We will also discuss how to ensure that the privacy policy is understood and adhered to by all library stakeholders–from library staff to supervisors and governance bodies.

Sarah Houghton head shotSarah Houghton worked for a decade in library technology and for the past seven years has worked as the Director at the San Rafael Public Library. She focuses her “off work time work” on library ethics, privacy, surveillance, censorship, and intellectual freedom.

View details and Reserve your spot here.

Questions or Comments?

For all other questions or comments related to the webinars, contact LITA at (312) 280-4268 or Mark Beatty,

Six Years of Tracking MARC Usage / HangingTogether

One of the visualizations of MARC usage on the site. Image by OCLC, licensed under CC BY 2.0.

We have been tracking the use of the MARC standard, as evidenced in WorldCat records (now well over 400 million!) for six years. Not only have we reported on how often a tag and its constituent subfields have been used (even ones that shouldn’t exist), but for selected subfields we have also reported on its contents.

If errors are detected, such as subfields being used that are not defined in the MARC standard, I will sometimes run a report so that our WorldCat Quality Control staff can make the needed corrections. Something similar happened recently, when I discovered that a field that had been deprecated and had been slowly going away had achieved new life and was coming back. It turned out that one institution hadn’t received the memo, so we cleaned up the mess and asked them to desist.

But error detection isn’t the only reason to do this work. If we are to make a smooth transition to a world of linked data, we need to know what we have to work with. These reports are meant to expose exactly what we have to work with, in the aggregate, so we can make informed decisions.

It’s also possible to track the growth in the use of the newly-defined RDA fields using these reports, such as the 33Xs. For example, there were less than 200,000 records that had a 336 in 2013, but by five years later there were over 266 million. In contrast, use of the 258 Philatelic Issue Data field went from 8 appearances in 2013 to 19 in 2018. Not exactly meteoric.

Since it isn’t practical, or meaningful, for us to produce these reports for every subfield, we focus on the ones that seem most useful to report on. But if there is a subfield that you want to know more about, just request it.

Information is power.

Digitization of Text Documents Using PDF/A / Information Technology and Libraries

The purpose of this article is to demonstrate a practical use case of PDF/A file format for digitization of textual documents, following recommendation of using PDF/A as a preferred digitization file format. The authors showed how to convert and combine all the TIFFs with associated metadata into a single PDF/A-2b file for a document. Using open source software with real-life examples, the authors show readers how to convert TIFF images, extract associated metadata and ICC profiles, and validate against the newly released PDF/A validator. The generated PDF/A file is a self-contained and self-described container which accommodates all the data from digitization of textual materials, including page-level metadata and/or ICC profiles. With theoretical analysis and empirical examples, PDF/A file format has many advantages over traditional preferred file format TIFF / JPEG2000 for digitization of textual documents.

Identifying Emerging Relationship in Healthcare Domain Journals via Citation Network Analysis / Information Technology and Libraries

Online e-journal databases enable scholars to search the literature in a research domain, or to cross-search an interdisciplinary field. The key literature can thereby be efficiently mapped out. This study builds a Web-based citation analysis system consisting of four modules: (1) literature search; (2) statistics; (3) articles analysis; and (4) co-citation analysis. The system focuses on the PubMed Central dataset and facilitates specific keyword searches in each research domain in terms of authors, journals, and core issues. In addition, we use data mining techniques for co-citation analysis. The results could assist researchers to develop an in-depth understanding of the research domain. An automated system for co-citation analysis promises to facilitate understanding of the changing trends that affect the journal structure of research domains. The proposed system has the potential to become a value-added database of the healthcare domain, which will benefit researchers.

Letter from the Editor (March 2018) / Information Technology and Libraries

This issue marks 50 years of Information Technology and Libraries. The scope and ever-accelerating pace of technological change over the five decades since Journal of Library Automation was launched in 1968 mirrors what the world at large has experienced. From “automating” existing services and functions a half century ago, libraries are now using technology to rethink, recreate, and reinvent services — often in areas that simply were in the realm of science fiction. 

President's Message / Information Technology and Libraries

March 2018 message from LITA President Andromeda Yelton.

Accessible, Dynamic Web Content Using Instagram / Information Technology and Libraries

This is a case study in dynamic content creation using Instagram’s API. An embedded feed of the Mansfield Library Archives and Special Collections’ most recent Instagram posts was created for their website’s home page. The process to harness Instagram’s API highlighted competing interests: web services’ desire to most efficiently manage content, Archives staff’s investment in the latest social media trends, and everyone’s institutional commitment to accessibility. 

Academic Libraries on Social Media: Finding the Students and the Information They Want / Information Technology and Libraries

Librarians from Purdue University wanted to determine which social media platforms students use, which platforms they would like the library to use, and what content they would like to see from the library on each of these platforms. We conducted a survey at four of the nine campus libraries to determine student social media habits and preferences. Results show that students currently use Facebook, YouTube, and Snapchat more than other social media types; however, students responded that they would like to see the library on Facebook, Instagram, and Twitter. Students wanted nearly all types of content from the libraries on Facebook, Twitter, and Instagram, but they did not want to receive business news or content related to library resources on Snapchat. YouTube was seen as a resource for library service information. We intend to use this information to develop improved communication channels, a clear libraries social media presence, and a cohesive message from all campus libraries.

Trope or Trap? Role-Playing Narratives and Length in Instructional Video / Information Technology and Libraries

This article discussed the results of a survey of over thirteen hundred respondents. This survey was designed to establish the preferences of the viewers of instructional how-to videos, asking the question of whether length as well as the presence of a role-playing narrative enhances or detracts from the viewer experience.

Étudier / Ed Summers

TL;DR - étudier is a command line utility for saving an article’s network of citing literature from Google Scholar as a GEXF file for viewing in Gephi, and as a (currently very bare bones) D3 network visualization for viewing in a web browser.

Periodically I discover a new piece of research I really like, and I want to know who else is citing it to see how it fits into the larger research landscape. Ever since Eugene Garfield dreamed up the citation index, we’ve been able to trace citation links forwards, to see who is citing an article. Of course a side effect of this has been the contentious field of scientometrics in which the impact of research is purportedly measured. Fifty years later Google Scholar offers this same functionality, by scraping scholarly publication metadata from publisher and repository websites, and building a database that lets see what research is citing something with the click of a link.

It’s kind of poignant that Google provide this functionality because their own PageRank algorithm for ranking web search results was directly influenced by Garfield, who died last year.

Got Data?

I recently went looking around for techniques to collect this network of close relationships around an article from Google Scholar. Ideally Scholar would have an API that provided structured metadata for research publications. But Google does not make an API available, probably because of the privileged access it receives from publishers like Elsevier, who have a vested interest in there not being a Google Scholar API.

I briefly looked at the Web of Science API, which has an endpoint for looking up citing articles. However it appears that access to the API is not guaranteed even if you are lucky enough to belong to an institution that pays for access to Web of Science. Futhermore API access is tiered (lite or premium) and tracing the links between articles requires premium access. Lastly the endpoint uses SOAP which is a little bit painful to use. I can understand why full access might be limited to business partners, but I was really just interested in looking at the immediate network of relationships around an article, not in getting some global picture of citations.

Then I ran across Jimmy Tidey’s excellent Scraping Google Scholar to write your PhD literature chapter which describes his tool Bibnet which does exactly what I wanted: collects the data around a particular publication from Google Scholar. I did manage to get it running locally, but couldn’t quite seem to get it to work. It involves running a local web application in combination with a Chrome extension. Tidey also made the application available as a service at, but I kind of got cold feet after seeing the invalid SSL certificate. Also the application seemed to do a lot more than I wanted: it has a backend database where citations are stored. I just wanted to get the network of citations out and move on to visualizing them and doing the reading.


So, somewhat predictably, I decided to write my own tool. étudier is a command line utility written in Python that uses Selenium and requests-html to automatically drive a browser to collect a citation graph around a particular Scholar citation. The resulting network is written out as a Gephi file and a D3 visualization using networkx. The D3 visualization could use some work, so if you add style to it please submit a pull request.

To use you first need to find a citation you are interested in on Google Scholar, and click on its Cited By link. For example here is the Cited By URL for Sherry Ortner’s influential Theory in Anthropology since the Sixties:,21

With this URL in hand you can run ',21'

This will collect the ten citations on the page, and then examine each one to see what cites them. If you want you can collect more than the first page using the –pages option: --pages 2 ',21'

And you can also collect the research that cites the research that cites your article by using the –depth option: --depth 2 ',21'

It’s unlikely you’ll want to use a –depth greater than 2, because only examining the first page of results will build a network of up to 1000 citations, which starts to get difficult to visualize.

If you are wondering why it uses a non-headless browser it’s because Google is quite protective of this data and routinely will ask you to solve a CAPTCHA (identifying street signs, cars, etc in photos). étudier will allow you to complete these tasks when they occur and then will continue on its way collecting data.


The network is written out as an HTML file that uses D3 to visualize the data. Here you can see a network generated for the Ortner article I mentioned above.

I’m hovering over an article that is at the center of a cluster of citations so I can see its title: Situated Learning: Legitimate peripheral participation by Lave and Wenger. Clicking on the nodes should open a new page and bring your browser to what Google Scholar thinks is the webpage for the publication.


Hopefully the D3 visualization can be improved a bit, but given that the resulting networks can be different it might be difficult to find a one size fits all solution. So the data is also saved as a GEXF file that can be loaded into Gephi where it can be massaged. Here is a visualization I made of 693 publications that I collected for the Ortner article with –depth 2.

When I opened the Gephi file it looked like a giant hairball, which is not unusual. Describing how to use Gephi is a bit beyond the scope of this post, but there are lots of videos on YouTube for working with Gephi. I bookmarked a few that I found particularly useful.

In the example above I filtered the network to only include articles that had 10 or more citations (in-degree >= 10). I then ran community detection and colored the nodes based on their community and applied a Force Atlas layout to arrange the nodes. Finally I made the node size relative the number of inbound citations. With a little finagling you can hover over the nodes to see what they are.

Search Results

You can also visualize generic search results. So for example [this search] for cscw memory to find papers from CSCW that mention the word memory. I collect the first three pages of results and generated this graph of citations that allows me to see the clusters of results: --pages 3 ''

This particular graph was constructed in a similar way to the Gephi graph above except that the nodes are weighted by the total number of times they are cited in the literature. In theory these little snapshots can help guide reading when investigating a new domain.

The Metadata

If you peak in the output.html you’ll see the JSON metadata which is also present in the GEXF file. Here’s an example:

You can use this metadata in the visualization, as I did above when I wanted to vary the size of the node in Gephi using the number of times a publication was cited: cited_by.

So, if you are looking for some help doing a literature review and have a chance to try out etudier I’d be interested to hear how it works for you. As with many scraping applications it is quite brittle, and if Google changes its HTML markup it’s liable to break. So please file an issue in GitHub if you noticed that has happened.

Long-awaited FDLP Modernization Act would strengthen public access to government information / District Dispatch

The bipartisan FDLP Modernization Act of 2018 (H.R. 5305) was introduced on March 15 following months of effort by the Committee on House Administration. The bill would modernize the Federal Depository Library Program and related programs that provide public access to government information. The bill is sponsored by Committee Chairman Gregg Harper (R-MS), Ranking Member Bob Brady (D-PA), and Committee members Rodney Davis (R-IL), Barbara Comstock (R-VA), Mark Walker (R-NC), Barry Loudermilk (R-GA), Zoe Lofgren (D-CA) and Jamie Raskin (D-MD).

The FDLP Modernization Act was developed with input from the library community following a series of public hearings in the Committee on House Administration in 2017, which included testimony from librarians. Chairman Harper commented at the time that the hearings were an opportunity “to see how we can make something that we like, better.” The bill follows that approach and incorporates many of the recommendations ALA sent to the committee.

While earlier, unintroduced drafts of the legislation dealt with a wide range of topics related to the Government Publishing Office (GPO) and government printing, the FDLP Modernization Act focuses solely on the FDLP and the Superintendent of Documents’ programs that provide public access to government information. Specifically, the bill would:

  • Modernize the FDLP to provide greater flexibility, facilitate collaboration, streamline program requirements, and allow more libraries to participate
  • Improve public access to electronic government information by clarifying that digital information belongs in the program, guaranteeing free access to GPO’s online repository, modernizing the repository and online services, and authorizing the Superintendent to digitize historical publications
  • Strengthen preservation of government information by clarifying the Superintendent’s responsibility for preservation, establishing partnerships with libraries to preserve publications, and directing GPO to preserve publications in its online repository
  • Increase transparency and oversight to encourage the program to continue to evolve to meet the future needs of libraries and the public

ALA welcomes the legislation and sent a letter of support, along with the American Association of Law Libraries (AALL) and the Association of Research Libraries (ARL), following the bill’s introduction. Prior to this process, it had been 20 years since Congress last held hearings or introduced legislation regarding the FDLP. After such a long wait, we are pleased to see bipartisan consensus that the time has come to modernize this program.

ALA is grateful for the committee’s work to strengthen this vital program. We appreciate the leadership of Reps. Harper, Brady, Davis, Comstock, Walker, Loudermilk, Lofgren and Raskin in introducing the FDLP Modernization Act. This commonsense legislation will better support libraries in providing effective and long-term access to government information. We encourage the Committee on House Administration to promptly advance the legislation.

The post Long-awaited FDLP Modernization Act would strengthen public access to government information appeared first on District Dispatch.

Research Graph VIVO Cloud Pilot / DuraSpace News

Attention: faculty information, grants, and repository managers!

You’re invited to participate in a project aimed at collecting, augmenting, and making discoverable patterns of faculty collaboration and research outputs through a pilot led by Research Graph and VIVO, and managed by Duraspace as a not-for-profit venture.

The proposed project will:

Marrakesh Treaty Closer to Reality with Senate bill introduced today / District Dispatch

This post originally appeared in American Libraries’ blog The Scoop.

On March 15, the Marrakesh Treaty Implementation Act (S. 2559) was introduced in the US Senate, nudging the United States further toward adoption of the Marrakesh Treaty. The international copyright treaty provides a copyright exception—the first ever in an international treaty—for libraries as authorized entities to make copies of entire articles and books accessible for people with print disabilities and distribute those copies across borders.

Braille ImageThe Marrakesh Treaty Implementation Act (S. 2559) was introduced on Thursday, March 15, by the bipartisan leadership of the Senate Judiciary and Foreign Relations Committees, Chairmen Grassley and Corker and Ranking Members Feinstein and Menendez, along with Senators Harris, Hatch and Leahy.

If the Marrakesh Treaty Implementation Act is passed and signed by the president, the bill will greatly increase access for English speakers with print disabilities, especially in developing countries, where less than 1% of all published print content is accessible. The US will benefit as well by being able to obtain foreign-language content, especially for Spanish speakers with print disabilities.

The American Library Association (ALA) first became involved in advocating for an exception for the print-disabled back in 2008, when ALA’s Washington Office collaborated with the Library Copyright Alliance (LCA), which includes the Association of College and Research Libraries (ACRL) and the Association of Research Libraries (ARL), to apply for and be granted official nongovernmental status to attend and make position statements at World Intellectual Property Organization meetings. For the next five years, LCA provided input to the US delegation and partnered on advocacy efforts with the International Federation of Library Associations and Institutions, World Blind Union, National Federation of the Blind (NFB), and other associations for people with print disabilities. When the treaty was adopted in 2013 at an international diplomatic conference in Marrakesh, it was hailed as the “Miracle in Marrakesh.”

Signing the treaty was only the first step. Because ratification and adoption of the treaty in the US requires modest changes to the US copyright law, copyright policy stakeholders, including the LCA, NFB, and American Association of Publishers (AAP), met to reach consensus on legislative language, extending the process another five years.

Now that the Marrakesh Treaty Implementation Act has been introduced, the next step toward full adoption of the treaty involves advocating to ensure swift passage of the legislation. Senators need to hear from constituents that the treaty is a priority. Contact your senator and show your support for the Marrakesh Treaty Implementation Act at the ALA Action Center.

The post Marrakesh Treaty Closer to Reality with Senate bill introduced today appeared first on District Dispatch.

Data models age like parents / Jakob Voss

Denny Vrandečić, employed as ontologist at Google, noticed that all six of of six linked data applications linked to 8 years ago (IWB, Tabulator, Disko, Marbles, rdfbrowser2, and Zitgist) have disappeared or changed their calling syntax. This reminded me at a proverb about software and data:

software ages like fish, data ages like wine.

The original form of this saying seems to come from James Governor (@monkchips) who in 2007 derived it from from an earlier phrase:

Hardware is like fish, operating systems are like wine.

The analogy of fishy applications and delightful data has been repeated and explained and criticized several times. I fully agree with the part about software rot but I doubt that data actually ages like wine (I’d prefer Whisky anyway). A more accurate simile may be „data ages like things you put into your crowded cellar and then forget about“.

Thinking a lot about data I found that data is less interesting than the structures and rules that shape and restrict data: data models, ontologies, schemas, forms etc. How do they age compared with software and data? I soon realized:

data models age like parents.

First they guide you, give good advise, and support you as best as they can. But at some point data begin to rebel against their models. Sooner or later parents become uncool, disconnected from current trends, outdated or even embarrassing. Eventually you have to accept their quaint peculiarities and live your own life. That’s how standards proliferate. Both ontologies and parents ultimately become weaker and need support. And in the end you have to let them go, sadly looking back.

(The analogy could further be extended, for instance data models might be frustrated confronted by how actual data compares to their ideals, but that’s another story)

Evergreen Project Represents at ALA Midwinter / Evergreen ILS

The Evergreen project highlighted the web client and the modern workflows it supports during the recent American Library Association (ALA) Midwinter 2018 meeting.

Elizabeth Thomsen of NOBLE facilitated the session.  Shae Tetterton of Equinox Open Library Initiative highlighted new features of the web client while Debbie Luchenbill of MOBIUS talked about plans for the upcoming Evergreen conference in St. Charles, Missouri.

The Evergreen Outreach Committee plans these programs during the ALA midwinter meetings and annual conferences to increase Evergreen’s visibility in the larger library community and to connect with existing users who we may not see at the Evergreen conference or through other community channels.

The Outreach Committee started this effort at ALA Annual 2015 in San Francisco when the community, using donations from Evergreen sites, sponsored an exhibit booth at the conference. Although the community has not sponsored a booth since that time, the Outreach Committee has committed to scheduling programs since that time as a way to showcase the software and community to potential and existing users.

Plans are already underway for our upcoming program at ALA Annual in New Orleans, called Tech gurus optional, which will focus on the many different ways libraries can host and support an open-source system.

Research data management librarian job posting at York University / William Denton

Research data management librarian wanted at York University Libraries in Toronto, where I work.

York University Libraries (YUL) seeks a dynamic and innovative individual with strong leadership potential to advance York University Libraries’ research data management portfolio in support of the research community across campus.

The successful candidate will be a member of the new Research and Open Scholarship division and will report to the Director for Open Scholarship. The incumbent will lead the development of a research data management program on campus and will coordinate ongoing support in this area within a team-based environment. The incumbent will work collegially with departmental members to advance the wider responsibilities of the Open Scholarship Department.

I’m not on the search committee and am happy to answer any questions I can from anyone interested in applying, by email or even by phone. They’re looking for someone who knows RDM and also can handle chemistry and other physical sciences.

Pay at York is good. I’d guess someone five years out of library school would get over $90,000 CAD. We have good benefits, time for (and expectations about) research, and of course the subway comes right on campus now. After six years the person will be up for continuing appointment (what we call tenure) and then get a sabbatical year. On the bad side, of course, CUPE 3903 is on strike right now.

YUL is going through a restructuring and this position will be in a new department. Anyone taking the job should ask serious questions about how the department will work and what support they will have in the role, but all in all I think the new structure will work pretty well and there is a lot of promise ahead.

Non-Canadians are welcome to apply. The way it works, any qualified Canadian trumps any non-Canadian, even if the non-Canadian is actually better, but don’t let that stop you from applying. The bigger the pool the better for us, of course, but with specialized knowledge like this, you never know what will happen or how many Canadians will apply.

Anyone whose career has taken unusual turns or had to take some time out (for parental or caregiver leave or something else) should mention that in the cover letter, and the search committee will consider it.

ARKs in the Open: Project Update #3, Roadmap and Resources and Value Statement / DuraSpace News

Add your voice to the new ARKs in the Open Roadmap and Resources and Value Statement. These new project artifacts describe the product and community resources required to fulfill a sustainable future for ARKS and our value proposition.

The roadmap is designed to welcome people to the ARKs in the Open project, tell them how to get involved in the project, and present a list of short-term, long-term, and aspirational priorities.

Privacy in Libraries, a LITA webinar series / LITA

This spring, LITA’s Patron Privacy Interest Group will offer a webinar/training series on privacy issues for libraries! Targeted at “advanced newcomers,” this series is for you if you’re interested in privacy issues; have ever wondered how to protect the data you use; and read through the Patron Privacy Checklists and wondered what else was out there. Led by member-experts of the Patron Privacy Interest Group, this series will tackle privacy policies and staff training, strategies to manage the lifecycle of personally identifiable information (PII), tips for adopting encrypted technologies, defending the privacy of library patrons, and communicating about and advocating for privacy to library stakeholders.

And the first webinar is free! The remaining four can be attended individually or get a package deal for all four.

Get the details on the LITA Privacy in Libraries series web page and register here.

Here’s the line up:

  • Developing Good Privacy Policies, March 28, 2018

To register for “Developing Good Privacy Policies”, use this link to reserve your spot (required). Thank you to the Library Leadership & Management Association (LLAMA) for co-sponsoring.

  • Wrangling Library Patron Data, April 11, 2018
  • Adopting Encryption Technologies, April 25, 2018
  • Analytics and Assessment: Privacy vs. Surveillance, May 9, 2018
  • Take Back Research Privacy, May 23, 2018

Here are the Presenters:

Privacy in Libraries presenters headshots

  • Sarah Houghton
  • Becky Yoose
  • Matt Beckstrom
  • T.J. Lamanna
  • Eric Hellman
  • Sam Kome

View details and Register here.

Questions or Comments?

For all other questions or comments related to the webinars, contact LITA at (312) 280-4268 or Mark Beatty,

Jobs in Information Technology: March 14, 2018 / LITA

New vacancy listings are posted weekly on Wednesday at approximately 12 noon Central Time. They appear under New This Week and under the appropriate regional listing. Postings remain on the LITA Job Site for a minimum of four weeks.

New This Week

City of Virginia Beach, Technology Services Manager #24897, Virginia Beach, VA

University of Virginia School of Law, Web Services Librarian, Charlottesville, VA

Folger Shakespeare Library, Associate Librarian for Collection Description and Imaging, Washington, DC

Kansas City Public Library, ILS System Administrator, Kansas City, MO

Harvard Library, Head of Digital Preservation, Cambridge, MA

Moreno Valley College, Technical Services & Digital Asset Librarian/ Assistant Professor(Extended), Moreno Valley, CA

University of Denver, Residency Librarian, Denver, CO

Visit the LITA Job Site for more available jobs and for information on submitting a job posting.

OpenID Connect / Alf Eaton, Alf

OpenID Connect (OIDC) is a specification built on top of OAuth 2.0 that allows applications to authenticate users via third-party Identity Providers.

Identity Providers can be certified if their service conforms to the OpenID Connect specification.

Authentication Flow

There are two types of authentication flow: "code", where communication with the Identity Provider happens via a server, and "implicit", where the client (e.g. a single-page application running in a web browser) communicates directly with the Identity Provider. Only the "implicit" flow is discussed here.

At the start of the implicit authentication flow, the client sends the user to an authorization endpoint, e.g. If it's not already known, that endpoint can be found via a discovery document, e.g., and there's even a way to discover that discovery document via webfinger, but that part of the process isn't essential here.

The request includes some query parameters:

const params = new URLSearchParams()
params.set('scope', 'openid') // use the OIDC protocol
params.set('client_id', CLIENT_ID) // the client identifier
params.set('redirect_uri', REDIRECT_URI) // redirect to this URL 
params.set('response_type', 'id_token') // return the id_token in a URL fragment
params.set('nonce', randomString()) // a random string to be included in the id_token
params.set('prompt', 'none') // don't prompt unless necessary

location.href = AUTHORIZATION_ENDPOINT + '?' + params.toString()

The Identity Provider must verify that this client_id and redirect_uri pair have previously been registered, so the id_token will only be sent to the application which the user has approved.

If the user declines, they're sent back to the redirect_uri with a URL fragment containing an error.

If the user approves, they're sent back to the redirect_uri with a URL fragment containing an id_token, which the client then stores. The client must check that the nonce value in the id_token matches the nonce value that was sent in the original request, so that an id_token can't be forced upon the user unless it was requested.

The client then includes the id_token in the HTTP headers of requests to an API that requires authentication (the Relying Party), as Authorization: Bearer ${id_token}.

JSON Web Token

The id_token is a JSON Web Token (JWT), which is signed but not encrypted, so you can read the contents.

The id_token has 3 parts, separated by a . character:

  • The first part of the token contains the base64url-encoded header, which describes the algorithm and key used to create the signature for the payload.

  • The second part of the id_token contains the base64url-encoded payload.

  • The third part of the id_token contains the signature, which confirms that the payload of the id_token hasn't been modified.

    The "RS256" signature algorithm is asymmetric: a private key is used to sign the token, and a public key is used to verify the signature. The public key can be retrieved from a JSON Web Key Set (JWKS) endpoint, e.g. (this URL can be found via the same discovery document as the authorization endpoint). The key set contains a list of keys - to validate the signature take the key with the kid value that matches the kid value in the header of the id_token. Once a key has been fetched, it should be cached locally as it won't change (when a new key is used for signing the id_token it will have a different kid).

The payload

The payload contains several important values, which the server must verify:

  • exp: the expiry date of this token, after which the token is no longer valid.
  • aud: the "audience" for this token, which must match the client_id used to request it (this prevents a site that receives an id_token from a user using it to impersonate the user on another site).
  • iss: the "issuer" of this token, which must be an expected value (e.g. "").

The unique identifier of the user within the issuer's domain (i.e. their account ID) will be included in the id_token as the sub value.

If the client has asked for other information to be included, e.g. by adding email or profile to the scope parameter in the original request, those values will also be included in the id_token.


When the token expires, the client can get a new token using "silent authentication": open the same authorization endpoint as before, but in a hidden iframe. If the user's approval is still valid (their session with the Identity Provider will be stored in a cookie which the authorization endpoint in the iframe, being on the Identity Provider's domain, can read), the iframe will redirect back to the redirect_uri for the new id_token to be read from the URL fragment and stored as before.

Further reading

LITA, LLAMA, ALCTS collaboration FAQ: Post #1 / LITA

On February 23, I posted for discussion a proposal on a closer formal relationship between LITA, LLAMA, and ALCTS. That included an anonymous feedback form where you can ask questions, express feelings, et cetera. I will be collating and answering these questions every few weeks here on LITAblog (so please keep asking!).

Thus far there is just one question, but I’m sure it speaks for many of you: “Will my dues go up if these three divisions combine?” And the answer is: we’re not totally sure yet; it depends, but up and down are both possibilities; and either way you’d get more value from your membership.

We recognize that finances are an important issue for most of our members, and any combined effort has to be financially sustainable for both the divisions and our members, or it can’t go forward. We (the leadership of the three divisions) are in the process of constituting a finance workgroup to look closely at the numbers and advise us on plans. Until that group reports, we will not have exact numbers for you. But here’s what we know right now:

  • LLAMA personal member dues are $50, LITA’s are $60, and ALCTS’s are $75.
  • …unless you’re a student, in which case your dues are $15 (ALCTS, LLAMA) or $25 (LITA).
  • LITA also has a non-salaried member tier; ALCTS has several additional tiers.
  • Some people are paying dues to multiple divisions: almost 800 people who are in both LITA and LLAMA, almost 600 who are in both LITA and ALCTS, and at least one who’s in all three. (Me, as of my membership renewal last week.)

Some back-of-the-envelope calculations suggest that we can have a viable path forward with dues in the current ballpark, but what that means for you personally depends on which dues you are paying. It’s entirely possible that some members will see no change, some will see an increase, and some will see a decrease.

Regardless of how the numbers work out for you, I’d like to leave you with an idea wisely suggested by LLAMA President-Elect Lynn Hoffman. There are lots of people who would like to be in more than one of these three divisions, but who find it too expensive. If you’re one of them, and I came to you and said “you can have a whole second division for just $10 or $20 more than you’re paying right now”…would you jump at that chance?

My Google docs folder for my files on the merger question is labeled Stronger Together, because the consensus of the LITA Board after Midwinter was that there are so many ways we can benefit from each others’ strengths. LITA members, you already know how much technological skill, creativity, and hard work are in this division. We run high-profile events like Top Tech Trends and Happy Hour; we publish really useful books; we’re the only one of the three divisions to run a face-to-face conference.

But we’ve heard for years that you want more of a leadership training pathway, and more ways to stay involved with your LITA home as you move into management; LLAMA opens up all kinds of natural opportunities. They have an agile divisional structure with their communities of practice and an outstanding set of leadership competencies.

And anyone involved with library technology knows that we live and die by metadata, but we aren’t always experts in it; joining forces with ALCTS creates a natural home for people no matter where they are (or where they’re going) on the technology/metadata continuum. ALCTS also runs far more online education than LITA and runs a virtual conference.

If these three divisions united – no matter whether your dues went up or down – you’d be getting a lot more value for your membership dollars. We wouldn’t even be considering the question if we weren’t certain that was the case, and we won’t advance to formal changes unless we have a financial plan we can believe in.

The Open Data Charter’s Measurement Guide is now open for consultation! / Open Knowledge Foundation

This blogpost is co-authored by  Ana Brandusescu  and Danny Lämmerhirt, co-chairs of the Measurement and Accountability Working Group of the Open Data Charter.

The Measurement and Accountability Working Group (MAWG) is launching the public consultation phase for the draft Open Data Charter Measurement* Guide!


Measurement tools are often described in technical language. The Guide explains how the Open Data Charter principles can be measured. It provides a comprehensive overview of existing open data measurement tools and their indicators, which assess the state of open government data at a national level. Many of the indicators analysed are relevant for local and regional governments, too. This post explains what the Measurement Guide covers; the purpose of the public consultation, and how you can participate!

What can I find in the Measurement Guide?

  • An executive summary for people who want to quickly understand what measurement tools exist and for what principles.
  • An analysis of measuring the Charter principles, which includes a comparison of the indicators that are currently used to measure each Charter principle and its accompanying commitments. It reveals how the measurement tools — Open Data Barometer, Global Open Data Index, Open Data Inventory, OECD’s OURdata Index, European Open Data Maturity Assessment — address the Charter commitments. For each principle, case studies of how Charter adopters have put commitments into practice are also highlighted.
  • Comprehensive indicator tables show available indicators against each Charter commitment. This table is especially helpful when used to compare how different indices approach the same commitment, and where gaps exist.
  • A methodology section that details how the Working Group conducted the analysis of mapping existing measurements indices against Charter commitments.
  • A recommended list of resources for anyone that wants to read more about measurement and policy.

We want you — to give us your feedback!

The public consultation is a dialogue between measurement researchers and everyone who is working with measurements — including government, civil society, and researchers. If you consider yourself as part of one (or more) of these groups, we would appreciate your feedback on the guide. Please bear the questions below in mind as you review the Guide:

  • Is the Measurement Guide clear and understandable?
  • Government: Which indicators are most useful to assess your work on open data and why?
  • Civil society: In what ways do you find existing indicators useful to hold your government to account?
  • Researchers: Do you know measurements and assessments that are well-suited to understand the Charter commitments?

How does the public consultation process work?

The public consultation phase will be open for two weeks — from 12 to 26 March — and includes:

  1. Public feedback, where we gather comments in the Measurement Guide, the indicator tables document.
  2. Public (and private) responses from MAWG members throughout the consultation phase.

How can I give feedback to the public consultation?

  1. You can leave comments directly in the Measurement Guide, as well as the indicator tables.
  2. If you want to send a private message to the group chairs, drop Ana and Danny an email at and Or send us a tweet at @anabmap and @danlammerhirt.
  3. Share your feedback with the community using the hashtag #OpenDataMetrics.

We will incorporate your feedback in the Measurement Guide, during the public consultation period. We plan to publish a final version of the Measurement Guide guide by end of April 2018.

A note that we will not include new indicators or comments specifically on the Charter principles. If you have comments about improving the Charter principles, we encourage you to participate in the updating process of the Charter principles.

*Since the last time we wrote a blog post, we have changed the name to more accurately represent the document, from Assessment Guide to Measurement Guide.

Are you using DSpace, Fedora or VIVO? / DuraSpace News

As the number of DSpace and Fedora repositories and VIVO installations grow–in 2018 there are more than 2,900 registered worldwide–the desire to learn more about them–where are they, how they are accessed online, what type of content they hold, software versions, specializations etc.–also increases. Answers to these and other questions can lead to greater connections among institutions and repository instances with similar goals and interests.

Reflections on Code4Lib 2018 / ACRL TechConnect

A few members of Tech Connect attended the recent Code4Lib 2018 conference in Washington, DC. If you missed it, the full livestream of the conference is on the Code4Lib YouTube channel. We wanted to  highlight some of our favorite talks and tie them into the work we’re doing.

Also, it’s worth pointing to the Code4Lib community’s Statement in Support of opening keynote speaker Chris Bourg. Chris offered some hard truths in her speech that angry men on the internet, predictably, were unhappy about, but it’s a great model that the conference organizers and attendees promptly stood in support.


One of my favorite talks at Code4lib this year was Amy Wickner’s talk, “Web Archiving and You / Web Archiving and Us.” (Video, slides) I felt this talk really captured some of the essence of what I love most about Code4lib, this being my 4th conference in the past 5 years. (And I believe this was Amy’s first!). This talk was about a technical topic relevant to collecting libraries and handled in a way that acknowledges and prioritizes the essential personal component of any technical endeavor. This is what I found so wonderful about Amy’s talk and this is what I find so refreshing about Code4lib as an inherently technical conference with intentionality behind the human aspects of it.

Web archiving seems to be something of interest but seemingly overwhelming to begin to tackle. I mean, the internet is just so big. Amy brought forth a sort of proposal for ways in which a person or institution can begin thinking about how to start a web archiving project, focusing first on the significance of appraisal. Wickner, citing Terry Cook, spoke of the “care and feeding of archives” and thinking about appraisal as storytelling. I think this is a great way to make a big internet seem smaller, understanding the importance of care in appraisal while acknowledging that for web archiving, it is an essential practice. Representation in web archives is more likely to be chosen in the appraisal of web materials than in other formats historically.

This statement resonated with me: “Much of the power that archivists wield are in how we describe or create metadata that tells a story of a collection and its subjects.”

And also: For web archives, “the narrative of how they are built is closely tied to the stories they tell and how they represent the world.”

Wickner went on to discuss how web archives are and will be used, and who they will be used by, giving some examples but emphasizing there are many more, noting that we must learn to “critically read as much as learn to critically build” web archives, while acknowledging web archives exist both within and outside of institutions. And that for personal archiving, it can be as simple as replacing links in documents with, Wayback Machine links, or WebRecorder links.

Another topic I enjoyed in this talk was the celebration of precarious web content through community storytelling on Twitter with the hashtags #VinesWithoutVines and #GifHistory, two brief but joyous moments.


The part of this year’s Code4Lib conference that I found most interesting was the talks and the discussion at a breakout session related to machine learning and deep learning. Machine learning is a subfield of artificial intelligence and deep learning is a kind of machine learning that utilizes hidden layers between the input layer and the output layer in order to refine and produce the algorithm that best represents the result in the output. Once such algorithm is produced from the data in the training set, it can be applied to a new set of data to predict results. Deep learning has been making waves in many fields such as Go playing, autonomous driving, and radiology to name a few. There were a few different talks on this topic ranging from reference chat sentiment analysis to feature detection (such as railroads) in the map data using the convolutional neural network model.

“Deep Learning for Libraries” presented by Lauren Di Monte and Nilesh Patil from University of Rochester was the most practical one among those talks as it started with a specific problem to solve and resulted in action that will address the problem. In their talk, Di Monte and Patil showed how they applied deep learning techniques to solve a problem in their library’s space assessment. The problem that they wanted to solve is to find out how many people visit the library to use the library’s space and services and how many people are simply passing through to get to another building or to the campus bus stop that is adjacent to the library. This made it difficult for the library to decide on the appropriate staffing level or the hours that best serve the users’ needs. It also prevented the library from showing the library’s reach and impact based upon the data and advocate for needed resources or budget to the decision-makers on the campus. The goal of their project was to develop automated and scalable methods for conducting space assessment and reporting tools that support decision-making for operations, service design, and service delivery.

For this project, they chose an area bounded by four smart control access gates on the first floor. They obtained the log files (with the data at the sensor level minute by minute) from the eight bi-directional sensors on those gates. They analyzed the data in order to create a recurrent neural network model. They trained the algorithm using this model, so that they can predict the future incoming and the outgoing traffic in that area and visually present those findings as a data dashboard application. For data preparation, processing, and modeling, they used Python. The tools used included Seaborn, Matplotlib, Pandas, NumPy, SciPy, TensorFlow, and Keras. They picked the recurrent neural network with stochastic gradient descent optimization, which is less complex than the time series model. For data visualization, they used Tableau. The project code is available at the library’s GitHub repo:

Their project result led to the library to install six more gates in order to get a better overview of the library space usage. As a side benefit, the library was also able to pinpoint the times when the gates malfunctioned and communicate the issue with the gate vendor. Di Monte and Patil plan to hand over this project to the library’s assessment team for ongoing monitoring and to look for ways to map the library’s traffic flow across multiple buildings as the next step.

Overall, there were a lot of interests in machine learning, deep learning, and artificial intelligence at the Code4Lib conference this year. The breakout session I led at the conference on these topics produced a lively discussion on a variety of tools, current and future projects for many different libraries, as well as the impact of rapidly developing AI technologies on society. This breakout session also generated #ai-dl-ml channel in the Code4Lib Slack Space. The growing interests in these areas are also shown in the newly formed Machine and Deep Learning Research Interest Group of the Library and Information Technology Association. I hope to see more talks and discussion on these topics in the future Code4Lib and other library technology conferences.


One of the talks which struck me the most this year was Matthew Reidsma’s Auditing Algorithms. He used examples of search suggestions in the Summon discovery layer to show biased and inaccurate results:

In 2015 my colleague Jeffrey Daniels showed me the Summon search results for his go-to search: “Stress in the workplace.” Jeff likes this search because ‘stress’ is a common engineering term as well as one common to psychology and the social sciences. The search demonstrates how well a system handles word proximities, and in this regard, Summon did well. There are no apparent results for evaluating bridge design. But Summon’s Topic Explorer, the right-hand sidebar that provides contextual information about the topic you are searching for, had an issue. It suggested that Jeff’s search for “stress in the workplace” was really a search about women in the workforce. Implying that stress at work was caused, perhaps, by women.

This sort of work is not, for me, novel or groundbreaking. Rather, it was so important to hear because of its relation to similar issues I’ve been reading about since library school. From the bias present in Library of Congress subject headings where “Homosexuality” used to be filed under “Sexual deviance”, to Safiya Noble’s work on the algorithmic bias of major search engines like Google where her queries for the term “black girls” yielded pornographic results; our systems are not neutral but reify the existing power relations of our society. They reflect the dominant, oppressive forces that constructed them. I contrast LC subject headings and Google search suggestions intentionally; this problem is as old as the organization of information itself. Whether we use hierarchical, browsable classifications developed by experts or estimated proximities generated by an AI with massive amounts of user data at its disposal, there will be oppressive misrepresentations if we don’t work to prevent them.

Reidsma’s work engaged with algorithmic bias in a way that I found relatable since I manage a discovery layer. The talk made me want to immediately implement his recording script in our instance so I can start looking for and reporting problematic results. It also touched on some of what despairs me in library work lately—our reliance on vendors and their proprietary black boxes. We’ve had a number of issues lately related to full-text linking that are confusing for end users and make me feel powerless. I submit support ticket after support ticket only to be told there’s no timeline for the fix.

On a happier note, there were many other talks at Code4Lib that I enjoyed and admired: Chris Bourg gave a rousing opening keynote featuring a rallying cry against mansplaining; Andreas Orphanides, who keynoted last year’s conference, gave yet another great talk on design and systems theory full of illuminating examples; Jason Thomale’s introduction to Pycallnumber wowed me and gave me a new tool I immediately planned to use; Becky Yoose navigated the tricky balance between using data to improve services and upholding our duty to protect patron privacy. I fear I’ve not mentioned many more excellent talks but I don’t want to ramble any further. Suffice to say, I always find Code4Lib worthwhile and this year was no exception.

DSpace Launches Developer Show and Tell Meetings / DuraSpace News

During a recent meeting of DSpace Developers an idea was shared by Terry Brady of Georgetown University, to host community meetings that discuss what DSpace Developers are up to; a developer showcase that allows developers the opportunity to share their tools, processes and recent work with everyone in the DSpace community.

Twitter / pinboard

RT @achdotorg: We too co-sign the #code4lib Community Statement in Support of @mchris4duke. We continue to admire an honor our col…