Planet Code4Lib

‘More than a collection of books’ / District Dispatch

How many times have you heard that phrase? A visit to the Central Library of the Brooklyn Public Library system proved to make that statement undoubtedly true. I work for ALA, I am a librarian, I worked for a library, I go to my branch library every week, so I know something about libraries. But this visit to this library was the kind of experience when you want to point and jump and exclaim, “See what is going on here!”

Dozens of baby strollers lined up in a library corridor

Photo credit: Jim Neal

Indeed, much more than a collection of books. And more than shiny digital things. And more than an information commons. All are welcome here because the library is our community.

Last week I traveled with President Julie Todaro and President-elect Jim Neal, along with the co-chairs of the Digital Content Working Group (DCWG) Erika Linke and Carolyn Anthony, as part of ALA’s DCWG contingent that meets periodically with publishers in New York City. ALA started these meetings several years ago, initially to convince publishers to sell e-books to public libraries. Since that time, public libraries can buy e-books, but there is still talk with publishers about business models, contract terms and prices as well common interests and potential collaborations. But first we visited Central Library.

On arrival, several FBI agents were assembled outside of the building, which gave us momentary pause until we learned that they were cast members of the television series Homeland, one of the many film crews that use the Central Library for scenic backdrop. Look at the front of the building. Someone said “it’s like walking into a church.” Very cool if you can call this your library home.

Brooklyn is making a difference in its community. Heck, in 2016 it won the Institute of Museum and Library Services’ National Medal, the nation’s highest honor for museums and libraries. Brooklyn was recognized in part for its Outreach Services Department, which provides numerous social services, including a creative aging program, oral history project for veterans, free legal services in multi-languages and a reading program called TeleStory. (TeleStory connects families to loved ones in prison via video conference so they can read books together. Who knew a public library did that?!)  All people are welcome here.

Door of Brooklyn Public Library

Photo credit: Jim Neal

The Adult Learning Center, operating in six Brooklyn Library branches under the direction of Kerwin Pilgrim, provides adult literacy training for new immigrants, digital literacy, citizenship classes, job search assistance and an extensive arts program that takes students to cultural centers where some see their first live dance performance!

During our tour of the library, we caught the tail end of story time, and I have never seen so many baby carriages lined up in the hallway. Another crowd scene when we were leaving the building – a huge line of people waiting to apply for their passports. Yes, the library has a passport office. Brooklyn also sponsors the municipal ID program called IDNYC. All New Yorkers are eligible for a free New York City Identification Card, regardless of citizenship status. Already, 10 percent of the city’s population has an IDNYC Card! This proof of identity card provides one with access to city services. With the ID, people can apply for a bank or credit union account, get health insurance from the NY State Health Insurance Marketplace and more.

I know this is just the tip of the iceberg of all the work that Brooklyn Public does. The library—more than a collection of books—is certainly home to Brooklyn’s 2.5 million residents. And all people are welcome here.

The post ‘More than a collection of books’ appeared first on District Dispatch.

Solr LocalParams and dereferencing / Brown University Library Digital Technologies Projects

A few months ago, at the Blacklight Summit, I learned that Blacklight defines certain settings in solrconfig.xml to serve as shortcuts for a group of fields with different boost values. For
example, in our Blacklight installation we have a setting for author_qf that references four specific author fields with different boost values.

<str name="author_qf">

In this case author_qf is a shortcut that we use when issuing searches by author. By referencing author_qf in our request to Solr we don’t have to list all four author fields (author_unstem_search, author_addl_unstem_search, author_t, and author_addl_t) and their boost values, Solr is smart enough to use those four fields when it notices author_qf in the query. You can see the exact definition of this field in our GitHub repository.

Although the Blacklight project talks about this feature in their documentation page and our Blacklight instance takes advantage of it via the Blacklight Advanced Search plugin I had never really quite understood how this works internally in Solr.


Turns out Blacklight takes advantage of a feature in Solr called LocalParams. This feature allows us to customize individual values for a parameter on each request:

LocalParams stands for local parameters: they provide a way to “localize” information about a specific argument that is being sent to Solr. In other words, LocalParams provide a way to add meta-data to certain argument types such as query strings.

The syntax for LocalParams is p={! k=v } where p is the parameter to localize, k is the setting to customize, and v the value for the setting. For example, the following

q={! qf=author}jane

uses LocalParams to customize the q parameter of a search. In this case it forces the query field qf parameter to use the author field when it searches for “jane”.


When using LocalParams you can also use dereferencing to tell the parser to use an already defined value as the value for a LocalParam. For example, the following example shows how to use the already defined value (author_qf) when setting the value for the qf in the LocalParams. Notice how the value is prefixed with a dollar-sign to indicate dereferencing:

q={! qf=$author_qf}jane

When Solr sees the $author_qf it replaces it with the four author fields that we defined for it and sets the qf parameter to use the four author fields.

You can see how Solr handles dereferencing if you pass debugQuery=true to your Solr query and inspect the debug.parsedquery in the response. The previous query would return something along the lines of

    author_t:jane^20.0 |
    author_addl_t:jane |
    author_addl_unstem_search:jane^50.0 |

Notice how Solr dereferenced (i.e. expanded) author_qf to the four author fields that we have configured in our solrconfig.xml with the corresponding boost values.

It’s worth noticing that dereferencing only works if you use the eDisMax parser in Solr.

There are several advantages to using this Solr feature that come to mind. One is that your queries are a bit shorter since we are passing an alias (author_qf) rather than all four fields and their boost values, this makes reading the query a bit clearer. The second advantage is that you can change the definition for the author_qf field on the server (say to add include a new author field in your Solr index) and the client applications automatically will use the definition when you reference author_qf.

Announcing the 2017 International Open Data Day Mini-Grant Winners! / Open Knowledge Foundation

This blog was co-written by Franka Vaughan and Mor Rubinstein, OKI Network team.

This is the third year of the Open Knowledge International Open Data Day mini-grants scheme, our best one yet! Building on last year’s lessons from the scheme, and in the spirit of Open Data Day, we are trying to make the scheme more transparent. We are aspiring to email every mini-grant applicant a response email with feedback about their application. This blog is the first in a series where we look at who has received grants, how much has been given, and also our criteria for deciding who to fund (more about that, next week!).

Our selection process took more time than expected due to the right circumstances – the UK Foreign & Commonwealth Office joined the scheme last week and is funding eight more events! Adding it to the support we got from SPARC, the Open Contracting Program of Hivos, Article 19 and our grant from Hewlett Foundation, a total amount of $16,530 worth of mini-grants are being distributed. This is $4,030 more than what we committed to initially.  

The grants are divided into six categories: Open Research, Open Data for Human Rights, Open Data for Environment, Open Data Newbies, Open Contracting and FCO special grantees. Although not a planned category, we decided to be flexible and accommodate places that will be hosting an Open Data Day event for the first time. We call it the Newbie category. These events didn’t necessarily fit our criteria, but showed potential or are hosting Open Data Day events for the first time. Two of these events will get special assistant from our Open Data for Development Africa Lead, David Opoku.

So without further ado, here are the grantees:  

  1. Open Knowledge Nepal’s ODD event will focus on “Open Access Research for Students” to highlight the conditions of Open Access and Open Research in Nepal, showcasing the research opportunities and the moving direction of research trends.  Amount: $350
  2. Open Switch Africa will organise a workshop to encourage open data practises in academic and public institutions, teach attendees how to create / utilize open data sheets and repositories and also build up an open data community in Nigeria.  Amount: $400
  3. The Electrochemical Society’s ODD event in the USA will focus on informing the general public about their mission to Free the Science and make scientific research available to everyone, and also share their plans to launch their open access research repository through Research4Life in March. Amount: $400
  4. Wiki Education Brazil aims to create and build structures to publish Brazilian academic research on Wikipedia and WikiData. They will organise a hackathon and edit-a-thon in partnership with and wikidata communities with support from Wikimedia Foundation research team to create a pilot event, similar to Amount: $400
  5. Kyambogo University, Uganda will organise a presentation on how open data and the library promote open access. They will host an exhibition on open access resources and organise a library tour to acquaint participants with the available open access resources in the University’s library. Amount: $400
  6. Kirstie Whitaker will organise a brainhack to empower early career researchers at Cambridge University, England, on how to access open neuroimaging datasets already in existence for new studies and add their own data for others to use in the future. Amount: $375
  7. The University of Kashmir’s ODD event in India will target scholars, researchers and the teaching community and introduce them to Open Data Lab projects that are available through open Knowledge Labs and open research repositories through  Amount: $400
  8. The Research Computing Centre of the University of Chicago will organise a hackathon that will introduce participants to public data available on different portals on the internet. Amount: $300
  9. Technarium hackerspace  in Lithuania will organise a science cafe to open up the conversation in an otherwise conservative Lithuanian scientist population about the benefits of open data, and ways to share the science that they do. Amount: $400
  10. UNU-MERIT /BITSS-YAOUNDE in Cameroon will organise a hands-on practical training courses on Github, OSF, STATA dynamic documents, R Markdown, advocacy campaigns etc. Targeting 100 people.  Amount: $400
  11. Open Sudan will organise a high level conference to discuss the current state of research data sharing in Sudan, highlight the global movement and its successes, shed light on what could be implemented on the local level that is learned from the global movement and most importantly create a venue for collaboration. Amount: $400

  1. Dag Medya’s ODD event will increase awareness on deceased workers in Turkey by structuring and compiling raw data in a tabular format and opening it to the public for the benefit of open data lovers and data enthusiasts. Amount: $300
  2. Election Resource Centre Zimbabwe will organise a training to build the capacity of project champions who will use data to tell human rights stories, analysis, visualisation, reporting, stimulating citizen engagement and campaigns. Amount: $350
  3. PoliGNU ODD event in Brazil will be a discussion on women’s participation in the development of public policies and will be guided by open data collection and visualizations. Amount: $390
  4. ICT4Dev Research Center will organise a press conference to launch their new website [] which highlights their open data work, a panel discussion about the relationship between Human Rights and Open Data in Morocco.  Amount: $300
  5.  will train and engage Citizen Helpdesk volunteers from four earthquake hit districts in Nepal (Kavre, Sindhpalchowke, Nuwakot and Dhading) who are working as interlocutors, problem solvers and advocates on migration-related problems to codify citizen feedback using qualitative data from the ground and amplifying them using open data tool.s Amount: $300
  6. Abriendo Datos Costa Rica will gather people interested in human rights activism and accountability, and teach them open data concepts and the context of open data day, and check for openness or otherwise of the available human rights data. Amount: $300

  1. SpaceClubFUTA will use OpenStreetMap, TeachOSM tasking manager, Remote Sensing and GIS tools to map garbage sites in Akure, Nigeria and track their exact locations, and the size and type of garbage. The data collected will be handed over to the agency in charge of clean up to help them organise the necessary logistics.  Amount: $300
  2. Open Data Durban will initiate a project about the impacts of open data in society through the engagement of the network of labs and open data school clubs (wrangling data through an IoT weather station) in Durban, South Africa. Amount: $310
  3. Data for Sustainable Development’s ODD event will focus on using available information from to create visualization thematic map to show how data can be used in the health sector to track spread of infectious diseases, monitor demand or use demographic factors to look for opportunity in opening of new health facilities. Amount: $300
  4. SubidiosClaros / Datos Concepción will create an Interactive Map of Floods on the Argentine and Uruguayan Coasts of the Uruguay River using 2000-2015 data. This will serve as an outline for implementing warning systems in situations of water emergency. Amount: $400
  5. Outbox Hub Uganda will teach participants how to tell stories using open data on air quality from various sources and their own open data project. Amount: $300
  6. Lakehub will use data to highlight the effects of climate change, deforestation on Lake Victoria, Kenya. Amount: $300
  7. will create the basis for a generic data model to analyze Air Quality in the city of Medellin for the last five years. This initial “scaffolding” will serve as the go-to basis to engage more city stakeholders while putting in evidence for the need for more open data sets in Colombia. Amount: $300
  8. Beog Neere will develop action plan to open up Extractives’ environmental impact data and develop data skills for key stakeholders – government and civil society. Amount: $300

  1. East-West Management Institute’s Open Development Initiative (EWMI-ODI) in Laos will build an open data community in Laos, and promote and localise localization the Open Data Handbook. Amount: $300
  2. Mukono District NGO Forum will use OpenCon resource depositories and make a presentation on Open Data, Open Access, and Open Data for Environment.  Amount: $350
  3. The Law Society of the Catholic University of Malawi will advocate for sexual reproductive health rights by going to secondary schools and disseminate information to young women on their rights and how they can report once they have been victimized. Amount: $350

  1. LabHacker will take their Hacker Bus to a small city near São Paulo and run a hack day/workshop there and create a physical tool to visualize the city budget which will be made available for the local citizens. They will document the process and share it online so other can copy and modify it. Amount: $400
  2. Anti-Corruption Coalition Uganda will organize a meetup of 40 people identified from civil society, media,  government and general public and educate them on the importance of open data in improving public service delivery. Amount: $400
  3. Youth Association for Development, will hold a discussion on the current government [Pakistan] policies about open data. The discussions will cover, open budget, Open Contracting, open bidding, open procurements, open bidding, open tendering, Open spending, Cooking budgets, Cooking budgets, Panama Papers, Municipal Money etc. Amount: $400
  4. DRCongo Open Data initiative will organise a conference to raise awareness on the role of open data and mobile technologies to enhance  transparency and promoting accountability in the management of revenues from extractive industries in DR Congo. Amount: $400
  5. Daystar University in Kenya will organise a seminar to raise awareness among student journalists about using public data to cover public officers’ use of taxpayer money. Amount: $380
  6. Centre for Geoinformation Science, University of Pretoria in South Africa will develop a web-based application that uses gamification to encourage the local community (school learners specifically)  to engage with open data on public funds and spending. Amount: $345
  7. Socialtic will host Data Expedition, Workshops, Panel and lightning talks and Open Data BBQ to encourage the participation of groups like NGOs and journos to use data for their work. Amount: $350
  8. OpenDataPy and Girolabs will show  civil society organizations public contract data of Paraguay and also show visualizations and apps made with that data. Their goal is to use all the data available and generate a debate on how this information can help achieve transparency. Amount: $400
  9. Code for Ghana will bring together data enthusiasts, developers, CSOs and journalists to work on analysing and visualising the previous government’s expenditure to bring out insights that would educate the general public. Amount: $400
  10. Benin Bloggers’ Association will raise awareness of the need for Benin to have an effective access to information law that oblige elected officials and public officials to publish their assets and revenues. Amount: $400

  1. Red Ciudadana will organize a presentation on open data and human rights in Guatemala. They aim to show the importance of opening data linked to the Sustainable Development Goals and human rights and the impact it has on people’s quality of life. Amount: $400
  2. School of Data – Latvia is organizing a hackathon and inviting journalists, programmers, data analysts, activists and the general public interested in data-driven opportunities. Their aim is to create real projects that draws out data-based arguments and help solve issues that are important for society. Amount: $280
  3. Code for South Africa (Code4SA)’s event will introduce participants to what open data is, why it is valuable and how it is relevant in their lives. They are choosing to *not* work directly with raw data, but rather using an interface on top of census and IEC data to create a more inclusive event. Amount: $400
  4. Code for Romania will use the “Vote Monitoring” App to build a user-friendly repository of open data on election fraud in Romania and Moldova. Amount: $400
  5. Albanian Institute of Science – AIS will organize a workshop on Open Contracting & Red Flag Index and present some of their instruments and databases, with the purpose of encouraging the use of facts in journalistic investigations or citizens’ advocacy. Amount: $400
  6. TransGov Ghana will clean data on public expenditure on development projects [2015 to 2016] and show how they are distributed in the Greater Accra Metropolis (data from Accra Metropolitan Assembly) to meet open data standards and deploy on Ghana Open Data Initiative (GODI) platform. Amount: $400


For those who were not successful on this occasion, we will providing further feedback and would encourage you to try again next time the scheme is available. We look forward to seeing, sharing and participating in your successful events. We invite you all you register your event on the ODD website.

Wishing you all a happy and productive Open Data Day! #OpenDataDay for more on Twitter!

rubyland infrastruture, and a modest sponsorship from honeybadger / Jonathan Rochkind is my hobby project ruby RSS/atom feed aggregator.

Previously it was run on entirely free heroku resources — free dyno, free postgres (limited to 10K rows, which dashes my dreams of a searchable archive, oh well). The only thing I had to pay for was the domain. Rubyland doesn’t take many resources because it is mostly relatively ‘static’ and cacheable content, so could get by fine on one dyno. (I’m caching whole pages with Rails “fragment” caching and an in-process memory-based store, not quite how Rails fragment caching was intended to be used, but works out pretty well for this simple use case, with no additional resources required).

But the heroku free dyno doesn’t allow SSL on a custom hostname.  It’s actually pretty amazing what one can accomplish with ‘free tier’ resources from various cloud providers these days.  (I also use a free tier mailgun account for an MX server to receive emails, and SMTP server for sending admin notifications from the app. And free DNS from cloudflare).  Yeah, for the limited resources rubyland needs, a very cheap DigitalOcean droplet would also work — but just as I’m not willing to spend much money on this hobby project, I’m also not willing to spend any more ‘sysadmin’ type time than I need — I like programming and UX design and enjoy doing it in my spare ‘hobby’ time, but sysadmin’ing is more like a necessary evil to me. Heroku works so well and does so much for you.

With a very kind sponsorship gift of $20/month for 6 months from Honeybadger, I used the money to upgrade to a heroku hobby-dev dyno, which does allow SSL on custom hostnames. So now is available at https, via, with cert acquisition and renewal fully automated by the letsencrypt-rails-heroku gem, which makes it incredibly painless, just set a few heroku config variables and you’re pretty much done.

I still haven’t redirected all http to https, and am not sure what to do about https on rubyland. For one, if I don’t continue to get sponsorship donations, I might not continue the heroku paid dyno, and then wouldn’t have custom domain SSL available. Also, even with SSL, since the feed often includes embedded <img> tags with their original src, you still get browser mixed-content warnings (which browsers may be moving to give you a security error page on?).  So not sure about the ultimate disposition of SSL on, but for now it’s available on both http and https — so at least I can do secure admin or other logins if I want (haven’t implemented yet, but an admin interface for approving feed suggestions is on my agenda).


I hadn’t looked at Honeybadger before myself.  I have used bugsnag on client projects before, and been quite happy with it. Honeybadger looks like basically a bugsnag competitor — it’s main feature set is about capturing errors from your Rails (or other, including non-ruby platform) apps, and presenting them well for your response, with grouping, notifications, status disposition, etc.

I’ve set up honeybadger integration on, to check it out. (Note: “Honeybadger is free for non-commercial open-source projects”, which is pretty awesome, thanks honeybadger!) Honeybadger’s feature set and user/developer experience are looking really good.  It’s got much more favorable pricing than bugsnag for many projects–pricing is just per-app, not per-event-logged or per-seat.  It’s got pretty similar featureset to bugsnag, in some areas I like how honeybadger does things a lot better than bugsnag, in others not sure.

(I’ve been thinking for a while about wanting to forward all Rails.logger error-level log lines to my error monitoring service, even though they aren’t fatal exceptions/500s. I think this would be quite do-able with honeybadger, might try to rig it up at some point. I like the idea of being able to put error-level logging in my code rather than monitoring-service-specific logic, and have it just work with whatever monitoring service is configured).

So I’d encourage folks to check out honeybadger — yeah, my attention was caught by their (modest, but welcome and appreciated! $20/month) sponsorship, but I’m not being paid to write this specifically, all they asked for in return for sponsorship was a mention on the about page.

Honeybadger also includes some limited uptime monitoring.   The other important piece of monitoring, in my opinion, is request- or page-load time monitoring, with reports and notifications on median and 90th/95th percentile. I’m not sure if honeybadger includes that in any way. (for non-heroku deploys, disk space, RAM, and CPU usage monitoring is also key. RAM and CPU can still be useful with heroku, but less vital in my experience).

Is there even a service that will work well for Rails apps that combines error, uptime, and request time monitoring, with a great developer experience, at a reasonable price? It’s a bit surprising to me that there are so many services that do just one or two of these, and few that combine all of them in one package.  Anyone had any good experiences?

For my library-sector readers, I think this is one area where most library web infrastruture is not yet operating at professional standards. In this decade, a professional website means you have monitoring and notification to tell you about errors and outages without needing to wait for users to report em, so you can get em fixed as soon as possible. Few library services are being operated such, and it’s time to get up to speed.  While you can run your own monitoring and notification services on your own hardware, in my experience few open source packages are up to the quality of current commercial cloud offerings — and when you run your own monitoring/notification, you run the risk of losing notice of problems because of misconfiguration of some kind (it’s happened to me!), or a local infrastructure event that takes out both your app and your monitoring/notification (that too!).  A cloud commercial offering makes a lot of sense. While there are many “reasonably” priced options these days, they are admittedly still not ‘cheap’ for a library budget (or lack thereof) — but it’s a price worth paying, it’s what i means to run web sites, apps, and services professionally.

Filed under: General

Top 5 myths about National Library Legislative Day / District Dispatch

Originally published by American Libraries in Cognotes during ALA Midwinter 2017.

The list of core library values is a proud one, and a long one. For the past 42 years, library supporters from all over the country have gathered in Washington, D.C. in May with one goal in mind – to advance libraries’ core values and communicate the importance of libraries to Members of Congress. They’ve told their stories, shared data and highlighted pressing legislation impacting their libraries and their patrons.

Attendees holding photobooth signs during NLLD 2017=6

Photo Credit: Adam Mason Photography

This year, Congressional action may well threaten principles and practices that librarians hold dear as never before. That makes it more important than ever that National Library Legislative Day 2017 be the best attended ever. So, let’s tackle a few of the common misconceptions about National Library Legislative Day that often keep people from coming to D.C. to share their own stories:

  1. Only librarians can attend.
    This event is open to the public and anyone who loves libraries – students, business owners, stay-at-home moms, just plain library enthusiasts – has a story to tell. Those firsthand stories are critical to conveying to members of Congress and their staffs just how important libraries are to their constituents.
  2. Only policy and legislative experts should attend.
    While some attendees have been following library legislative issues for many years, many are first time advocates. We provide a full day of training to ensure that participants have the most up-to-date information and can go into their meetings on Capitol Hill fully prepared to answer questions and convey key talking points.
  3. I’m not allowed to lobby.
    The IRS has developed guidelines so that nonprofit groups and private citizens can advocate legally. Even if you are a government appointee, there are ways you can advocate on issues important to libraries and help educate elected officials about the important work libraries do.
    Still concerned? The National Council of Nonprofits has resources to help you.
  4. My voice won’t make a difference.
    From confirming the new Librarian of Congress in 2016 to limiting mass surveillance under the USA FREEDOM Act in 2015 to securing billions in federal support for library programs over many decades, your voice combined with other dedicated library advocates’ has time and again defended the rights of the people we serve and moved our elected officials to take positive action. This can’t be done without you!
  5. I can’t participate if I don’t go to D.C.
    Although having advocates in D.C. to personally visit every Congressional office is hugely beneficial – and is itself a powerful testimony to librarian’s commitment to their communities –  you can participate from home. During Virtual Library Legislative Day you can help effectively double the impact of National Library Legislative Day by calling, emailing or tweeting Members of Congress using the same talking points carried by onsite NLLD participants.

Legislative threats to core library values are all too real this year. Don’t let myths prevent you from standing up for them on May 1-2, 2017. Whether you’ve been advocating for 3 months or 30 years, there’s a place for you in your National Library Legislative Day state delegation, either in person or online.

For more information, and to register for National Library Legislative Day, please visit

The post Top 5 myths about National Library Legislative Day appeared first on District Dispatch.

Poynder on the Open Access mess / David Rosenthal

Do not be put off by the fact that it is 36 pages long. Richard Poynder's Copyright: the immoveable barrier that open access advocates underestimated is a must-read. Every one of the 36 pages is full of insight.

Briefly, Poynder is arguing that the mis-match of resources, expertise and motivation makes it futile to depend on a transaction between an author and a publisher to provide useful open access to scientific articles. As I have argued before, Poynder concludes that the only way out is for Universities to act:
As it happens, the much-lauded Harvard open access policy contains the seeds for such a development. This includes wording along the lines of: “each faculty member grants to the school a nonexclusive copyright for all of his/her scholarly articles.” A rational next step would be for schools to appropriate faculty copyright all together. This would be a way of preventing publishers from doing so, and it would have the added benefit of avoiding the legal uncertainty some see in the Harvard policies. Importantly, it would be a top-down diktat rather than a bottom-up approach. Since currently researchers can request a no-questions-asked opt-out, and publishers have learned that they can bully researchers into requesting that opt-out, the objective of the Harvard OA policies is in any case subverted.
Note the word "faculty" above. Poynder does not examine the issue that very few papers are published all of whose authors are faculty. Most authors are students, post-docs or staff. The copyright in a joint work is held by the authors jointly, or if some are employees working for hire, jointly by the faculty authors and the institution. I doubt very much that the copyright transfer agreements in these cases are actually valid, because they have been signed only by the primary author (most frequently not a faculty member), and/or have been signed by a worker-for-hire who does not in fact own the copyright.

Look Back, Move Forward: network neutrality / District Dispatch

Look Back Move Forward: net neutrality. “Whereas, America’s libraries, collect, create, and disseminate essential information to the public over the Internet, and serve as critical resources for individuals to access, create, and distribute content;”

Background image is from the ALA Archives.

With news about network neutrality in everyone’s feeds recently, let’s TBT to 2014 at the Annual Conference in Las Vegas, Nevada, where the ALA Council passed a resolution “Reaffirming Support for National Open Internet Policies and Network Neutrality.” And in 2006—over a decade ago!—our first resolution “Affirming Network Neutrality” was approved.

You can read both resolutions from 2006 and 2014 in ALA’s Institutional Repository. While you are here, be sure to sign up for the Washington Office’s legislative action center for more news and opportunities to act as the issue evolves.

2014 Resolution Reaffirming Support for National Open Internet Policies and “Network Neutrality”

2014 Resolution Reaffirming Support for National Open Internet Policies and “Network Neutrality”

• Resolution endorsed by ALA Council on June 28, 2006. Council Document 20.12.
• Resolution adopted by ALA Council on July 1, 2014, in Las Vegas, Nevada. Council Document 20.7.

The post Look Back, Move Forward: network neutrality appeared first on District Dispatch.

WordPress could be libraries’ best bet against losing their independence to vendors / LibUX

Stephen Francouer: Interesting play by EBSCO. I’m going to guess that it’s optimized to work with EDS and other EBSCO products. “When It Comes To Improving Your Library Website, Not All Web Platforms Are Created Equal

Stephen’s linking to an article where Ebsco announces Stacks:

Stacks is the only web platform created by library professionals for library professionals. Stacks understands the challenges librarians face when it comes to the library website and has built a web platform and native mobile apps that lets you get back to doing what you do best; curating excellent content for your users. Learn more about how Stacks and the New Library Experience.

I haven’t had any hands-on opportunity with Stacks, so I can’t comment on the product – it might be good. My contention, however, is that it is probably worse for libraries if it’s good.

Ebsco is not the first in this space. I think, probably, Springshare has the leg up – so far. Ebsco won’t be the last in this space, either. I know of two vendors who are poised to announce their product.

The opportunity for library-specific content management systems is huge, though. Open-source is still such an incredibly steep hill for libraries that installing, maintaining, customizing — and I am going to say this without any first-hand experience with Stacks, but I can’t believe Ebsco will break free of the vendor-wide pattern — a superior platform like WordPress requires too much involvement. So, because library websites fail to convert and library professionals lack the expertise to solve that problem themselves, it’s ripe for the picking.

This is part of a trend I’ve warned about in my last few posts, the last podcast (called “Your front end is doomed”), and so on all the way back to my once optimistic observation of the Library as Interface: libraries are losing control of their most important asset – the gate.

Libraries are so concerned with being help-desk level professionals that they are ignoring the in-house opportunity for design and development expertise and unable to comprehend the role that plays in libraries’ independence.

Why I title this post “WordPress could be libraries’ best bet against losing their independence to vendors” is because WordPress — moreso than Drupal — is the easiest platform through which to learn how to develop custom solutions. There are more developers, cheap conferences worldwide, ubiquitous meetups, literally more WordPress sites than any other site on the internet, that is easy-ish to use out of the box and most capable to scale for complexity.

These in-house skills are crucial for the libraries’ ability to say “no” over the long term.

Measuring the openness of government data in southern Africa: the experience of a GODI contributor / Open Knowledge Foundation

The Global Open Data Index (GODI) is one of our core projects at Open Knowledge International. The index measures and benchmarks the openness of government data around the world. As we complete the review phase of the audit of government data, we are soliciting feedback on the submission process. Tricia Govindasamy shares her experience submitting to #GODI16.

Open Data Durban (ODD), a civic tech lab based in Durban South Africa, received the opportunity from Open Knowledge International (OKI) to contribute to the Global Open Data Index (GODI) 2016 for eight (8) southern African countries. OKI defines GODI as “an annual effort to measure the state of open government data around the world.” With a fast approaching deadline, I was eager to take up the challenge of measuring the openness of specified datasets as made available by the governments of South Africa, Botswana, Namibia, Malawi, Zambia, Zimbabwe, Mozambique and Lesotho.

This intense data wrangling consisted of finding the state of open government data for the following datasets: National Maps, National Laws, Government Budget, Government Spending, National Statistics, Administrative Boundaries, Procurement, Pollutant Emissions, Election Results, Weather Forecast, Water Quality, Locations, Draft Legislation, Company Register, Land Ownership. A quick calculation: 15 datasets multiplied by 8 individual countries, results in 120 surveys! As you can imagine, this repetitive task took hours of google searches until late hours of the night (the best and most productive time for data wrangling I reckon) resulting in my sleep pattern being completely messed up. Nonetheless, I got the task done. Here are some of the findings.

Part of the survey for Pollutant Emissions in South Africa


The African Development Bank developed Open Data Portals for most of the 8 countries. At first sight, these portals are quite impressive with data visualisations and graphics, however, these portals are poorly organised and rarely updated. For most countries, the environmental departments are lagging as there is barely any records on Pollutant Emissions or Water Quality. Datasets on Weather forecasts and Land Ownership are only available for half of the countries. In some situations, sections of the datasets were not available. For example, while both South Africa and Malawi had data on land parcel boundary, there was no data on property value or tenure type.

It was quite shocking to note that Company Register, an important dataset that can help monitor fraud as it relates to trade and industry was unavailable for all the countries with the exception of Lesotho.

National Laws dataset was found for all countries with the exception of Mozambique, whereas Draft Legislation data was not available in Mozambique, Namibia and Botswana. I believe the availability of data on National Laws for almost all the countries can in part be attributed to the African Legal Information Institute, which has contributed to making legalisation open and has created websites for South Africa, Lesotho, Malawi and Zambia. Also, while Government Budget and Expenditure data are available, important detailed information such as transactions are lacking for most countries.

On a more positive note, Election Results compiled by independent electoral commissions were the easiest data to find and were generally up to date for all countries except Mozambique, for which I found no results.

It is important to note that none of the datasets for any of the 8 countries are openly licensed or in the public domain, begging the question for more education on the importance of the matter.


OKI has a forum in which Network members from around the world discuss projects and also ask and resolve questions. I must admit, I took full advantage of this since I am a new member of the community with my training wheels still on. The biggest challenge I faced during this process was searching for Mozambique’s government data. I had to resort to using Google translator to find relevant data sources since all the data are published in Portuguese, Mozambique’s national language.

Due to the language barrier, I felt certain things were lost in translation, thus not providing a fair depiction of the survey. Luckily, OKI members from Brazil will be reviewing my submission to verify the data sources.

Tricia Govindasamy submitting to GODI on behalf of 8 countries in southern Africa.

Being South African and having prior knowledge of available government data made the process much easier when I submitted for South Africa. I already knew where to find the data sources even though many of the sources did not show up on simple google searches. I do not have experience with government data from the 7 other countries so I solely relied on google searches which may or may not have contained all exhaustible sources of data in its first few pages of search results.

The part of the survey which I felt my efforts really did not provide much insight into the Index was in situations where I found no datasets. If no datasets are found, the survey asks to “provide the reason that the data are not collected by the government”. I did not have any evidence to sufficiently substantiate an answer and contacting government departments for a variety of countries to get an answer was simply not practical at the time.

I would like to thank OKI for giving Open Data Durban the opportunity to be part of contributing to  GODI. It was a fulfilling experience as it is a volunteer based programme for people around the world. It is always great to know that the open data community extends beyond just Durban or South Africa but is an international community who are always collaborating on projects with a joint objective of advocating for open data.

Listen: Your Front End is Doomed (33:10) / LibUX

Metric alum Emily King @emilykingatcsn swings by to chat with me about conversational UI and “interface aggregation” – front ends other than yours letting users connect with your service without ever actually having to visit your app. We cover a lot: API-first, considering the role of tone in voice user interfaces, and — of course — predicting doom.

You can also  download the MP3 or subscribe to Metric: A UX Podcast on OverCastStitcher, iTunes, YouTube, Soundcloud, Google Music, or just plug our feed straight into your podcatcher of choice.

Jobs in Information Technology: February 22, 2017 / LITA

New vacancy listings are posted weekly on Wednesday at approximately 12 noon Central Time. They appear under New This Week and under the appropriate regional listing. Postings remain on the LITA Job Site for a minimum of four weeks.

New This Week

Yale University, Sterling Memorial Library, Workflow Analyst/Programmer, New Haven, CT

Penn State University Libraries, Nursing and Allied Health Liaison Librarian, University Park, PA

St. Lawrence University, Science Librarian, Canton, NY

Louisiana State University, Department Head/Chairman, Baton Rouge, LA

Louisiana State University, Associate Dean for Special Collections, Baton Rouge, LA

Visit the LITA Job Site for more available jobs and for information on submitting a job posting.

Evergreen 2.12 beta is released / Evergreen ILS

The Evergreen community is pleased to announce the beta release of Evergreen 2.12 and the beta release of OpenSRF 2.5. The releases are available for download and testing from the Evergreen downloads page and from the OpenSRF downloads page. Testers must upgrade to OpenSRF 2.5 to test Evergreen 2.12.

This release includes the implementation of acquisitions and booking in the new web staff client in addition to many web client bug fixes for circulation, cataloging, administration and reports. We strongly encourage libraries to start using the web client on a trial basis in production. All functionality is available for testing with the exception of serials and offline circulation.

Other notable new features and enhancements for 2.12 include:

  • Overdrive and OneClickdigital integration. When configured, patrons will be able to see ebook availability in search results and on the record summary page. They will also see ebook checkouts and holds in My Account.
  • Improvements to metarecords that include:
    • improvements to the bibliographic fingerprint to prevent the system from grouping different parts of a work together and to better distinguish between the title and author in the fingerprint;
    • the ability to limit the “Group Formats & Editions” search by format or other limiters;
    • improvements to the retrieval of e-resources in a “Group Formats & Editions” search;
    • and the ability to jump to other formats and editions of a work directly from the record summary page.
  • The removal of advanced search limiters from the basic search box, with a new widget added to the sidebar where users can see and remove those limiters.
  • A change to topic, geographic and temporal subject browse indexes that will display the entire heading as a unit rather than displaying individual subject terms separately.
  • Support for right-to-left languages, such as Arabic, in the public catalog. Arabic has also become a new officially-supported language in Evergreen.
  • A new hold targeting service supporting new targeting options and runtime optimizations to speed up targeting.
  • In the web staff client, the ability to apply merge profiles in the record bucket merge and Z39.50 interfaces.
  • The ability to display copy alerts when recording in-house use.
  • The ability to ignore punctuation, such as hyphens and apostrophes, when performing patron searches.
  • Support for recognition of client time zones,  particularly useful for consortia spanning time zones.

With release 2.12, minimum requirements for Evergreen have increased to PostgreSQL 9.3 and OpenSRF 2.5

For more information about what will be available in the release, check out the draft release notes.

Many thanks to all of the developers, testers, documentors, translators, funders and other contributors who helped make this release happen.

Michele Kimpton to Lead Business Development Strategy at DPLA / DPLA

The Digital Public Library of America is pleased to announce that Michele Kimpton will be joining its staff as Director of Business Development and Senior Strategist beginning March 1, 2017.

In this critical role, Michele will be responsible for developing and implementing business strategies to increase the impact and reach of DPLA. This will include building key strategic partnerships, creating new services and exploring new opportunities, expanding private and public funding, and developing community support models, both financial and in-kind. Together these important activities will support DPLA’s present and future.

“We are truly fortunate to have someone of Michele’s deep experience, tremendous ability, and stellar reputation join DPLA at this time,” said Dan Cohen, DPLA’s Executive Director. “Along with the rest of the DPLA staff, I look forward to working with Michele to strengthen and expand our community and mission.”

Prior to joining DPLA, Michele Kimpton worked as Chief Strategist for LYRASIS and CEO of DuraSpace, where she developed several new cloud-based managed services for the digital library community, and developed new sustainability and governance models for multiple open source projects. Kimpton is a founding member of both the National Digital Strategic Alliance (NDSA) and the IIPC (International Internet Preservation Consortium). In 2013, Kimpton was named Digital Preservation Pioneer by the NDIPP program at Library of Congress. She holds a MBA from Santa Clara University, and a Bachelor of Science in Mechanical Engineering from Lehigh University. She can now be reached at michele dot kimpton at dp dot la.

Welcome, Michele!

Assembling the Whole: An Interview with Librarian|Artist Oliver Baez Bendorf / Library of Congress: The Signal


Oliver Baez Bendorf is a poet, cartoonist, librarian, teaching artist and activist. He holds an MFA in Poetry and MLIS from the University of Wisconsin-Madison, author of the book of poems The Spectral Wilderness (Kent State University Press 2015) and an essay on activism in the forthcoming Poet-Librarians in the Library of Babel (Library Juice Press Spring 2017).

We commissioned Oliver to create a poster series inspired by our Collections as Data summit in September 2016 that represent key themes of what it means to serve and use library collections computationally.

We caught up with Oliver to ask about his process and why he decided on this physical approach to represent a digital event.

You call yourself a Queer poet, cartoonist, teacher, librarian. Why librarian?

I started working part-time on campus at a library at UW Madison when I was finishing my MFA in poetry. I was working with the humanities librarian (I had a great mentor in Susan Barribeau) and I worked closely in Special Collections, and it blew my mind that librarians got to spend time with these amazing items and could help make them accessible to other people. It deepened my understanding of the role of poetry in an information landscape: how it circulates, how it reaches new readers. So I decided to pursue an MLIS (also at UW Madison) and continue to follow my interests.

When I was in library school I did a semester-long practicum at The Bubbler, which was Madison Public Library’s then brand new kind-of makerspace…a lot more than a makerspace…super arts focused (instead of more fabrication focused) and that opened things up for me. People need access to creativity as much as they need access to information, and libraries are really well-positioned to facilitate that. What’s most exciting for me is where those overlap– learning as a creative experience at or through a library and vice versa. The library program at Wisconsin was a good fit.I did an art project for almost every assignment I could, and they let me do that (thank you!), so my experience learning about the field was also a process of integrating hands-on creativity into learning. The intersection of the super-material hands-on and the digital is really exciting to me.

This project that I did after the Pulse Orlando shooting was to collect handmade posters and make them available online for people to download, print and hang up. People made them on their own or in groups-signs of protest, love, resistance- many different angles and sentiments- all hand-made. People sent me photos of the flyers hanging up on their campus, in their office, etc. I love things that you can tell a human made, but I also love the way that digital collections let those be in more places, almost like a chain letter or something like that, the intimacy of passing it on and letting it circulate farther than it might have otherwise.

Which is the same concept as the posters you made for our Collections as Data event.

Yes, totally. I love it when people print something out again. I know it’s bad for the trees, I know a lot of emails say “Do you really need to print this out?” But yes, I love it when something becomes paper again.

We were pleasantly shocked when we saw your poster drafts and they were so physically worked with the collaging. It was a surprising representation of a digital conference.

Thanks. There was a lot to synthesize from the conference and collage as a thinking process was a really effective method for me. I moved things around so much before even gluing anything down and shuffling things around was super helpful in a kinesthetic way. I love collage for that. I have a big Nike shoe box where I keep scraps of paper. Eve Sedgwick has written about this kind of texxture, where she uses two Xs in it to signify materials that carry a history with them. My MLIS advisor Jonathan Senchyne changed the way I think about how paper relates to information.  So collage seemed fitting to me, how all of these scraps hold other meanings and histories that they bring to this new context.



Can you walk us through the process for making “The Whole”?

I was really struck by that question [originally posed by Ricky Punzalan] “what does it mean to assemble the whole?” and knew that I wanted to do something with that. I was also thinking a lot about patterns: how to convey something that’s machine readable…data points. But when you zoom in, might be something a human is really drawn to, actually luscious and vivid, each data point expanded into a whole story. So I think of these different scraps of paper and watercolors as data points that are all connected but have the capacity to be these luscious stories on their own, and that assembling them together is part of the work and mission of people at the symposium. The lives inside these collections and how to approach them,both as individual stories that people can play with and learn about and also what they mean when taken together and how to give access to those stories and angles.


So I just started playing with these scraps and moving them around. The lines are the least interesting aesthetic part of this [poster], but that’s what connects them. The points without the lines here would just be scattered on the page but there is a way to connect all of them, different ways to connect all of them that haven’t even been drawn yet. I think of the web of it as intentionally unfinished, as a way to represent an invitation for more work to be done connecting points.

The pieces look so placed, it makes me want to pick them up and move them around, lift them off the poster and place them in a different cluster. It picks up on that theme of inviting engagement with our collections, that computation allows you to act on them.

So since we’re on this theme of patterns, let’s go to the “Calling All Storytellers” poster.

When I started it, I was thinking about invitations to artists and writers to interact with and act on data and collections at the library. But by the time I finished it, it had expanded from that —  anyone who downloads the data and interacts with it is telling a story about it or trying to find a story to tell with it. It could be an artist or writer but it could also be a researcher or anyone who has some interest or some story that they want to tell with the data.


With this one too, I was thinking about things that might look alike but are not exactly alike and how each one of those data points again is a way in. Those questions that I got interested in – “Is the pattern the story or is the story where the pattern breaks?”–  I think of it almost like a prompt to someone who might want to interact with data. I was excited to think about what kinds of questions can be asked about data, about a collection, about an archive, and is the story where the pattern breaks, thinking about what isn’t there. Then I got this phrase please report to your nearest library stuck in my head and I kinda kept hearing it in my head as over a PA system. I want people to feel paged to their nearest library — maybe particularly artists and writers, but also anyone with curiosity. Paged to their library with questions like this as an invitation and also a kind of civic participation.

One of my takeaways coming out of Collections as Data was this idea of access. Not necessarily people having to go to the library but also there’s an excitement that feels like a convening at the library. Even if you can access the data from home, there’s still something about going to the library, reporting to the library, showing up to the library and maybe that doesn’t necessarily have to mean at the physical library, although I’m good with it if it does. But at least the spirit of it, of showing up.

The last one, let’s call it “The Fish” is a crowd favorite. What was your process here?


I was thinking about collections with a natural-history bent. A lot of the threads that Thomas [Padilla] and Marisa [Parham] brought up in their talks – what is in collections and why? – and the invitation to interact, toward the lives there are traces of in collections and towards people’s lives now that collections can be in service of.  The fish asking this question “What are you going to do with those?” was on my mind after the symposium; collections not just for collections’ sake but how to remember and foreground the human or animal element or more generally speaking the life element. To remember also when thinking about data the luscious or historical or beautiful human complex animal lives behind and inside of collections and influenced by collections…and how the work of collections as data can be in service of that. That ties into Ricky’s [Punzalan] talk about reunification of items. How can what is in these collections be put to service those who need it most? There are lives, bodies, sentient beings that are in the collections and could be influenced by work done with them.

The buffalo kept getting cut off and the buffalo had to be there, so I kept shuffling things around with this one,  trying to get everything visible.

I like their physicality and how approachable they are.

Thanks. Yeah, the feather is actually sewn on. I had to hold the top of the scanner down to keep the light out, cause otherwise a little bit of light was sneaking in on the edge right there. One of my MLIS professors (hi Dorothea Salo!) called me a “materiality wonk” and I embrace it. I really like an approach and aesthetic unmistakably made by hand and I love especially bringing that to digital library contexts because there are so many conversations in the digital library world right now about how to manage computational advances with keeping humans at the center. I like the unexpected, super-handmade aesthetic that deals with digital library topics. That’s something I was doing a lot of with illustration and visual work when I was at DLF, so it was fun to dive more into that in this conceptual way with collages.

One of my big influences in this intersection is my teacher from [the University of ]Wisconsin, Lynda Barry, who is fond of saying the human hand is the original digital device. We also talk about the fingers as “digits” and I do think, in so many ways, that handmade and hands-on work is very “digital” in this way. I loved being able to play in that space with these posters.

What are five sources of inspiration for you right now? 

Small Science Collective Zine Library: from a group of scientists, artists, teachers, and students who believe in zines in science education. This is their collection of fact-based zines in science-y categories (creatures, insects, ecology, evolution, space and physics, etc.) available for free download

Aspen groves: looks like many separate trees; actually one massive organism! With a giant root system underground.

United in Anger: A History of ACT UP: inspiring documentary with archival footage and oral histories of the AIDS Coalition to Unleash Power activist movement.

“All of Us or None” Archive Project: ever-expanding online collection of social justice posters maintained by the Oakland Museum of Contemporary Art – currently over 24,500 items strong. 

Emily Dickinson Archive Lexicon: over 9,000 words used in Dickinson’s collected poems- with definitions from her 1944 Webster dictionary (check out “accessible” and “library” and “pattern”)

Oliver’s posters are now available for download on the Collections as Data event page.

Sparking Curiosity – Librarians’ Role in Encouraging Exploration / In the Library, With the Lead Pipe

In Brief

Students often struggle to approach research in an open-minded, exploratory way and instead rely on safe topics and strategies. Traditional research assignments often emphasize and reward information-seeking behaviors that are highly prescribed and grounded in disciplinary practices new college students don’t yet have the skills to navigate. Librarians understand that the barriers to research are multidimensional and usually involve affective, cognitive, and technical concerns. In this article we discuss how a deeper understanding of curiosity can inspire instructional strategies and classroom-based activities that provide learners with a new view of the research process. We share strategies we have implemented at Oregon State University, and we propose that working with teaching faculty and instructors to advocate for different approaches to helping students solve information problems is a crucial role for librarians to embrace.

By Hannah Gascho Rempel and Anne-Marie Deitering


Every librarian who has helped a student develop an academic argument knows about those topics. Every first-year composition or speech teacher knows them too. Some instructors ban them outright; others are more subtle in their disapproval. While some instructors who ban topics because they are particularly polarizing or controversial, many of our colleagues who teach writing tell us they were driven to create a ban list for a different reason. They ban topics that are overdone, and that they don’t want to see again. When they see topics like “body image and the media” or “concussions in the NFL,” based on their past experiences with papers on these topics, they have learned that the final paper will rarely be provocative, innovative, or even interesting.

If instructors don’t use ban lists, they see these same topics dozens of times a term. So why do students continue to gravitate to these topics? And what can we as librarians do about it? It is easy to look at the fifteenth marijuana legalization paper and think that students aren’t trying, aren’t engaged, or just don’t care about the course. And in some cases, those things may be true. Students have a lot going on and sometimes a research assignment won’t be at the top of their list of priorities. But in this essay, we are going to examine this question from another perspective and discuss ways that providing space for curious exploration can reframe the research paper assignment.

Librarians and the Exploratory Research Process

The importance of open-minded exploration in the research process is evident from a basic review of seminal research on information literacy. Carol Kuhlthau’s highly-cited model of information seeking describes an exploratory, learning-focused research process.1 At Oregon State (OSU), we enjoy a well-established, creative and productive partnership with rhetoric and composition faculty. This partnership dates back almost twenty years and has focused on working with students in a first-year composition course. Our partnership and the curricular choices for this course were deeply informed by Kuhlthau’s interpretation of the academic research process. Over the years, this course has evolved, but our goals remain consistent: to introduce students to an exploratory, open-minded, inquiry-driven research process.2

As librarians, we bring this focus on research as a complex learning process to our work with faculty, and we consider that process in all of its dimensions: affective, cognitive, and technical. Here again, Kuhlthau is influential in shaping our thinking. She highlights the fact that information-seeking is an inherently uncertain process, and she considers the emotional impact that this uncertainty can create for the learner (Kuhlthau 1993). To understand how and why our students do what they do with research assignments, we need to consider the range of influences on their behaviors.

Our current thinking about students’ information seeking behaviors went in a new direction based on an assessment project we conducted in OSU’s first-year composition courses. In 2013, we started reviewing a stack of ninety student essays to answer a fairly typical, yes or no question: can first-year composition students accurately identify their sources in order to apply the appropriate citation style? Answering that question became much less important to us, as almost immediately, we found that the papers we were reviewing pointed to a much more complicated question: how does the topic a student selects affect their willingness or ability to engage in an exploratory research process? Or, to approach this question from a different direction: Can we teach an exploratory research process if we start after our students have already selected their topics?

The first-year composition curriculum at that time featured three major writing assignments used in all sections. The second of these, a metacognitive narrative in which the students explicitly described their research process and the sources found, was the paper used for the assessment project. We already knew from our conversations with the graduate teaching assistants and faculty who taught first-year composition that this metacognitive assignment was difficult to teach, difficult to assess, and difficult for students to grasp. What we did not know, until we read these papers, was how boring and lifeless most of the papers were.

We want to pause here to emphasize that we are not criticizing the effort, attention, or ability that students brought to these papers. Papers that were carefully crafted and well thought-out were just as dull as those that were incomplete, rushed, or messy. The composition instructors confirmed to us these metacognitive papers were uniquely (and almost universally) difficult to read. We knew that many students had never been asked to produce a paper that included this level of self-reflection about the research process before, and that assignment design certainly accounted for some of the problems we observed. However, we did not think that the assignment itself explained all the issues we observed. We struggled with the question of why these papers were so lifeless, until Hannah had an epiphany: While every one of these papers described a research process, almost none of students described learning anything new from their research. The processes described were almost completely devoid of curiosity.

Even though students were told to construct a narrative that described their thinking throughout the entire research process, the majority of papers started with a description of their first search in library databases. Most students skipped topic selection entirely. A few started with a vague sentiment: “ever since I was a child I have loved the oceans.” And a few were brutally honest: “I chose this topic because I wrote my senior project on it last year.” The entire sample was dominated by overused topics familiar to any librarian or composition instructor: body image and the media; videogames and violence; marijuana legalization; or lowering the legal drinking age.

Research from Project Information Literacy ( helps us understand why students gravitate toward familiar, overused topics rather than exploring new topics. These old standbys make students feel safe as they navigate the inherent uncertainty of the research process. In their 2010 paper, Alison Head and Michael Eisenberg show that most students (85%) identify “getting started” as their biggest challenge in research writing. In their qualitative analysis Head and Eisenberg identify a metaphor that sheds some light on how students feel about topic selection: gambling. To students, committing to a research topic is like rolling the dice. When students choose an unfamiliar topic, they don’t know what they will find and they do not know if they can ultimately meet their instructor’s expectations. Even worse, they must invest weeks and weeks of work into a project that may or may not pay out in the form of a good grade (Head and Eisenberg 2010).

In this context, it is not surprising that students prefer topics they have used before, or that they know many other students have successfully used before. These topics represent safe choices. They know these topics will “work,” because they have worked in the past. Students may not know exactly what they are being asked to do in their first “college-level research paper,” but with these topics, they know they are giving themselves a reasonable chance at success.

However, the same qualities that make these topics feel safe for students make them problematic for instructors and librarians. We want students to start thinking about research as a learning process and as an opportunity to explore new things. We know that all students will not have this experience with every research assignment they complete. Students have to juggle many competing demands on their time, and they will not connect with every assignment they have. As a result, instructors and librarians must create conditions where students feel motivated, capable, and safe enough to explore and learn in the research process. Because of our reflections on the student paper assessment project, we realized that as instruction librarians we needed to enter the process earlier, at the topic selection stage, and that we needed to think more intentionally about how to create an environment that encourages curiosity.

Curiosity and Exploration

Curiosity has been defined as “the drive-state for information” (Kidd and Hayden 2015: 450). In this sense, curiosity is a part of any academic research process, since all students will, at some point, need to find information they do not have. For example, finding a quotation to support a claim one has already made would require curiosity. In the context of our work as librarians with the first-year composition course, however, curiosity meant more than this. We were trying to introduce research as an opportunity to learn new things, to explore new perspectives, and to synthesize new ideas into an original argument. If students clung to topics they already knew a lot about, it seem unlikely that they could experience the research process in this new way. However, to test this assumption we needed to find out more about our students and more about curiosity.

To learn more about our students, we designed a small qualitative study that allowed us to track five students’ experiences through the first-year composition course. We gave students a curiosity self-assessment test (discussed in more depth later in this article), interviewed each student twice during the term, and also analyzed each student’s graded work. In the interviews we asked students both implicitly and explicitly how curiosity impacted their behaviors. This project confirmed what we had learned in our assessment of previous student research papers and through reading the literature on students’ information seeking behaviors: when it comes to research assignments, even curious students will avoid topics they do not know anything about, and they will do so to avoid risking failure.

Curiosity and Cognition

To learn about curiosity, we turned to the research literature. A small body of work by cognitive psychologists attempting to define—and develop instruments to identify—different types of curiosity gave us a framework for understanding what sparks curiosity in the first place (Collins et al. 2004; Litman and Jimerson 2004; Loewenstein 1994; Zuckerman and Link 1968). However, these researchers are interested in identifying those aspects of curiosity that are context dependent and those that are inherent personality traits. Our focus as instruction librarians was different; we wanted to know if there are aspects of curiosity that can enhance how we design and teach research assignments. The cognitive psychologists identified several different types of curiosity, and we selected three types we believe have particular value in the composition classroom based on our direct experiences with students: epistemic, perceptual, and interpersonal curiosity.

Epistemic curiosity is the drive for knowledge and the desire to seek information to enjoy the feeling of knowing things (Litman et al. 2005). Epistemic curiosity pushes people to figure out how things work; and it can be concrete (e.g., sparked by a desire to solve a puzzle, or take apart a machine) or abstract (e.g., sparked by a desire to understand theory or abstract concepts). Before starting our own exploration of curiosity, we would have probably defined “curiosity” in the classroom setting solely as epistemic. We assumed that curiosity sparks people to ask questions when they encounter new information—in the classroom and in the world—and that students who are both curious and engaged can always find an interesting research topic in those questions. However, we were not seeing that behavior play out in many of our students’ research papers, and instructors confirmed that many students who were engaged and interested in learning were not always driven by this type of curiosity.

Perceptual curiosity is sparked by the drive to experience the world through the senses—to actually touch, hear, and smell things (Collins et al. 2004). The desire to try new flavors or to touch interesting textures may be easy to relate to, but this kind of sensory experience rarely comes up in the traditional classroom. Traditional research assignments require students to transfer their learning out of the classroom, but the classroom experience remains focused on the facts, figures, ideas, and theories found in texts and does not extend to embodied or physical experiences. This disconnect is unfortunate because the first encounter with something new is usually through the senses. Perceptual curiosity, as a concept, immediately pushed us to think more expansively about how curiosity could connect to an academic research process. Learning activities that ask students to engage their senses while in the lab or outside on a field trip could help spark students to seek different types of information to make sense of what they were hearing, smelling, or touching.

Curiosity sparked by the desire to know more about other people is interpersonal curiosity (Litman and Pezzo 2007). Interpersonal curiosity has an element of snooping or spying in some situations. But this curiosity type also includes behaviors driven by empathy, or an interest in other people’s emotional states, and can be used to reduce uncertainty about how others are feeling or what they are doing. There are classroom assignments where students learn to conduct interviews or observations that can connect to this type of curiosity. Similarly, learning activities that connect students to human experience—like guest speakers, interviews, documentaries, panel discussions, or TED talks—may be inspiring for students who are motivated to understand how other people connect to their topic at an emotional level.

Conceptualizing the cognitive aspects of curiosity in a more multi-faceted way prompted us to think about additional factors related to curiosity, and how we might tie these different approaches to being curious to the research process.

Curiosity and Affect

In an effort to generate interest in the research process, many instructors tell students to choose a topic they are “passionate” about, which can make the prospect of engaging for the first time with academic writing even more stressful. Many of students hear “passion” and think of controversial topics where their minds are very firmly made up. On a cognitive level, it can be challenging for students to learn new things about these topics. But on an affective level, students who already see research projects as a time-intensive “gamble” can feel that this well-meaning directive has raised the stakes even higher. Now they have to come up with a convincing argument—convincing to an audience they don’t really know much about—that expresses a point of view about which they feel deeply.

The research on curiosity further emphasizes the importance of the affective domain (Litman and Silvia 2006). When learners are anxious, worried, or concerned that they cannot complete a task, they are less likely to make room for curiosity. The uncertainty inherent in choosing an unfamiliar topic can be too much to bear. In the context of a traditional research assignment, a student’s choice to play it safe, and avoid the gamble of an unfamiliar topic, is eminently sensible. Years of experience with school have taught students that they will not be evaluated on their willingness to take risks, but on their ability to meet predetermined expectations. The risks inherent in taking a curiosity-driven approach to research may seem too great to overcome.

Opportunities for Applying Curiosity: Experiences from OSU

To overcome these barriers, students must be convinced that the risks are worth it—and librarians cannot affect this change by themselves, particularly within the confines of a one-hour library instruction session. No matter how engaging or compelling librarians are, in our role as guest lecturers, we cannot expect to convince students to take risks that might threaten their ability to meet their (grading) instructors’ expectations. No matter how empathetic and approachable librarians are, we cannot expect students to trust relative strangers to help them navigate the anxiety and uncertainty that is inherent in a curiosity-driven research process. To build an environment for curiosity in the first-year composition classroom, librarians have to work collaboratively with the faculty designing the curriculum, the GTAs teaching the sections, and the students doing the work.

At OSU, we have worked with our partners in the first-year composition program to try out a variety of approaches for creating spaces for curiosity in the classroom. Some of these approaches include changing the language used to discuss the research process, recognizing the role affect plays on students’ research behaviors, building in multiple opportunities and rewards for broad exploration of ideas and sources, and providing prompts for students to reflect on how curiosity influences the way they think about research. These activities are discussed in more depth below so that other librarians might consider adopting the approaches that work well in their context.

Adopting the Language of Curiosity and Exploration

At OSU, we built on our longstanding partnership with first-year composition faculty to incorporate curiosity into faculty development and GTA trainings. A key focus of these trainings is on the importance of language. Wendy Holliday and Jim Rogers (2013) demonstrated that the language we use to discuss the research process matters. Using discourse analysis in the first-year composition classroom, they found that students are more likely to engage in an exploratory research process when their instructors emphasize “learning about” a topic instead of “finding sources.” Building on this research, we suggest that first-year composition instructors should also encourage students to choose topics they are “curious” about instead of “passionate” about.

In addition, every year we share our findings about curiosity and learning with the first-year composition GTAs in a required training session that takes place before fall term. We discuss the importance of the affective domain in research instruction and give new GTAs a chance to experience research from their students’ perspectives. Throughout the term, we help instructors actively reflect on the affective and cognitive challenges students face, and we provide repeated opportunities to practice using terms like “curiosity,” “exploration,” and “learning” to describe the research process.

Encouraging Early Exploration of Different Sources

We also developed a variety of activities librarians could use to engage students with curiosity in the one-hour library instruction environment. To create opportunities for students to engage with their own curiosity, and to practice relying on their curiosity strengths in research, we used our library sessions to encourage browsing for a wider range of topics. But we also encouraged students to browse outside the journal literature, using sources that would help them place scholarly sources in a meaningful context.

Our students, like many first-year composition students, were required to use several source types in their final essays, including peer-reviewed journal articles. And our students, like many first-year composition students, frequently struggled with this requirement. Research articles, written for experts, were an unfamiliar genre and most of our students did not have the skills or experience to break them down and identify the pieces most likely to be useful in a first-year composition paper. The vocabulary and concepts were often dense and could not be easily digested in one or two (or five or six) readings. Even when an individual article was intellectually accessible, most first-year composition students did not know enough about the surrounding discourse to contextualize or evaluate the article as an expert would.

If these were the only barriers we had to navigate—if all first-year composition students selected topics well-represented in the scholarly literature—we could develop strategies to deal with each one. We found that there was another, more deeply entrenched, barrier that was harder to overcome: because most of our first-year composition students did not know what they could expect to find in the scholarly literature, they could not devise queries or research questions that could be usefully explored in that literature. As a result, students sometimes chose topics that were not discussed in peer-reviewed journals. No matter how well we taught students to find, read, use, and cite scholarly articles, if their argument was about the lack of parking on the OSU campus, they were going to find the scholarly article requirement frustrating and impossible to navigate. Sometimes, their topic choice meant they could not see the value in the sources they did find. If they were looking for the kind of general overview they were used to in encyclopedias and textbooks, students were unimpressed with the narrow, focused information reported in peer-reviewed articles.

To encourage students to follow their curiosity and set them up for success in their searches, we developed different ways for students to browse the scholarly literature before choosing paper topics. We developed activities that required students to browse press releases using the OSU research channel ( or aggregators like Science Daily ( or EurekAlerts ( To create a visual framework students could use to contextualize that research, we developed a Google Map of our campus that combined information about researchers with snippets from press releases about their new discoveries (see Figure 1).


Image Caption/Alt Text: Google Map of campus with information points

Google Map of OSU with researcher information points by Rempel and Deitering


We required students to identify a topic that sparked their curiosity in these sources, knowing they could build on that initial spark to explore effectively in the scholarly literature. In addition, the press release or news story usually provided some context to help students understand the significance of a study because these sources were written for general audiences. Analyzing a press release or news article about a research study helped students understand what they could expect to find in different source types more organically than arbitrary requirements did, and it also helped them to understand the different types of original research scholars do. After students learned about a study in these general sources, we explored alternate ways to search the library’s web-scale discovery tool by teaching them to search for the researcher, not the study. Learning more about the author(s)’ background provided another layer of context that the students could use to make sense of their article.

As we revised and refined these activities, and observed students completing them, we identified some best practices for encouraging curiosity-driven research in the library instruction one-shot environment:

  • Encourage students to choose topics they know little or nothing about by creating or finding relevant browsing environments for them to explore.
  • Frame the goals for the session around curiosity and exploration instead of finding sources.
  • Create circumstances where students can be successful navigating assignment requirements by narrowing the scope of those browsing environments to focus on topics discussed in the scholarly journals they are required to use.
  • Expose students to resources that help them put their sources in context.
  • And, most importantly, keep the stakes of the in-class activities low.

We did not hide the fact that some students might find a research paper topic through these activities, but we made it clear that they would not be required to write their paper on the topic they used in the session. Choosing a focus for their activities in the library session was a commitment of an hour, not a term. When students saw for themselves that a topic was viable, however, these temporary topics sometimes became more attractive. We also lowered the stakes for individual students by having them work in pairs or small groups to analyze unfamiliar sources and materials whenever possible.

Encouraging Self Reflection on Curiosity

Opportunities to play with curiosity in the classroom are important, but it soon became clear that if we wanted students to transfer what they learned in the first-year composition classroom to other situations, they needed additional opportunities for reflection and metacognition. They needed to analyze and understand their own curiosity, think about its role in the academic research process, and figure out how curiosity might shape their thinking and learning more generally, beyond the first-year composition classroom. Without metacognition, it is difficult to transfer learning from one domain to another, because for many students, this act of self-reflection does not come naturally (Bowler 2010).

One activity we developed to make the mental framework of curiosity types visible is a Curiosity Self-Assessment. Most of the work cognitive psychologists have done with the curiosity types is descriptive; we saw an opportunity to use the instruments developed by these researchers to help students reflect on their own curiosity preferences. We were guided by an important assumption as we worked with these instruments: that all humans are curious, at least at some level (Kidd and Hayden 2015). As a result, the Curiosity Self-Assessment we constructed does not ask “Are you curious?” and it does not promise to answer the question, “How curious am I?” Instead, it asks, “How are you curious?” That guiding question works well for us in the research assignment context. Even if some students feel sparks of curiosity less strongly than others, they all have research assignments to navigate. And they can all benefit from understanding different ways that other people can feel curious about some ideas or concepts.

We created the Curiosity Self-Assessment survey by integrating ten questions each from three different curiosity instruments: epistemic, perceptual, and interpersonal. The Curiosity Self-Assessment has 30 Likert-style questions and takes about 15 minutes to complete. See Table 1 to get a sense of the types of questions that make up the self-assessment. Find the full 30-question Self-Assessment here:

 Almost Never  Sometimes  Often  Almost Always
Epistemic Curiosity
When I learn something new, I like to find out more
I find it fascinating to learn new information.
I enjoy exploring new ideas.
Perceptual Curiosity
I enjoy trying different foods.
When I see new fabric, I want to touch and feel it.
I like to discover new places to go.
Interpersonal Curiosity
I wonder what other people’s interests are.
I like going into houses to see how people live.
I figure out what others are feeling by looking at them.

Printable version here.

This instrument has been a useful tool to introduce conversations about curiosity with both students and faculty, and we will discuss those applications in the rest of this section. First, however, we need to talk about its limits. We specifically developed the tool to use as a self-assessment exercise that would then be paired with classroom activities guiding individuals to reflect on the different ways in which they are curious. The Likert scale used in the survey doesn’t reflect any inherent or objective value and as a result should not be used to give a curiosity “grade.” For more on the specifics of scoring this Self-Assessment, see this blog post:

The Curiosity Self-Assessment can be used in the classroom environment to get students thinking about the role of curiosity in the research process and about the wide range of topics they might explore. While it is important to explain the nature and purpose of the assessment, this can be done quickly before assigning it as homework. In our experience, students find that the types are logical and easy to understand with just a little background explanation. In addition, it is unlikely that anyone will be upset or confused by what they find. Discovering that you are “perceptually curious” (or “interpersonal” or “epistemic”) doesn’t feel as loaded as labels like “introvert” or “extrovert” might. And as is the case with all self-assessments, the process of taking this survey is itself an opportunity for learners to start thinking differently about curiosity. Metacognitively reflecting on their learning behaviors promotes the ability to develop new behaviors and adapt what they have learned to new contexts.

Challenges and Limitations

In the years since we started working to embed curiosity in the first-year composition course, the curriculum for that course has undergone a significant change. Instead of writing a traditional argument paper, with required source types all students must use, students now complete a scaffolded, multi-stage rhetorical analysis using sources appropriate for their specific rhetorical situation (see this video for an overview of the current approach: This shift has provided many opportunities to embed curiosity throughout the research process. Students start by choosing a rhetorical artifact to analyze instead of “choosing a topic,” which provides an opportunity to build in activities that incorporate all of the curiosity types. Additionally, the new rhetorical analysis paper is unfamiliar to many students, which makes it more difficult for them to use familiar topics, sources, and habits as they write it.

However, there are significant challenges that will likely persist as long as we try to embed curiosity and exploration in the classroom context. The first of these comes from the fact that most students are not driven by an intrinsic or personal desire to learn simply because they have been assigned a research project. We are hoping to activate their curiosity and therefore spark that desire to learn, but we must recognize that when a student has a required paper to write, the initial motivation to go out and do research will almost always be external, imposed upon them by the first-year composition instructor. Many students keep this external focus throughout and are motivated more by their desire to do well in class and to meet their teacher’s expectations than by an intrinsic need to learn the content (Senko and Miles 2008). To encourage curiosity in this context, we need to work even more closely with the faculty and GTAs who teach first-year composition to develop ways to reward students for taking intellectual risks and engaging in exploratory research.

The need for more structured, positive feedback to encourage exploratory behaviors relates to the second challenge: a curiosity-driven, exploratory research process cannot be taught as a standalone part of the first-year composition curriculum. Students need to see curiosity modeled for them over and over. They need to hear the research process described in terms of learning and exploration at every stage. To make this change, we must give the people who are in the classroom every day the tools, vocabulary, and conceptual understanding they need to do this work. At OSU, we now devote more (and more) of our time to teaching the teachers. At this point, we do not teach one-shot library sessions in first-year composition, but are instead embedded in the required seminar that all new first-year composition instructors must take.


As librarians, we are usually asked to work with students to help them find sources. However, we also know that this part of the research process is not where many of our students struggle the most. Think back to your own experience as a student. Was there a research paper or project that was particularly meaningful? Can you remember what you learned writing that paper? Chances are good that at some point in that research process you were curious, you learned something new, and you created new meaning or new knowledge for yourself. Instructors and librarians want students to have this type of experience, but it can be very challenging in a required course like first-year composition, and may be impossible when students have already committed to going through the motions with a tired, overused topic. As librarians, we need to advocate for our students to get the help they need from the very start. By shifting the discourse to focus on curiosity within the classroom, providing activities grounded in low-stakes exploration, and encouraging self-reflective behaviors focused on curiosity, we can provide opportunities for more students to create their own new knowledge.


The authors would like to thank Lori Townsend (external peer reviewer) and Amy Koester (internal peer reviewer) for reviewing drafts of this article. We recognize that reviewing takes a lot of time, and we are thankful for their feedback. The authors would also like to thank our Publishing Editor, Sofia Leung. Her efficiency and responsiveness were highly appreciated. Finally, we would especially like to thank the many faculty and graduate students from the OSU School of Writing Literature and Film who are always willing to consider new ideas and new approaches to our shared work. We would especially like to acknowledge Tim Jensen, Sara Jameson, Chad Iwertz, and all of the Composition Assistants who have contributed to the WR 121 curriculum.


Association of College and Research Libraries. 2016. “Framework for Information Literacy for Higher Education.”

Bowler, Leanne. 2010. “The Self-Regulation of Curiosity and Interest during the Information Search Process of Adolescent Students.” Journal of the American Society for Information Science & Technology 61 (7): 1332–44.

Collins, Robert P, Jordan A Litman, and Charles D Spielberger. 2004. “The Measurement of Perceptual Curiosity.” Personality and Individual Differences 36 (5): 1127–41.

Deitering, Anne-Marie, and Sara Jameson. 2008. “Step by Step through the Scholarly Conversation: A Collaborative Library/Writing Faculty Project to Embed Information Literacy and Promote Critical Thinking in First Year Composition at Oregon State University.” College & Undergraduate Libraries 15 (1–2): 57–79.

Head, Alison J., and Michael B. Eisenberg. 2010. “Truth Be Told: How College Students Evaluate and Use Information in the Digital Age.” Available at SSRN 2281485.

Holliday, Wendy, and Jim Rogers. 2013. “Talking about Information Literacy: The Mediating Role of Discourse in a College Writing Classroom.” portal: Libraries and the Academy 13 (3): 257–271.

Kidd, Celeste, and Benjamin Y. Hayden. 2015. “The Psychology and Neuroscience of Curiosity.” Neuron 88 (3): 449–60.

Kuhlthau, Carol C. 1991. “Inside the Search Process: Information Seeking from the User’s Perspective.” Journal of the American Society for Information Science 42 (5): 361–371.

Kuhlthau, Carol C. 1993. “A Principle of Uncertainty for Information Seeking.” The Journal of Documentation. 49 (4): 339–55.

Litman, Jordan A., Tiffany L. Hutchins, and Ryan K. Russon. 2005. “Epistemic Curiosity, Feeling‐of‐knowing, and Exploratory Behaviour.” Cognition & Emotion 19 (4): 559–82.

Litman, Jordan A., and Tiffany L. Jimerson. 2004. “The Measurement of Curiosity as a Feeling of Deprivation.” Journal of Personality Assessment 82 (2): 147–57.

Litman, Jordan A., and Mark V. Pezzo. 2007. “Dimensionality of Interpersonal Curiosity.” Personality and Individual Differences 43 (6): 1448–1459.

Litman, Jordan A., and Paul J. Silvia. 2006. “The Latent Structure of Trait Curiosity: Evidence for Interest and Deprivation Curiosity Dimensions.” Journal of Personality Assessment 86 (3): 318–28.

Loewenstein, George. 1994. “The Psychology of Curiosity: A Review and Reinterpretation.” Psychological Bulletin 116 (1): 75-98.

McMillen, Paula S., Bryan Miyagishima, and Laurel S. Maughan. 2002. “Lessons Learned about Developing and Coordinating an Instruction Program with Freshman Composition.” Reference Services Review 30 (4): 288–299.

Senko, Corwin, and Kenneth M. Miles. 2008. “Pursuing Their Own Learning Agenda: How Mastery-Oriented Students Jeopardize Their Class Performance.” Contemporary Educational Psychology 33 (4): 561–83.

Zuckerman, Marvin, and Kathryn Link. 1968. “Construct Validity for the Sensation-Seeking Scale.” Journal of Consulting and Clinical Psychology 32 (4): 420-426.

  1. Kulthau’s 1991 article in JASIST, “Inside the Search Process: Information Seeking from the User’s Perspective” has been cited 609 times in Web of Science, most recently this month, and almost 2,300 times in Google Scholar.
  2. To trace this history of this collaboration, see McMillen, Miyagishima and Maughan 2002; and Deitering and Jameson 2008.

Watch: UX Quackery with Tim Broadwater (58:21) / LibUX

  1. Description
  2. Transcript (Coming soon)


Debra Kolah, Benjamin MacLeod, Stephen Francoeur, Jennifer DeJonghe, Stephanie Van Ness, Edward Lim, Sally Vermaaten, Brian Holda, Emily Hayes, Thomas Habing, Alyssa Hanson, David Drexler, Rebecca Blakiston, Meghan Frazer, Suzanna Conrad, James Day, J Boyer, Galen Charlton, Junior Tidal, Judith Cobb, Dave McRobbie.

Thank you! Our event could have never happened without these folks.


Novare Library Services provides our webinar-space, records, and archives our video. They specializes in IT solutions for libraries and small businesses. In addition to LibUX Community Webinars, they’re behind a bunch of other events. Follow Diana Silveira (@dee987), she is awesome :).

In the field of User Experience there are dishonest practices, misnomers, claims of special knowledges, and required skills that are touted as being ‘common knowledge’ or ‘best practices’. These biases or ‘trappings’ are often encountered when working with various clients, project managers, product managers, designers, and even developers.

This webinar will focus on the prevailing quackery that exists in the field of UX when working with various stakeholders, performing research, or conducting user testing … and what UX professionals can do to weed-out the quacks.


Tim Broadwater (@tim_broadwater) is an expert UI designer and certified UX developer that has worked for Fortune 500 companies, grant-based education initiatives, higher education institutions, and nonprofit organizations. For the past ten years he has lived in the greater Pittsburgh area. He’s an avid foodie, a convention junkie, enjoys social gaming, and likes taking music and theatre road trips.

Tim’s written some really good posts here, and he’s also been interviewed on our podcast, Metric.

CA / Ed Summers

Conversation Analysis (CA) is the study of interaction using a fine grained analysis of spoken conversations. The researcher moves between detailed examination of individual cases (specific segments of transcriptions) and a general view of a set of related cases. By collecting multiple cases the researcher can get context independence. Audio recordings are important in CA because there is a great deal of attention to not just to the words that are spoken, but also to their timing and intonation.

CA is about actions and practices. Practices are distinctive, turn-based and notable for the consequences that they have. The way that prior turns talk is understood is key to CA. This is known as next-turn-proof-procedure. Establishing patterns of behavior is important, but deviant cases, where a pattern is broken are extremely important because they provide insight into the normative structures that the participants are engaged with.

Generally speaking CA draws on regularities and co-occurrences in talk. Some examples of these regularities outlined by Paltridge (2012) are:

  • opening conversations: how conversations are initiated or started
  • closing conversations: how conversations end
  • turn taking: the ways in which participants signal the end of their turn
  • adjacency pairs: regularities in two successive speakers that form expected sequences of behavior
  • stage of conversation: examining the way different adjacency pairs can behave at different points in the conversation
  • preference organization: examining the ways in which adjacency pairs can be examined using the preferred and dispreferred response
  • feedback: how speakers show that they are listening and understanding with their words
  • repair: how speakers correct themselves or others in their speech
  • discourse markers: items in spoken discourse which act as signposts of discourse coherence: oh, now y’know

The focus on the spoken interaction, and the transcript is not to make a philosophical claim about the world–it is simply a method for analysis. It is an argument for evidence based analysis.

Often CA starts as undirected research, or immersion in data, where the analyst notices interesting outcomes in the text, and tries to identify what conversation practices are involved. Or it may be that the analyst notices features of the talking, that invite them to take a closer look at what outcomes they might be associated with. This initial noticing can lead to identifying the same pattern in other text. Data collection can take a while–years in some cases. An analytical unit (turn, sequence, etc) can be part of multiple collections of related collections.

In another reading this week Sidnell & Stivers (2012) uses an analysis of the use of Oh to show how it used not to indiciated surprise, but a transition from unknowing to knowing. While I understand that Sidnell is introducing CA and how it is performed, I seemed to miss the connection between the detailed understanding of Oh and a research question. How does mapping the terrain (to use Sidnell’s metaphor) of the use of Oh in this particular discourse help answer a research question? Or is it meant to simply be descriptive of a particular phenomenon? What is the community of practice that is being understood here?

Schegloff, Jefferson, & Sacks (1977) ties the idea of self and other from sociology to turn-taking systems, or conversation. They are making the point that self-correction is a preferred form of correction than other-correction, and want to understand how that preference is deployed in conversation. I almost wrote why it is used, but I’m not sure at the outset if that is in fact their goal. Why seems like more of a psychological question.

I kind of like the way they start out by stating their base assumptions about self-correction in plain language. It adds clarity. They are intentionally scoping the idea of repair to include the already established notion of correction. So it will include situations where there isn’t necessarily an error that is being replaced. They use examples to illustrate this, which is very helpful.

It struck me while doing the readings this week that the transcription notation in CA offers greater detail but also has the effect of making the cases really stand out from the regular text of the article.

Repair initiation and repair outcome are useful tools for looking at the repair sequence. They basically show that the preference for self-correction can be explained by looking at how repair initiation operates to give the self an opportunity to correct rather than doing the correction themselves.

The number of examples in Schegloff et al. (1977), and the categories that they were taken to represent was a bit mind numbing and difficult to keep in my head at one time. I felt like a diagram would have been useful. Perhaps this complexity is because he was carving out a new area of research, and the options for publishing diagrams at the time may have been limited?

Wilkinson & Weatherall (2011) take a deep dive into a particular type of repair that Schegloff et al. (1977) cataloged: insertion repair. They look at 500 examples of insertion repair, to see how it is used to intensify or specify talk. It’s interesting to note that the study drew on multiple datasets of spoken English from England, the United States and New Zealand.

Insertion repair can be used to differentiate between possible referents, which is either explicitly stated or inferred. If there isn’t another referent involved, then the insertion is being used to clarify the type of referent it is. Their choices of examples are easier for me to understand, even with all the detail they offer. Also I like how they show how observation of insertion is used, for example in this segment:

    01  DR:  Cause I mean they even: (0.8) they’ve got
    street gangs rou:nd in Cuba Ma:ll no:w.
    03  IV:   Ye:s.
    04  DR:   for [    cry] in’ out lou[:d. ]
    05  IV:      [(◦Mm◦)]             [Yea]:h.
    06  DV:   It’s- (0.5) (pretty) ◦ba:[d.◦]
    07  IV:                            [O:h] it’s getting ba:d.
    08        It’s really- They [’ve gotta do something.]
    09  DR:                     [ I mean you can’t even ] wa:lk
    10        (0.2) thr(h)ough Cuba Ma:ll now [huh]=
    11  IV:                                   [No:]
    12  DR:   =all those people ma:n.
    13        (.)
    14  IV:   Yea:h.
    15  DR:   They steal your bloody sh: Doc Mar:tens shoe:s
    16        an:[:::wh::]whatever you’ve got o:n it’s-
    17  IV:      [ Ri:ght.  ]
    18  IV:   Mmm.
    19        (0.5)
    20  DR:  pretty ba:d but [- ]
    21  IV:                  [Ri]:ght.
    22       (0.8)
    23  DR:  They won’t be stea(h)ling the co(h)rolla
    24       anyw(hh)ay s(h)o: huh [huh huh] huh huh
    25  IV:                        [ Yea:h. ]

The authors connect the dots between the insertion of a type of shoes (Doc Martens) and the subsequent use of the Toyota Corolla to draw attention the way in which brands are used to qualify the type of thieving that’s going on. Examples like this get me to see how CA could actually be useful when trying to analyze, understand and contextualize a particular type of speech, scenario or environment. It takes it out of the realm of exploring the universe of language, which is interesting, but also feels like a bit of a navel-gaze. The analysis needs to be put to use in some way: having a paper that just finds some new feature of language use doesn’t in itself seem that interesting to me. Although I grant that it is interesting to other people.

For my own research I think CA could be a useful tool. I recently conducted 30 interviews with archivists of web content to see how they decide ascribe value to content and enact appraisal in web archives. I have audio recordings and transcripts of these interviews, and have spent some time coding the transcripts and my own field notes from the interviews. I can see that CA could be a useful way to identify patterns in the conversations, that could possibly help me uncover some buried meanings, or insights into what is going on. For example I could examine the way that collection development policies were talked about.

Even though these interviews were unstructured ethnographic interviews they aren’t really natural conversations about appraisal. I worry that my participation in all of them as the interviewer would really distort the analysis. But perhaps that could be accounted for. Also, I have 30 hours of transcriptions, which is a lot of material to sift through. How can I narrow down to individual cases in that content? I have definitely been immersed in the data to do coding, but I haven’t examined them from a conversational perspective. I guess this is something that makes it appealing, to triangulate on the interviews using a different method, to see what I can learn from them.


Paltridge, B. (2012). Discourse analysis: An introduction. Bloomsbury Publishing.

Schegloff, E. A., Jefferson, G., & Sacks, H. (1977). The preference for self-correction in the organization of repair in conversation. Language, 361–382.

Sidnell, J., & Stivers, T. (2012). The handbook of conversation analysis. John Wiley & Sons.

Wilkinson, S., & Weatherall, A. (2011). Insertion repair. Research on Language and Social Interaction, 44(1), 65–91.

INTRODUCING Fedora 4 Ansible / DuraSpace News

From Yinlin Chen, Software Engineer, Digital Library Development, University Libraries, Virginia Tech

Only a week left to sign up for the joint LITA and ACRL supercomputing webinar / LITA

What’s so super about supercomputing? A very basic introduction to high performance computing

Presenters: Jamene Brooks-Kieffer and Mark J. Laufersweiler
Tuesday February 28, 2017
2:00 pm – 3:30 pm Central Time

Register Online, page arranged by session date (login required)

This 90 minute webinar provides a bare-bones introduction to high-performance computing (HPC) or supercomputing. This program is a unique attempt to connect the academic library to introductory information about HPC. Librarians who are learning about researchers’ data-intensive work will want to familiarize themselves with the computing environment often used to conduct that work. Bibliometric analysis, quantitative statistical analysis, and geographic data visualizations are just a few examples of computationally-intensive work underway in humanities, social science, and science fields.

Covered topics will include:

  • Why librarians should care about HPC
  • HPC terminology and working environment
  • Examples of problems appropriate for HPC
  • HPC resources at institutions and nation-wide
  • Low-cost entry-level programs for learning distributed computing

Details here and Registration here

Jamene Brooks-Kieffer brings a background in electronic resources to her work as Data Services Librarian at the University or Kansas.

Dr. Mark Laufersweiler has, since the Fall of 2013, served as the Research Data Specialist for the University of Oklahoma Libraries.


Look here for current and past LITA continuing education offerings

Questions or Comments?

contact LITA at (312) 280-4268 or Mark Beatty,
contact ACRL at (312) 280-2522 or Margot Conahan,

lita and acrl logos combined

Papers AND passwords, please… / District Dispatch

The Department of Homeland Security is increasingly demanding without cause that non-citizens attempting to lawfully enter the U.S. provide border officials with their electronic devices and the passwords to their private social media accounts. Today, ALA is pleased to join 50 other national public interest organizations – and nearly 90 academic security, technology and legal experts in the US and abroad – in a statement condemning these activities and the policy underlying them.

Hand holding a smart phone displaying the lockscreen

Source: shutterstock

Linked below, the statement calls the policy (first articulated by DHS Secretary John Kelly at a February 7 congressional hearing) a “direct assault on fundamental human rights.” It goes on to warn that the practice also will violate the privacy of millions of U.S. citizens and persons in their social networks and will encourage the governments of other nations to retaliate against Americans in kind.

For the statement’s signators, the literal bottom line is: “The first rule of online security is simple: do not share your passwords. No government agency should undermine security, privacy, and other rights with a blanket policy of demanding passwords from individuals.”

Click here to read the full statement.

Additional Resources:

Tech, advocacy groups slam DHS call to demand foreign traveler’s passwords
By Ali Breland Feb 21, 17

Electronic Media Searches at Border Crossings Raise Worry
By The Associated Press Feb. 18, 2017

What Are Your Rights if Border Agents Want to Search Your Phone?
By Daniel Victor  Feb. 14, 2017

‘Give Us Your Passwords’
The Atlantic by Kaveh Waddell  Feb 10, 2017

The post Papers AND passwords, please… appeared first on District Dispatch.

DPLAfest 2017 Program Now Available / DPLA

The DPLAfest 2017 program of sessions, workshops, lightning talks, and more is now available!

Taking place at Chicago Public Library’s Harold Washington Library Center on April 20 and 21, DPLAfest 2017 will bring together librarians, archivists, and museum professionals, developers and technologists, publishers and authors, educators, and many others to celebrate DPLA and its community of creative professionals.

We received an excellent array of submissions in response to this year’s call for proposals and are excited to officially unveil the dynamic program that we have lined up for you. Look for opportunities to engage with topics such as social justice and digital collections; public engagement; library technology and interoperability; metadata best practices; ebooks; and using digital collections in education and curation projects.

DPLAfest 2017 presenters represent institutions across the country–and as far as Europe–but also include folks from some of our host city’s premier cultural and educational institutions, including the Art Institute of Chicago, the Field Museum and Chicago State University. We are also grateful for the support and collaboration of DPLAfest hosting partners  Chicago Public Library, the Black Metropolis Research Consortium, Chicago Collections, and the Reaching Across Illinois Library System (RAILS).

View the DPLAfest 2017 program and register to reserve your spot today.

Changes to copyright liability calculus counterproductive / District Dispatch

ALA – as part of the Library Copyright Alliance (LCA) – submitted a second round of comments in the Copyright Office’s study on the effectiveness of the notice and takedown provisions of Section 512. In its comments, LCA argues that the effectiveness of federal policies to improve access to information and enhance education (such as the National Broadband Plan adopted by the FCC in 2010, ConnectEd and the expansion of the E-rate program) would have been seriously compromised without Section 512. Accordingly, LCA again opposes changes to Section 512 not required by the DMCA and which could upset the present balance that the statute attempts to strike between the protection of copyrighted information and its necessary free flow and access over the internet.

blue globe with highlighted connectors

Photo credit: Pixabay

Last year the U.S. Copyright Office initiated separate inquiries into several aspects of copyright law relevant to libraries, their users and the public in general. One such important proceeding asked for comment on the part of the Digital Millennium Copyright Act (DMCA) that provides internet service providers (ISPs) and others with a “safe harbor” from secondary copyright liability if they comply with a process that’s become known as “notice and takedown.”

Specifically, Section 512 protects online service providers from liability for the infringing actions of others who use online networks. Libraries are included in this safe harbor because they offer broadband and open access computing to the public. Because of the safe harbor, libraries have been able to provide broadband services to millions of people without the fear of being sued for onerous damages because of infringing user activity.

The Copyright Office has not yet announced a timeline for publication of its findings or recommendations regarding Section 512.


The post Changes to copyright liability calculus counterproductive appeared first on District Dispatch.

Using to collaborate on Open Data Day and to showcase work after the event / Open Knowledge Foundation

March 4th is Open Data Day! Open Data Day is an annual celebration of open data all over the world. For the fifth time in history, groups from around the world will create local events on the day where they will use open data in their communities. Here is a look at how groups can use the platform to identify data sources and collaborate with open data users.

Although the team will only be present at our local Open Data Day in Austin, Texas, everyone at is proud to support the groups that will participate in events all over the world. The platform will make it easier to collaborate on your data projects, connect with the community, and preserve your work for others to build upon after Open Data Day.

For those of you that don’t know us yet, this is central to our vision as a B Corp and Public Benefit Corporation. By setting up in this way, we commit to considering our impact on stakeholders – not only on shareholders – and allow ourselves to publicly report on progress towards our mission in the same way companies report on finances. Our mission is to:

  1. build the most meaningful, collaborative and abundant data resource in the world in order to maximize data’s societal problem-solving utility,
  2. advocate publicly for improving the adoption, usability, and proliferation of open data and linked data, and
  3. serve as an accessible historical repository of the world’s data.

When I reached out to OKI about supporting the event, they suggested that I write some tips on how groups could benefit most from using the platform on Open Data Day and I prepared this short list:  

  • Data discovery and organization: before Open Data Day, search the platform and identify other data sources that are relevant to a project you hope to work on during the event. Create a dataset that includes hypotheses, questions, or goals for your project as well as data and related documentation
  • Explore and query data: as soon as you find a data file, understand its shape and descriptive statistics to determine if the data has the right characteristics for your project as well as query the file directly on using SQL
  • Use the API: interact with data via R Studio or Python programs using the API or link a Google Sheet to a dataset (if you prefer working locally in a spreadsheet you can do that too)
  • Communicate effectively: as you work on your project, use discussion threads in the project’s dataset as well as annotate data within the platform so group members have maximum context
  • Showcase your work: including data, notebooks, analysis, and visualizations in a single workspace to preserve what was achieved and permit the community to build on it without unnecessarily repeating the data prep and analysis completed during the event

If you’d like to see some relevant examples on I would suggest looking at this dataset from the Anti-defamation League, this analysis of Cancer Clinical Trials, and this Data for Democracy project around Drug Spending.

I’d love to see your projects on so tag @len in a discussion on your dataset or invite me to be a read-only contributor. If you have questions, email and you’ll get the attention of our whole team as your feedback goes right into our company Slack.

Hopefully helps your group be more productive on Open Data Day and also sustain momentum from the event afterwards.

Creating an OAI-PMH Feed From Your Website / ACRL TechConnect

Libraries who use a flexible content management system such as Drupal or WordPress for their library website and/or resource discovery have a challenge in ensuring that their data is accessible to the rest of the library world. Whether making metadata useable by other libraries or portals such as DPLA, or harvesting content in a discovery layer, there are some additional steps libraries need to take to make this happen. While there are a number of ways to accomplish this, the most straightforward is to create an OAI-PMH feed. OAI-PMH stands for Open Archives Initiative Protocol for Metadata Harvesting, and is a well-supported and understood protocol in many metadata management systems. There’s a tutorial available to understand the details you might want to know, and the Open Archives Initiative has detailed documentation.

Content management tools designed specifically for library and archives usage, such as LibGuides and Omeka, have a built in OAI-PMH feed, and generally all you need to do is find the base URL and plug it in. (For instance, here is what a LibGuides OAI feed looks like). In this post I’ll look at what options are available for Drupal and WordPress to create the feed and become a data provider.


This is short, since there aren’t that many options. If you use WordPress for your library website you will have to experiment, as there is nothing well-supported. Lincoln University in New Zealand has created a script that converts a WordPress RSS feed to a minimal OAI feed. This requires editing a PHP file to include your RSS feed URL, and uploading to a server. I admit that I have been unsuccessful at testing this, but Lincoln University has a working example, and uses this to harvest their WordPress library website into Primo.


If you use Drupal, you will need to first install a module called Views OAI-PMH. What this does is create a Drupal view formatted as an OAI-PMH data provider feed. Those familiar with Drupal know that you can use the Views module to present content in a variety of ways. For instance, you can include certain fields from certain content types in a list or chart that allows you to reuse content rather than recreating it. This is no different, only the formatting is an OAI-PMH compliant XML structure. Rather than placing the view in a Drupal page or block, you create a separate page. This page becomes your base URL to provide to others or reuse in whatever way you need.

The Views OAI-PMH module isn’t the most obvious module to set up, so here are the basic steps you need to follow. First, enable and set permissions as usual. You will also want to refresh your caches (I had trouble until I did this). You’ll discover that unlike other modules the documentation and configuration is not in the interface, but in the README file, so you will need to open that out of the module directory to get the configuration instructions.

To create your OAI-PMH view you have two choices. You can add it to a view that is already created, or create a new one. The module will create an example view called Biblio OAI-PMH (based on an earlier Biblio module used for creating bibliographic metadata). You can just edit this to create your OAI feed. Alternatively, if you have a view that already exists with all the data you want to include, you can add an OAI-PMH display as an additional display. You’ll have to create a path for your view that will make it accessible via a URL.

The details screen for the OAI-PMH display.

The Views OAI-PMH module only supports Dublin Core at this time. If you are using Drupal for bibliographic metadata of some kind, mapping the fields is a fairly straightforward process. However, choosing the Dublin Core mappings for data that is not bibliographic by nature requires some creativity and thought about where the data will end up. When I was setting this up I was trying to harvest most of the library website into our discovery layer, so I knew how the discovery layer parsed OAI DC and could choose fields accordingly.

After adding fields to the view (just as you normally would in creating a view), you will need to select settings for the OAI view to select the Dublin Core element name for each content field.

You can then map each element to the appropriate Dublin Core field. The example from my site includes some general metadata that appears on all content (such as Title), and some that only appears in specific content types. For instance, Collection Description only appears on digital collection content types. I did not choose to include the body content for any page on the site, since most of those pages contain a lot of scripts or other code that wasn’t useful to harvest into the discovery layer. Explanatory content such as the description of a digital collection or a database was more useful to display in the discovery layer, and exists only in special fields for those content types on my Drupal site, so we could pull those out and display those.

In the end, I have a feed that looks like this. Regular pages end up with very basic metadata in the feed:

<oai_dc:dc xsi:schemaLocation="">
<dc:identifier></dc:identifier><dc:creator>Loyola University Libraries</dc:creator></oai_dc:dc>

Whereas databases get more information pulled in. Note that there are two identifiers, one for the database URL, and one for the database description link. We will make these both available, but may choose one to use only one in the discovery layer and hide the other one.

<oai_dc:dc xsi:schemaLocation="">
<dc:title>Annual Bibliography of English Language and Literature</dc:title>
<dc:subject>Modern Languages</dc:subject>
<dc:creator>Loyola University Libraries</dc:creator>

When someone does a search in the discovery layer for something on the library website, the result shows the page right in the interface. We are still doing usability tests on this right now, but expect to move it into production soon.


I’ve just touched on two content management systems, but there are many more out there. Do you create OAI-PMH feeds of your data? What do you do with them? Share your examples in the comments.

Are we officially post app? / LibUX

Probably. Natasha Lomas (@riptari) reported only a few hours ago how Gartner predicts that “Facebook [is] on course to be the WeChat of the West,” meaning that through third-party integrations Facebook will — and has started to — become the sole app through which users connect with other services.

In an upcoming podcast with Trey Gordner (@darthgordner) and Stephen Bateman (@IAmBateman) from Koios, I asked them about a near future in which the standalone front ends of companies are subsumed by mashups like IFTTT or other “super interfaces” that rely on APIs and metadata to aggregate those companies’ services. There has already been activity in this space for years, like Kayak, but Gartner research director Jessica Ekholm describes the thirst for these experiences to be increasing.

“It’s a move away from using native applications to something else,” says Gartner research director Jessica Ekholm, discussing the survey in an interview with TechCrunch. “Consumers are getting less interested in using applications; there are far too many applications. Some of the surveys that we’re doing we see that there’s a stablization in terms of app usage. How many apps they’re downloading, how much time they’re spending finding new applications — it’s just that people are getting a little bit disinterested in that. People are spending more time with the apps that they’ve already got.

And other than with shopping apps, this seems to be playing out in all other categories that show decline in usage. This is what they mean with “post app.” As opportunity declines for prospective native app developers, it is in their commercial interest to make their products and services interoperable with the apps folks already use – like Facebook, Alexa, etc.

The library vendor Koios certainly sees an opportunity here: they make Libre. Appetite specifically for aggregation in the library space could be high. Library front ends and search experiences leave much to be desired, and as these organizations are slow to improve or incapable of coming-up to the par, Koios and the like can offer something moderately more convenient and succeed. (See: how Libre was designed to be a “Netflix” for libraries)

Now that voice user interfaces with virtual assistants (Alexa, Siri) are capable, this fuse is lit, but it’s AI in general through various conversational UI — whether those are messenger apps, VUI — that make this trend viable. If there is a winner it will be the one that aggregates the most, has the thinnest perceptible interface (see: the library interface), who gets it right.

Facebook is well-positioned here not just because they are the biggest player (because they command the highest engagement), but because they know the most about you.

Tracking What’s Next At CES / LITA

It’s no news that technology is a part of daily life, or that it’s an ever-increasing part of library life. One reliable way to keep ahead of what might be walking in the door tomorrow is monitoring consumer trade shows, the largest of which is CES (formerly the Consumer Electronics Show, now just the acronym).

For 50 years, CES has showcased hot new gadgets that eventually had culture-changing effects: the Portable Executive Telephone (1968), the VCR (1970), the Commodore 64 (1982), Nintendo (1985), DVDs (1996), Plasma TVs (2001), and the first big wave of smart home technologies (2014). Some of these had direct impacts on libraries, and others only in how our patrons live their lives. As the time from a technology’s introduction to its adoption gets shorter, staying on top of what’s happening at CES is becoming more important.

A few years ago, librarians began attending South By Southwest (SXSW) to know what hot new thing was coming – now, some are shifting to attending CES as their major annual conference…and not any library-focused events. Michael Sauers, Director of Technology for Do Space in Omaha (NE), explains: “As someone whose job it is to provide the latest and greatest technology to a community, CES is the perfect place to spend three days discovering not only what’s new, but what’s next when it comes to tech. It’s an exhausting experience, but it gives me plenty of ideas to keep me busy for the rest of the year.” Check out the slidedeck Michael made of his 2016 CES experience, or Instagram for what he saw in 2017.

Pancake printer #ces2017🎆

A post shared by Michael Sauers (@michael.sauers) on

Attending CES might not be in your budget, but watching the videos (choose CSTV as a category) and reading the session descriptions on the website provides plenty of food for thought.

How do you evaluate what you see in terms of its impact on libraries?

Need – What library needs might this bit of technology fulfill? Anything from user engagement to kids’ programming, something there might add to the library’s services, programs, or support.

Future Planning – What isn’t a need right now, but might be next year? At the start of your next 5-year plan? What technology isn’t ready to fulfill a need right now, but the next version might be at the right price or the right format?

Staying Aware – What technology may never fulfill a library need, but might become a part of users’ daily lives? (Think smartphones.) You might need to know about these things to prepare for community questions, such as when the FAA started regulating drones over a certain size. People who use or want to talk about a particular technology might need space to meet and geek out. Finally, it might just be awesome for you to have a customer interaction where you mention this neat new thing you saw at CES and thought the patron might be interested in.

As with most technology, you don’t necessarily need to know everything about it for that knowledge to be useful. A little bit can go a long way when you’re predicting the future.

For more conferences to stay on top of what’s next, check out:

Industry Conventions

Developer & Release Events

Europe in the age of Tr… Transparency / Open Knowledge Foundation

For the past few years, the USA has been an example of how governments can manage open government initiatives and open data particularly. They have done this by introducing positions like federal chief information officer and chief data officers. With datasets being opened on a massive scale in a standardised format, it laid the ground for startups and citizen apps to flourish. Now, when referring to the example of the US, it is common to add ‘under Obama’s administration’ with a sigh. Initiatives to halt data collection put the narrative on many sensitive issues such as climate change, women’s rights or racial inequality under threat. Now, more than ever, the EU should take a global lead with its open data initiatives.

One of these initiatives just took place last week: developers of civic apps from all over Europe went on a Transparency Tour of Brussels. Participants were the winners of the app competition that was held at TransparencyCamp EU in Amsterdam last June. In the run up to the final event, 30 teams submitted their apps online while another 40 teams were created in a series of diplohacks that Dutch embassies organised in eight countries. If you just asked yourself ‘what is diplohack?’, let me explain.

ConsiliumVote team pitching their app at TCampEU, by EU2016NL

Diplohacks are hackathons where developers meet diplomats – with initial suspicion from both sides. Gradually, both sides understand how they can benefit from this cooperation. As much as the word ‘diplohack’ itself brings two worlds together, the event was foremost an ice breaker between the communities. According to the survey of participants, direct interaction is what both sides enjoyed the most. Diplohacks helped teams to find and understand the data, and also enabled data providers to see the points of improvement like better interface, adding relevant data fields to their datasets, etc.  

Experience the diplohack atmosphere by watching this short video:

All winners of the app competition were invited last week for the transparency tour at the EU institutions. The winning teams were, which h makes use of bike data; Harta Banilor Publici (Public Spending Map) in Romania; and ConsiliumVote, a visualization tool of the votes in the Council of the EU. Developers were shown the EU institutions from the inside, but the most exciting part of it was a meeting with the EU open data steering committee.

Winners of the app competition at the Council of EU, by Open Knowledge Belgium

Yet again, it proved how important it is to meet face to face and discuss things. Diplomats encouraged coders to use their data more. Tony Agotha, a member of the cabinet of First Vice-President Frans Timmermans, reminded and praised coders for the social relevance of their work. Developers, in turn, provided feedback with both specific comments like making the search on the Financial Transparency website possible across years; and general ideas such as making the platform of the European data portal open sourced so that regional and municipal portals can build on it.

Open data is not a favour, it’s a right’ – said one of the developers. To use this right, we need more meetings between publishers and re-users, we need community growth, we need communication of data and ultimately, more data. TransparencyCamp Europe and last week’s events in Brussels were good first steps. However, both EU officials and European citizens using data should keep the dialogue going if we want to take up the opportunity for the EU to lead on open data. Your comments and ideas are welcome. Join the discussion here.



MarcEdit Mac Update / Terry Reese

It seems like I’ve been making a lot of progress wrapping up some of the last major features missing from the Mac version of MarcEdit.  The previous update introduced support for custom/user defined fonts and font sizes which I hope went a long way towards solving accessibility issues.  Today’s update brings plugin support to MarcEdit Mac.  This version integrates the plugin manager and provides a new set of templates for interacting with the program.  Additionally, I’ve migrated one of the Windows plugins (Internet Archive to HathiTrust Packager) to the new framework.  Once the program is updated, you’ll have access to the current plugins.  I have 3 that I’d like to migrate, and will likely be doing some work over the next few weeks to make that happen.

Interested in seeing what the plugin support looks like? See:

You can download the file from the downloads page ( or via the automatic updating tool in the program.

Questions?  Let me know.


Open Knowledge International receives $1.5 million from Omidyar Network / Open Knowledge Foundation

We’ve recently received funding from Omidyar Network, which will allow us to further our commitment to civil society organisations!

Open Knowledge International has received a two-year grant amounting to $1.5 million from Omidyar Network to support the development and implementation of our new civil society-focused strategy. Running until the end of December 2018, this grant reflects Omidyar Network’s confidence in our shared vision to progress openness in society and we are looking forward to using the funds to strengthen the next phase of our work.

With over a decade’s experience opening up information, we will be turning our attention and efforts to focus on realising the potential of data for society. The unrestricted nature of the funding will help us to build on the successes of our past, work with new partners and implement effective systems to constructively address the challenges before us.

2017 certainly presents new challenges to the open data community. Increased access to information simply is not enough to confront a shrinking civic space, the stretched capacities of NGOs, and countless social and environmental issues. Open Knowledge International is looking to work with new partners on these areas to use open data as an effective tool to address society’s most pressing issues. Omidyar Network’s support will allow us to work in more strategic ways, to develop relationships with new partners and to embed our commitment to civil society across the organisation.

Pavel Richter, Open Knowledge International’s CEO, underlines the impact that this funding will have on the organisation’s continued success: “Given the expertise Open Knowledge International has amassed over the years, we are eager to employ our efforts to ensure open data makes a real and positive impact in the world. Omidyar Network’s support for the next two years will allow us to be much more strategic and effective with how we work.”

Of course implementing our strategic vision will take time. Long-term funding relationships like the one we have with Omidyar Network play an instrumental role in boosting Open Knowledge International’s capacity as they provide the space to stabilise and grow. For the past six years, Omidyar Network has been an active supporter of Open Knowledge International, and this has allowed us to cultivate and refine the strong vision we have today. More recently Omidyar Network has provided valuable expertise for our operational groundwork, helping to instil a suitable structure for us to thrive. Furthermore, our shared vision of the transformative impact of openness has allowed us to scale our community and grow our network of committed change-makers and activists around the world.

“We are proud to continue our support for Open Knowledge International, which plays a critical role in the open data ecosystem,” stated Martin Tisné, Investment Partner at Omidyar Network. “Open Knowledge International has nurtured several key developments in the field, including the Open Definition, CKAN and the School of Data, and we look forward to working with Open Knowledge International as it rolls out its new civil society-focused strategy.”

As we continue to chart our direction, Open Knowledge International’s work will focus on three areas to unlock the potential value of open data for civil society organisations: we will demonstrate the value of open data for the work of these organisations, we will provide organisations with the tools and skills to effectively use open data, and we will work to make government information systems more responsive to the needs of civil society. Omidyar Network’s funding ensures Open Knowledge International has the capacity to address these three areas. We are grateful for the support and we welcome our new strategic focus to empower civil society organisations to use open data to improve people’s lives.

Further information:

Open Knowledge International

Open Knowledge International is a global non-profit organisation focused on realising open data’s value to society by helping civil society groups access and use data to take action on social problems. Open Knowledge International does this in three ways: 1) we show the value of open data for the work of civil society organizations; 2) we provide organisations with the tools and skills to effectively use open data; and 3) we make government information systems responsive to civil society.

Omidyar Network 

Omidyar Network is a philanthropic investment firm dedicated to harnessing the power of markets to create opportunity for people to improve their lives. Established in 2004 by eBay founder Pierre Omidyar and his wife Pam, the organization invests in and helps scale innovative organizations to catalyze economic and social change. Omidyar Network has committed more than $1 billion to for-profit companies and nonprofit organizations that foster economic advancement and encourage individual participation across multiple initiatives, including Education, Emerging Tech, Financial Inclusion, Governance & Citizen Engagement, and Property Rights.

To learn more, visit, and follow on Twitter @omidyarnetwork


Excel is threatening the quality of research data — Data Packages are here to help / Open Knowledge Foundation

This week the Frictionless Data team at Open Knowledge International will be speaking at the International Digital Curation Conference #idcc17 on making research data quality visible. Dan Fowler looks at why the popular file format Excel is problematic for research and what steps can be taken to ensure data quality is maintained throughout the research process.

Our Frictionless Data project aims to make sharing and using data as easy and frictionless as possible by improving how data is packaged.The project is designed to support the tools and file formats researchers use in their everyday work, including basic CSV files and popular data analysis programming languages and frameworks like R and Python Pandas.  However, Microsoft Excel, both the application and the file format, remains very popular for data analysis in scientific research.

It is easy to see why Excel retains its stranglehold: over the years, an array of convenience features for visualizing, validating, and modeling data have been developed and adopted across a variety of uses.  Simple features, like the ability to group related tables together, is a major advantage of the Excel format over, for example, single-table formats like CSV.  However, Excel has a well documented history of silently corrupting data in unexpected ways which leads some, like data scientist Jenny Bryan, to compile lists of “Scary Excel Stories” advising researchers to choose alternative formats, or at least, treat data stored in Excel warily.

“Excel has a well-documented history of silently corrupting data in unexpected ways…”

With data validation and long-term preservation in mind, we’ve created Data Packages which provide researchers an alternative format to Excel by building on simpler, well understood text-based file formats like CSV and JSON and adding advanced features.  Added features include providing a framework for linking multiple tables together; setting column types, constraints, and relations between columns; and adding high-level metadata like licensing information.  Transporting research data with open, granular metadata in this format, paired with tools like Good Tables for validation, can be a safer and more transparent option than Excel.

Why does open, granular metadata matter?

With our “Tabular” Data Packages, we focus on packaging data that naturally exists in “tables”—for example, CSV files—a clear area of importance to researchers illustrated by guidelines issued by the Wellcome Trust’s publishing platform Wellcome Open Research. The guidelines mandate:

Spreadsheets should be submitted in CSV or TAB format; EXCEPT if the spreadsheet contains variable labels, code labels, or defined missing values, as these should be submitted in SAV, SAS or POR format, with the variable defined in English.

Guidelines like these typically mandate that researchers submit data in non-proprietary formats; SPSS, SAS, and other proprietary data formats are accepted due to the fact they provide important contextual metadata that haven’t been supported by a standard, non-proprietary format. The Data Package specifications—in particular, our Table Schema specification—provide a method of assigning functional “schemas” for tabular data.  This information includes the expected type of each value in a column (“string”, “number”, “date”, etc.), constraints on the value (“this string can only be at most 10 characters long”), and the expected format of the data (“this field should only contain strings that look like email addresses). The Table Schema can also specify relations between tables, strings that indicate “missing” values, and formatting information.

This information can prevent incorrect processing of data at the loading step.  In the absence of these table declarations, even simple datasets can be imported incorrectly in data analysis programs given the heuristic (and sometimes, in Excel’s case, byzantine) nature of automatic type inference.  In one example of such an issue, Zeeberg et al. and later Ziemann, Eren and El-Osta describe a phenomenon where gene expression data was silently corrupted by Microsoft Excel:

A default date conversion feature in Excel (Microsoft Corp., Redmond, WA) was altering gene names that it considered to look like dates. For example, the tumor suppressor DEC1 [Deleted in Esophageal Cancer 1] [3] was being converted to ’1-DEC.’ [16]

These errors didn’t stop at the initial publication.  As these Excel files are uploaded to other databases, these errors could propagate through data repositories, an example of which took place in the now replaced “LocusLink” database. In a time where data sharing and reproducible research is gaining traction, the last thing researchers need is file formats leading to errors.

Much like Boxed Water, Packaged Data is better because it is easier to move.

Zeeberg’s team described various technical workarounds to avoid Excel problems, including using Excel’s text import wizard to manually set column types every time the file is opened.  However, the researchers acknowledge that this requires constant vigilance to prevent further errors, attention that could be spent elsewhere.   Rather, a simple, open, and ubiquitous method to unambiguously declare types in column data—columns containing gene names (e.g. “DEC1”) are strings not dates and “RIKEN identifiers” (e.g. “2310009E13”) are strings not floating point numbers—paired with an Excel plugin that reads this information may be able to eliminate the manual steps outlined above.

Granular Metadata Standards Allow for New Tools & Integrations

By publishing this granular metadata with the data, both users and software programs can use it to automatically import into Excel, and this benefit also accrues when similar integrations are created for other data analysis software packages, like R and Python.  Further, these specifications (and specifications like them) allow for the development of whole new classes of tools to manipulate data without the overhead of Excel, while still including data validation and metadata creation.

For instance, the Open Data Institute has created Comma Chameleon, a desktop CSV editor.  You can see a talk about Comma Chameleon on our Labs blog.  Similarly, Andreas Billman created SmartCSV.fx to solve the issue of broken CSV files provided by clients.  While initially this project depended on an ad hoc schema for data, the developer has since adopted our Table Schema specification.

Other approaches that bring spreadsheets together with Data Packages include Metatab which aims to provide a useful standard, modeled on the Data Package, of storing metadata within spreadsheets.  To solve the general case of reading Data Packages into Excel, Nimble Learn has developed an interface for loading Data Packages through Excel’s Power Query add-in.

For examples of other ways in which Excel mangles good data, it is worth reading through Quartz’s Bad Data guide and checking over your data.  Also, see our Frictionless Data Tools and Integrations page for a list of integrations created so far.   Finally, we’re always looking to hear more user stories for making it easier to work with data in whatever application you are using.

This post was adapted from a paper we will be presenting at International Digital Curation Conference (IDCC) where our Jo Barratt will be presenting our work to date on Making Research Data Quality Visible .

Truly progressive WebVR apps are available offline! / Dan Scott

I've been dabbling with the A-Frame framework for creating WebVR experiences for the past couple of months, ever since Patrick Trottier gave a lightning talk at the GDG Sudbury DevFest in November and a hands-on session with AFrame in January. The @AFrameVR Twitter feed regularly highlights cool new WebVR apps, and one that caught my attention was ForestVR - a peaceful forest scene with birds tweeting in the distance. "How nice would it be", I thought, "if I could just escape into that little scene wherever I am, without worrying about connectivity or how long it would take to download?"

Then I realized that WebVR apps are a great use case for Progressive Web App (PWA) techniques that allow web apps to be as fast, reliable, and engaging as native Android apps. With the source code for ForestVR at my disposal, I set out to add offline support. And it turned out to be surprisingly easy to make this work on Android in both the Firefox and Chrome browsers.

If you just want to see the required changes for this specific example, you can find the relevant two commits at the tip of my branch. The live demo is at

ForestVR with "Add to Home Screen" menu on Firefox for Android 51.0.3

ForestVR with "Add" prompt on Chrome for Android 57

In the following sections I've written an overview of the steps you have to take to turn your web app into a PWA:

Describe your app with a Web App Manifest

ForestVR already had a working Web App Manifest (Mozilla docs / Google docs), a simple JSON file that defines metadata about your web app such as the app name and icon to use when it is added to your home screen, the URL to launch, the splash screen to show when it is loading, and other elements that enable it to integrate with the Android environment.

The web app manifest for ForestVR is named manifest.json and contains the following code:

  "name": "Forest VR",
  "icons": [
      "src": "./assets/images/icons/android-chrome-144x144.png",
      "sizes": "144x144",
      "type": "image/png"
  "theme_color": "#ffffff",
  "background_color": "#ffffff",
  "start_url": "./index.html",
  "display": "standalone",
  "orientation": "landscape"

You associate the manifest with your web app through a simple <link> element in the <head> of your HTML:

<link rel="manifest" href="manifest.json">

Create a service worker to handle offline requests

A service worker is a special chunk of JavaScript that runs independently from a given web page, and can perform special tasks such as intercepting and changing browser fetch requests, sending notifications, and synchronizing data in the background (Google docs / Mozilla docs). While implementing the required networking code for offline support would be painstaking, bug-prone work, Google has fortunately made the sw-precache node module available to support generating a service worker from a simple configuration file and any static files in your deployment directory.

The configuration I added to the existing gulp build system gulpfile uses runtime caching for assets that are hosted at a different hostname or, in the case of the background soundtrack, is not essential for the experience at launch and can thus be loaded and cached after the main experience has been prepared. The staticFileGlobs list, on the other hand, defines all of the assets that must be cached before the app can launch.

swConfig = {
  runtimeCaching: [{
    urlPattern: /^https:\/\/cdn\.rawgit\.com\//,
    handler: 'cacheFirst'
    urlPattern: /^https:\/\/aframe\.io\//,
    handler: 'cacheFirst'
    urlPattern: /\/assets\/sounds\//,
    handler: 'cacheFirst'
  staticFileGlobs: [

I defined the configuration inside a new writeServiceWorkerFile() function so that I could add it as a build task to the gulpfile:

function writeServiceWorkerFile(callback) {
  swConfig = {...}
  swPrecache.write('service-worker.js', swConfig, callback);

In that gulp task, I declared the 'scripts' and 'styles' tasks as prerequisites for generating the service worker, as those tasks generate the bundle.js and bundle.css files. If the files are not present in the build directory when sw-precache runs, then it will simply ignore their corresponding entry in the configuration, and they will not be available for offline use.

gulp.task('generate-service-worker', ['scripts', 'styles'], function(callback) {

I added the generate-service-worker task to the deploy task so that the service worker will be generated every time we build the app:


Register the service worker

Just like the Web App Manifest, you need to register your service worker--but it's a little more complex. I chose Google's boilerplate service worker registration script because it contains self-documenting comments and hooks for adding more interactivity, and added it in a <script> element in the <head> of the HTML page.

Host your app with HTTPS

PWAs--specifically service workers--require the web app to be hosted on an HTTPS-enabled site due to the potential for mischief that service workers could cause if replaced by a man-in-the-middle attack that would be trivial with a non-secure site. Fortunately, my personal VPS already runs HTTPS thanks to free TLS certificates generated by Let's Encrypt.

Check for success with Lighthouse

Google has made Lighthouse, their PWA auditing tool, available as both a command-line oriented node module and a Chrome extension for grading the quality of your efforts. It runs a separate instance of Chrome to check for offline support, responsiveness, and many other required and optional attributes and generates succinct reports with helpful links for more information on any less-than-stellar results you might receive.

Check for success with your mobile web browser

Once you have satisfied Lighthouse's minimum requirements, load the URL in Firefox or Chrome on Android and try adding it to your home screen.

  • In Firefox, you will find the Add to Home Screen option in the browser menu under the Page entry.
  • In Chrome, the Add button (Chrome 57) or Add to Home Screen button (Chrome 56) will appear at the bottom of the page when you have visited it a few times over a span of five minutes or more; a corresponding entry may also appear in your browser menu.

Put your phone in airplane mode and launch the app from your shiny new home screen button. If everything has gone well, it should launch and run successfully even though you have no network connection at all!


As a relative newbie to node projects, I spent most of my time in figuring out how to integrate the sw-precache build steps nicely into the existing gulp build, and in making the app relocatable on different hosts and paths for testing purposes. The actual service worker itself was straightforward. While I used ForestVR as my proof of concept, the process should be similar for turning any other WebVR app into a Progressive WebVR App. I look forward to seeing broader adoption of this approach for a better WebVR experience on mobile!

As an aside for my friends in the library world, I plan to apply the same principles to making the My Account portion of the Evergreen library catalogue a PWA in time for the 2017 Evergreen International Conference. Here's hoping more library software creators are thinking about improving their mobile experience as well...

Today, I learned about the Accessibility Tree / LibUX

Today, I learned about the “accessibility tree.

I am not sure who attribute this diagram to, but I borrowed this from Marcy Sutton.

The accessibility tree and the DOM tree are parallel structures. Roughly speaking the accessibility tree is a subset of the DOM tree. It includes the user interface objects of the user agent and the objects of the document. Accessible objects are created in the accessibility tree for every DOM element that should be exposed to an assistive technology, either because it may fire an accessibility event or because it has a property, relationship or feature which needs to be exposed. Generally if something can be trimmed out it will be, for reasons of performance and simplicity. For example, a <span> with just a style change and no semantics may not get its own accessible object, but the style change will be exposed by other means. W3C Core Accessibility Mappings 1.1

Basically, when a page renders in the browser, there is the Document Object Model (DOM) that is the underlying structure of the page that the browser interfaces with. It informs the browser that such-and-such is the title, what markup to render, and so on. It’s hierarchically structured kind of like a tree. There’s a root and a bunch of branches.

At the same time, there is an accessibility tree that is created. Browsers make them to give assistive technology something to latch on to.

When we use ARIA attributes, we are in part giving instructions to the browser about how to render that accessibility tree.

There’s a catch: not all browsers create accessibility trees in the same way; not all screen readers interpret accessibility trees in the same way; not all screen readers even refer to the accessibility tree, but they scrape the DOM directly — some do both.

The Space Age: Library as Location / LITA

On the surface, a conversation about the physical spaces within libraries might not seem relevant in:re technology in libraries, but there’s a trend I’ve noticed — not only in my own library, but in other libraries I’ve visited in recent months: user-supplied tech in library landscapes.

Over the course of the last decade, we’ve seen a steady rise in the use of portable personal computing devices. In their Evolution of Technology survey results, Pew Research Center reports that 51% of Americans own a tablet, and 77% own smartphones. Library patrons seem to be doing less browsing and more computing, and user-supplied technology has become ubiquitous — smartphones, and tablets, and notebooks, oh my! Part of the reason for this BYO tech surge may be explained by a triangulation of high demand for the library’s public computer stations, decreased cost of personal devices, and the rise of telecommuting and freelance gig-work in the tech sector. Whatever the reasons, it seems that a significant ratio of patrons are coming to the library to use the wi-fi and the workspace.

I recently collected data for a space-use analysis at my library, and found that patrons who used our library for computing with personal devices outnumbered browsers, readers, and public computer users 3:1. During the space use survey, I noted that whenever our library classrooms are not used for a class, they’re peopled with multiple users who “camp” there, working for 2 – 4 hours at a time. Considering elements of these more recently constructed rooms that differ from the space design in the rest of the 107-year-old building offers a way into thinking about future improvements. Below are a few considerations that may support independent computers and e-commuters in the library space.

Ergonomic Conditions

Furnish work spaces with chairs designed to provide lumbar support and encourage good posture, as well as tables that match the chairs in terms of height ratio to prevent wrist- and shoulder-strain.

Adequate Power

A place to plug in at each surface allows users to continue working for long periods. It’s important to consider not only the number of outlets, but their position: cords stretched across spaces between tables and walls could result in browsers tripping, or knocking laptops off a table.

Reliable Wireless Signal

It goes without saying that telecommuters need the tele– to do their commuting. Fast, reliable wi-fi is a must-have.

Concentration-Inducing Environment

If possible, a library’s spaces should be well-defined, with areas for users to meet and talk, and areas of quiet where users can focus on their work without interruption. Sound isn’t the only environmental consideration. A building that’s too hot or too cold can be distracting. High-traffic areas — such as spaces near doors, teens’ and children’s areas, or service desks — aren’t the best locations for study tables.

Relaxed Rules

This is a complex issue; it’s not easy to strike a balance. For instance, libraries need to protect community resources — especially the expensive electronic ones like wiring — from spills; but we don’t want our patrons to dehydrate themselves while working in the library! At our library, we compromise and allow beverages, as long as those beverages have a closed lid, e.g., travel mugs, yes; to go cups (which have holes that can’t be sealed) no.

As library buildings evolve to accommodate digital natives and those whose workplaces have no walls, it’s important to keep in mind the needs of these library users and remix existing spaces to be useful for all of our patrons, whether they’re visiting for business or for pleasure.


Do you have more ideas to create useful space for patrons who bring their own tech to the library? Any issues you’ve encountered? How have you met those challenges?


2018 Evergreen International Conference – Host Site Selected / Evergreen ILS

The 2018 Evergreen Conference Site Selection Committee has chosen the next host and venue for the 2018 conference.  The MOBIUS consortium will be our 2018 conference host and St. Charles, Missouri will be the 2018 location.  Conference dates to be determined.

Congratulations, MOBIUS!  

LITA Personas Task Force / LITA

Coming soon to the LITA blog: the results of the LITA Personas Task Force. The initial report contains a number of useful persona types and was submitted to the LITA Board at the ALA Midwinter 2017 conference. Look for reports on the process and each of the persona types here on the LITA blog starting in March 2017.

As a preview, go behind the scenes with this short podcast presented as part of the LibUX Podcast series, on the free tools the Task Force used to do their work.

Metric: A UX Podcast
Metric is a #libux podcast about #design and #userExperience. Designers, developers, librarians, and other folks join @schoeyfield and @godaisies to talk shop.

The work of the LITA Personas Task Force

In this podcast Amanda L. Goodman (@godaisies) gives you a peek into the work of the LITA Persona Task Force, who are charged with defining and developing personas that are to be used in growing membership in the Library and Information Technology Association.

The ten members of the task force were from academic, public, corporate, and special libraries located in different timezones. With such challenges, the Task Force had to use collaborative tools which were easy to use for all. Task member, Amanda L. Goodman, presented this podcast originally on LibUX’s Metric podcast.

How could a global public database help to tackle corporate tax avoidance? / Open Knowledge Foundation

A new research report published today looks at the current state and future prospects of a global public database of corporate accounts.

Shipyard of the Dutch East India Company in Amsterdam, 1750. Wikipedia.

The multinational corporation has become one of the most powerful and influential forms of economic organisation in the modern world. Emerging at the bleeding edge of colonial expansion in the seventeenth century, entities such as the Dutch and British East India Companies required novel kinds of legal, political, economic and administrative work to hold their sprawling networks of people, objects, resources, activities and information together across borders. Today it is estimated that over two thirds of the world’s hundred biggest economic entities are corporations rather than countries.

Our lives are permeated by and entangled with the activities and fruits of these multinationals. We are surrounded by their products, technologies, platforms, apps, logos, retailers, advertisements, publications, packaging, supply chains, infrastructures, furnishings and fashions. In many countries they have assumed the task of supplying societies with water, food, heat, clothing, transport, electricity, connectivity, information, entertainment and sociality.

We carry their trackers and technologies in our pockets and on our screens. They provide us not only with luxuries and frivolities, but the means to get by and to flourish as human beings in the contemporary world. They guide us through our lives, both figuratively and literally. The rise of new technologies means that corporations may often have more data about us than states do – and more data than we have about ourselves. But what do we know about them? What are these multinational entities – and where are they? What do they bring together? What role do they play in our economies and societies? Are their tax contributions commensurate with their profits and activities? Where should we look to inform legal, economic and policy measures to shape their activities for the benefit of society, not just shareholders?

At the moment these questions are surprisingly difficult to answer – at least in part due to a lack of publicly available information. We are currently on the brink of a number of important policy decisions (e.g. at the EU and in the UK) which will have a lasting effect on what we are able to know and how we are able to respond to these mysterious multinational giants.

Image from report on IKEA’s tax planning strategies. Greens/EFA Group in European Parliament.

A wave of high-profile public controversies, mobilisations and interventions around the tax affairs of multinationals followed in the wake of the 2007-2008 financial crisis. Tax justice and anti-austerity activists have occupied high street stores in order to protest multinational tax avoidance. A group of local traders in Wales sought to move their town offshore, in order to publicise and critique legal and accountancy practices used by multinationals. One artist issued fake certificates of incorporation for Cayman Island companies to highlight the social costs of tax avoidance. Corporate tax avoidance came to epitomise economic globalisation with an absence of corresponding democratic societal controls.

This public concern after the crisis prompted a succession of projects from various transnational groups and institutions. The then-G8 and G20 committed to reducing the “misalignment” between the activities and profits of multinationals. The G20 tasked the OECD with launching an initiative dedicated to tackling tax “Base Erosion and Profit Shifting” (BEPS). The OECD BEPS project surfaced different ways of understanding and accounting for multinational companies – including questions such as what they are, where they are, how to calculate where they should pay money, and by whom they should be governed.

For example, many industry associations, companies, institutions and audit firms advocated sticking to the “arms length principle” which would treat multinationals as a group of effectively independent legal entities. On the other hand, civil society groups and researchers called for “unitary taxation”, which would treat multinationals as a single entity with operations in multiple countries. The consultation also raised questions about the governance of transnational tax policy, with some groups arguing that responsibility should shift from the OECD to the United Nations  to ensure that all countries have a say – especially those in the Global South.

Exhibition of Paolo Cirio’s “Loophole for All” in Basel, 2015. Paolo Cirio.

While many civil society actors highlighted the shortcomings and limitations of the OECD BEPS process, they acknowledged that it did succeed in obtaining global institutional recognition for a proposal which had been central to the “tax justice” agenda for the previous decade: “Country by Country Reporting” (CBCR), which would require multinationals to produce comprehensive, global reports on their economic activities and tax contributions, broken down by country. But there was one major drawback: it was suggested that this information should be shared between tax authorities, rather than being made public. Since the release of the the OECD BEPS final reports in 2015, a loose-knit network of campaigners have been busy working to make this data public.

Today we are publishing a new research report looking at the current state and future prospects of a global database on the economic activities and tax contributions of multinationals – including who might use it and how, what it could and should contain, the extent to which one could already start building such a database using publicly available sources, and next steps for policy, advocacy and technical work. It also highlights what is involved in making of data about multinationals, including social and political processes of classification and standardisation that this data depends on.

New report on why we need a public database on the tax contributions and economic activities of multinational companies

The report reviews several public sources of CBCR data – including from legislation introduced in the wake of the financial crisis. Under the Trump administration, the US is currently in the process of repealing and dismantling key parts of the Dodd-Frank Wall Street Reform and Consumer Protection Act, including Section 1504 on transparency in the extractive industry, which Oxfam recently described as the “brutal loss of 10 years of work”. Some of the best available public CBCR data is generated as a result of the European Capital Requirements Directive IV (CRD IV), which gives us an unprecedented (albeit often imperfect) series of snapshots of multinational financial institutions with operations in Europe. Rapporteurs at the European Parliament just published an encouraging draft in support of making country-by-country reporting data public.

While the longer term dream for many is a global public database housed at the United Nations, until this is realised civil society groups may build their own. As well as being used as an informational resource in itself, such a database could be seen as form of “data activism” to change what public institutions count – taking a cue from citizen and civil society data projects to take measure of issues they care about – from migrant deaths to police killings, literacy rates, water access or fracking pollution.

A civil society database could play another important role: it could be a means to facilitate the assembly and coordination of different actors who share an interest in the economic activities of multinationals. It would thus be not only a source of information, but also a mechanism for organisation – allowing journalists, researchers, civil society organisations and others to collaborate around the collection, verification, analysis and interpretation of this data. In parallel to ongoing campaigns for public data, a civil society database could thus be viewed as a kind of democratic experiment opening up space for public engagement, deliberation and imagination around how the global economy is organised, and how it might be organised differently.

In the face of an onslaught of nationalist challenges to political and economic world-making projects of the previous century – not least through the “neoliberal protectionism” of the Trump administration – supporting the development of transnational democratic publics with an interest in understanding and responding to some of the world’s biggest economic actors is surely an urgent task.

Launched in 2016, supported by a grant from Omidyar Network, the FTC and coordinated by TJN and OKI, Open Data for Tax Justice is a project to create a global network of people and organisations using open data to improve advocacy, journalism and public policy around tax justice. More details about the project and its members can be found at

This piece is cross-posted at OpenDemocracy.

Security releases: OpenSRF 2.4.2 and 2.5.0-alpha2, Evergreen 2.10.10, and Evergreen 2.11.3 / Evergreen ILS

OpenSRF 2.4.2 and 2.5.0-alpha2, Evergreen 2.10.10, and Evergreen 2.11.3 are now available. These are security releases; the Evergreen and OpenSRF developers strongly urge users to upgrade as soon as possible.

The security issue fixed in OpenSRF has to do with how OpenSRF constructs keys for use by memcached; under certain circumstances, attackers would be able to exploit the issue to perform denial of service and authentication bypass attacks against Evergreen systems. Users of OpenSRF 2.4.1 and earlier are should upgrade to OpenSRF 2.4.2 right away, while testers of OpenSRF 2.5.0-alpha should upgrade to 2.5.0-alpha2.

If you are currently using OpenSRF 2.4.0 or later, you can update an Evergreen system as follows:

  • Download OpenSRF 2.4.2 and follow its installation instructions up to and including the make install step and chown -R opensrf:opensrf /<PREFIX> steps.
  • Restart Evergreen services using osrf_control.
  • Restart Apache

If you are running a version of OpenSRF older than 2.4.0, you will also need to perform the make and make install steps in Evergreen prior to restarting services.

Please visit the OpenSRF download page to retrieve the latest releases and consult the release notes.

The security issue fixed in Evergreen 2.10.10 and 2.11.3 affects users of the Stripe credit card payment processor and entails the possibility of attackers gaining access to your strip credentials. Users of Evergreen 2.10.x and 2.11.x can simply upgrade as normal, but if you are running Evergreen 2.9.x or earlier, or if you cannot perform a full upgrade right away, you can apply the fix by running the following two SQL statements in your Evergreen database:

UPDATE config.org_unit_setting_type
    SET view_perm = (SELECT id FROM permission.perm_list
    WHERE name LIKE 'credit.processor.stripe%' AND view_perm IS NULL;

UPDATE config.org_unit_setting_type
    SET update_perm = (SELECT id FROM permission.perm_list
    WHERE name LIKE 'credit.processor.stripe%' AND update_perm IS NULL;

In addition, Evergreen 2.10.10 has the following fixes since 2.10.9:

  • A fix to correctly apply floating group settings when performing no-op checkins.
  • A fix to the HTML coding of the temporary lists page.
  • A fix of a problem where certain kinds of requests of information about the organizational unit hierarchy to consume all available open-ils.cstore backends.
  • A fix to allow staff to use the place another hold link without running into a user interface loop.
  • A fix to the Edit Due Date form in the web staff client.
  • A fix to sort billing types and non-barcoded item types in alphabetical order in the web staff client.
  • A fix to the return to grouped search results link in the public catalog.
  • A fix to allow pre-cat checkouts in the web staff client without requiring a circulation modifier.
  • Other typo and documentation fixes.

Evergreen 2.11.3 has the following additional fixes since 2.11.2:

  • A fix to correctly apply floating group settings when performing no-op checkins.
  • An improvement to the speed of looking up patrons by their username; this is particularly important for large databases.
  • A fix to properly display the contents of temporary lists (My List) in the public catalog, as well as a fix of the HTML coding of that page.
  • A fix to the Spanish translation of the public catalog that could cause catalog searches to fail.
  • A fix of a problem where certain kinds of requests of information about the organizational unit hierarchy to consume all available open-ils.cstore backends.
  • A fix to allow staff to use the place another hold link without running into a user interface loop.
  • A fix to the Edit Due Date form in the web staff client.
  • A fix to the definition of the stock Full Overlay merge profile.
  • A fix to sort billing types in alphabetical order in the web staff client.
  • A fix to the display of the popularity score in the public catalog.
  • A fix to the return to grouped search results link in the public catalog.
  • A fix to allow pre-cat checkouts in the web staff client without requiring a circulation modifier.
  • A fix to how Action/Trigger event definitions with nullable grouping fields handle null values.
  • Other typo and documentation fixes.

Please visit the Evergreen download page to retrieve the latest releases and consult the release notes.

New amicus briefs on old copyright cases / District Dispatch

The American Library Association (ALA), as a member of the Library Copyright Alliance (LCA), joined amicus briefs on Monday in support of two landmark copyright cases on appeal.

iPad e-book demo on computer desk

Photo credit: Anita Hart, flickr

The first (pdf) is the Georgia State University (GSU) case—yes, that one— arguing that GSU’s e-reserves service is a fair use. The initial complaint was brought back in 2008 by three academic publishers and has been bankrolled by the Copyright Clearance Center and the American Association of Publishers ever since.
Appeals and multiple requests for injunction from the publishers have kept this case alive for eight years. (The long history of the ins and outs of these proceedings can be found here, and the briefs filed by the Library Copyright Alliance (LCA) can be found here.) Most recently, in March 2016, a federal appeals court ruled in GSU’s favor and many thought that would be the end of the story. The publishers appealed again, however, demanding in part that the court conduct a complicated market effect analysis and reverse its earlier ruling.

While not parties to the case, LCA and co-author the Electronic Frontier Foundation (EFF) make three principal points in their “friend of the court” (or “amicus”) brief:

  • First, they note that that GSU’s e-reserve service is a fair use of copyrighted material purchased by its library, underscoring that the service was modeled on a broad consensus of best practices among academic libraries.
  • Second, and more technically, the brief explains why the district court should have considered the goals of faculty and researchers who wrote most the works involved to disseminate works broadly as a characteristic of the “nature of the use” factor of fair use.
  • Third, and finally, the brief addresses the fourth factor of the statutory fair use test: the effect of the material’s use on the market for the copyrighted work.

Libraries and EFF note that the content loaned by GSU through its e-reserve service is produced by faculty compensated with state funds. Accordingly, they contend, “A ruling against fair use in this case will create a net loss to the public by suppressing educational uses, diverting scarce resources away from valuable educational investments, or both. This loss will not be balanced by any new incentive for creative activity.”

digital audio icon

Photo credit: Pixabay

The second amicus brief just filed by ALA and its LCA allies, another defense of fair use, was prepared and filed in conjunction with the Internet Archive on behalf of ReDigi in its ongoing litigation with Capitol Records. ReDigi is an online business that provides a cloud storage service capable of identifying lawfully acquired music files. Through ReDigi, the owner of the music file can electronically distribute it to another person. When they do, however, the ReDigi service is built to automatically and reliably delete the sender’s original copy. ReDigi originally maintained that this “one copy, one user” model and its service should have been considered legal under the “first sale doctrine” in U.S. copyright law. That’s the statutory provision which allows libraries to lend copies that they’ve lawfully acquired or any individual to, for example, buy a book or DVD and then resell or give it away. Written long before materials became digital, however, that part of the Copyright Act refers only to tangible (rather than electronic) materials. The Court thus originally rejected ReDigi’s first sale doctrine defense.

In their new amicus brief on ReDigi’s appeal, LCA revives and refines an argument that it first made way back in 2000 when ReDigi’s automatic delete-on-transfer technology did not exist. Namely, that digital first sale would foster more innovative library services and, for that and other reasons, should be viewed as a fair use that is appropriate in some circumstances.

With the boundaries of fair use or first sale unlikely to be productively changed in Congress, ALA and its library and other partners will continue to participate in potentially watershed judicial proceedings like these.

The post New amicus briefs on old copyright cases appeared first on District Dispatch.

Look Back, Move Forward: librarians combating misinformation / District Dispatch

Librarians across the field have always been dedicated to combating misinformation. TBT to 1987, when the ALA Council passed the “Resolution on Misinformation to Citizens” on July 1 in San Francisco, California. (The resolution is also accessible via the American Library Association Institutional Repository here.)

Resolution on Misinformation to Citizens, passed on July 1, 1987, in San Francisco, California.

Resolution on Misinformation to Citizens, passed on July 1, 1987, in San Francisco, California.

In response to the recent dialogue on fake news and news literacy, the ALA Intellectual Freedom Committee crafted the “Resolution on Access to Accurate Information,” adopted by Council on January 24.

Librarians have always helped people sort reliable sources from unreliable ones. Here are a few resources to explore:

  • IFLA’s post on “Alternative Facts and Fake News – Verifiability in the Information Society”
  • Indiana University East Campus Library’s LibGuide, “Fake News: Resources”
  • Drexel University Libraries’ LibGuide, “Fake News: Source Evaluation”
  • Harvard Library’s LibGuide, “Fake News, Misinformation, and Propaganda”
  • ALA Office for Intellectual Freedom’s “Intellectual Freedom News,” a free biweekly compilation of news related to (among other things!) privacy, internet filtering and censorship.
  • This Texas Standard article on the “CRAAP” (Currency, Relevance, Authority, Accuracy & Purpose) test.

If you are working on or have encountered notable “fake news” LibGuides, please post links in the comments below!

The post Look Back, Move Forward: librarians combating misinformation appeared first on District Dispatch.

Upcoming Evergreen and OpenSRF security releases / Evergreen ILS

Later today we will be releasing security updates for Evergreen and OpenSRF. We recommend that Evergreen users be prepared to install them as soon as possible.

The Evergreen security issue only affects users of a certain credit card payment processor, and the fix can be implemented by running two SQL statements; a full upgrade is not required.

The OpenSRF security issue is more serious and can be used by attackers to perform a denial of service attack and potentially bypass standard authentication.  Consequently, we recommend that users upgrade to OpenSRF 2.4.2 as soon as it is released.

If you are currently using OpenSRF 2.4.0 or OpenSRF 2.4.1, the upgrade will consist of the following steps:

  • downloading and compiling OpenSRF 2.4.2
  • running the ‘make install’ step
  • restarting Evergreen services

If you are currently running a version of OpenSRF that is older than 2.4.0, we strongly recommend upgrading to 2.4.2; note that it will also be necessary to recompile Evergreen.

There will also be an second beta release of OpenSRF 2.5 that will include the security fix.

Postel's Law again / David Rosenthal

Eight years ago I wrote:
In RFC 793 (1981) the late, great Jon Postel laid down one of the basic design principles of the Internet, Postel's Law or the Robustness Principle:
"Be conservative in what you do; be liberal in what you accept from others."
Its important not to lose sight of the fact that digital preservation is on the "accept" side of Postel's Law,
Recently, discussion on a mailing list I'm on focused on the downsides of Postel's Law. Below the fold, I try to explain why most of these downsides don't apply to the "accept" side, which is the side that matters for digital preservation.

Two years after my post, Eric Allman wrote The Robustness Principle Reconsidered, setting out the reasons why Postel's Law isn't an unqualified boon. He writes that Postel's goal was interoperability:
The intent of the Robustness Principle was to maximize interoperability between network service implementations, particularly in the face of ambiguous or incomplete specifications. If every implementation of some service that generates some piece of protocol did so using the most conservative interpretation of the specification and every implementation that accepted that piece of protocol interpreted it using the most generous interpretation, then the chance that the two services would be able to talk with each other would be maximized.
In recent years, however, that principle has been challenged. This isn't because implementers have gotten more stupid, but rather because the world has become more hostile. Two general problem areas are impacted by the Robustness Principle: orderly interoperability and security.
Allman argues, based on his experience with SMTP and Kirk McKusick's with NFS, that interoperability arises in one of two ways, the "rough consensus and running code" that characterized NFS (and TCP), or from detailed specifications:
the specification may be ambiguous: two engineers build implementations that meet the spec, but those implementations still won't talk to each other. The spec may in fact be unambiguous but worded in a way that some people misinterpret. ... The specification may not have taken certain situations (e.g., hardware failures) into account, which can result in cases where making an implementation work in the real world actually requires violating the spec. ... the specification may make implicit assumptions about the environment (e.g., maximum size of network packets supported by the hardware or how a related protocol works), and those assumptions may be incorrect or the environment may change. Finally, and very commonly, some implementers may find a need to enhance the protocol to add new functionality that isn't defined by the spec.
His arguments here are very similar to those I made in Are format specifications important for preservation?:
I'm someone with actual experience of implementing a renderer for a format from its specification. Based on this, I'm sure that no matter how careful or voluminous the specification is, there will always be things that are missing or obscure. There is no possibility of specifying formats as complex as Microsoft Office's so comprehensively that a clean-room implementation will be perfect. Indeed, there are always minor incompatibilities (sometimes called enhancements, and sometimes called bugs) between different versions of the same product.
The "rough consensus and running code" approach isn't perfect either. As Allman relates, it takes a lot of work to achieve useful interoperability:
The original InterOp conference was intended to allow vendors with NFS (Network File System) implementations to test interoperability and ultimately demonstrate publicly that they could interoperate. The first 11 days were limited to a small number of engineers so they could get together in one room and actually make their stuff work together. When they walked into the room, the vendors worked mostly against only their own systems and possibly Sun's (since as the original developer of NFS, Sun had the reference implementation at the time). Long nights were devoted to battles over ambiguities in the specification. At the end of those 11 days the doors were thrown open to customers, at which point most (but not all) of the systems worked against every other system.
The primary reason is that even finding all the corner cases is difficult, and so is deciding for each whether the sender needs to be more conservative or the receiver needs to be more liberal.

The security downside of Postel's Law is even more fundamental. The law requires the receiver to accept, and do something sensible with, malformed input. Doing something sensible will almost certainly provide an attacker with the opportunity to make the receiver do something bad.

An example is in encrypted protocols such as SSL. They typically provide for the initiator to negotiate with the receiver the specifics of the encryption to be used. Liberal receivers can be negotiated down to the use of an obsolete algorithm, vitiating the security of the conversation. Allman writes:
Everything, even services that you may think you control, is suspect. It's not just user input that needs to be checked—attackers can potentially include arbitrary data in DNS (Domain Name System) results, database query results, HTTP reply codes, you name it. Everyone knows to check for buffer overflows, but checking incoming data goes far beyond that.
Security appears to demand receivers be extremely conservative, but that would kill off interoperability; Allman argues that a balance between these conflicting goals is needed.

Ingest and dissemination in digital preservation are more restricted cases of both interoperability and security. As regards interoperability:
  • Ingest is concerned with interoperability between the archive and the real world. As digital archivists we may be unhappy that, for example, one of the consequences of Postel's Law is that in the real world almost none of the HTML conforms to the standard. But our mission requires that we observe Postel's Law and not act on this unhappiness. It would be counter-productive to go to websites and say "if you want to be archived you need to clean up your HTML".
  • Dissemination is concerned with interoperability between the archive and an eventual reader's tools. Traditionally, format migration has been the answer to this problem, whether preemptive or on-access. More recently, emulation-based strategies such as Ilya Kreymer's avoid the problem of maintaining interoperability through time by reconstructing a contemporaneous environment.
As regards security:
  • Ingest. In the good old days when Web archives simply parsed the content they ingested to find the links, the risk to their ingest infrastructure was minimal. But now the Web has evolved from inter-linked static documents to a programming environment, the risk to the ingest infrastructure from executing the content is significant. Precautions are needed, such as sandbox-ing the ingest systems.
  • Dissemination. Many archives attempt to protect future readers by virus-scanning on ingest. But, as I argued in Scary Monsters Under The Bed, this is likely to be both ineffective and counter-productive. As digital archivists we may not like the fact that the real world contains malware, but our mission requires that we not deprive future scholars of the ability to study it. Optional malware removal on access is a suitable way to mitigate the risk to scholars not interested in malware (cf. the Internet Archive's Malware Museum).
Thus, security considerations for digital preservation systems should not focus on being conservative by rejecting content for suspected malware, but instead focus on taking reasonable precautions so that content can be accepted despite the possibility that some might be malicious.

Mapping open data governance models: Who makes decisions about government data and how? / Open Knowledge Foundation

Different countries have different models to govern and administer their open data activities. Ana Brandusescu, Danny Lämmerhirt and Stefaan Verhulst call for a systematic and comparative investigation of the different governance models for open data policy and publication.

The Challenge

An important value proposition behind open data involves increased transparency and accountability of governance. Yet little is known about how open data itself is governed. Who decides and how? How accountable are data holders to both the demand side and policy makers? How do data producers and actors assure the quality of government data? Who, if any, are data stewards within government tasked to make its data open?

Getting a better understanding of open data governance is not only important from an accountability point of view. If there is a better insight of the diversity of decision-making models and structures across countries, the implementation of common open data principles, such as those advocated by the International Open Data Charter, can be accelerated across countries.

In what follows, we seek to develop the initial contours of a research agenda on open data governance models. We start from the premise that different countries have different models to govern and administer their activities – in short, different ‘governance models’. Some countries are more devolved in their decision making, while others seek to organize “public administration” activities more centrally. These governance models clearly impact how open data is governed – providing a broad patchwork of different open data governance across the world and making it difficult to identify who the open data decision makers and data gatekeepers or stewards are within a given country.  

For example, if one wants to accelerate the opening up of education data across borders, in some countries this may fall under the authority of sub-national government (such as states, provinces, territories or even cities), while in other countries education is governed by central government or implemented through public-private partnership arrangements. Similarly, transportation or water data may be privatised, while in other cases it may be the responsibility of municipal or regional government. Responsibilities are therefore often distributed across administrative levels and agencies affecting how (open) government data is produced, and published.

Why does this research matter? Why now?

A systematic and comparative investigation of the different governance models for open data policy and publication has been missing till date. To steer the open data movement toward its next phase of maturity, there is an urgency to understand these governance models and their role in open data policy and implementation.

For instance, the International Open Data Charter states that government data should be “open by default” across entire nations. But the variety of governance systems makes it hard to understand the different levers that could be used to enable nationwide publication of open government data by default. Who holds effectively the power to decide what gets published and what not? By identifying the strengths and weaknesses of governance models, the global open data community (along with the Open Data Charter) and governments can work together and identify the most effective ways to implement open data strategies and to understand what works and what doesn’t.

In the next few months we will seek to increase our comparative understanding of the mechanisms of decision making as it relates to open data within and across government and map the relationships between data holders, decision makers, data producers, data quality assurance actors, data users and gatekeepers or intermediaries. This may provide for insights on how to improve the open data ecosystem by learning from others.

Additionally, our findings may identify the “levers” within governance models used to provide government data more openly. And finally, having more transparency about who is accountable for open data decisions could allow for a more informed dialogue with other stakeholders on performance of the publication of open government data.

We are interested in how different governance models affect open data policies and practices – including the implementations of global principles and commitments. We want to map the open data governance process and ecosystem by identifying the following key stakeholders, their roles and responsibilities in the administration of open data, and seeking how they are connected:

  • Decision makers – Who leads/asserts decision authority on open data in meetings, procedures, conduct, debate, voting and other issues?
  • Data holders – Which organizations / government bodies manage and administer data?
  • Data producers – Which organizations / government bodies produce what kind of public sector information?
  • Data quality assurance actors – Who are the actors ensuring that produced data adhere to certain quality standards and does this conflict with their publication as open data?
  • Data gatekeepers/stewards – Who controls open data publication?

We plan to research the governance approaches to the following types of data:

  • Health: mortality and survival rates, levels of vaccination, levels of access to health care, waiting times for medical treatment, spend per admission
  • Education: test scores for pupils in national examinations, school attendance rates, teacher attendance rates
  • National Statistics: population, GDP, unemployment
  • Transportation: times and stops of public transport services – buses, trains
  • Trade: import and export of specific commodities, balance of trade data against other countries
  • Company registers: list of registered companies in the country, shareholder and beneficial ownership information, lobbying register(s) with information on companies, associations representatives at parliamentary bodies
  • Legislation: national legal code, bills, transcripts of debates, finances of parties

Output of research

We will use different methods to get rapid insights. This includes interviews with stakeholders such as government officials, as well as open government initiatives from various sectors (e.g. public health services, public education, trade). Interviewees may be open data experts, as well as policymakers or open data champions within government.

The type of questions we will seek to answer beyond the broad topic of “who is doing what”

  • Who holds power to assert authority over open data publication? What roles do different actors within government play to design policies and to implement them?
  • What forms of governance models can be derived from these roles and responsibilities? Can we see a common pattern of how decision-making power is distributed? How do these governance models differ?
  • What are criteria to evaluate the “performance of the observed governance models? How do they for instance influence open data policy and implementation?

Call for contributions

We invite all interested in this topic to contribute their ideas and to participate in the design and execution of one or more case studies. Have you done research on this? If so, we would also like to hear from you!

Contact one or all of the authors at:

Ana Brandusescu:

Danny Lämmerhirt:

Stefaan Verhulst:

Benchmarks and Heuristics Reports / LibUX

The user experience audit is the core deliverable from the UX bandwagon if you don’t code or draw. It has real measurable value, but it also represents the lowest barrier of entry for aspirants. Code or visual design work have these baked-in quality indicators. Good code works, and you just know good design when you see it in the same way Justice Stewart was able to gauge obscenity.

 I shall not today attempt further to define the kinds of material I understand to be embraced within that shorthand description [“hard-core pornography”], and perhaps I could never succeed in intelligibly doing so. But I know it when I see it, and the motion picture involved in this case is not that.

Audits, though, aren’t so privileged. Look for audit templates or how-to’s — we even have one here on LibUX — and you’ll find the practice is inconsistent across the discipline.

In part, they suffer from the same flaw inherent to user experience design in general in that nobody can quite agree on just what user experience audits do.

It’s an ambiguity that extends across the table.

As a term, the “user experience audit” fails to describe its value to client stakeholders. There is no clear return in paying for an “audit,” rather than the promise of red flags under scrutiny. And precisely because the value of performing an audit requires explanation, scoring the opportunity relies now on the art of the pitch rather than the expertise of the service you provide.

It boils down to a semantic problem.

That’s all preamble for this: this weekend, my partnership with a library association came to an end – capped by the delivery of a benchmarks and heuristics report, which was a service I was able to up-sell in addition to my original scope of involvement. I don’t think I could have sold a “user experience audit.”

Instead, I offered to report on the accessibility, performance, and usability of their service in order to establish benchmarks on which to build moving forward. This creates an objective-ish reference that they or future consultants can use in future decision-making. Incremental improvements in any of these areas has an all-ships-rising-with-the-tide effect, but with this report — I say — we will be able to identify which opportunities have the most bang for their buck.

So, okay. It’s semantics. But this little wordsmithy makes an important improvement: “benchmarks and heuristics” actually describe the content of the audit. This makes it easier to convince stakeholders it’s no report card – but a decision-making tool that empowers the organization.

My template

I use a simple template. I tweak, add, and remove sections depending on the scope of the project, but I think the arrangement holds-up. There is a short cover letter followed with an overview summarizing the whole shebang. I make it conversational, and try to answer the question stakeholders paid me to answer: how do we stand, and what should we do next? The rest of the report is evidence to support my advice.

Benchmarks are quantitative scores informed by data or programmatic audits that show the organization where they stand in relation to law or best practice or competition. You can run objective-ish numbers from user research as long as they adhere to some system — like net promoter scores or system usability scales — but in my experience the report is best organized from inarguable to old-fashioned gut feelings.

Programmatic audits border on the “inarguable” here. You’re either Section 508 compliant or you’re not. These are validation scans for either accessibility, performance, security, which can — when there’s something wrong — identify the greatest opportunities for improvement. I attach the full results of each audit in the appendix and explain my method. Then, I devote the white-space to describing the findings like you would over coffee.

Anticipate and work the answers to these questions into your writeup:

  • Is this going to cost me [the stakeholder] money, business, credibility, or otherwise hurt me sometime down the road if I don’t fix?
  • What kind of involvement, cost, consideration, and time does it take to address?
  • What would you [the expert] recommend if you had your druthers?
  • What is the least I could do or spend to assuage the worst of it?

I follow benchmarks with liberally linked-up heuristics and other research findings as they veer further into opinion, and the more opinionated each section becomes, the more I put into their presentation: embed gifs or link out to unlisted YouTube videos of the site in action, use screenshots, pop-in analytics charts or snippets from external spreadsheets like their content audit — or even audio clips from a user chatting about a feature.

Wait, audio? I’m not really carrying podcast equipment everywhere. Sometimes, I’ll put the website or prototype up on and ask five to ten people to perform a short task, then I’ll use the video or audio to — let’s say — prove a point about the navigation.

The more qualitative data I can use to support a best practice or opinion, the better I feel. I don’t actually believe that folks who reach out to me for this kind of stuff are looking for excuses to pshaw my work, but I’m a little insecure about it.

Anyway, your mileage may vary, but I thought I’d show you the basic benchmarks and heuristics report template I fork and start with each time. It might help if you don’t know where to start.


TIMTOWTDI / Galen Charlton

The Internet Archive had this to say earlier today:

This was in response to the MacArthur Foundation announcing that the IA is a semifinalist for a USD $100 million grant; they propose to digitize 4 million books and make them freely available.

Well and good, if they can pull it off — though I would love to see the detailed proposal — and the assurance that this whole endeavor is not tied to the fortunes of a single entity, no matter how large.

But for now, I want to focus on the rather big bus that the IA is throwing “physical libraries” under. On the one hand, their statement is true: access to libraries is neither completely universal nor completely equitable. Academic libraries are, for obvious reasons, focused on the needs of their host schools; the independent researcher or simply the citizen who wishes to be better informed will always be a second-class user. Public libraries are not evenly distributed nor evenly funded. Both public and academic libraries struggle with increasing demands on their budgets, particularly with respect to digital collections. Despite the best efforts of librarians, underserved populations abound.

Increasing access to digital books will help — no question about it.

But it won’t fundamentally solve the problem of universal and equitable service. What use is the Open Library to somebody who has no computer — or no decent smart phone – or an inadequate data plan—or uncertain knowledge of how to use the technology? (Of course, a lot of physical libraries offer technology training.)

I will answer the IA’s overreach into technical messianism with another bit of technical lore: TIMTOWTDI.

There Is More Than One Way To Do It.

I program in Perl, and I happen to like TIMTOWTDI—but as a principle guiding the design of programming languages, it’s a matter of taste and debate: sometimes there can be too many options.

However, I think TIMTOWTDI can be applied as a rule of thumb in increasing social justice:

There Is More Than One Way To Do It… and we need to try all of them.

Local communities have local needs. Place matters. Physical libraries matter—both in themselves and as a way of reinforcing technological efforts.

Technology is not universally available. It is not available equitably. The Internet can route around certain kinds of damage… but big, centralized projects are still vulnerable. Libraries can help mitigate some of those risks.

I hope the Internet Archive realizes that they are better off working with libraries — and not just acting as a bestower of technological solutions that may help, but will not by themselves solve the problem of universal, equitable access to information and entertainment.