Planet Code4Lib

Retracted item notifications with Retraction Watch integration / Zotero

Zotero can now help you avoid relying on retracted publications in your research by automatically checking your database and documents for works that have been retracted. We’re providing this service in partnership with Retraction Watch, which maintains the largest database of retractions available, and we’re proud to help sustain their important work.

How It Works

Retracted publications are flagged in the items list, and if you click on one you’ll see a warning at the top of the item pane with details on the retraction and links to additional information.

If you try to cite a retracted item using the word processor plugin, Zotero will warn you and confirm that you still want to cite it. If you’ve already added a citation to a document and it later is retracted, Zotero will warn you the next time you update the document’s citations, even if the item no longer exists in your Zotero library or was added by a co-author.

Currently, this feature is limited to items with a DOI or PMID (entered in the DOI field or in Extra as “DOI:”, “PMID:”, or “PubMed ID:”), which covers about 3/4 of Retraction Watch data, but we’re hoping to support items without identifiers as best as possible in a future update.

Designed for Privacy

The full retraction data is stored on Zotero servers, but we’ve designed this feature in a way that allows the Zotero client to check for retracted items without sharing the contents of your library. You don’t need to use Zotero syncing or upload a list of items to benefit from this feature.

For each item in your library, Zotero calculates a non-unique identifier that could map to hundreds or thousands of publications, and then compares those to a list of similar partial identifiers of retracted publications that it retrieves from Zotero servers. For each potential match, it requests the full details of all possible retractions, and then checks for local items matching any of those full identifiers and flags any that it finds. The Zotero servers have no way of knowing whether you have the retracted work in your library or one of hundreds or thousands of others. (A similar approach is used by some tools to check for compromised passwords without sharing the passwords they’re checking with the server.) And, as with our other services, we’re not logging the contents of even these anonymized lookups.

This feature is available today in Zotero 5.0.67.

Michael Nelson's CNI Keynote: Part 1 / David Rosenthal

Michael Nelson and his group at Old Dominion University have made major contributions to Web archiving. Among them are a series of fascinating papers on the problems of replaying archived Web content. I've blogged about several of them, most recently in All Your Tweets Are Belong To Kannada and The 47 Links Mystery. Nelson's Spring CNI keynote Web Archives at the Nexus of Good Fakes and Flawed Originals (Nelson starts at 05:53 in the video, slides) understandably focuses on recounting much of this important research. I'm a big fan of this work, and there is much to agree with in the rest of the talk.

But I have a number of issues with the big picture Nelson paints. Part of the reason for the gap in posting recently was that I started on a draft that discussed both the big picture issues and a whole lot of minor nits, and I ran into the sand. So I finally put that draft aside and started this one. I tried to restrict myself to the big picture, but despite that it is still too long for a single post. Follow me below the fold for the first part of a lengthy disquisition.


Taken as a whole, Nelson seems to be asking:
Can we take what we see from Web archives at face value?
Nelson is correct that the answer is "No", but there was no need to recount all the detailed problems in his talk to arrive at this conclusion. All he needed to say was:
Web archive interfaces such as the Wayback Machine are Web sites like any other. Nothing that you see on the Web can be taken at face value.
As we have seen recently, an information environment allowing every reader to see different, individually targeted content allows malign actors to mount powerful, sophisticated disinformation campaigns. Nelson is correct to warn that Web archives will become targets of, and tools for these disinformation campaigns.

I'm not saying that the problems Nelson illustrates aren't significant, and worthy of attention. But the way they are presented seems misleading and somewhat counter-productive. Nelson sees it as a vulnerability that people believe the Wayback Machine is a reliable source for the Web's history. He is right that malign actors can exploit this vulnerability. But people believe what they see on the live Web, and malign actors exploit this too. The reason is that most people's experience of both the live Web and the Wayback Machine is that they are reasonably reliable for everyday use.

The structure I would have used for this talk would have been to ask these questions:
  • Is the Wayback Machine more or less trustworthy as to the past than the live web? Answer: more.
  • How much more trustworthy? Answer: significantly but not completely.
  • How can we make the Wayback Machine, and Web archives generally more trustworthy? Answer: make them more transparent.

What Is This Talk About?

My first big issue with Nelson's talk is that, unless you pay very close attention, you will think that it is about "web archives" and their problems. But it is actually almost entirely about the problems of replaying archived Web content. The basic problem is that the process of replaying content from Web archives frequently displays pages that never actually existed. Although generally each individual component (Memento) of the replayed page existed on the Web at some time in the past, no-one could ever have seen a page assembled from those Mementos.

Nelson hardly touches on the problems of ingest that force archives to assemble replayed pages from non-contemporaneous sets of Mementos. His discussion of the threats to the collected Mementos between ingest and dissemination is superficial. The difference between these issues and the problems of replay that Nelson describes in depth is that inadequate collection and damage during storage are irreversible whereas replay can be improved through time (Slide 18). Nelson even mentions a specific recent case of fixed replay problems, "zombies" (Slide 30). If more resources could be applied, more of them could be fixed more quickly.

Many of my problems with Nelson's talk would have gone away if, instead of saying "web archives" he had said "web archive replay" or, in most cases "Wayback Machine style replay".

Winston Smith Is In The House!

Winston Smith in "1984" was "a clerk for the Ministry of Truth, where his job is to rewrite historical documents so that they match the current party line". Each stage of digital preservation is vulnerable to attack, but only one to "Winston Smith" attacks:
  • Ingest is vulnerable to two kinds of attack, in which Mementos unrepresentative of the target Web site end up in the archive's holdings at the time of collection:
    • A malign target Web site can detect that the request is coming from an archive and respond with content that doesn't match what a human would have seen. Of course, a simple robots.txt file can prevent archiving completely.
    • A malign "man-in-the-middle" can interpose between the target Web site and the archive, substituting content in the response. This can happen even if the archive's crawler uses HTTPS, via DNS tampering and certificate spoofing.
  • Preservation is vulnerable to a number of "Winston Smith" attacks, in which the preserved contents are modified or destroyed after being ingested but before dissemination is requested. In 2005 we set out a comprehensive list of such attacks in Requirements for Digital Preservation Systems: A Bottom-Up Approach, and ten years later applied them to the CLOCKSS Archive in CLOCKSS: Threats and Mitigations.
  • Dissemination is vulnerable to a number of disinformation attacks, in which the Mementos disseminated do not match those stored Mementos responsive to the user's request. Nelson uses the case of Joy Reid's blog (Slide 33) to emphasize that "Winston Smith" attacks aren't necessary for successful disinformation campaigns. All that is needed is to sow Fear, Uncertainty and Doubt (FUD) as to the veracity of Web archives (Slide 41). In the case of Joy Reid's blog, all that was needed to do this was to misinterpret the output of the Wayback Machine; an attack wasn't necessary.
Nelson then goes on to successfully cast this FUD on Web archives by mentioning the risks of, among others:
  • Insider attacks (Slide 51, Slide 64),which can in principle perform "Winston Smith" rewriting attacks. But because of the way preserved content is stored in WARC files with hashes, this is tricky to do undetectably. An easier insider attack is to tamper with the indexes that feed the replay pipeline so they point to spurious added content instead.
  • Using Javascript to tamper with the replay UI to disguise the source of fake content (Slide 59). Although Nelson shows a proof-of-concept (Slide 61), this attack is highly specific to the replay technology. Using an alternate replay technology, for example, it would be an embarrassing failure.
  • Deepfakes (Slide 56). This is a misdirection on Nelson's part, because deepfakes are an attack on the trustworthiness of the Web, not specifically on Web archives. It is true that they could be used as the content for attacks on Web archives, but there is nothing that Web archives can do to specifically address attacks with deepfake content as opposed to other forms of manipulated content. In Slide 58 Nelson emphasizes that it isn't the archive's job to detect or suppress fake content from the live Web.
In Slide 62, Nelson points to a group of resources from two years ago including Jack Cushman and Ilya Kreymer's Thinking like a hacker: Security Considerations for High-Fidelity Web Archives and my post about it. He says "fixing this, preventing web archives being an attack vector, is going to be a great deal of work". It isn't "going to be", it is a ongoing effort. At least some of the attacks identified two years ago have already been addressed; for example the Wayback Machine uses the Content-Security-Policy header. Clearly, like all Web technologies, vulnerabilities in Web archiving technologies will emerge through time and need to be addressed.

Slide 69 provides an example of the one of two most credible ways Web archives can be attacked, by their own governments. This is a threat about which we have been writing for at least five years in the context of academic journals. Nelson underplays the threat, because it is far more serious than simply disinformation.

Governments, for example the Harper administration in Canada and the Trump administration, have been censoring scientific reports and the data on which they depend wholesale.  Web archives have been cooperating in emergency efforts to collect and safeguard this information, primarily by archiving them in other jurisdictions. But both the US with the CLOUD Act and the EU claim extraterritorial jurisdiction over data in servers. A recent example was documented in Chris Butler's post Official EU Agencies Falsely Report More Than 550 URLs as Terrorist Content on the Internet Archive's blog:
In the past week, the Internet Archive has received a series of email notices from French Internet Referral Unit (French IRU) falsely identifying hundreds of URLs on as “terrorist propaganda”. At least one of these mistaken URLs was also identified as terrorist content in a separate take down notice sent under the authority of the French government’s L’Office Central de Lutte contre la Criminalité liée aux Technologies de l’Information et de la Communication (OCLCTIC).
It would be bad enough if the mistaken URLs in these examples were for a set of relatively obscure items on our site, but the French IRU’s lists include some of the most visited pages on and materials that obviously have high scholarly and research value.
The alleged terrorist content included archives of the Grateful Dead's music, CSPAN, Rick Prelinger's industrial movies and scientific preprints from

The other most credible way Web archives can be attacked is, as the Joy Reid case illustrates, by the target Web site abusing copyright or robots.txt. I wrote a detailed post about this last December entitled Selective Amnesia. The TL;DR is that if sites don't want to be archived they can easily prevent it, and if they subsquently regret it they can easily render the preserved content inaccessible via robots.txt, or via a DMCA takedown. And if someone doesn't want someone else's preserved content accessed, they have many legal avenues to pursue, starting with a false claim of copyright:
The fundamental problem here is that, lacking both a registry of copyright ownership, and any effective penalty for false claims of ownership, archives have to accept all but the most blatantly false claims, making it all too easy for their contents to be censored.

I haven't even mentioned the "right to be forgotten", the GDPR, the Australian effort to remove any judicial oversight from takedowns, or the EU's Article 13 effort to impose content filtering. All of these enable much greater abuse of takedown mechanisms.

To Be Continued

The rest of the post is still in draft. It will probably consist of two more parts:

Part 2:

  • A House Built On Sand
  • Diversity Is A Double-Edged Sword
  • What Is This "Page" Of Which You Speak?
  • The essence of a web archive is ...
  • A Voight-Kampff Test?

Part 3:

  • What Can Be Done?
    • Source Transparency
    • Temporal Transparency
    • Fidelity Transparency 1
    • Fidelity Transparency 2
  • Conclusion

New Islandora 7.x Committer: Marcus Barnes / Islandora

In recognition of his many contributions to Islandora and the Islandora community, the Islandora 7.x Committers have asked Marcus Barnes from the University of Toronto Scarborough to join their ranks and we are very pleased to announce that he has accepted.
Marcus is a longstanding member of the Islandora community, a dedicated member of multiple release teams, an active participant in our Committers Calls and reviewer of open pull requests, and a great help to new community members. Marcus also actively maintains one of the most popular contributed Solution Packs, for managing Oral Histories in Islandora.
Further details of the rights and responsibilities of being a Islandora committer can be found here:
Please join me in congratulating Marcus! We are very fortunate to have contributors like him standing behind Islandora.

Building a Nordic Anti-Corruption Data Ecosystem / Open Knowledge Foundation

Open Knowledge Sweden (OKSE) jointly with Transparency International Latvia and Transparency International Lithuania continues to promote usage of open data for combating corruption in the Baltic and Nordic countries.


Stockholm, 10 June 2019 – On May 15, 2019 Open Knowledge Sweden (OKSE) jointly with Transparency International Latvia and Transparency International Lithuania started the activities for a new project aimed to empower Nordic and Baltic stakeholders in helping to disclose anti-corruption-related datasets. 

The work is  funded by the Nordic Council of Ministers office in Latvia within a project “Building a Nordic Anti-Corruption Data Ecosystem”. The three implementing partners aim to build constructive relationships with national officials and promote the usage of open data for anti-corruption purposes. The following activities will run until autumn 2019:

  • Explorative online surveys to map demand for anti-corruption-related data in 7 Nordic and Baltic countries (Latvia, Lithuania, Estonia, Sweden, Denmark, Finland, Norway);
  • Identification of a basic inventory of anti-corruption-related data systems (i.e. those related to individuals and organizations, public resources, laws and regulations) which could be employed for further anti-corruption action at the national and regional level;
  • Workshop with anti-corruption and data-oriented NGOs from the region to develop a shared advocacy strategy for the release of public sector datasets which can be useful to fight political corruption – namely those related to lobbying, MPs’ interest and asset disclosure, political financing, public procurement and beneficial ownership.

Whereas in a previous project the partner organisations looked at the supply-side of anti-corruption data, this project will focus on the demand-side and the emerging impact of Open Government Data (OGD) policies in Nordic and Baltic countries. The project also aims to contribute to the strengthening of NGOs cooperation on common anti-corruption related priority areas.

More information

Guest Webinar Recap: Elevate Your Digital Commerce Platform with AI-powered Search / Lucidworks

Our RealDecoy and Lucidworks joint webinar yesterday covered a topic that I care about deeply: how to elevate digital commerce platforms with AI-powered search. In fact, I co-founded RealDecoy in 2000 with the goal of improving e-commerce search experiences, at a time when most of those experiences were rudimentary and AI was a scarce and poorly understood option (by today’s standards).

On the webinar, I had a conversation with Justin Sears from Lucidworks, and we discussed some of these questions:

  • What are some important differences between the search needs on B2C sites, versus B2B sites?
  • What is the difference between plain old “personalization” and “hyper-personalization”?
  • How does artificial intelligence create uniquely personal e-commerce experiences?
  • What are the cost-benefit trade-offs between investing one dollar in replatforming e-commerce technology versus investing that same dollar enhancing the search experience?

You can watch the entire webinar recording here, or access the slides here. For those who care about this topic and couldn’t join the webinar, I wanted to post about three key takeaways on what the market needs for hyper-personalized e-commerce search and how Lucidworks Fusion helps meet those needs. We hope these insights help e-commerce leaders think about how to take their next step on their e-commerce journey (whatever their stage of maturity).

Remember These Three Things

First, search can be far more than a little text box. If you watch the full recording or look at the slides, you can see a few examples that we shared, showing how modern e-commerce search is far more sophisticated than what we might see when we use Google or Bing.

Today’s online buyers simply expect to find all the information they want on a company’s website. To think beyond the search box you must think of search in terms of product discovery experience. The best product discovery experience drives higher conversion rates, order sizes and loyalty by presenting the right results to users without them having to ask.

Secondly, e-commerce personalization is a journey, not a final destination.

Even companies doing many of the right things- returning relevant search results, offering seasonal promotions, boosting and burying results (all of which are not examples of personalization, by the way) are just scratching the surface when it comes to personalization. In fact, 30% of companies are not collecting meaningful metrics around their website search experience.

AI and Machine Learning search platforms like Lucidworks Fusion give your site the ability to scale personalization to understand and predict your user’s intent.

Finally, search is about efficiency (shoppers finding what they need quickly), but it’s also about experience (the most important driver of customer retention).

Site search and personalization play a key role in creating inspired buying experiences. Now’s the time to leverage predictive analytics and user behavior to engage customers with experiences that speak to their individual preferences.

Winning in digital commerce isn’t just about acquiring new customers: It’s also about investing in ways to grow your sales with the ones you already have. Hyper-personalizing search experiences are how the leaders in digital commerce are doing both.

Lucidworks and RealDecoy are already planning future webinars to drill down into two more specific topics:

  • Improving loyalty and conversion rates in B2B: This webinar will discuss the fundamental difference between B2C shoppers and B2B buyers and how companies targeting B2B buyers can leverage best practices to identify high-quality content and improve the ease of transaction.
  • How to enrich content data from structured and unstructured sources will explore new ways to personalize online and offline customer experiences, syndicate high-value content across an ecosystem and drive industry-leading online conversion optimization.

Links to additional information:

Richard Isaac is CEO of RealDecoy Consulting, a Lucidworks partner for implementation of digital commerce solutions for AI-powered search, based on Lucidworks Fusion. Isaac wrote this post to recap the Lucidworks webinar that he co-hosted on June 12, 2019, with Lucidworks VP of Product Marketing, Justin Sears.

The post Guest Webinar Recap: Elevate Your Digital Commerce Platform with AI-powered Search appeared first on Lucidworks.

Invitation to hack the Distant Reader / Eric Lease Morgan

We invite you to write a cool hack enabling students & scholars to “read” an arbitrarily large corpus of textual materials.


A website called The Distant Reader takes an arbitrary number of files or links to files as input. [1] The Reader then amasses the files locally, transforms them into plain text files, and performs quite a bit of natural language processing against them. [2] The result — the the form of a file system — is a set of operating system independent indexes which point to individual files from the input. [3] Put another way, each input file is indexed in a number of ways, and therefore accessible by any one or combination of the following attributes:

  • any named entity (name of person, place, date, time, money amount, etc)
  • any part of speech (noun, verb, adjective, etc.)
  • email address
  • free text word
  • readability score
  • size of file
  • statistically significant keyword
  • textual summary
  • URL

All of things listed above are saved as plain text files, but they have also been reduced to an SQLite database (./etc/reader.db), which is also distributed with the file system.

The Challenge

Your mission, if you choose to accept it, is to write a cool hack against the Distant Reader’s output. By doing so, you will be enabling people to increase their comprehension of the given files. Here is a list of possible hacks:

  • create a timeline – The database includes a named entities table (ent). Each entity is denoted by a type, and one of those types is “PERSON”. Find all named entities of type PERSON, programmatically look them up in Wikidata, extract the entity’s birth & death dates, and plot the result on a timeline. As an added bonus, update the database with the dates. Alternatively, and possibly more simply, find all entities of type DATE (or TIME), and plot those values on a timeline.
  • create a map – Like the timeline hack, find all entities denoting places (GRE or LOC), look up their geographic coordinates in Wikidata, and plot them on a map. As an added bonus, update the database with the coordinates.
  • order documents based on similarity – “Find more like this one” is a age-old information retrieval use case. Given a reference document – a document denoted as particularly relevant — create a list of documents from the input which are similar to the reference document. For example, create a vector denoting the characteristics of the reference document. [4] Then create vectors for each document in the collection. Finally, use something like the Cosine Similarly algorithm to determine which documents are most similar (or different). [5] The reference document may be from either inside or outside the Reader’s file system, for example, the Bible or Shakespeare’s Hamlet.
  • write a Javascript interface to the database – The Distant Reader’s database (./etc/reader.db) is manifested as a single SQLite file. There exists a Javascript library enabling one to read & write to SQLite databases. [6] Sans a Web server, write sets of HTML pages enabling a person to query the database. Example queries might include: find all documents where Plato is a keyword, find all sentences where Plato is a named entity, find all questions, etc. The output of such queries can be HTML pages, but almost as importantly, they can be CSV files so people can do further analysis. As an added bonus, enable a person to update the database so things like authors, titles, dates, genres, or notes can be assigned to items in the bib table.
  • list what is being bought or sold – Use the entities table (ent) to identify all the money amounts (type equals “MONEY”) and the sentences from which they appear. Extract all of those sentences, analyze the sentence, and output the things being tendered. You will probably have to join the id and sentenced id in the ent table with the id and sentence id in the pos table to implement this hack. As an added bonus, calculate how much things would cost in today’s dollars or any other currency.
  • normalize metadata – The values in the named entities table (ent) are often repeated in various forms. For example, a value may be Plato, plato, or PLATO. Use something like the Levenshtein distance algorithm to normalize each value into something more consistent. [7]
  • prioritize metadata – Just because a word is frequent does not mean it is significant. A given document may mention Plato many times, but if Plato is mentioned in each and every document, then the word is akin to noise. Prioritize given named entities, specifically names, through the use of a something like TFIDF. Calculate a TFIDF score for a given word, and if the word is above a given threshold, then update the database accordingly. [8]
  • extract sentences matching a given grammer – Each & every word, punctuation mark, and part of speech of each & every document is enumerated and stored in the pos table of the database. Consequently it is rather easy to find all questions in the database and extract them. (Find all sentences ids where punctuation equals “?”. Find all words (tokens) with the same id and sentence id. Output all tokens sorted by token id.) Similarly, it is possible to find all sentences where a noun precedes a verb which precedes another noun. Or, find all sentences where a noun precedes a verb which is followed by the word “no” or “not” which precedes another noun. Such queries find sentence in the form of “cat goes home” or “dog is not cat”. Such are assertive sentences. A cool hack would be to identify sentences of any given grammer such as adjective-noun or entity-verb where the verb is some form of the lemma to be (is, was, are, were, etc.), as in “Plato is” or “Plato was”. The adjective-noun patterns is of particular interest, especially given a particular noun. Find all sentences matching the pattern adjective-king to learn how the king was described.
  • create a Mad Lib – This one is off the wall. Identify (random) items of interest from the database. Write a template in the form of a story. Fill in the template with the items of interest. Done. The “better” story would be one that is complete with “significant” words from the database; the better story would be one that relates truths from the underlying content. For example, identify the two most significant nouns. Identify a small handful of the most significant verbs. Output simple sentences in the form of noun-verb-noun.
  • implement one of two search engines – The Distant Reader’s output includes a schema file (./etc/schema.xml) defining the structure of a possible Solr index. The output also includes an indexer (./bin/ as well as a command-line interface (./bin/ to search the index. Install Solr. Create an index with the given schema. Index the content. Write a graphical front-end to the index complete with faceted search functionality. Allow search results to be displayed and saved in a tabular form for further analysis. The Reader’s output also includes a semantic index (./etc/reader.vec) a la word2vec, as well as a command-line interface (./bin/ for querying the semantic index. Write a graphical interface for querying the semantic index.

Sample data

In order for you to do your good work, you will need some Distant Reader output. Here are pointers to some such stuff:

Helpful hint

With the exception of only a few files (./etc/reader.db, ./etc/reader.vec, and ./cache/*), all of the files in the Distant Reader’s output are plain text files. More specifically, they are either unstructured data files or delimited files. Despite any file’s extension, the vast majority of the files can be read with your favorite text editor, spreadsheet, or database application. To read the database file (./etc/reader.db), you will need an SQLite application. The files in the adr, bib, ent, pos, urls, or wrd directories are all tab delimited files. A program called OpenRefine is a WONDERFUL tool for reading and analyzing tab delimited files. [9] In fact, a whole lot can be learned through the skillful use of OpenRefine against the tab delimited files.


[1] The home page of the Distant Reader is

[2] All of the code doing this processing is available on GitHub. See

[3] This file system is affectionately known as a “study carrel”.

[4] A easy-to-use library for creating such vectors is a part of the Scikit Learn suite of software. See

[5] The algorithm is described at, and a SciKit Learn module is available at

[6] The name of the library is called sql.js and it is available at

[7] The Levenshtein distance is described here —, and various libraries doing the good work are outlined at

[8] Yet another SciKit Learn module may be of use here —

[9] OpenRefine eats delimited files for lunch. See

PyPI packages / Brown University Library Digital Technologies Projects

Recently, we published two Python packages to PyPI: bdrxml and bdrcmodels. No one else is using those packages, as far as I know, and it takes some effort to put them up there, but there are benefits from publishing them.

Putting a package on PyPI makes it easier for other code we package up to depend on bdrxml. For our indexing package, we can switch from this:

‘bdrxml @’,

to this:


in, which is simpler. This also lets us using Python’s package version checking to not pin bdrxml to just one version, which is helpful when we embed the indexing package in another project that may use a different version of bdrxml.

Publishing these first two packages also gave us experience, which will help if we publish more packages to PyPI.

New Hampshire Public Library Services for Survivors of Domestic and Sexual Violence / In the Library, With the Lead Pipe

In Brief

Domestic violence and sexual assault survivors experience unique information needs that can be answered through formal avenues such as a crisis center or police/court proceedings, but many survivors do not take a formal route to recovery. This survey seeks to identify what services and policies guide New Hampshire public libraries in providing services to survivors to assist them in navigating the experience and recovery from domestic or sexual violence.

by Miranda Dube

New Hampshire, photo credit Wanda Dube
New Hampshire, photo credit Wanda Dube


Ongoing and widespread discussion within our culture is shedding light on the issue of sexual assault and domestic violence. Sexual assault and domestic violence do not discriminate and can impact anyone of any race, gender, ability, age, or sexual orientation. In the state of New Hampshire, roughly one in four women and one in twenty men have been sexually assaulted (NHCADSV, n.d.), which means there is the potential for almost 200,000 sexual assault survivors in the state, making up 15% of the population (United States Census Bureau, n.d.). Concrete statistics on domestic violence in New Hampshire are difficult to find, but would undoubtedly increase the number of survivors of violence. According to the New Hampshire Coalition Against Domestic and Sexual Violence, 13,505 adults and children who experienced domestic or sexual violence in 2016 received formal services from one of the thirteen Member Programs in the state (NHCADSV, n.d.). There is a large difference between how many people the crisis centers serve in a year and the total number of survivors in our state, but it is an important discrepancy to discuss in order to understand the role public libraries could play in providing services and information to survivors of domestic and sexual violence1 throughout the recovery process.

It is likely that the people seeking formal services from the crisis centers are either currently or have recently experienced violence, making the desire for crisis center support such as court advocates, hospital accompaniment, relocation, support services, and safety planning a current need. Survivors who are not currently experiencing violence still need supportive services, but their needs may not align with the services offered by formal organizations. Dr. Judith Herman’s (1992) book Trauma and Recovery outlines three stages that survivors of trauma experience, beginning with safety and stabilization, followed by remembrance and mourning, and concluding with reconnection and integration. Not every survivor will go through formal channels, such as the police or crisis center, to assist them in the recovery process, so what options are left for survivors?

Research Purpose and Questions

This research uses qualitative and quantitative data from New Hampshire public libraries to identify what services are offered to survivors of domestic and sexual violence in order to theorize on potential barriers and improvements that could be made. While the study focuses on New Hampshire public libraries, the barriers and services discussed henceforth could be applied to any library implementing similar programming, policies, and attitudes. This study seeks to answer:

  • How do public libraries in the state of New Hampshire provide information for survivors of domestic and sexual violence?
  • What are the potential barriers to information seeking at public libraries for survivors of domestic and sexual violence?

It is difficult to answer these questions without survivor input, for they are the experts on what they may want or need from the library. The original study design consisted of an additional survey for survivors of domestic and sexual violence who reside in the state of New Hampshire, with the hopes that information from both groups could be compared against one another. However, the response rate was extremely low (n=3) and that part of the study was closed. In place of first-hand knowledge of what survivors want or need from their library, information on survivors’ process of recovery and barriers to library access for a myriad of marginalized groups has been used to hypothesize on potential problems and solutions.

Research Design

A survey consisting of 34 questions was sent to 203 public libraries in New Hampshire. While 231 public libraries were identified, 28 of them lacked a website, email address, phone number, or a combination of these communication methods, creating a barrier to delivering the electronic survey. Where possible, emails were sent to the director of the library, but if no email was provided for the director, the library’s general email was utilized. Of the 203 emails sent, 29 were returned, for a response rate of 14%.

Results and Discussion

The survey asked respondents to report information in six categories: collection development, staff training/awareness, library policy, safety concerns, library programming, and assistance in the five information seeking stages of survivors. The importance and implications of each section are discussed with the data.

Collection Development

Participants were asked to provide information about their collections, all data collected used the Dewey Decimal system as a reference point. During survey distribution, four libraries responded that they would be unable to participate due to not having an automated system, and one library responded that they did not use Dewey Decimal and the process of translating their collection system to Dewey would put undue stress on the librarian’s time. While it is unfortunate that the survey design prevented these five libraries from participating, it was imperative to have consistency in the reporting of information and to maintain a low level of impact on survey responders.

Participants were asked to report information in the broad categories of the call numbers 364 and 362, which house the majority of materials on sexual assault and domestic violence respectively. It is important to note that other subjects are held within these call numbers as well, so the reported numbers are not in direct relation to domestic and sexual violence materials, but they do provide a baseline for how domestic and sexual violence materials are used in the library.

The participating libraries housed a range of 0-569 books in the call number 364, with an average of 86 materials. These materials have an average publication date range of 1984-2017, and an average circulation range of 0-22 was reported (average circulation statistics were calculated based on how many times the material circulated for the length of time the library owned the item, not for a specified date range of circulations). When averaging out the reported data, the average publication date of materials in call number 364 is 2006, and the materials circulate an average of 6.5 times.

Call number 362 housed a range of 0-536 books, with an average of 49 materials. These materials have an average publication date range of 2000-2015, and an average circulation range of 0-29 was reported. When averaging out the reported data, the average publication date of materials in call number 362 is 2006, and the materials circulate an average of 6.37 times.

Without analyzing each library’s holdings on the subjects, it is impossible to hypothesize on why some materials circulate more than others, if certain publication dates are preferable, or if the collection should house more materials on the subject. However, utilizing CREW: A Weeding Manual for Modern Libraries (2012) we can identify what criteria should go into maintaining these collections. According to the manual, materials housed in the call number range of 360-369 should be weeded based on age and popularity, and care should be taken to watch for “social welfare topics that are changing rapidly” (Larson, 2012, p. 67).

The conversation around sexual assault evolved rapidly through social networks in 2017 when actress Alyssa Milano sent a tweet that would soon go viral, even though the phrase “me too” had been coined in 2006 by the grassroots efforts of Tarana Burke (Alyssa_Milano, 2017; Me Too, 2018). With the heightened awareness on sexual assault, new books such as Chessy Prout’s memoir I Have the Right To were published, yet the average publication date for the respondents is 2006, long before many of the conversations we are having today took place. Similarly, terminology surrounding domestic violence has shifted from “battered woman syndrome” which was largely popular in the 1990’s to more inclusive language such as “intimate partner violence.” This change in terminology is another suggested criteria for weeding listed in the CREW manual (Larson, 2012, p. 67). Developing collections that house appropriate and culturally relevant materials on these subjects may not only help circulation statistics, but they may also assist survivors in locating applicable and timely material that is free from bias and judgement, and full of accurate resources.

Participants also reported the content of the physical books held in their collections. The categories included materials on empowerment (20%), information about abuse geared towards children/teens (16%), overview materials on domestic and sexual violence (11%), personal memoirs/biographies of survivors (27%), legal resources (24%), and fiction (1%). With a quarter of holdings on domestic violence and sexual assault relating to legal resources, it is even more imperative that the materials be current in order for survivors to make life-altering decisions with the best information.

In addition to physical books, participants reported the following materials as available in their libraries, listed in order of most to least reported: audiobooks, eBooks, DVDs, posters/pamphlets, and online journal articles. Nine libraries reported that they do not offer any other sources of information on domestic and sexual violence besides physical books. Further research with survivors would be necessary to determine if alternative methods of information delivery would be preferable.

Staff Training/Awareness

Participants were asked to rate statements regarding how often upper management conducts training with their staff that specifically discusses domestic and sexual violence on a scale from “strongly agree” to “strongly disagree.” When asked to rate “Upper management conducts training with newly hired staff and volunteers” specific to domestic and sexual violence,” 69% (n=20) of participants responded “disagree” or “strongly disagree,” 3% (n=1) responded “agree,” and 27% (n=8) responded with “neutral.” Additionally, participants were asked to rate the following statement: “Upper management conducts training with staff and volunteers that specifically discusses domestic and sexual violence survivors yearly,” which was aimed at discovering other professional development training that may occur at the library post-employment. 65% (n=19) of participants responded “disagree” or “strongly disagree,” 3% (n=1) responded “agree,” and 31% (n=9) responded with “neutral.” In order to provide services to special populations such as survivors, there must be some component of training provided by upper-management, which would also require upper-management to be aware of the unique needs of the population. This will be further explored at the end of the paper.

The researchers also sought to identify librarians’ knowledge of services in their area, hopeful that even without training, staff may know how to locate the resources individually. Participants were asked to report if all, most, some, or none of their staff members were aware of where the local crisis center is that serves their area. 55% (n=16) of libraries felt that less than half their staff knew this information. When asked how many staff members they felt could locate information about where the local crisis center was, 27% (n=8) felt that less than half of their staff could not locate this information, and 44% (n=13) felt that all their staff members could locate this information if needed. While it is great that staff could locate the information if necessary, Evans and Feder (2016) point out that informal disclosures to informal networks, such as a library staff member, only result in formal support if the person receiving the disclosure has prior knowledge or experience with domestic violence (p. 62). Therefore, it is necessary for all library staff to know this information prior to a disclosure or help-seeking question in order to be most successful in assisting the patron with locating formal resources.

Another important area of training in relation to domestic and sexual violence survivors is restraining orders. When asked if staff are trained on how to handle restraining orders, 76% (n=22) of participants responded “no,” 10% (n=3) did not respond, 3% (n=1) have met and talked amongst themselves, 3% (n=1) work directly with law enforcement, 6% (n=2) are unsure, and 3% (n=1) reported that restraining orders are not applicable at their library. Training on how to handle restraining orders as well as interacting with an abuser are imperative to survivor and library staff safety, as fleeing an abusive relationship results in a 500 times more likely chance of increased violence, including homicide (Mitchell, 2017).

Library Policy

As mentioned previously, some survivors of violence will seek formal services, such as safe shelter to live in, which may be at a homeless or crisis center, both of which present unique barriers to library card access. While not all survivors of domestic and sexual violence live in shelter, this study sought to identify barriers survivors may face, which includes barriers caused from living in shelter. Participants were asked to report a variety of library policies that relate to obtaining access to library materials for those residing in shelter. Proof of residence was required by 79% (n=23) libraries, 7% (n=2) did not provide an answer, 3% (n=1) have no requirement, and 10% (n=3) provided answers that were uncategorizable. Even though survivors may be living in a shelter similar to those experiencing homelessness, crisis centers are unique in not providing their address, which would provide safety issues for not only the person seeking the library card but everyone who lives or will live in the building. The researchers asked participants if there were any special rules or regulations that apply to survivors living in shelter who wish to obtain a library card. 24% (n=7) report not having a town shelter, 20% (n=6) require a letter from the shelter, 14% (n=4) require an ID with a town address, 10% (n=3) require verbally informing staff, 10% (n=3) reported no special rules or regulations, 17% (n=5) provided no answer or had no idea about special requirements, and 3% (n=1) would try to confirm residency by phone. While there is no ideal way for libraries to confirm residency for survivors living in shelter, as almost all would require a public disclosure, requiring an ID with a town address would require survivors to wait a longer period of time to obtain the card, and confirming residency by phone is only possible if the survivor gives consent, as residents’ names are never given for any reason (Anonymous, personal communication, December 28, 2018).

Safety Concerns

Participants were asked to report how sensitive information is monitored and kept safe within the library in regards to domestic and sexual violence survivors. When asked whether or not “staff is trained to never reveal the details of a patron’s account, especially to a spouse, as they could be a potential abuser,” 90% (n=26) of participants said yes, they were trained to follow the law regarding privacy, 3% (n=1) responded no, and 6% (n=2) did not answer. An area of concern is the additional responses from libraries that answered “yes” to this question. Additional information provided by the participants showcased ideologies that are not only of concern to survivors but the general public. Two participants expanded their answer of “yes” by adding the following statements: “Never is a strong word, there are occasions when it is appropriate to reveal a book title to a family member but generally speaking we respect privacy,” and “we abide by the law to the best of our ability.” In our profession, patron privacy is of the utmost importance and is one of the pillars of the public library and people’s freedom to read. The fact that some libraries do not feel compelled to uphold this, or to only uphold the law to the best of their ability (it is unclear where that line is drawn), creates a significant safety issue not only for survivors but for anyone utilizing those libraries. Only one participant responded with specific information about privacy training and intra-family privacy concerns, which include domestic and sexual violence. It is possible that if upper-management of libraries develop an understanding of the unique power and manipulation tactics used by abusers, it could shed light on and inform library training on patron privacy to reinforce how necessary this practice is.

An additional area of concern is sharing patron information amongst staff. While many libraries require survivors living in shelter to disclose their status, only 10% of the respondents train staff to not share this information with other staff members. Sharing this information with all staff members may provide minimal benefit to the patron, such as if an abuser were to show up at the library; however this outweighs the safety issues of a staff member sharing this information publicly and retraumatizing the victim through disclosing their story without their permission. Some potential solutions may be to develop a reporting plan in individual libraries that provides strict guidelines about whom information can be shared with, and what to do when shelter residents obtain a library card. Additionally, protocol must be developed for how letters from shelters providing proof of address are stored, if they are stored at all. Ideally, library staff would confirm residence through the letter and return the letter to the patron so there is no concern with the library holding such important and potentially deadly information.

Library Programming/Services

In addition to the questions on collection development, safety concerns, library card policies, and staff training, participants responded to questions regarding programming and services that relate to domestic and sexual violence survivors. One such service is a meeting space for a survivor to meet with a crisis center advocate. By creating space in a library for these meetings to take place, libraries allow for a safe and neutral place for survivors to meet with advocates who may not have otherwise had the opportunity due to safety issues with going to a crisis center office. Over half (55%, or n=16) of the participants stated they do provide meeting space for this purpose, 28% (n=8) could provide the service if needed, and 17% (n=5) do not or could not offer this service due to space constrictions. One thing that is unclear from this survey question is whether or not these libraries actively share that these rooms are available with their local crisis center, or if the space is just generally available. There is a possibility for community partnership if libraries share their available space opportunities with their local crisis center, and an opportunity for more active learning about domestic and sexual violence.

Participants were also asked about passive and active programming offered at their library. The majority of programming offered at New Hampshire libraries is passive, consisting of book displays, posters with crisis center information, and fact sheets about domestic and sexual violence. In comparison, only five participants reported active programming which consisted of a workshop on personal protection, a support group, a presentation from a local crisis center, and two partnerships with a local crisis center, although it is unclear what active programming came out of the latter.

This research does not seek to tell survivors what they do or do not need from their library in terms of services, so it is unwise to recommend types of programming that should be offered without input from survivors themselves. However, with that said, individual libraries could begin to branch out their programming options for survivors and do internal program evaluations to determine what offerings appear to be fulfilling a community need. Doing so eliminates the lengthy and challenging process of receiving survey feedback from survivors, and if done correctly could positively impact the survivors in the library’s area exponentially.

Information Seeking Stages of Survivors

In 2009, Lynn Westbrook published “Crisis Information Concerns: Information Needs of Domestic Violence Survivors.” In this study, Westbrook identified six main stages of information seeking that domestic violence survivors may experience. The six stages are as follows: initial consideration of a life change; during shelter and/or criminal justice engagement; post-shelter/post-police planning; legal concerns in making a life change; immigration-related information needs; and lastly, overlapping information needs from the previous five stages, which may occur at the same time (Westbrook, 2009, p. 104-109). Participants in the study were asked to rank all but the last stage (overlapping needs) on a scale of “1-do not provide information” to “5-provide above average information” on this information need. The breakdown for each information seeking stage is below:

Table 1. Initial consideration of a life change
1 – do not provide information 14% (n=4)
2 31% (n=9)
3 45% (n=13)
4 10% (n=3)
5 – provide above average information 0%
Table 2. During shelter and/or criminal justice engagement
1 – do not provide information 14% (n=4)
2 41% (n=12)
3 38% (n=11)
4 6% (n=2)
5 – provide above average information 0%
Table 3. Post-shelter, post-police planning
1 – do not provide information 38% (n=11)
2 31% (n=9)
3 27% (n=8)
4 10% (n=3)
5 – provide above average information 0%
Table 4. Legal concerns in making a life change
1 – do not provide information 21% (n=6)
2 27% (n=8)
3 34% (n=10)
4 17% (n=5)
5 – provide above average information 0%
Table 5. Immigration-related information needs
1 – do not provide information 34% (n=10)
2 28% (n=11)
3 24% (n=7)
4 10% (n=3)
5 – provide above average information 0%

While Westbrooks’ information seeking stages of survivors of domestic violence is not a formal evaluative tool, applying it in this context allows for a starting point in seeing what, if any, gaps exist in a library collection. From the responses of the New Hampshire participants it is clear that immigration-related information needs must be curated, and both post-police/post-shelter planning and during shelter and/or criminal engagement could use review and additions. Even though New Hampshire is 93% white, 6% of the state’s residents are immigrants, and Welcoming New Hampshire, a community building and bridging initiative for refugees, has office locations in four of the state’s cities (United States Census Bureau, n.d.; American Immigration Council, 2017; Welcoming New Hampshire, 2018). The cultural differences and language barriers faced by immigrants and refugees create extreme challenges for those facing domestic and sexual violence and their ability to navigate formal networks such as the police and court systems. Furthermore, disclosing or seeking help provides challenges based on English proficiency, the formal support networks’ language options, knowledge of policies, laws, resources, and larger cultural structures such as institutional racism (Lockhart & Danis, 2010, p. 161).

Lastly, participants were asked to use the space provided to inform the researchers of any other information they thought was pertinent to the study. Four comments were made explaining their community and the positive services they provided, five comments were made that express concern with the study format/questions, three comments were made that support the ideology that this is a problem for another organization to solve, and two comments were made informing the researchers that this is not a problem in their town. This means that at least 4% of New Hampshire public libraries are confident in sharing their lack of desire to bettering services for survivors. These comments are a significant area for concern since the respondents of the survey were mainly library directors. If the overarching attitude of those in the highest position of power is to dismiss and discredit survivors or the research being done to improve services to them, it will continue to shape how those libraries provide services and directly impact survivors in their area.

Areas for Future Research

As stated previously, researching library services to survivors without survivor input will have extremely limited impact. A nationwide survey for survivors of domestic and sexual violence specific to library services would shed light on where and how libraries can progress from where we are today. Conducting a nationwide survey could also allow for analysis of different geographic regions’ services to survivors, may allow for identification of areas where services to survivors are more successful than others, and could provide a road map for improvements. Additionally, this type of roadmap could be obtained by identifying libraries within the United States that offer explicit services to survivors of domestic and sexual violence and analyzing their challenges, successes, and community feedback. Furthermore, research into training, safety issues, collection development, and institutional bias in regards to survivors could provide more concrete answers to the questions that have risen from the current research.


While only a survey of one state, this research has allowed for a beginning exploration of what it may mean for libraries to provide great services to survivors of domestic and sexual violence. Those in charge of developing library collections should focus on increasing the appropriateness of materials relating to domestic and sexual violence, as well as expanding collections beyond physical books to include eBooks, DVD’s, and more. Training staff on domestic and sexual violence survivors’ needs can assist in bettering the library, including having the ability to refer informal disclosures to formal networks and increase library safety for everyone involved. While most survivors living in shelter are able to obtain a local library card, libraries should look for a non-disclosure method to obtain a card, increasing patrons’ and crisis centers’ safety.

The largest barrier to success in offering services to survivors lies within the implicit assumptions held by library workers, which trickles down through the organization creating an environment of barriers and lack of support. Although a difficult topic to discuss, domestic and sexual violence is one area librarians cannot afford to ignore, as the problem may never go away, and ignorance will only cause our patrons who have survived such violence more discomfort.


My deepest gratitude goes out to Christina Mendez, external reviewer, Bethany Messersmith, internal reviewer, and Amy Koester, publishing editor, for their labor and time. What started as a graduate school independent study is finally being published, and I would not have been able to do that without them. I also want to thank Dr. Melissa Villa-Nicholas for supervising my independent study, and Carrie, Karina, and Emily for reviewing early drafts of this work. I thank you all for reminding me every step of the way how necessary and important this research is.


Adamovich, S. G. (Ed.). (1989). The road taken: The New Hampshire Library Association 1889-1989. West Kennebunk, ME: Phoenix Publishing.

Alyssa_Milano. (2017, October 15).If you’ve been sexually harassed or assaulted write ‘me too’ as a reply to this tweet. [Twitter post]. Retrieved from

American Immigration Council. (2017, October 13). Immigrants in New Hampshire. Retrieved from

Evans, M. A., & Feder, G. S. (2016). Help-seeking amongst women survivors of domestic violence: A qualitative study of pathways towards formal and informal support. Health Expectations, 19(1), p. 62-73. doi: 10.111/hex.12330

Herman, J. (1992). Trauma and recovery: The aftermath of violence- from domestic abuse to political terror. New York, NY; Basic Books.

Larson, J. (2012). CREW: A weeding manual for modern libraries. Available from

Lockhart, L.L., & Danis, F. S. (2010). Domestic violence: Intersectionality and culturally competent practice. New York, NY: Columbia University Press.

Me Too. (2018). About. Retrieved from

Mitchell, J. (2017, January 28).Most dangerous time for battered women? When they leave. Clarion-Ledger. Retrieved from

NHCADSV. (n.d.). Statistics & Research. Retrieved from

Public Libraries. (2019). New Hampshire Public Libraries. Retrieved from 2018

United States Census Bureau. (n.d.). QuickFacts New Hampshire. Retrieved from

Welcoming New Hampshire. (2018). Welcoming New Hampshire: Weaving cultures, building communities. Retrieved from

  1. Throughout this article I have used the term “domestic and sexual violence survivor” as an umbrella term for any person who has experienced domestic violence, sexual assault, stalking (including cyber), and intimate partner violence (IPV). My deepest apologies go out to anyone who identifies with other terminology and feels unrepresented. I promise I see you, and your experience and voice matter.

New Digital Collections: Completed July 2018 - June 2019 / Library Tech Talk (U of Michigan)

Section 3 of the Mushi scroll

Over the past fiscal year (July 2018 - June 2019) the Digital Content & Collections (DCC) department has collaborated with stakeholders within libraries, museums, and more, across campus and beyond, to create the following new digital collections, adding to the full list of nearly 300 digital collections found online at Thank you to all of our stakeholders involved in each collection, the Library Copyright Office for their role in every new digital collection, and the many individuals within Library Information Technology who also assisted in the creation of these collections!

SAVE THE DATE for the LYRASIS Member Summit Oct. 1 – 2 Chicago, Illinois / DuraSpace News

We are thrilled to be planning our 4th annual Member Summit. This year’s in-person event will be at the Big 10 Conference Center in Chicago. Our theme this year is Local to Global. This  will be our first meeting after the LYRASIS + DuraSpace merger, and we will be welcoming DuraSpace Members and staff. The summit will include high level discussions for your senior and executive staff as well as the Leaders Circle pre-conference event.

The LYRASIS Member Summit is a members-only meeting for senior strategists and staff who are looking for new ways to collaborate and serve their users, researchers and communities. It is designed to help our members and their institutions seize both sustainable and disruptive innovations to deliver new, better, and tried and tested approaches for increasing and sustaining access to knowledge. The Member Summit creates a unique space for learning, networking, collaboration and designing action on the most important issues facing archives, libraries, museums and galleries today. As a LYRASIS member, you are invited to this annual conference at no charge.

Registration for both our Summit and hotel block will be open in the next few weeks! We hope to see you there!

The post SAVE THE DATE for the LYRASIS Member Summit Oct. 1 – 2 Chicago, Illinois appeared first on

Bootstrap 3 to 4: Changes in how font size, line-height, and spacing is done. Or “what happened to $line-height-computed.” / Jonathan Rochkind

Bootstrap 4 (I am writing this in the age of 4.3.0) changes some significant things about how it handles font-size, line-height, and spacer variables in SASS.

In particular, changing font-size calculations from px units to rem units; with some implications for line-heights as handled in bootstrap; and changes to how whitespace is calculated to be in terms of font-size.

I have a custom stylesheet built on top of Bootstrap 3, and am migrating it to Bootstrap 4, and I was getting confused about what’s going on. And googling, some things are written about “Bootstrap 4” that are really about a Bootstrap 4 alpha, and in some cases things changed majorly before the final.

So I decided to just figure it out looking at the code and what docs I could find, and write it up as a learning exersize for myself, perhaps useful to others.

Bootstrap 3

In Bootstrap 3, the variable $font-size-base is the basic default font size. It defaults to 14px, and is expected to be expressed in pixel units.

CSS line-height is given to the browser as a unit-less number. MDN says “Desktop browsers (including Firefox) use a default value of roughly 1.2, depending on the element’s font-family.” Bootstrap sets the CSS line-height to a larger than ‘typical’ browser default value, having decided that is better typography at least for the default Bootstrap fonts.

In Bootstrap 3, the unit-less $line-height-base variable defaults to the unusual value of 1.428571429. This is to make it equivalent to a nice round value of “20px” for a font-size-base of 14px, when the unit-less line-height is multiplied by the font-size-base. And there is a line-height-computed value that’s defined as exactly that by default, it’s defined in terms of $line-height-base.  So line-height-base is a unit-less value you can supply to the CSS line-height property (which _scaffolding does on body), and line-height-computed is a value in pixels that should be the same size, just converted to pixels.


As a whitespace measure, in bootstrap 3

Bootstrap wants to make everything scale depending on font-size, so tries to define various paddings and margins based on your selected line height in pixels.

For instance, an alerts, breadcrumbs, and tables, all have a margin-bottom of $line-height-computed (default 20px, with the default 14px font size and default unit-less line-height). h1, h2, and h3 all have a margin-top of $line-height-computed.

h1, h2, and h3 all have a margin-bottom of $line-height-computed/2 (half a line heigh tin pixels; 10px by default). And ($line-height-computed / 2) is both margin-bottom and margin-top for a p tag.

You can redefine the size of your font or line-height in variables, but bootstrap 3 tries to express lots of whitespace values in terms of “the height of a line on the page in pixels” (or half of one) — which is line-height-computed, which is by default 20px.

On the other hand, other kinds of whitespace are expressed in hard-coded values, unrelated to the font-size, and only sometimes changeable by bootstrap variables either.  Often using the specific fixed values 30px and 15px.

$grid-gutter-width is set to 30px.  So is $jumbotron-padding, You can change these variables yourself, but they don’t automatically change “responsively” if you change the base font-size in $font-size-base. They aren’t expressed in terms of font-size.

A .list-group has a margin-bottom set to 20px, and a .list-group-item has a padding of 10px 15px, and there’s no way to change either of these with a bootstrap variable, they are truly hard-coded into the SCSS. (You could of course try to override them with additional CSS).

So some white-space in Bootstrap 3 does not scale proportionately when you change $font-size-baseand/or $line-height-base.

Bootstrap 4

In Bootstrap 4, the fundamental starting font-size variable is still $font-size-base, but it’s defined in terms of rem, it is by default defined to 1rem.

You can’t set $font-size-base to a value in px units, without bootstrap’s sass complaining as it tries to do things with it that are dimensionally incompatible with px. You can change it to something other than 1rem, but bootstrap 4 wants $font-size-base in rem units.

1rem means “same as the font-size value on the html element.”  Most browsers (at least most desktop browsers?) default to 16px, so it will usually by default mean 16px. But this isn’t required, and some browsers may choose other defaults.

Some users may set their browser default to something other than 16px, perhaps because they want ‘large print’. (Although you can also set default ‘zoom level’ instead in a browser; what a browser offers and how it effects rendering can differ between browsers). This is, I think, the main justification for Bootstrap changing to rem, accessibility improvements respecting browser default stylesheets.

Bootstrap docs say not much to explain the change, but I did find this:

No base font-size is declared on the <html>, but 16px is assumed (the browser default). font-size: 1rem is applied on the <body> for easy responsive type-scaling via media queries while respecting user preferences and ensuring a more accessible approach.

Perhaps for these reasons of accessibility, Bootstrap itself does not define a font-size on the html element, it just takes the browser default. But in your custom stylesheet, you could insist html { font-size: 16px } to get consistent 1rem=16px regardless of browser (and possibly with accessibility concerns — although you can find a lot of people debating this if you google, and I haven’t found much that goes into detail and is actually informed by user-testing or communication with relevant communities/experts).  If you don’t do this, your bootstrap default font-size will usually be 16px, but may depend on browser, although the big ones seem to default to 16px.

(So note, Bootstrap 3 defaulted to 14px base-font-size, Bootstrap 4 defaults to what will usually be 16px). 

Likewise, when they say “responsive type-scaling via media queries”, I guess they mean that based on media queries, you could set font-size on html to something like 1.8​, meaning “1.8 times as large as ordinary browser default font-size.”  Bootstrap itself doesn’t seem to supply any examples of this, but I think it’s what it’s meant to support. (You wouldn’t want to set the font-size in px based on a media-query, if you believe respecting default browser font-size is good for accessibility).

Line-height in Bootstrap 4

The variable line-height-base is still in Bootstrap 4, and defaults to 1.5.  So in the same ballpark as Bootstrap 3’s 1.428571429, although slightly larger — Bootstrap is no longer worried about making it a round number in pixels when multiplied against a pixel-unit font-size-base.  line-height-base is still set as default line-height for body, now in _reboot.scss (_scaffolding.scss no longer exists).

$line-height-computed, which in Bootstrap 3 was “height in pixel units”, no longer exists in Bootstrap 4. In part because at CSS-writing/compile time, we can’t be sure what it will be in pixels, because it’s up to the browser’s default size.

If we assume browser default size of 16px, the “computed” line-height it’s now 24px, which is still a nice round number after all.

But by doing everything in terms of rem, it can also change based on media query of course. So if the point of Bootstrap 3 line-height-computed was often to use for whitespace and other page-size calculations based on base font-size, if we want to let base-font-size fluctuate based on a media query, we can’t know the value in terms of pixels at CSS writing time.

Bootstrap docs say:

For easier scaling across device sizes, block elements should use rems for margins.

Font-size dependent whitespace in Bootstrap 4

In Bootstrap 3, line-height-computed ) (20px for 14px base font; one line height) was often used for a margin-bottom.

In Bootstrap 4, we have a new variable $spacer that is often used. For instance, table now uses $spacer as margin bottom.  And spacer defaults to… 1rem. (Just like font-size-base1, but it’s not defined in terms of it, if you want them to match and you change one, you’d have to change the other to match).

alert and breadcrumbs both have their own new variables for margin-bottom, which also both default to: 1rem. Again not in terms of font-size-base, just happen to default to the same thing.

So one notable thing is that Bootstrap 3, as related to base font size, is putting less whitespace in margin-bottom on these elements. In Bootstrap 3, they got the line-height as margin (roughly 1.5 times the font size, 20px for a 14px font-size). In Bootstrap 4, they get 1rem which is the same as the default font-size, so in pixels that’s 16px for the default 16px font-size. Not sure why Bootstrap 4 decided to slightly reduce the separator whitespace here. 

All h1-h6 have a margin-bottom of $headings-margin-bottom, which defaults to half a $spacer. –default 1rem. (bootstrap 3 gave h1-h2 ‘double’ margin-bottom).

p uses $paragraph-margin-bottom, now in _reboot.scss. Which defaults to, you guessed it, 1rem.  (note that paragraph spacing in bootstrap 3 was ($line-height-computed / 2), half of a lot of other block element spacing. Now it’s 1rem, same as the rest).

grid-gutter-width is still in pixels, and still 30px, it is not responsive to font size.

list-groups look like the use padding rather than margin now, but it is defined in terms of rem .75rem in the vertical direction.

So a bunch of white-space separator values that used to be ‘size of line-height’ are now the (smaller) ‘size of font’ (and now expressed in rems).

If you wanted to make them bigger, the same relation to font/line-height they had in bootstrap 3, you might want to set them to 1rem * $line-height-base, or to actually respond properly to any resets to font-size-base, $font-size-base * $line-height-base. You’d have a whole bunch of variables to reset this way, as every component uses it’s own variable, which aren’t in terms of each other.

The only thing in Bootstrap 4 that still uses $font-size-base * $line-height-base (actual line height expressed in units, in this case rem units) seems to in custom_forms for custom checkbox/radio button styling. 

For your own stuff? $spacer and associated multiples

$spacer is probably a good variable to use where before you might have used $line-height-computed, for “standard vertical whitespace used most other places” — but beware it’s now equal to font-size-base, not (the larger) line-height-base.

There are additional spacing utilities, to let you get standard spaces of various sizes as margin or padding, whose values are by default defined as multiples of $spacer. I don’t believe these $spacer values are used internally to bootstrap though, even if the comments suggest they will be. Internally, bootstrap sometimes manually does things like $spacer / 2, ignoring your settings for $spacers.

If you need to do arithmetic with something expressed in rem (like $spacer), and a value expressed in pixels… you can let the browser do it with calc. calc($spacer - 15px)" actually delivered to the browser should work in any recent browser.

One more weird thing: Responsive font-sizes?

While off by default, Bootstrap gives you an option to enable “responsive font sizes”, which change themselves based on the viewport size. Not totally sure of the implications of this on whitespace defined in terms of font-size (will that end up responsive too?), it’s enough to make the head spin.

When Innovation Fails: The Swan Song of Gibson Guitars / Lucidworks

Frequently, when it comes to digital transformation, the focus is on which technologies or analytics strategies businesses need to adopt. There’s sometimes an assumption that if companies are committed to innovation, success is just around the corner.

That’s just not true. Let’s take Gibson Guitars as an example.

The Innovation Failure of Gibson

Gibson Guitars was beloved by some of rock’s greatest: Mark Knopfler from Dire Straits, Billy Gibbons from ZZTop, Martin Barre of Jethro Tull, and Duane Allman of the Allman Brothers are just a few of the famous who play them.

Yet in 2018, at the ripe age of 116, Gibson filed for bankruptcy.

Despite fame and favor, Gibson couldn’t be saved from declining sales — nor from the missteps of former Chairman and CEO Henry Juszkiewicz, who was determined to keep a healthy customer base as guitar enthusiasts aged.

For example, in 2014, Gibson bought Philip’s audio division for $135 million, a move that was not well regarded. As the Los Angeles Times said, “Juszkiewicz traded a slice of Gibson’s soul in an attempt to become more than just the maker of the world’s most beloved guitar. He bought pieces of consumer electronics companies to relaunch Gibson Guitars as Gibson Brands Inc., a ‘music lifestyle’ company. It didn’t work out as planned.”

Juszkiewicz launched several new products — which most of Gibson’s customer base regarded as either unnecessary or of low quality.

Gibson’s 2015 product line came in for particularly heavy flak. Intended to encourage new players to the brand, Gibson USA’s 2015 electrics sported an automatic-tuning system, wider necks and a brass nut – non-traditional features that were intended to make playing easier,” wrote

This site goes on to blame nostalgia for that backlash, but others say the technology had a serious impact on the quality of the guitars.

The most reviled feature on Internet forums was the G-Force automatic tuning system, which was included on almost all Gibson guitars sold between 2015 and 2017.

“Guitarists seem to be incensed that Gibson would forsake its traditional designs and force unexpected changes down the public’s collective throat,” wrote Bob Cianci in Guitar Noise.

“Juszkiewicz tried diversification,” said the LA Times, but sources seem to agree that diversification was in a direction customers didn’t want.

Innovation Does Not Equal Digital Transformation

Juszkiewicz’s move could be called a failed digital transformation, which is about adding digital capability to your business, so that you can respond to changes in the marketplace. But Gibson’s move was more about adding electronics, not about true transformation.

In fact, Gibson seemed to be making decisions in a vacuum. “There is no way of knowing if their sales systems could communicate with each other,” said Vivek Sriram, guitar enthusiast and CMO of Lucidworks. “But if they were, it was clear management wasn’t listening. Just rolling up sales numbers and exploring social media mentions would have told them they were on the wrong track.”

Fender Goes With a Digital-First Play

Gibson’s rival, at least since 1946, is another famed American-based guitar maker, Fender Musical Instruments Corp. In 2017, Fender, launched the online subscription service Fender Play. The Verge describes it as a “guitar lessons web platform and iOS app aimed at getting beginners hooked with visual, bite-sized tutorials.”

Baby boomers are, as the Washington Post said, “looking to shed, not add to, their collections. Andy Mooney, Fender’s CEO, told the Washington Post that they key to Fender’s continued success “is to get more beginners to stick with an instrument they often abandon within a year.”

“The key to Play is search,” says Sriram. “Students get hooked when they find the songs they want to play.” This online introduction to playing — and hearing recognizable tunes — encourages the budding player to continue. Instead of abandoning the instrument, explains Sriram, the player becomes proficient, then wants a better and more expensive guitar.  

From Physical Product to Digital

Fender Play moves the Fender from a physical product company to a digital-inclusive one. This digital-born product is a true digital transformation story — it allows Fender to meet marketplace needs so that the company can continue to thrive.

But that digital effort also includes adapting to a world of online music consumption. “In Fender’s case we think we are just doing a really good job of being a contemporary provider of product and a contemporary marketer of product through predominantly social media and digital channels,” Mooney said to Forbes.

It is also important to note that Fender made some savvy business moves that have increased sales after a dip in the middle of the decade. According to Reuters, Fender “overhauled instrument details like the feel of the frets under a player’s fingers and the electronics that reproduce the guitar’s sound but skipped digital add-ons.”

The company also moved some of its production to Mexico to reduce production costs and keep prices lower for consumers.

And Mooney has also focused marketing efforts on women “after company research revealed nearly half of first-time guitar buyers are women.”

All of this reinforces the idea that digital transformation will fail if companies don’t keep their customers foremost in their minds. Decisions pertaining to consumer preferences must be reinforced with data-driven analytics.

Digital transformation will always be more about culture than technology. Obviously, technology will (and should) play a huge role in any company’s transformation strategy, but companies must always keep the end goal of the transformation in mind: namely, to better serve their existing and potential customers.

The tale of these two guitar makers illustrates that digital transformation is part of the solution — but being agile enough to give customers what they want should be the goal.

Evelyn L. Kent is a content analytics consultant who specializes in building semantic models. She has 20 years of experience creating, producing and analyzing content for organizations such as USA Today, Tribune and McClatchy.

The post When Innovation Fails: The Swan Song of Gibson Guitars appeared first on Lucidworks.

What happened to $grid-float-breakpoint in Bootstrap 4. And screen size breakpoint shift from 3 -> 4. / Jonathan Rochkind

I have an app that customizes Bootstrap 3 stylesheets, by re-using Bootstrap variables and mixins.

My app used the Bootstrap 3 $grid-float-breakpoint and $grid-float-breakpoint-max variables in @media queries, to have ‘complex’ layout ‘collapse’ to something compact and small on a small screen.

This variable isn’t available in bootstrap 4 anymore.  This post is about Bootstrap 4.3.0, and probably applies to Bootstrap 4.0.0 final too. But googling to try to figure out changes between Bootstrap 3 and 4, I find a lot of things written for one of the Bootstrap 4 alphas, sometimes just calling it “Bootstrap 4” — and in some cases things changed pretty substantially between alphas and final. So it’s confusing, although I’m not sure if this is one of those cases. I don’t think people writing “what’s changed in Bootstrap 4” blogs about an alpha release were expecting as many changes as there were before final.

Quick answer

If in Bootstrap 3 you were doing:

// Bootstrap 3
@media(max-width: $grid-float-breakpoint-max) {
  // CSS rules

Then in Bootstrap 4, you want to use this mixin instead:

// Bootstrap 4
@include media-breakpoint-down(md) {
  // CSS rules

In in Bootstrap 3, you were doing:

// Bootstrap 3
@media (min-width: $grid-float-breakpoint) {
  // CSS rules

Then in Bootstrap 4, you want to do:

@include media-breakpoint-up(lg) {
  // CSS rules

If you were doing anything else in Bootstrap 3 with media queries and $grid_float_breakpoint, like doing (min-width: $grid-float-breakpoint-max`) or (max-width: $grid-float-breakpoint), or doing any + 1 or - 1 yourself — you probably didn’t mean to be doing that, were doing the wrong thing, and meant to be doing one of these things. 

One of the advantage of the new mix-in style, is that it makes it a little bit more clear what you are doing, how to apply a style to “just when it’s collapsed” vs “just when it’s not collapsed”.

What’s going on

Bootstrap 3

In Bootstrap 3,   there is a variable `$grid-float-breakpoint`, documented in comments as “Point at which the navbar becomes uncollapsed.”  It is by default set to equal the Bootstrap 3 variable `$screen-sm-min` — so we have an uncollapsed navbar at “sm” screen size and above, and a collapsed navbar at smaller than ‘sm’ screen size.  screen-sm-min in Bootstrap 3 defaults to 768px. 

For convenience, there was also a $grid-float-breakpoint-max, documented as “Point at which the navbar begins collapsing” — which is a bit confusing to my programmer brain, it’s more accurate to say it’s the largest size at which the navbar is uncollapsed. (I would say it begins collapsing at $grid-float-breakpoint, one higher than $grid-float-breakpoint-max).

$grid-float-breakpoint-maxis defined as ($grid-float-breakpoint - 1) to make that so. So, yeah, $grid-float-breakpoint-max is confusingly one pixel less than $grid-float-breakpoint — kind of easy to get confused.

While documented as applying to the navbar, it was also used in default Bootstrap 3 styles in at least one other place, dropdown.scss,  where I don’t totally understand what it’s doing, but is somehow changing alignment to something suitable for ‘small screen’ at the same place navbars break — smaller than ‘screen-sm’.

If you wanted to change the point of ‘breakdown’ navbars, dropdowns, and anything else you may have re-used this variable for — you could just reset the $grid-float-breakpoint variable, it would now be unrelated to $screen-sm size. Or you could reset $screen-sm size. In either case, the change is now global to all navbars, dropdowns, etc.

Bootstrap 4

In Bootstrap 4, instead of just one breakpoint for navbar collapsing, hard-coded at the screen-sm boundary, you can choose to have your navbar break at any of bootstrap’s screen size boundaries, using classes ‘.navbar-expand-sm’, ‘.navbar-expand-lg’, etc. ‘navbar-expand-sm’. You can now choose different breakpoints for different navbars using the same stylesheet, so long as they correspond to one of the bootstrap defined breakpoints.

‘.navbar-expand-sm` means “be expanded at size ‘sm’ and above’, collapsed below that.”

If you don’t put any ‘.navbar-expand-*’ class on your navar — it will always be collapsed, always have the ‘hamburger’ button, no matter how small the screen size.

And instead of all dropdowns breaking at the same point as all navbars at ‘grid-float-break, there are similar differently-sized responsive classes for dropdowns.  (I still don’t entirely understand how dropdowns change at their breakpoint, have to experiment).

In support of bootstrap’s own code creating all these breakpoints for navbars and dropdowns, there is a new set of breakpoint utility mixins.  These also handily make explicit in their names “do you want this size and smaller” or “do you want this size and larger”, to try to avoid the easy “off by one” errors using Bootstrap 3 variables, where a variable name sometimes left it confusing whether it was the high-end of (eg) md or the low-end of md.

You can also use these utility mixins yourself of course!  breakpoint-min(md) will be the lowest value in pixels that is still “md” size. breakpoint-min(xs) will return sass null value (which often converts to an empty string), because “xs” goes all the way to 0.

breakpoint-max(md) will return a value with px units, that is the largest pixel value that’s within “md” size. breakpoint-max(xl) will return null/””, because “xl” has no max value, it goes all the way up to infinity.

Or you can use the mixins that generate the actual media queries you want, like media-breakpoint-up(sm) (size “sm” and up), or media-breakpoint-down(md) (size ‘md’ and down). Or even the handy media-breakpoint-between(sm, lg) (small to large, inclusive; does not include xs or xl.)

Some Bootstrap 4 components still have breakpoints hard-coded to a certain responsive size, rather than the flexible array of responsive breakpoint classes. For instance a card has a collapse breakpoint at the bottom of ‘sm’ size, and there’s no built-in way to choose a different collapse breakpoint.  Note how the Bootstrap source uses the media-breakpoint-up utility to style the ‘card’ collapse breakpoint.

Bootstrap 4 responsive sizes shift by one from Bootstrap 3!

To make things more confusing, ‘sm’ in bootstrap 3 is actually ‘md’ in bootstrap 4.

  • Added a new sm grid tier below 768px for more granular control. We now have xs, sm, md, lg, and xl. This also means every tier has been bumped up one level (so .col-md-6 in v3 is now .col-lg-6 in v4)

In Bootstrap 3, ‘sm’ began at 768px. In Bootstrap 4, it’s md that by default begins at 768px. And there’s a new ‘sm’ inserted below 768 — in Bootstrap 4 sm by default begins at 576px. 

So that’s why to get the equivalent of Bootstrap 3(max-width: $grid-float-breakpoint-max), where $grid-float-breakpoint was defined  based on “screen-sm-min” in Bootstrap 3 (smaller than ‘sm’) — in bootstrap 4 we need to use md instead — media-breakpoint-down(md).

Customizing breakpoints in Bootstrap 4

The responsive size breakpoints in bootstrap 4 are defined in a SASS ‘map’ variable called grid-breakpoints. You can change these breakpoints, taking some care to mutate the ‘map;’ without removing default values, if you that is your goal.

If you change them there, you will change all the relevant breakpoints, including the grid utility classes like col-lg-2, as well as the collapse points for responsive classes for navbars and dropdowns. If you change the sm breakpoint, you’ll change the collapse breakpoint for card for instance too.

There’s no way to only change the navbar/dropdown collapse breakpoint, as you could in Bootstrap 3 with $grid-float-breakpoint. On the other hand, you can at least hypothetically (I haven’t tried it or seen it documented) add additional breakpoints if you want, maybe you want something in between md and large, called, uh, I don’t know what you’d call it that wouldn’t be confusing. But in theory all the responsive utilities should work with it, the various built-in *-md-* etc classes should now be joined by classes for your new one (since the built-in ones are generated dynamically), etc. I don’t know if this is really a good idea.

Creating accessible digital exhibits – a conversation / HangingTogether

Photo by Jan Tinneberg on UnsplashPhoto by Jan Tinneberg on Unsplash

Recently I hosted a virtual conversation, open to staff at all OCLC Research Library Partnership institutions about creating interactive digital exhibits that are accessible and inclusive.  “Accessible and inclusive” in this context means that Web sites, tools, and technologies are designed and developed so that people with disabilities and learning differences can use them.  Over the course of an hour-long discussion, we compared notes, shared strategies and successes, pooled our uncertainties, and brainstormed about what’s needed in the way of tools, documentation, and collaborative infrastructure.  Participants included librarians from Brandeis University, Emory University, New York Public Library, Penn State University, Temple University, and the University of Minnesota, along with two of my OCLC Research colleagues, Merrilee Proffitt and Chela Weber. 

Web site accessibility is not an area where OCLC RLP staff have expertise, but we do have the ability to reach out across the OCLC RLP membership and bring together staff at affiliated institutions to address issues and challenges of common interest.  Several RLP members have indicated to us that creating accessible interactive digital exhibits is indeed one of those shared challenges. 

Here are our top takeaways from the group conversation.

Web accessibility is an emerging challenge requiring specialized skills and knowledge not always readily available to library staff members or to others within the institution.

Representing distinctive and unique collections online is a top priority for all the libraries represented on the call.  The threat of legal action because of accessibility issues raises the stakes and the level of urgency.  But as one conversation participant put it, “The biggest deterrent to creating [accessible] online exhibits is not having the expertise and the resources.”  One library makes knowledge more widely available by offering office hours with accessibility experts.  Another has embedded an assistive technology specialist within the library IT unit.  Others advocate for leveraging campus expertise outside the library in order to ensure accessibility: “[Our university has] an accessibility office that’s really helpful…We have developers in that area as well.  I think we have a lot of the expertise here, but it’s just pulling it all together and figuring out what technologies we’re going to be using.”

Web accessibility is being addressed by individual institutions, consortia, and professional organizations, but information about these efforts is scattered and not easy to find

Institutions represented on the call are working largely on their own to address the challenges around creating accessible digital exhibits and issues of accessibility in general.  Approaches vary, in method and in scope.  One library is looking into hiring Web developers trained in accessibility.  Another has formed a committee to look at all aspects of creating online content, asking, “How can we provide that really curated experience while also making it accessible, providing an equal experience for all of our users?”  Another participant’s library changed its Web site workflow so that all materials are reviewed for accessibility: “Now we have one main content person who is a content strategist… We just work with people to edit the content on the Web pages, where they are the subject matter experts, but we’re doing the content editing, to make sure it’s accessible.”  And one university is considering a much broader approach to accessibility, namely restructuring “the way they think about how they’re using resources so that they are universally designed from the beginning.” 

There is no obvious place to come together and compare notes on such efforts. One participant joined a Canadian Web accessibility guidelines working group after not finding something closer to home.  Others pointed to listservs and interest groups that have sprung up under various organizations, such as the American Library Association’s uniaccess listserv and Educause’s IT Accessibility Community Group.  The Web Accessibility Initiative was named as a key resource and potential collaborator.  Some suggested the Digital Library Federation, the Digital Public Library of America, and the Museum Computer Network as potential players and rallying points for accessible digital exhibit efforts. 

Many of the systems in use in our libraries do not adequately support accessibility and inclusion

This idea was universally acknowledged during the session.  One participant said, “There has to be a cultural shift around expectations.  I think if you put pressure on…vendors up front, that would make a huge difference in terms of the accessibility.”  One library represented on the call has included language in their RFPs stating that any software to be purchased must be vetted by the accessibility office and IT: “Any exception would come with an alternate access plan…and something in writing from the vendor saying that accessibility is on their roadmap.” 

There is a persistent myth that you sacrifice cool and engaging web design if you embrace accessibility

Several call participants pushed back passionately against that myth: “That really is determined by how experienced your developers are in accessible design.  Are your developers accessibility certified, and do they have years of experience with that?  When we hire people in IT, I don’t think we always ask that that be a well-developed skill.”

There are lightweight approaches to creating compelling curated online experiences that may offer alternatives to developing full-blown exhibitions

Inspired by work that the Digital Public Library of America has done with primary source sets, the Minnesota Digital Library has created a set of “curated collections of materials where there’s kind of an introductory blurb that gives people some context and some understanding of the topic… It’s just a Drupal page, it’s not difficult to create…This has been a good way for us to get into digital storytelling without getting into the complexity and the time involved in creating an exhibit.”  Call participants were impressed with the results, with one exclaiming, “What I like about this is that it’s truly digital – the user gets to curate their own direction.” 

Accessibility goes well beyond Web sites – we need to consider everything that is viewed on a computer, including communications sent through email marketing platforms

Of course, accessible online exhibits represent only a tiny fraction of the accessibility and inclusion challenges faced by libraries and their parent institutions.  As one participant noted, I think everyone should think of accessibility as a possible systemic issue.”

Heightening awareness and providing training on the basics of accessibility would go a long way toward addressing the overall challenge: I think that in addition to training for developers we also need broader training for creators…I think that building greater awareness throughout the institution will be really key.  That is also going to help with people understanding the need for resource allocation and time for planning for a project.”  Some are already providing basic instruction at their institutions, to those who avail themselves of it: “I teach a workshop on designing accessible and inclusive digital projects…and staff do attend.  But, with the inconsistency in knowledge, it can become very challenging…Training staff is a preventative so that, as they move forward, they cease to do things a certain way…Fixing that would go some way toward fixing some of these other problems as well.” 

Bottom line: the more general understanding there is about accessibility, what it means, where the work is being done, and where to go for resources, the better off libraries are

This conversation among OCLC Research Library Partnership members, and our reporting out of the highlights, is a step along that path.

The post Creating accessible digital exhibits – a conversation appeared first on Hanging Together.

Islandoracon Full Schedule Now Available / Islandora

The complete five-day schedule for Islandoracon (October 7 - 11 2019) is now available! Please join us and our 43 speakers and workshops leaders for a week of the best in Islandora and an opportunity to get together and share.

Registration is open and the Early Bird discount is on until July 1st.

The schedule at-a-glance:

  • Monday, October 7th: Pre-Conference and Half-Day Workshops
  • Tuesday, October 8th: Main Conference Sessions
  • Wednesday, October 9th: Main Conference Sessions
  • Thursday, October 10th: Main Conference Workshops
  • Friday, October 11th: Islandora 8 Use-a-Thon (Hackfest)


View the Islandoracon schedule & directory.

Glitter, glitter, everywhere / Casey Bisson

Near the entrance, metal shelves taller than a man were laden with over one thousand jumbo jars of glitter samples arranged by formulation, color, and size: emerald hearts, pewter diamonds, and what appeared to be samples of the night sky collected from over the Atlantic Ocean. There were neon sparkles so pink you have only seen them in dreams, and rainbow hues that were simultaneously lilac and mint and all the colors of a fire.

OpenSRF 3.0.3 and OpenSRF 3.1.1 released / Evergreen ILS

We are pleased to announce the release of bugfix releases (3.0.3 and 3.1.1) of OpenSRF, a message routing network that offers scalability and failover support for individual services and entire servers with minimal development and deployment overhead.

OpenSRF 3.0.3 and 3.1.1 include a performance improvement; all users of OpenSRF 3.0.x and 3.1.x are encouraged to upgrade as soon as possible.

The following bugs are fixed in OpenSRF 3.0.3 and 3.1.1:

  • LP#1824181 and LP#1824184: Improve the performance of certain logging statements. OpenSRF application code written in Perl can now pass a subroutine reference to a logging statement instead of a string. This allows complicated expressions to generate text for a log message to not be run unless actually needed at the current logging level. For example, a logging statement of$logger->debug('message')

    can now be alternatively be represented as

    $logger->debug(sub { return 'message' })

    OpenSRF now uses this mechanism for a debug logging statement in method_lookup(). This has the effect of reducing the time to run some methods in Evergreen’s by 90%.

OpenSRF 3.1.1 also marks the end of formal installation support for Ubuntu 14.04 Trusty Tahr, which is no longer supported by Ubuntu.

To download OpenSRF, please visit the downloads page.

We would also like to thank the following people who contributed to the releases:

  • Galen Charlton
  • John Merriam
  • Ben Shum
  • Jason Stephenson

On the Response to My Atlantic Essay on the Decline in the Use of Print Books in Universities / Dan Cohen

I was not expecting—but was gratified to see—an enormous response to my latest piece in The Atlantic, “The Books of College Libraries Are Turning Into Wallpaper,” on the seemingly inexorable decline in the circulation of print books on campus. I’m not sure that I’ve ever written anything that has generated as much feedback, commentary, and hand-wringing. I’ve gotten dozens of emails and hundreds of social media messages, and The Atlantic posted (and I responded in turn to) some passionate letters to the editor. Going viral was certainly not my intent: I simply wanted to lay out an important and under-discussed trend in the use of print books in the libraries of colleges and universities, and to outline why I thought it was happening. I also wanted to approach the issue both as the dean of a library and as a historian whose own research practices have changed over time.

I think the piece generated such a large response because it exposed a significant transition in the way that research, learning, and scholarship happens, and what that might imply for the status of books and the nature of libraries—topics that often touch a raw nerve, especially at a time when popular works extol libraries—I believe correctly—as essential civic infrastructure.

But those works focus mostly on public libraries, and this essay focused entirely on research libraries. People are thankfully still going to and extensively using libraries, both research and public (there were over a billion visits to public libraries in the U.S. last year), but they are doing so in increasingly diversified ways.

The key to my essay were these lines:

“The decline in the use of print books at universities relates to the kinds of books we read for scholarly pursuits rather than pure pleasure…A positive way of looking at these changes is that we are witnessing a Great Sorting within the [research] library, a matching of different kinds of scholarly uses with the right media, formats, and locations.”

Although I highlighted statistics from Yale and the University of Virginia (which, alas, was probably not very kind to my friends at those institutions, although I also used stats from my own library at Northeastern University), the trend I identified seems to be very widespread. Although I only mentioned specific U.S. research libraries, my investigations showed that the same decline in the use of print collections is happening globally, albeit not necessarily universally. In most of the libraries I examined, or from data that was sent to me by colleagues at scores of universities, the circulation of print books within research libraries is declining at about 5-10% per year per student (or FTE).

For example, in the U.K. and Ireland, over the three years between the 2013-14 school year and the 2016-17 school year, the circulation of print books per student declined by 27%, according to the Society of College, National and University Libraries (SCONUL), which represents all university libraries in the U.K. and Ireland. Meanwhile, SCONUL reports that visits to these libraries have actually increased during this period. (SCONUL’s other core metric, print circulations per student visit to the library, has thus declined even more, by 33% over three years.) Similarly, the Canadian Association of Research Libraries (CARL), which maintains the statistics for university libraries in Canada, notes that during these same three years, the average yearly print circulation at their member libraries dropped from 200,000 to 150,000 books, and their per-student circulation number also dropped by 25%.

Again, this is just over three recent years. The decline becomes even more severe as one goes further back in time. In the 2005-6 school year, the average Canadian research library circulated 30 books per student, which slid to 25 in 2008-9; by 2016-17 that number was just 5. Readers of my article were shocked that UVA students had only checked out 60,000 books last year, compared to 238,000 a decade ago, but had I gone all the way back in the UVA statistics to two decades ago, the comparison would have been even more stark. The total circulation of books in the UVA library system was 1,085,000 in 1999-2000 and 207,000 in 2016-17. Here’s the overall graph of print circulation (in “initial circs,” which do not include renewals) from the Association of Research Library (U.S.), showing a 58% decline between 1991 and 2015, but an even larger decline since Peak Book and an even larger decline on a per student basis, since during this same period the student body at these universities increased 40%.

These longer time frames underline how this is an ongoing, multi-decade shift in the ways that students and faculty interact with and use the research library. All research libraries are experiencing such forces and pressing additional demands—the need for new kinds of services and spaces as well as the surging use of digital resources and data—while at the same time continuing to value physical artifacts (archives and special collections) and printed works. It’s a very complicated, heterogeneous environment for learning and scholarship. Puzzling through the correct approach to these shifts, rather than ignoring them and sticking more or less with the status quo, was what I was trying to prod everyone to think about in the essay, and if I was at all successful, that’s hopefully all to the good.

Average Order Value Down? Endeca Could Be the Culprit / Lucidworks

If you are seeing your Average Order Value (AOV) flattening or declining, and you’re still using Endeca, your search engine could be to blame.

AOV is total revenue divided by number of orders. If your AOV is dropping, that means your loyal customers are spending less and less with each purchase. That’s a real problem, says Richard Isaac, CEO RealDecoy, a leading enabler of ecommerce for brands like American Express, Samsung, Honeywell, Coach, and a long-time integrator of Endeca.

“In my experience, most retailers overinvest in customer acquisition while under-investing in growing average order value and conversion rates,” says Isaac. He estimated that the cost to acquire a new customer is 9x the cost of retaining an existing one.

“Top retailers agree that digital commerce search has a huge impact on their cost of service, average order value, and customer loyalty. But many of them struggle with obstacles like budgets and technology that does not (and may never) meet their needs. Most importantly, they lack a data-driven approach to making enhancements,” he explains.

Today’s ecommerce shoppers expect search to be personal and precise–to provide them with exactly the info they need, when they need it.

Customers are not really interested in your top 10 guesses about what they might want. They just want the top one or two items that fit exactly what they need. And when that doesn’t happen, customers get frustrated.

Today’s retailers try to meet this demand for a personal approach to ecommerce with more words. They provide catalogs with many thousands of products that have complex and specific descriptions, hoping that customers will have the patience to wade through it and find what they need. But what if they do not show such patience?  

Endeca’s Keyword Search Is No Longer Enough

Endeca’s simple keyword search is no longer enough to keep up with the volume, velocity, and variety of products and words that retailers must manage if they don’t want to leave money on the table.

Most teams using Endecaknow they must act smarter and faster, yet Oracle isn’t innovating the Endeca product and support is inadequate. The question isn’t “whether” to transition, but when and how. When do the benefits of switching outweigh the costs and potential for disruption? That time has come.

Average Order Value and Conversion Success

Leading retailers like Lenovo have recently upgraded their search capabilities and as Marc Desormeau, Lenovo’s Senior Manager of Digital Customer Experience put it, “Since the migration [to Fusion from Endeca] we’ve seen a 50-percent increase in conversion rates and other key success metrics for transactional revenue.”

So what do folks like Desormeau and Isaac look for in a modern solution? First off, they want newer and better search algorithms that allow customers to find what they need on the first try. Secondly, they seek a modern scalable architecture that allows them to do more with their data and handle changes in real-time. If they need a full re-index of their catalog and site content, they want it to take minutes not hours. Just like you, they want their customers find what they need, when they need it, however they describe it!

Retail Trends in 2019

With retailers scrambling to try and find ways to improve the customer experience, don’t forget to look at site search analysis. The right analysis will help you figure out user intent.

So the right solution should learn from customer search behavior. As customers click on different results, those results should be boosted. If an individual customer trends towards certain types of products, they should automatically see more of what they’re interested in, without a merchandiser having to create a specific rule or customer segment.

Turning on and Configuring Signals

Finally, Endeca users have relied on thousands of difficult-to-maintain merchandising rules in order to personalize, customize and optimize the customer experience. Modern solutions use AI and machine learning techniques to do the bulk lifting. They save predictive marketing tools (including rules) to do the more specific tuning. This means tens or hundreds of rules instead of thousands. And it means a lot less drudgery for merchandising teams.

Predictive Merchandising in Fusion 4.2

Retailers who aren’t moving from Endeca quickly enough will not be able to take advantage of signals, relevancy tuning, and machine learning. They will be leaving money on the table. In fact, those that do move typically capture enough additional revenue to pay for the migration in as short as a few months.

Learn more:

The post Average Order Value Down? Endeca Could Be the Culprit appeared first on Lucidworks.

Statement from the Open Knowledge Foundation Board on the future of the CKAN Association / Open Knowledge Foundation

The Open Knowledge Foundation (OKF) Board met on Monday evening to discuss the future of the CKAN Association.

The Board supported the CKAN Stewardship proposal jointly put forward by Link Digital and Datopian. As two of the longest serving members of the CKAN Community, it was felt their proposal would now move CKAN forward, strengthening both the platform and community.

In appointing joint stewardship to Link Digital and Datopian, the Board felt there was a clear practical path with strong leadership and committed funding to see CKAN grow and prosper in the years to come.

OKF will remain the ‘purpose trustee’ to ensure the Stewards remain true to the purpose and ethos of the CKAN project. The Board would like to thank everyone who contributed to the deliberations and we are confident CKAN has a very bright future ahead of it.

If you have any questions, please get in touch with Steven de Costa, managing director of Link Digital, or Paul Walsh, CEO of Datopian, by emailing

Islandora 8 Now Available / Islandora

The Islandora Foundation is pleased to announce the immediate availability of Islandora 8 version 1.0.0!  This is an important milestone for the Islandora project, and is a testament to our wonderful and vibrant community.  Built using Drupal 8 and Fedora 5, Islandora 8 faithfully integrates the two as invisibly as possible, giving an experience that is both more Drupal-y and more Fedora-y at the same time.  Islandora 8 unlocks all of Drupal's features along with its entire ecosystem of contributed modules, all the while quietly preserving your metadata in a Fedora 5 repository behind the scenes.  It truly is the best of both worlds.

If you would like to try Islandora 8 for yourself, we have three options for you:

  1. A sandbox of version 1.0.0 is available to play with at
  2. A virtualbox VM is available for download here.
  3. You can install a development or production environment with our Ansible playbook, which has a corresponding 1.0.0 release.
Islandora 8's documentation is stored in markdown on Github, with contributions welcome.  If you would like to contribute a use case or file a bug, please see our issue queue.

The Islandora Foundation is committed to providing utilities for Islandora 7 repositories to make migration as painless as possible.  All existing Islandora 7 users are encouraged to evaluate our migration tools and provide us with feedback.  We are dedicated to working with everybody to make sure we all move forward together!

Here's a list of all the features currently available with the 1.0.0 release (including those that come for free from contributed modules):

  • Model content using core Drupal entities and fields
  • Out of the box support for
    • Collections
    • Images
    • Audio
    • Video
    • PDF
    • Binaries
  • Control how content is displayed using the UI
  • Configure forms for content using the UI
  • Categorize content using taxonomy terms
  • Expandable file storage
    • Drupal's public file system
    • Multiple private file systems using flysystem (check this link for a full list of supported adapters)
      • Fedora
      • Local or networked storage
      • Sftp
      • AWS S3
      • and more…
    • Basic CRUD operations with Drupal REST and JSON
    • Read-only JSONLD serialization
    • Extensive use of Link headers for discoverability
    • Add files to objects with PUT
  • Solr search (using search_api_solr)
    • Configure search index through the UI
  • Custom viewers
    • Openseadragon
    • PDF.js using the pdf module
  • Custom field types
    • Extended Date Time Format (EDTF)
    • Typed Relation
    • Authority Link
  • Custom entities for:
    • People
    • Families
    • Organizations
    • Locations
    • Subjects
  • Derivatives
    • Convert / transform images (or just use Drupal image styles!)
    • Extract images from PDFs
    • Extract images from Video
    • Convert audio formats
    • Convert video formats
    • All derivative operations have forms and can be configured through the UI
  • Access control
  • Control repository events through the UI using the context module
    • Index RDF in Fedora
    • Index RDF in a Triplestore
    • Derivatives
    • Switching themes
    • Switching displays/viewers
    • Switching forms
    • And much much more....
  • Multi-lingual support
    • Translated content is included in metadata and indexed in both Fedora and the Triplestore with proper language tags
    • The user interface can be translated to languages other than English
  • Bulk ingest using CSVs
  • Views!  You can filter, sort, display, and otherwise manipulate lists of content in all kinds of ways.  For example:
    • Make a browse by collections page (see this example on the sandbox, which can be customized here).
    • Make an image gallery (see this example on the sandbox, which can be customized here)
    • Make a slideshow (see this example on the sandbox, which can be customized here) using views_slideshow.
    • Put pins on a Google map using the geolocation module
    • Execute actions in bulk on views results using the views_bulk_operations module
      • Re-index content
      • Re-generate derivatives
      • And everything else you can do with Drupal actions (pretty much anything!)
    • Bulk edit metadata using views_bulk_edit (see this example on the sandbox, which can be customized here)
If there are any features that are missing that you consider to be requirements for adoption, we will be polling the community to find out what features to build next.  Your input is valued and you are encouraged to participate in the upcoming poll.

This software is made possible by volunteer contributions from community members and organizations. Development, documentation, and testing are all significant undertakings that require time and effort.  We thank each and every one of the people who have helped us deliver this software, to whom we owe a debt of infinite gratitude.

  • Aaron Coburn
  • Adam Soroka
  • Alan Stanley
  • Alex Kent
  • Alexander O’Neill
  • Amanda Lehman
  • Andrija Sagic
  • Ann McShane
  • Benjamin Rosner
  • Bethany Seeger
  • Brad Spry
  • Brian Woolstrum
  • Bryan Brown
  • Caleb Derven
  • Cara Key
  • Carolyn Moritz
  • Cillian Joy
  • Courtney Matthews
  • Cricket Deane
  • David Thorne
  • Diego Pino
  • Don Richards
  • Eli Zoller
  • Favenzio Calvo
  • Frederik Leonhardt
  • Gavin Morris
  • Janice Banser
  • Jared Whiklo
  • Jason Peak
  • John Yobb
  • Jonathan Green
  • Jonathan Hunt
  • Jonathan Roby
  • Kim Pham
  • Marcus Barnes
  • Mark Jordan
  • Meghan Goodchild
  • Mike Bolam
  • Minnie Rangel
  • Natkeeran Kanthan
  • Nick Ruest
  • Noah Smith
  • Pat Dunlavey
  • Paul Clifford
  • Paul Pound
  • Pete Clarke
  • Rachel Leach
  • Rachel Tillay
  • Rosie Le Faive
  • Seth Shaw
  • Suthira Owlarn
  • Yamil Suarez
We would also like to acknowledge the initial financial support from the following institutions, which got development started back in 2015:
  • The American Philosophical Society
  • Common Media Inc. (Born-Digital)
  • discoverygarden inc.
  • McMaster University
  • PALS
  • Simon Fraser University
  • University of Limerick
  • University of Manitoba
  • University of Prince Edward Island
  • York University

New RIAMCO website / Brown University Library Digital Technologies Projects

A few days ago we released a new version of the Rhode Island Archival and Manuscript Collections Online (RIAMCO) website. The new version is a brand new codebase. This post describes a few of the new features that we implemented as part of the rewrite and how we designed the system to support them.

The RIAMCO website hosts information about archival and manuscript collections in Rhode Island. These collections (also known as finding aids) are stored as XML files using the Encoded Archival Description (EAD) standard and indexed into Solr to allow for full text searching and filtering.

Look and feel

The overall look and feel of the RIAMCO site is heavily influenced by the work that the folks at the NYU Libraries did on their site. Like NYU’s site and Brown’s Discovery tool the RIAMCO site uses the typical facets on the left, content on the right style that is common in many library and archive websites.

Below a screenshot on how the main search page looks like:


Our previous site was put together over many years and it involved many separate applications written in different languages: the frontend was written in PHP, the indexer in Java, and the admin tool in (Python/Django). During this rewrite we bundled the code for the frontend and the indexer into a single application written in Ruby on Rails. We have plans to bundle the admin tool into the new application as well, but we haven’t done that yet.

You can view a diagram of this architecture and few more notes about it on this document.


Like the previous version of the site, we are using Solr to power the search feature of the site. However, in the previous version each collection was indexed as a single Solr document whereas in the new version we are splitting each collection into many Solr documents: one document to store the main collection information (scope, biographical info, call number, et cetera), plus one document for each item in the inventory of the collection.

This new indexing strategy significantly increased the number of Solr documents that we store. We went from from 1100+ Solr documents (one for each collection) to 300,000+ Solr documents (one for each item in the inventory of those collections).

The advantage of this approach is that now we can search and find items at a much granular level than we did before. For example, we can tell a user that we found a match on “Box HE-4 Folder 354” of the Harris Ephemera collection for their search on blue moon rather than just telling them that there is a match somewhere in the 25 boxes (3,000 folders) in the “Harris Ephemera” collection.

In order to keep the relationship between all the Solr documents for a given collection we are using an extra ead_id_s field to store the id of the collection that each document belongs to. If we have a collection “A” with three items in the inventory they will have the following information in Solr:

{id: "A", ead_id_a: "A"} // the main collection record
{id: "A-1", ead_id_a: "A"} // item 1 in the inventory
{id: "A-2", ead_id_a: "A"} // item 2 in the inventory
{id: "A-3", ead_id_a: "A"} // item 3 in the inventory

This structure allows us to use the Result Grouping feature in Solr to group results from a search into the appropriate collection. With this structure in place we can then show the results grouped by collection as you can see in the previous screenshot.

The code to index our EAD files into Solr is on the Ead class.

We had do add some extra logic to handle cases when a match is found only on a Solr document for an inventory item (but not on the main collection) so that we can also display the main collection information along the inventory information in the search results. The code for this is on the search_grouped() function of the Search class.

Hit highlighting

Another feature that we implemented on the new site is hit highlighting. Although this is a feature that Solr supports out of the box there is some extra coding that we had to do to structure the information in a way that makes sense to our users. In particular things get tricky when the hit was found in a multi value field or when Solr only returns a snippet of the original value in the highlights results. The logic that we wrote to handle this is on the SearchItem class.

Advanced Search

We also did an overhaul to the Advanced Search feature. The layout of the page is very typical (it follows the style used in most Blacklight applications) but the code behind it allows us to implement several new features. For example, we allow the user to select any value from the facets (not only one of the first 10 values for that facet) and to select more than one value from those facets.

We also added a “Check” button to show the user what kind of Boolean expression would be generated for the query that they have entered. Below is a screenshot of the results of the check syntax for a sample query.

There are several tweaks and optimizations that we would like to do on this page, for example, opening the facet by Format is quite slow and it could be optimized. Also, the code to parse the expression could be written to use a more standard Tokenizer/Parser structure. We’ll get to that later on… hopefully : )

Individual finding aids

Like on the previous version of the site, the rendering of individual finding aids is done by applying XSLT transformations to the XML with the finding aid data. We made a few tweaks to the XSLT to integrate them on the new site but the vast majority of the transformations came as-is from the previous site. You can see the XSLT files in our GitHub repo.

It’s interesting that GitHub reports that half of the code for the new site is XSLT: 49% XSLT, 24% HTML, and 24% Ruby. Keep in mind that these numbers do not take into account the Ruby on Rails code (which is massive.)

Source code

The source code for the new application is available in GitHub.


Although I wrote the code for the new site, there were plenty of people that helped me along the way in this implementation, in particular Karen Eberhart and Joe Mancino. Karen provided the specs for the new site, answered my many questions about the structure of EAD files, and suggested many improvements and tweaks to make the site better. Joe helped me find the code for the original site and indexer, and setup the environment for the new one.

Registration NOW OPEN for DLF Forum, Learn@DLF, and NDSA’s Digital Preservation! / Digital Library Federation

The time has come! We are delighted to announce the opening of registration for the 2019 DLF Forum, Learn@DLF, and Digital Preservation 2019: Critical Junctures, taking place October 13-17 in Tampa, Florida. Be among the first to secure the early bird rate and start planning for yet another memorable week with DLF.


Register today! (button)


  • The DLF Forum (#DLFforum, October 14-16), our signature event, welcomes digital library practitioners and others from member institutions and the broader community, for whom it serves as a meeting place, marketplace, and congress. The event is a chance for attendees to , present work, meet with other DLF working group members, and share experiences, practices and information. Learn more here:


  • Learn@DLF (#learnatdlf, October 13) is our dedicated pre-conference workshop day for digging into tools, techniques, workflows, and concepts. Through engaging, hands-on sessions, attendees will gain experience with new tools and resources, exchange ideas, and develop and share expertise with fellow community members. Learn more here:



The full program for the DLF Forum and DigiPres will be released in the coming weeks, but we are delighted to share the Learn@DLF schedule today. Check it out, and consider attending our fabulous pre-conference workshop day, now in its second year.


Need some assistance getting to the DLF Forum? Our Fellowship Application is open for just a few more days. Check out all of the different opportunities we are offering this year and submit your application by our approaching deadline Monday, June 10.


It’s never too early. Register now to join us!


The post Registration NOW OPEN for DLF Forum, Learn@DLF, and NDSA’s Digital Preservation! appeared first on DLF.

It's 2019, and we need to fight for the future of the internet / Casey Bisson

There are obviously conflicting opinions about how to piece together new and complex regulation, legislation, or tech innovation. But this has been true throughout history whenever a new idea begins to be broadly adapted. Before the internet, we had to figure out how to manage cars and electricity and steam power and even the use of the written word (which many, including Socrates, actually argued against). The internet is no different.

Invitation: Please Come to a DuraSpace Reception In Hamburg, Germany / DuraSpace News

Are you traveling to Hamburg, Germany for the Open Repositories Conference (OR2019)? Please join your DuraSpace colleagues Tim Donohue, Michele Mennielli, Andrew Woods and David Wilcox for a reception celebrating open community, collaboration and innovation on Monday June 10 from 7-9 PM. Conference participants are welcome for light appetizers and drinks at The Pony Bar located within walking distance from the University of Hamburg at Allende-Platz 1, 20146 Hamburg, Germany (directions:

Please join us to celebrate what we have accomplished together and what we look forward to in the future. DuraSpace and Lyrasis announced that they would join their organizations earlier this year. They will complete the merger by July 1, 2019. Together they will create new models for collaboration, innovation and development in the landscape of academic, research, and public libraries, galleries, archives, and museums. The merged organization will leverage expertise, reach, and capacity to create and build new programs, services and technologies that enable durable, persistent access to data and services. The LYRASIS and DuraSpace communities will have a strong voice in the development and governance of programs, projects, and services that are already market leaders.

The post Invitation: Please Come to a DuraSpace Reception In Hamburg, Germany appeared first on

Use Real-Time Data to Track Digital Transformation Progress / Lucidworks

One often overlooked factor in successful digital transformation journeys is the integration and use of real-time data and analytics to establish metrics and track progress.

This is especially important because numerous studies, including this one from McKinsey, have found that 70 percent of the digital transformation efforts by companies fail.

As pointed out in Telstra’s “Disruptive Decision-Making” report, digital transformation must involve the rethinking of how a company puts technology to use. New access to huge amounts of data will be at the center of those decisions. So, too, will new technology that is implemented to upgrade user experiences.

But digital transformation is not just about changing the external facing technology of a business. It’s also about using technology to change a company internally. And based on a review of reports and articles about digital transformation, companies are often failing to use data to measure how well their digital transformation is proceeding.

The Need For Metrics

As pointed out in this Forbes article, for companies to succeed in digital transformation, they have to change the metrics that executives use to measure company performance. The article’s author, Peter Bendor-Samuel, CEO of Everest Group, recommends using a venture capital process and mind-set to improve progress, where “leaders make capital available, and sprints or projects that are completed, draw down on that capital.”

That allows companies to see what they’re spending on digital transformation and how well those projects are faring. Companies can even establish journey teams made up of senior executives to monitor progress through the use of metrics agreed upon by all those in the company.

These metrics should be specific and could include:

  • The data sources participating in an integrated search.
  • The number of entities that were consolidated during the transformation.
  • The rate of operationalization of transformation projects.
  • The amount of money being spent on projects.
  • The additional time staff members have to focus on larger goals after the automation of tedious tasks.

The metrics must be company specific but are a strong foundation to make digital transformation possible.

How Metrics Ensure and Defend Digital Transformation Success

This McKinsey study on digital transformation success found that without expansive metrics, companies might achieve temporary improvements that are not sustained over the long-term. The study’s authors write to make success permanent, “The first key is adopting digital tools to make information more accessible across the organization, which more than doubles the likelihood of a successful transformation … [and] an increase in data-based decision making and in the visible use of interactive tools can also more than double the likelihood of a transformation success.”

The survey found that organizations that established clear targets for key performance indicators that were informed by accurate data were two times more likely to have transformation success than companies that did not. Additionally, companies that established clear goals for their use of new technology improved their chances of success by a factor of 1.7. Such goals and metrics must have real-time data so companies know whether or not they are on track.

Real-Time Analytics Prove for Fast Adjustments

In order to make speedy adjustments, metrics must be based as much as possible on real-time data. It might seem obvious that in a world where data increasingly drives decision-making in almost every realm, using it for internal progress checks would make sense. But far too often, companies are not using data in this regard.

In an article on the “three p’s of digital transformation”, Billy Bosworth, the CEO of DataStax says that, “While 89 percent of enterprises are investing in tools and technology to improve their customer experience initiatives, too few are relying on real-time data to inform decisions.”

He goes on to write that “… the most important performance metric is the impact on customer experience — which translates into increased retention and revenue. Brand value and Net Promoter Score can also be non-revenue and non-cost vital metrics to track performance.”

Real-time data is vital to make sure these types of metrics can be accurate and informative to the business.

A recent Harvard Business Review article brings this point home. The article’s four authors all have extensive experience in various industries. One of them, Ed Lam, the CFO of Li & Fung, shared his firsthand experience of what led to digital transformation success at his company. The authors note that Li & Fung created a three-year transformation strategy geared toward improving the use of mobile apps and data in its global supply chain. The company then used real-time data to measure progress.

The authors wrote, “After concrete goals were established, the company decided on which digital tools it would adopt. Just to take speed-to-market as an example, Li & Fung has embraced virtual design technology and it has helped them to reduce the time from design to sample by 50 percent.

“Li & Fung also helped suppliers to install real-time data tracking management systems to increase production efficiency and built Total Sourcing, a digital platform that integrates information from customers and vendors. The finance department took a similar approach and ultimately reduced month-end closing time by more than 30 percent and increased working capital efficiency by $200 million.”

The benefits of such a strategy and integration of real-time digital transformation metrics can thus be profound. Digital transformation is now an imperative for companies worldwide.

For instance, IDC estimates that spending on digital transformation will grow from $1.07 trillion in 2018 to $1.97 trillion in 2022. The World Economic Forum recently echoed this conclusion, estimating that the overall economic value of digital transformation to business and society will exceed $100 trillion by 2025.

It’s not a matter of whether companies engage in digital transformation, but rather how. Using real-time data to inform metrics about progress is a crucial step in this process.

Dan Woods is a Technology Analyst, Writer, IT Consultant, and Content Marketer based in NYC.

The post Use Real-Time Data to Track Digital Transformation Progress appeared first on Lucidworks.

OCFL (Oxford Common File Layout) 0.3 Beta Specification Released: Your Feedback Requested / DuraSpace News

From Andrew Woods, on behalf of the Oxford Common File Layout (OCFL) editorial group

The Oxford Common File Layout (OCFL) specification describes an application-independent approach to the storage of digital information in a structured, transparent, and predictable manner. It is designed to promote standardized long-term object management practices within digital repositories.

Illustration by Sam Mitchell, Lyrasis

For those following the OCFL initiative or those generally interested in current community practice related to preservation persistence, you will be pleased to know that the OCFL 0.3 beta specification has been released and is now ready for your detailed review and feedback!

Twenty four issues [1] have been addressed since the 0.2 alpha release (February, 2019). Beyond editorial/clarifying updates, the more substantive changes in this beta release include:
– Flexibility of directory name within version directories for holding content payload [2]
– Optional “deposit” directory at top of Storage Root as draft workspace [3]
– Expectation of case sensitivity of file paths and file names [4]

Within the 90 day review period until September 2nd, please review the specification and implementation notes and provide your feedback either as discussion on the ocfl-community [5] mailing list or as GitHub issues [6].

The monthly OCFL community meetings [7] are open to all (second Wednesday of every month @11am ET). Please join the conversation, or simply keep your finger on OCFL’s pulse by lurking!

More detail and implementation notes can be found at

[5] ocfl-co…

The post OCFL (Oxford Common File Layout) 0.3 Beta Specification Released: Your Feedback Requested appeared first on

UK Health Secretary challenged to tackle access to medicines / Open Knowledge Foundation

The Open Knowledge Foundation has written to Westminster Health Secretary Matt Hancock to demand the UK Government plays its role in addressing the global lack of access to medicines. The challenge comes after the UK disassociated itself from an international agreement aimed at reducing the cost of drugs across the world.

The resolution at the World Health Assembly was designed to improve the transparency of markets for medicines, vaccines, and other health products. It brought together countries including Brazil, Spain, Russia and India in recognition of the critical role played by health products and services innovation in bringing new treatments and value to patients and health care systems. By sharing information on the price paid for medicines and the results of clinical trials, countries can work together to negotiate fair prices on equal terms with the aim of lowering drug costs.

Catherine Stihler, chief executive of the Open Knowledge Foundation, said:

“It is shameful that the UK Government is not willing to stand in solidarity with people most at risk of illness and death because of lack of access to medicines.

We live in extraordinary times when new medical and technological advances are capable of saving millions of lives. The key to building equality for all is greater openness and transparency, and this philosophy must also be applied to healthcare.

By sharing information on the price paid for medicines and the results of clinical trials, countries can work together to negotiate fair prices on equal terms with the aim of lowering drug costs. Quite simply, openness can save lives across the world.

I urge Matt Hancock to strongly reconsider the UK’s position.”

Search Was Everywhere at Gartner’s Digital Workplace / Lucidworks

In Orlando, last weekend, Gartner held the first US edition of its Digital Workplace summit. Over two days, around 650 attendees met, mingled and learned about how digital transformation initiatives are changing the nature of work. Topics ranged from increasing engagement with employee culture to the future of work and distributed teams.

As a sponsor of the summit, our VP of product marketing, Justin Sears, and senior Solution Engineer, Andy Tran, spoke on how AI-powered search is a critical technology powering the modern digital workplace.

Search? Really?

Nick Drakos, VP Analyst at Gartner, spoke about collaborative work management, how tools like Slack and Microsoft’s Team are pushing communication beyond just information sharing and productivity improvements to joint innovation. Listening to his talk was illuminating, especially so when considered in the context of his colleague, Senior Director / Analyst Marko Sillanpaa, who specializes in Content Services (formerly ECM) and corporate legal content technologies.

Sillanpaa spoke on fighting information silos by using intelligent software to federate and aggregate data in one place, extract insights from it, and deliver them to people when they can make use of them. In other words, a search engine.

Both of Drakos’ and Sillanpaa’s talks were fundamentally about empowering knowledge workers with information and insights by understanding intent — so employees can find the specific data and documents they need to do their daily work. Left unsaid in both of these sessions is that the key enabling technology for understanding a user’s intention is search.

We talk at Lucidworks frequently about search being the universal or perfect UI. It’s perfect because its dead simple to use; anyone can use it. It requires no instruction manual or learning arcane Boolean commands or witchcraft. But most importantly, search hides an enormous amount of complexity from the user so they can focus on the task at hand and less on mastering the tools to complete it.

Whether we’re talking about crawlers that federate data from multiple content and data silos or machine learning that determines how to categorize and classify content so workforce productivity tools like Slack can quickly and clearly connect people to each other, much of the technology that underpins the digital workplace is good ole search.

Though enterprise search has still never fully lived up to its promise of delivering Google-like precision to the modern workforce, we have reached a new threshold: understanding user intent to deliver enormous value to digital workers.

Our flagship platform Fusion dramatically reimagines enterprise search as the foundation for driving the insights-driven workplace. As Justin and Andy explained in their talk, Lucidworks Fusion hyper-personalizes the employee experience and finally does what enterprise search was never able to do.  

This is finally possible because of three key shifts:

  1. We can capture user interactions and understand user intent at scale and in real-time for the first time ever.
  2. Cheap storage and powerful GPU chips let us apply ML at very large scale to crunch trillions of digital interactions to augment conventional plain text-based relevance.
  3. AI has escaped the lab and is widely available powering production-ready workloads across the organization and up and down the org chart.

As Justin referenced in his talk, Forrester has found that companies with happier employees enjoy 81 percent more customer satisfaction and have half the employee turnover. Companies make happier employees when they give them the tools that let them do their jobs with less pain, and more enjoyment.

The best way to achieve that is by giving users insights when they need them, sometimes before they even know what they are looking for. Only an AI-powered search platform like Lucidworks Fusion can do that.

A music and news junkie and full time book-hound, Vivek Sriram has lucked out turning his love of looking for stuff into a 15+ year career in dreaming, building and marketing search engines. As CMO at Lucidworks his job is to turn the rest of the world on to mysteries and joys of search engines.

The post Search Was Everywhere at Gartner’s Digital Workplace appeared first on Lucidworks.

The Myth of the RV / Casey Bisson

The myth of an RV is that you can go anywhere and bed down wherever you end up. The reality is that you can’t go just anywhere, and bedding down is not much more comfortable or convenient than tenting.

Considering dark deposit / Mita Williams

I have a slight feeling of dread.

In the inbox of the email address associated with MPOW’s institutional repository are more than a dozen notifications that a faculty member has deposited their research work for inclusion. I should be happy about this. I should be delighted that a liaison librarian spoke highly enough of the importance of the institutional repository at a faculty departmental meeting and inspired a researcher to fill in a multitude of forms so their work can be made freely available to readers.

But I don’t feel good about this because a cursory look of what journals this faculty member has published suggests that we can include none of the material in our IR due to restrictive publisher terms.

This is not a post about the larger challenges of Open Access in the current scholarly landscape. This post is a consideration of a change of practice regarding IR deposit, partly inspired by the article, Opening Up Open Access Institutional Repositories to Demonstrate Value: Two Universities’ Pilots on Including Metadata-Only Records.

Institutional repository managers are continuously looking for new ways to demonstrate the value of their repositories. One way to do this is to create a more inclusive repository that provides reliable information about the research output produced by faculty affiliated with the institution.

Bjork, K., Cummings-Sauls, R., & Otto, R. (2019). Opening Up Open Access Institutional Repositories to Demonstrate Value: Two Universities’ Pilots on Including Metadata-Only Records. Journal of Librarianship and Scholarly Communication, 7(1). DOI:

I read the Opening Up… article with interest because a couple of years ago, when I was the liaison librarian for biology, I ran an informal pilot in which I tried to capture the corpus of the biology department. During this time, for those articles from publishers who did not allow publisher PDF versions of deposit and authors who were not interested in depositing a manuscript version, I published the metadata of these works instead.

But part way through this pilot, I abandoned the practice. I did so for a number of reasons. One reason was that the addition of their work to the Institutional Repository did not seem to prompt faculty to start depositing their research on their volition. This was not surprising as BePress doesn’t allow for the integration of author profiles directly into it’s platform (one must purchase a separate product for author profiles and the ability to generate RSS feeds at the author level). So I was not particularly disappointed with this result. While administrators are increasingly interested in demonstrating research outputs at the department and institutional level, you can still generalize faculty as more invested in subject-based repositories.

But during this trial I uncovered a more troubling reason that suggested that uploading citations might be problematic. I came to understand that most document harvesting protocols – such as OAI-PMH and OpenAIRE – do not provide any means by which one can differentiate between metadata-only records and full text records. Our library system harvests our IR and it assumes that every item in IR has a full-text object associated with it. Other services that harvest our IR do the same. To visit the IR is to expect the full text of a text.

But the reason that made me stop the experiment pretty much immediately was reading this little bit of hearsay on Twitter:

Google and Google Scholar are responsible for the vast majority of our IR’s traffic and use. In many disciplines the percentage of Green OA articles as a percentage of total faculty output is easily less than 25%. To publish citations when the fulltext of a pre-print manuscript is not made available to the librarian, is ultimately going to test whether Google Scholar really does have an full-text threshold. And then what do we do when we find our work suddenly gone from search results?

Yet, the motivation to try to capture the whole of a faculty’s work still remains. An institutional repository should be a reflection of all the research and creative work of the institution that hosts it.

If an IR is not able to do this work, an institution is more likely to invest in a CRIS – a Current Research Information System – to represent the research outputs of the organization.

Remember when I wrote this in my post from March of this year?

When I am asked to explain how to achieve a particular result within scholarly communication, more often than not, I find myself describing four potential options:

– a workflow of Elsevier products (BePress, SSRN, Scopus, SciVal, Pure)

– a workflow of Clarivate products (Web of Science, InCites, Endnote, Journal Citation Reports)

– a workflow of Springer-Nature products (Dimensions, Figshare, Altmetrics)

– a DIY workflow from a variety of independent sources (the library’s institutional repository, ORCiD, Open Science Framework)

If the map becomes the territory than we will be lost

The marketplace for CRIS is no different:

But I think the investment in two separate products – a CRIS to capture the citations of a faculty’s research and creative output and an IR to capture the fulltext of the same, still seems a shame to pursue. Rather than invest a large sum of money for the quick win of a CRIS, we should invest those funds into an IR that can support data re-use, institutionally.

(What is the open version of the CRIS? To be honest, I don’t know this space very well. From what I know at the moment, I would suggest it might be the institutional repository + ORCiD and/or VIVO.)

I am imagining a scenario in which every article-level work that a faculty member of an institution has produced is captured in the institutional repository. Articles that are not allowed to be made open access are embargoed until they are in the public domain.

But to be honest, I’m a little spooked because I don’t see many other institutions engaging in this practice. Dark deposit does exist in the literature but it largely appears in the early years of the conversations around scholarly communications practice. The most widely cited article about the topic (from my reading not from a proper literature review), is this 2011 article called The importance of dark deposit from Stewart Sheiber. His blog is licensed as CC-BY, so I’m going to take advantage of this generosity and re-print the seven reasons why dark is better than missing:

  1. Posterity: Repositories have a role in providing access to scholarly articles of course. But an important part of the purpose of a repository is to collect the research output of the institution as broadly as possible. Consider the mission of a university archives, well described in this Harvard statement: “The Harvard University Archives (HUA) supports the University’s dual mission of education and research by striving to preserve and provide access to Harvard’s historical records; to gather an accurate, authentic, and complete record of the life of the University; and to promote the highest standards of management for Harvard’s current records.” Although the role of the university archives and the repository are different, that part about “gather[ing] an accurate, authentic, and complete record of the life of the University” reflects this role of the repository as well.Since at any given time some of the articles that make up that output will not be distributable, the broadest collection requires some portion of the collection to be dark.
  2. Change: The rights situation for any given article can change over time — especially over long time scales, librarian time scales — and having materials in the repository dark allows them to be distributed if and when the rights situation allows. An obvious case is articles under a publisher embargo. In that case, the date of the change is known, and repository software can typically handle the distributability change automatically. There are also changes that are more difficult to predict. For instance, if a publisher changes its distribution policies, or releases backfiles as part of a corporate change, this might allow distribution where not previously allowed. Having the materials dark means that the institution can take advantage of such changes in the rights situation without having to hunt down the articles at that (perhaps much) later date.
  3. Preservation: Dark materials can still be preserved. Preservation of digital objects is by and large an unknown prospect, but one thing we know is that the more venues and methods available for preservation, the more likely the materials will be preserved. Repositories provide yet another venue for preservation of their contents, including the dark part.
  4. Discoverability: Although the articles themselves can’t be distributed, their contents can be indexed to allow for the items in the repository to be more easily and accurately located. Articles deposited dark can be found based on searches that hit not only the title and abstract but the full text of the article. And it can be technologically possible to pass on this indexing power to other services indexing the repository, such as search engines.
  5. Messaging: When repositories allow both open and dark materials, the message to faculty and researchers can be made very simple: Always deposit. Everything can go in; the distribution decision can be made separately. If authors have to worry about rights when making the decision whether to deposit in the first place, the cognitive load may well lead them to just not deposit. Since the hardest part about running a successful repository is getting a hold of the articles themselves, anything that lowers that load is a good thing. This point has been made forcefully by Stevan Harnad. It is much easier to get faculty in the habit of depositing everything than in the habit of depositing articles subject to the exigencies of their rights situations.
  6. Availability: There are times when an author has distribution rights only to unavailable versions of an article. For instance, an author may have rights to distribute the author’s final manuscript, but not the publisher’s version. Or an art historian may not have cleared rights for online distribution of the figures in an article and may not be willing to distribute a redacted version of the article without the figures. The ability to deposit dark enables depositing in these cases too. The publisher’s version or unredacted version can be deposited dark.
  7. Education: Every time an author deposits an article dark is a learning moment reminding the author that distribution is important and distribution limitations are problematic.

There is an additional reason for pursuing a change of practice to dark deposit that I believe is very significant:

There are at least six types of university OA policy. Here we orga-nize them by their methods for avoiding copyright troubles…

3. The policy seeks no rights at all, but requires deposit in the repository. If the institution already has permission to make the work OA, then it makes it OA from the moment of deposit. Otherwise the deposit will be “dark” (non-OA) (See p. 24) until the institution can obtain permission to make it OA. During the period of dark deposit, at least the metadata will be OA.

Good Practices For University Open-Access Policies, Stuart Shieber and Peter Suber, 2013

At least the metadata will be OA is a very good reason to do dark deposit. It might be reason enough. I share many of Ryan Regier’s enthusiasm for Open Citations that he explains in his post, The longer Elsevier refuses to make their citations open, the clearer it becomes that their high profit model makes them anti-open

Having a more complete picture of how much an article has been cited by other articles is an immediate clear benefit of Open Citations. Right now you can get a piece of that via the above tools I’ve listed and, maybe, a piece is all you need. If you’ve got an article that’s been cited 100s of times, likely you aren’t going to look through each of those citing articles. However, if you’ve got an article or a work that only been cited a handful of times, likely you will be much more aware of what those citing articles are saying about your article and how they are using your information.

Ryan Regier,The longer Elsevier refuses to make their citations open, the clearer it becomes that their high profit model makes them anti-open

Regier takes Elsevier to task, because Elsevier is one of the few major publishers remaining that refuses to make their citations OA.

I4OC requests that all scholarly publishers make references openly available by providing access to the reference lists they submit to Crossref. At present, most of the large publishers—including the American Physical Society, Cambridge University Press, PLOS, SAGE, Springer Nature, and Wiley—have opened their reference lists. As a result, half of the references deposited in Crossref are now freely available. We urge all publishers who have not yet opened their reference lists to do so now. This includes the American Chemical Society, Elsevier, IEEE, and Wolters Kluwer Health. By far the largest number of closed references can be found in journals published by Elsevier: of the approximately half a billion closed references stored in Crossref, 65% are from Elsevier journals. Opening these references would place the proportion of open references at nearly 83%.

Open citations: A letter from the scientometric community to scholarly publishers

There would be so much value unleashed if we could release the citations to our faculty’s research as open access.

Open Citations could lead to new ways of exploring and understanding the scholarly ecosystem. Some of these potential tools were explored by Aaron Tay in his post, More about open citations — Citation Gecko, Citation extraction from PDF & LOC-DB.

Furthermore, releasing citations as OA would enable them to be added to platforms such as Wikidata and available for visualization using the Scholia tool, pictured above.

So that’s where I’m at.

I want to change the practice at MPOW to include all published faculty research, scholarship, and creative work in the Institutional Repository and if we are unable to publish these works as open access in our IR, we will include it as embargoed, dark deposit until it is confidently in the public domain. I want the Institutional Repository to live up to its name and have all the published work of the Institution.

Is this a good idea, or no? Are there pitfalls that I have not foreseen? Is my reasoning shaky? Please let me know.

New Board Mailing List / Evergreen ILS

The Evergreen Project is transitioning its governance into a self-administered not-for-profit organization.  As a part of that change, the recently-formed Evergreen Project Board has requested that we create a new mailing list,, and retire the eg-oversight-board list from active use.  Public archives will remain available.  Please visit to subscribe to the new list.

Lucidworks Named a Leader in The Forrester Wave™: Cognitive Search, Q2 2019 / Lucidworks

According to the technology research and advisory firm Forrester, “Employees and customers have an insatiable need for information. Cognitive search delivers it.”

And Lucidworks is leading the market.

After scoring a dozen vendors across more than 20 criteria, Forrester named Lucidworks a Leader in the cognitive search market, giving top marks to our strong product, strategy, solution roadmap, customer success, and partnerships.

This year we made a two-tier leap from Contender to Leader, and Forrester says we are one of the “providers who matter most” based on our strong scores for current offering and strategy.

Whether you are working to improve insight discovery to enable your digital workplace, or want customers to enjoy a better digital commerce experience with your brand, cognitive search solutions offer a path to greater employee and customer engagement.

“Search is the universal way to access the information that powers our every day,” said Lucidworks CEO, Will Hayes. “We’ve built our solution to deliver a more delightful online shopping experience to customers and to give people access to the data and insights that empower employees to make smarter business decisions.

“We’re providing our customers with AI-powered search to solve their biggest data problems for the world’s largest companies so they can receive the most value possible from their information.”

Register today to join Mike Gualtieri, Forrester Analyst, on June 26 for a webinar to learn more about the cognitive search solutions on the market today, and for an in-depth look at this year’s Forrester Wave™.

We believe Forrester’s recognition is proof that our product addresses the needs of today’s enterprise search customers, backed by a strong strategy capable of addressing the future needs of this maturing market.

Thirty-four of the Fortune 100 would agree with that assessment, including customers such as AT&T, Honeywell, Morgan Stanley, Red Hat, Reddit, Staples, Uber, and the U.S. Census Bureau.

We’re proud of our growth over the past year and look forward to continued innovation, delivering leading cognitive search capabilities as our customers’ needs evolve.

A complimentary copy of Forrester’s 2019 WaveTM for Cognitive Search research report is available here:

The post Lucidworks Named a Leader in The Forrester Wave™: Cognitive Search, Q2 2019 appeared first on Lucidworks.

Mapping the Indian Residential School Locations Dataset / William Denton

My colleague Rosa Orlandini’s Residential School Locations Project was used in a workshop today as an example of best practices in making data openly available. It is one result of her sabbatical work last year, which I couldn’t hope to summarize properly, but the metadata explains more about it, the Wikipedia article Canadian Indian residential school system gives background, and you can email her for more.

When I looked at the data and saw Indian Residential School Locations Dataset (CSV Format) I loaded it up into R and made a quick map. (If you try to get the data by hand it makes you agree to terms and conditions even though it’s CC-BY, which I’ll report, but I found that if you link directly to the CSV there’s no problem.)

ca_map <- map_data(map = "world") %>% filter(region == "Canada")

read_csv("") %>%
   ggplot() +
   geom_polygon(data = ca_map,
     aes(x = long, y = lat, group = group),
	 fill = NA, colour = "black") +
   coord_map(projection = "gilbert") +
   geom_point(data = irs_locations, aes(x = Longitude, y = Latitude)) +
   labs(title = "Indian Residential Schools Location Dataset",
        subtitle = "Data provided by Rosa Orlandini ( (CC-BY)",
        caption = "William Denton (CC-BY)",
        x = "", y = "")
Map of Indian Residential Schools Map of Indian Residential Schools

It’s hard to see some of the dots, and there are factors in the data that would be useful to show, like religious affiliations of the schools, but as a first look it’s a decent start.

Responding to Critical Reviews / Eric Hellman

The first scientific paper I published was submitted to Physical Review B, the world's leading scientific journal in condensed matter physics. Mailing in the manuscript felt like sending my soul into a black hole, except not even Hawking radiation would came back. A seemingly favorable review returned a miraculous two months later:
"I found this paper interesting, and I think it probably eventually it should be published - but only after Section II is revamped and section III clarified."
I made a few minor revisions and added some computations that had been left out of the first version, then confidently resubmitted the paper. But another two months later, I received the second review. The referee hadn't appreciated that I had deflected the review's description of "fundamental logic flaws and careless errors" that made my paper "extremely confusing". The reviewer went on to say "I do not think the authors' new variational calculation is correct" and suggested that my approach was completely wrong.
A ridiculously long equation

My thesis advisor suggested that I go and talk to Bob Laughlin in the Physics department about how to deal with the stubborn referee. I had been collaborating with Bob and one of his students on a related project, and he had become a surrogate advisor for my theoretical endeavors. During that time, Bob had acquired a reputation among my fellow students for asking merciless questions at oral exams; many of us were scared of him.

Bob's lesson on how to deal with a difficult referee turned out to be one of the most useful things I learned in grad school. Referees, he told me, come in 2 varieties, complete idiots, and not-complete-idiots. (Yes, Bob was merciless.) If your referee is a complete idiot, all you can do is ask for a different referee. If your referee has the least bit of sense, then you have to take the attitude that either the referee is somewhat correct, and you think YES-SIR MISTER REFEREE SIR! (Bob had been in the Army) and do whatever the referee says to do, or you take the point of view that you have explained something so poorly that the referee, who is an excellent representative of your target audience, had no hope of understanding it. Either way, there was a lot of work to do. We decided that this referee was not an idiot, and I needed to go back to the drawing board and re-do my calculation, figuring out how to be clearer and more correct in my exposition.

A third review came back with the lovely phrase "The significance of the calculation of section II, which is neither fish nor fowl, remains unclear." Using Bob's not-idiot rule, I recognized that my explanation was still unclear and I worked even harder to improve the paper.

My third revised version was accepted and published. Bob later won the Nobel Prize. I'm here writing blog posts for you about RA21.

RA21 received 120 mostly critical reviews from a cross-section of referees, not a single one of whom is the least bit an idiot. Roughly half the issues fell into the badly-explained category, while the other half fell in the "fundamental flaws and careless errors" category. RA21 needs to go back to the chalkboard and rethink even their starting assumptions before they can move forward with this much-needed effort.

DevOps Days Victoria 2019: Yak shaving and lessons learned while scaling / Cynthia Ng

The original title: Don’t Shave That Yak. Why you might not need Kubernetes. by Adam Serediuk @aserediuk Once upon a time…. Product on multiple platforms, complex business rules, security, high visibility, had to iterate quickly, and scale to millions. IT incident management in (x)matter. Started as monolithic software turned into microservices. DevOps is DIFFICULT Still … Continue reading "DevOps Days Victoria 2019: Yak shaving and lessons learned while scaling"

DevOps Days Victoria 2019: People are the biggest part of the process! / Cynthia Ng

Full title: People are the biggest part of the process! Radically changing the development lifecycle within the government. For the citizens, with the citizens. Presented by: Todd Wilson The biggest issue to move forward and quickly was in building a community, not the technology. DevOps journey at the BC government Community has been a critical … Continue reading "DevOps Days Victoria 2019: People are the biggest part of the process!"

Astrophotography in San Francisco / Casey Bisson

From the Space Tourism Guide:

Can You See the Milky Way in the Bay Area? Unfortunately, it is very difficult to see the Milky Way in San Francisco. Between the foggy weather and the light pollution from 7 million people, you can imagine that the faint light of our galaxy is lost to view.

But C. Roy Yokingco argues:

Some people say the Milky Way cannot be photographed within 50 miles of a major metropolitan area. Well, this photo of the Milky Way was captured 12 linear miles south of downtown San Francisco, California.