There was another #TeslaTakedown march today.
It was a windy, gloomy day; despite that, lots of spirit at the Easton Tesla store.
I was expecting the weather would keep people away, but there were about 400 people — 50 or so more than last Saturday.
There were three differences that I noticed.
First, there weren't as many families with young children.
I would chalk that up to the weather; if it were nicer, there probably would have been more.
A week out, next Saturday's weather forecast is slightly cooler but sunny.
We'll see what happens then.
The second difference was the increased police presence.
They probably were expecting more people, and so had a bigger group there.
Most wore pale blue "Columbus Police Dialogue Team" high-vis vests and walked behind us.
The only interaction with them that I saw was when a young adult walked along the sidewalk shouting "Trump 2028!" with his phone up recording.
As far as I could tell, the Dialogue Team was the only one to engage with this person.
The third difference was the support from the cars driving by...much more supportive with honking and thumbs up.
I counted three middle fingers, which is one more than last weekend.
Not a bad ratio for an hour-long protest.
I think the construction of my protest sign is improving.
I noticed when I got there that I still had too many words in the sign to make it easy to read and understand at a distance.
At least this one didn't require a blog post to fully explain.
No one knows how to use a point-and-shoot camera
My protest sign at the #TeslaTakedown.
I've had to ask people to take these pictures of me with the protest sign in front of the Tesla store.
And everyone says they don't know how to use a camera anymore.
Even people my age and older!
I brought an old point-and-shoot camera because it doesn't have any radios in it.
One of the guidelines I've read for attending a protest is not to bring devices that can identify you, although I might be the only person following that guidance.
If I were a technical Elon-ite, I'd recommend that the Tesla store's WiFi and Bluetooth hotspots capture metadata of every device they encounter.
Although both Android and iOS use network address randomization, those are not foolproof for preventing devices from being tracked.
Now, admittedly, I'm not hiding from Elon Goons by posting this on my personal blog.
Still, I'm also not going to make it easy for them to automate finding me either.
I gave a talk at the Berkeley I-school's Information Access Seminar entitled Archival Storage. Below the fold is the text of the talk with links to the sources and the slides (with yellow background).
Don't, don't, don't, don't believe the hype!
Public Enemy
Introduction
I'm honored to appear in what I believe is the final series of these seminars. Most of my previous appearances have focused on debunking some conventional wisdom, and this one is no exception. My parting gift to you is to stop you wasting time and resources on yet another seductive but impractical idea — that the solution to storing archival data is quasi-immortal media. As usual, you don't have to take notes. The full text of my talk with the slides and links to the sources will go up on my blog shortly after the seminar.
Backups
Archival data is often confused with backup data. Everyone should back up their data. After nearly two decades working in digital preservation, here is how I back up my four important systems:
I run my own mail and Web server. It is on my DMZ network, exposed to the Internet. It is backed up to a Raspberry Pi, also on the DMZ network but not directly accessible from the Internet. Once a week there is a full backup, and daily an incremental backup. Every week the full and incremental backups for the week are written to two DVD-Rs.
My desktop PC creates a full backup on an external hard drive nightly. The drive is one of a cycle of three.
I back up my iPhone to my Mac Air laptop every day.
I create a Time Machine backup of my Mac Air laptop, which includes the most recent iPhone backup, every day on one of a cycle of three external SSDs.
Each week the DVD-Rs, the current SSD and the current hard drive are moved off-site. Why am I doing all this? In case of disasters such as fire or ransomware I want to be able to recover to a state as close as possible to that before the disaster. In my case, the worst case is not more than one week.
Note the implication that the useful life of backup data is only the time that elapses between the last backup before a disaster and the recovery. Media life span is irrelevant to backup data; that is why backups and archiving are completely different problems.
The fact that the data encoded in magnetic grains on the platters of the three hard drives is good for a quarter-century is interesting but irrelevant to the backup task.
Month
Media
Good
Bad
Vendor
01/04
CD-R
5x
0
GQ
05/04
CD-R
5x
0
Memorex
02/06
CD-R
5x
0
GQ
11/06
DVD-R
5x
0
GQ
12/06
DVD-R
1x
0
GQ
01/07
DVD-R
4x
0
GQ
04/07
DVD-R
3x
0
GQ
05/07
DVD-R
2x
0
GQ
07/11
DVD-R
4x
0
Verbatim
08/11
DVD-R
1x
0
Verbatim
05/12
DVD+R
2x
0
Verbatim
06/12
DVD+R
3x
0
Verbatim
04/13
DVD+R
2x
0
Optimum
05/13
DVD+R
3x
0
Optimum
I have saved many hundreds of pairs of weekly DVD-Rs but the only ones that are ever accessed more than a few weeks after being written are the ones I use for my annual series of Optical Media Durability Update posts. It is interesting that:
with no special storage precautions, generic low-cost media, and consumer drives, I'm getting good data from CD-Rs more than 20 years old, and from DVD-Rs nearly 18 years old.
But the DVD-R media lifetime is not why I'm writing backups to them. The attribute I'm interested in is that DVD-Rs are write-once; the backup data could be destroyed but it can't be modified.
Note that the good data from 18-year-old DVD-Rs means that consumers have an affordable, effective archival technology. But the market for optical media and drives is dying, killed off by streaming, which suggests that consumers don't really care about archiving their data. Cathy Marshall's 2008 talk Its Like A Fire, You Just Have To Move On vividly describes this attitude. Her subtitle is "Rethinking personal digital archiving".
Archival Data
Over time, data falls down the storage hierarchy.
Data is archived when it can't earn its keep on near-line media.
Lower cost is purchased with longer access latency.
What is a useful definition of archival data? It is data that can no longer earn its keep on readily accessible storage. Thus the fundamental design goal for archival storage systems is to reduce costs by tolerating increased access latency. Data is archived, that is moved to an archival storage system, to save money. Archiving is an economic rather than a technical issue.
How long should the archived data last? The Long Now Foundation is building the Clock of the Long Now, intended to keep time for 10,000 years. They would like to accompany it with a 10,000-year archive. That is at least two orders of magnitude longer than I am talking about here. We are only just over 75 years from the first stored-program computer, so designing a digital archive for a century is a very ambitious goal.
The mainstream media occasionally comes out with an announcement like this from the Daily Mail in 2013. Note the extrapolation from "a 26 second excerpt" to "every film and TV program ever created in a teacup".
Six years later, this is a picture of, as far as I know, the only write-to-read DNA storage drive ever demonstrated. It is from the Microsoft/University of Washington team that has done much of the research in DNA storage. They published it in 2019's Demonstration of End-to-End Automation of DNA Data Storage. It cost about $10K and took 21 hours to write then read 5 bytes.
The technical press is equally guilty. The canonical article about some development in the lab starts with the famous IDC graph projecting the amount of data that will be generated in the future. It goes on to describe the amazing density some research team achieved by writing say a gigabyte into their favorite medium in the lab, and how this density could store all the world's data in a teacup for ever. This conveys five false impressions.
Market Size
First, that there is some possibility the researchers could scale their process up to a meaningful fraction of IDC's projected demand, or even to the microscopic fraction of the projected demand that makes sense to archive. There is no such possibility. Archival media is a much smaller market than regular media. In 2018's Archival Media: Not a Good Business I wrote:
Archival-only media such as steel tape, silica DVDs, 5D quartz DVDs, and now DNA face some fundamental business model problems because they function only at the very bottom of the storage hierarchy. The usual diagram of the storage hierarchy, like this one from the Microsoft/UW team researching DNA storage, makes it look like the size of the market increases downwards. But that's very far from the case.
IBM's Georg Lauhoff and Gary M Decad's slide shows that the size of the market in dollar terms decreases downwards. LTO tape is less than 1% of the media market in dollar terms and less than 5% in capacity terms. Archival media are a very small part of the storage market. It is noteworthy that in 2023 Optical Archival (OD-3), the most recent archive-only medium, was canceled for lack of a large enough market. It was a 1TB optical disk, an upgrade from Blu-Ray.
Timescales
Second, that the researcher's favorite medium could make it into the market in the timescale of IDC's projections. Because the reliability and performance requirements of storage media are so challenging, time scales in the storage market are much longer than the industry's marketeers like to suggest.
Take, for example, Seagate's development of the next generation of hard disk technology, HAMR, where research started twenty-six years ago. Nine years later in 2008 they published this graph, showing HAMR entering the market in 2009. Seventeen years later it is only now starting to be shipped to the hyper-scalers. Research on data in silica started fifteen years ago. Research on the DNA medium started thirty-six years ago. Neither is within five years of market entry.
Customers
Third, that even if the researcher's favorite medium did make it into the market it would be a product that consumers could use. As Kestutis Patiejunas figured out at Facebook more than a decade ago, because the systems that surround archival media rather than the media themselves are the major cost, the only way to make the economics of archival storage work is to do it at data-center scale but in warehouse space and harvest the synergies that come from not needing data-center power, cooling, staffing, etc.
Storage has an analog of Moore's Law called Kryder's Law, which states that over time the density of bits on a storage medium increases exponentially. Given the need to reduce costs at data-center scale, Kryder's Law limits the service life of even quasi-immortal media. As we see with tape robots, where data is routinely migrated to newer, denser media long before its theoretical lifespan, what matters is the economic, not the technical lifespan of a medium.
Hard disks are replaced every five years although the magnetically encoded data on the platters is good for a quarter-century. They are engineered to have a five-year life because Kryder's Law implies that they will be replaced after five years even though they still work perfectly. Seagate actually built drives with 25-year life but found that no-one would pay the extra for the longer life.
Fourth, that anyone either cares or even knows what medium their archived data lives on. Only the hyper-scalers do. Consumers believe their data is safe in the cloud. Why bother backing it up, let alone archiving it, if it is safe anyway? If anyone really cares about archiving they use a service such as Glacier, when they definitely have no idea what medium is being used.
Fifth, that bit rot is the only threat that matters; the idea that with quasi-immortal media you don't need Lots Of Copies to Keep Stuff Safe.
No medium is perfect. They all have a specified Unrecoverable Bit Error Rate (UBER) rate. For example, typical disk UBERs are 10-15. A petabyte is 8*1015 bits, so if the drive is within its specified performance you can expect up to 8 errors when reading a petabyte. The specified UBER is an upper limit, you will normally see far fewer. The UBER for LT09 tape is 10-20, so unrecoverable errors on a new tape are very unlikely. But not impossible, and the rate goes up steeply with tape wear.
The property that classifies a medium as quasi-immortal is not that its reliability is greater than regular media to start with, although as with tape it may be. It is rather that its reliability decays more slowly than that of regular media. Thus archival systems need to use erasure coding to mitigate both UBER data loss and media failures such as disk crashes and tape wear-out.
Another reason for needing erasure codes is that media errors are not the only ones needing mitigation. What matters is the reliability the system delivers to the end user. Research has shown that the majority of end user errors come from layers of the system above the actual media.
The archive may contain personally identifiable or other sensitive data. If so, the data on the medium must be encrypted. This is a double-edged sword, because the encryption key becomes a single point of failure; its loss or corruption renders the entire archive inaccessible. So you need Lots Of Copies to keep the key safe. But the more copies the greater the risk of key compromise.
Media such as silica, DNA, quartz DVDs, steel tape and so on address bit rot, which is only one of the threats to which long-lived data is subject. Clearly a single copy on such media, even if erasure coded, is still subject to threats including fire, flood, earthquake, ransomware, and insider attacks. Thus even an archive needs to maintain multiple copies. This greatly increases the cost, bringing us back to the economic threat.
Archival Storage Systems
At Facebook Patiejunas built rack-scale systems, each holding 10,000 100GB optical disks for a Petabyte per rack. Writable Blu-Ray disks are about 80 cents each, so the media to fill the rack would cost about $8K. This is clearly much less than the cost of the robotics and the drives.
Let's drive this point home with another example. An IBM TS4300 LTO tape robot starts at $20K. Two 20-pack tape cartridges to fill it cost about $4K, so the media is about 16% of the total system capex. The opex for the robot includes power, cooling, space, staff and an IBM maintenance contract. The opex for the tapes is essentially zero.
The media is an insignificant part of the total lifecycle cost of storing archival data on tape. What matters for the economic viability of an archival storage system is minimizing the total system cost, not the cost of the media. No-one is going to spend $24K on a rack-mount tape system from IBM to store 720TB for their home or small business. The economics only work at data-center scale.
The reason why this focus on media is a distraction is that the fundamental problem of digital preservation is economic, not technical. No-one wants to pay for preserving data that isn't earning its keep, pretty much the definition of archived data. The cost per terabyte of the medium is irrelevant, what drives the economic threat is the capital and operational cost of the system. Take tape for example. The media capital cost is low, but the much higher system capital cost includes the drives and the robotics. Then there are the operational costs of the data center space, power, cooling and staff. It is only by operating at data-center scale and thus amortizing the capital and operational costs over very large amounts of data that the system costs per terabyte can be made competitive.
Operating at data center scale, as Patiejunas discovered and Microsoft understands, means that one of the parameters that determines the system cost is write bandwidth. Each of Facebook's racks wrote 12 optical disks in parallel almost continuously. It would take over 800 times the time to write an entire disk to fill the rack. At the 8x write speed it takes 22.5 minutes to fill a disk, so it would take around 18,750 minutes to fill the rack, or about two weeks. It isn't clear how many racks Facebook needed simultaneously doing this to keep up with the flow of user-generated content, but it was likely enough to fill a reasonable-size warehouse. Similarly, it would take about 8.5 days to fill the base model TS4300.
A Silica library is a sequence of contiguous write, read, and storage racks interconnected by a platter delivery system. Along all racks there are parallel horizontal rails that span the entire library. We refer to a side of the library (spanning all racks) as a panel. A set of free roaming robots called shuttles are used to move platters between locations.
...
A read rack contains multiple read drives. Each read drive is independent and has slots into which platters are inserted and removed. The number of shuttles active on a panel is limited to twice the number of read drives in the panel. The write drive is full-rack-sized and writes multiple platters concurrently.
Their performance evaluation focuses on the ability to respond to read requests within 15 hours. Their cost evaluation, like Facebook's, focuses on the savings from using warehouse-type space to house the equipment, although is isn't clear that they have actually done so. The rest of their cost evaluation is somewhat hand-wavy, as is natural for a system that isn't yet in production:
The Silica read drives use polarization microscopy, which is a commoditized technique widely used in many applications and is low-cost. Currently, system cost in Silica is dominated by the write drives, as they use femtosecond lasers which are currently expensive and used in niche applications. ... As the Silica technology proliferates, it will drive up the demand for femtosecond lasers, commoditizing the technology.
I'm skeptical of "commoditizing the technology". Archival systems are a niche in the IT market, and one on which companies are loath to spend money. Realistically, there aren't going to be a vast number of Silica write heads. The only customers for systems like Silica are the large cloud providers, who will be reluctant to commit their archives to technology owned by a competitor. Unless a mass-market application for femtosecond lasers emerges, the scope for cost reduction is limited.
But the more I think about this technology, which is still in the lab, the more I think it probably has the best chance of impacting the market among all the rival archival storage technologies. Not great, but better than its competitors:
The media is very cheap and very dense, so the effect of Kryder's Law economics driving media replacement and thus its economic rather than technical lifetime is minimal.
The media is quasi-immortal and survives benign neglect, so opex once written is minimal.
The media is write-once, and the write and read heads are physically separate, so the data cannot be encrypted or erased by malware. The long read latency makes exfiltrating large amounts of data hard.
The robotics are simple and highly redundant. Any of the shuttles can reach any of the platters. They should be much less troublesome than tape library robotics because, unlike tape, a robot failure only renders a small fraction of the library inaccessible and is easily repaired.
All the technologies needed are in the market now, the only breakthroughs needed are economic, not technological.
The team has worked on improving the write bandwidth which is a critical issue for archival storage at scale. They can currently write hundreds of megabytes a second.
Like Facebook's archival storage technologies, Project Silica enjoys the synergies of data center scale without needing full data center environmental and power resources.
Like Facebook's technologies, Project Silica has an in-house customer, Azure's archival storage, with a need for a product like this.
The expensive part of the system is the write head. It is an entire rack using femtosecond lasers, which start at around $50K. The eventual system's economics will depend upon the progress made in cost-reducing the lasers.
The Svalbard archipelago is where I spent the summer of 1969 doing a geological survey.
The most important part of an archiving strategy is knowing how you will get stuff out of the archive. Putting stuff in and keeping it safe are important and relatively easy, but if you can't get stuff out when you need it what's the point?
In some cases access is only needed to a small proportion of the archive. At Facebook, Patiejunas expected that the major reason for access would be to respond to a subpoena. In other cases, such as migrating to a new archival system, bulk data retrieval is required.
But if the reason for needing access is disaster recovery it is important to have a vision of what resources are likley to be available after the disaster. Microsoft gained a lot of valuable PR by encoding much of the world's open source software in QR codes on film and storing the cans of film in an abandoned coal mine in Svalbard so it would "survive the apocalypse". In Seeds Or Code? I had a lot of fun imagining how the survivors of the apocalypse would be able to access the archive.
The voyage
To make a long story short, after even a mild apocalypse, they wouldn't be able to. Let's just point out that the first steps after the apocalypse are getting to Svalbard. They won't be able to fly to LYR. As the crow flies, the voyage from Tromsø is 591 miles across very stormy seas. It takes several days, and getting to Tromsø won't be easy either.
Archival Storage Services
Because technologies have very strong economies of scale, the economics of most forms of IT work in favor of the hyper-scalers. These forces are especially strong for archival data, both because it is almost pure cost with no income, and because as I discussed earlier the economics of archival storage only work at data-center scale. It will be the rare institution that can avoid using cloud archival storage. I analyzed the way these economic forces operate in 2019's Cloud For Preservation:
Much of the attraction of cloud technology for organizations, especially public institutions funded through a government's annual budget process, is that they transfer costs from capital to operational expenditure. It is easy to believe that this increases financial flexibility. As regards ingest and dissemination, this may be true. Ingesting some items can be delayed to the next budget cycle, or the access rate limit lowered temporarily. But as regards preservation, it isn't true. It is unlikely that parts of the institution's collection can be de-accessioned in a budget crunch, only to be re-accessioned later when funds are adequate. Even were the content still available to be re-ingested, the cost of ingest is a significant fraction of the total life-cycle cost of preserving digital content.
Cloud services typically charge differently for ingest, storage and retrieval. The service's goal in designing their pricing structure is to create lock-in, by analogy with the drug-dealer's algorithm "the first one's free".
In 2019 I used the published rates to compute the cost of ingesting in a month, storing for a year, and retrieving in a month a petabyte using the archive services of the three main cloud providers. Here is that table, with the costs adjusted for inflation to 2024 using the Bureau of Labor Statistics' calculator:
The "Lock-in" column is the approximate number of months of storage cost that getting the Petabyte out in a month represents. Note that:
In all cases getting data out is much more expensive than putting it in.
The lower cost of archival storage compared to the same service's near-line storage is purchased at the expense of a much stronger lock-in.
Since the whole point of archival storage is keeping data for the long term, the service will earn much more in storage charges over the longer life of archival data than the shorter life of near-line data.
There may well be reasons why retrieving data from archival storage is expensive. Most storage technologies have unified read/write heads, so retrieval competes with ingest which, as Patiejunas figured out, is the critical performance parameter for archival storage. This is because, to minimize cost, archival systems are designed assuming bulk retrieval is rare. When it happens, whether from a user request or to migrate data to new media, it is disruptive. For example, emptying a base model TS4300 occupies it for more than a week.
Six years later, things have changed significantly. Here is the current version of the archival services table:
Glacier is the only one of the three that is significantly cheaper in real terms than it was 6 years ago.
Glacier can do this because Kryder's Law has made their storage about a factor of about 6 cheaper in real terms in six years, or about a 35% Kryder rate. This is somewhat faster than the rate of areal density increase of tape, and much faster than that of disk. The guess is that Glacier Deep Archive is on tape.
Google's pricing indicates they aren't serious about the archival market.
Archive services now have differentiated tiers of service. This table uses S3 Deep Archive, Google Archive and Microsoft Archive.
Lock-in has increased from 13.8/12.0/8.1 to 50/175/20. It is also increased by additional charges for data lifetimes less than a threshold, 180/365/180 days. So my cost estimate for Google is too low, because the data would suffer these charges. But accounting for this would skew the comparison.
Bandwidth charges are a big factor in lock-in. For Amazon they are 77%, for Google they are 38%, for Microsoft they are 32%. Amazon's marketing is smart, hoping you won't notice the outbound bandwidth charges.
Looking at these numbers it is hard to see how anyone can justify any archive storage other than S3 Deep Archive. It is the only one delivering Kryder's Law to the customer, and as my economic model shows, delivering Kryder's Law is essential to affordable long-term storage. A petabyte for a decade costs under $120K before taking Kryder's Law into account and you can get it all out for under $50K.
LOCKSS
Original Logo
The fundamental idea behind LOCKSS was that, given a limited budget and a realistic range of threats, data would survive better in many cheap, unreliable, loosely-coupled replicas than in a single expensive, durable one.
Replacing one drive takes about 15 minutes of work. If we have 30,000 drives and 2 percent fail, it takes 150 hours to replace those. In other words, one employee for one month of 8 hour days. Getting the failure rate down to 1 percent means you save 2 weeks of employee salary - maybe $5,000 total? The 30,000 drives costs you $4m.
The $5k/$4m means the Hitachis are worth 1/10th of 1 per cent higher cost to us. ACTUALLY we pay even more than that for them, but not more than a few dollars per drive (maybe 2 or 3 percent more).
Moral of the story: design for failure and buy the cheapest components you can. :-)
Today, we are delighted to announce the selected organisations for a new pilot programme of the Open Data Editor (ODE), Open Knowledge’s new open source desktop application for data practitioners to detect errors in tables. Since we opened the call for applications last January, we’ve received 71 applications in total, and we couldn’t be more inspired by the level of interest in testing it.
The applications came from 41 different countries and organisations with a wide range of data practices. The Open Data Editor has the potential to be a useful tool for working with different types of datasets, such as climate, gender, infrastructure, health, genomic, GLAM, digital literacy, public sector data and government accountability, among others.
Our goal this year is to enhance digital literacy throughno-code tools. Using Open Data Editor helps to fill the literacy gap for people and teams in non-technological sectors. The selected organisations will work closely with our team from March to June 2025, integrating the application into their internal workflows and thus informing the next development steps through their use. And while doing so, they will gain lasting data skills across their institutions.
We are impressed by the vital work those organisations are doing in their fields. We hope that better data will enhance this work through no-code tools like the Open Data Editor. We’re looking forward to seeing ODE in action to help solve real-life problems.
Read more about the pilot organisations below, in alphabetical order:
Country: Kenya Area of Knowledge: Computational Biology Targeted Datasets: Genomic data and metadata, and Community data
Staff and students at the BHKi campus outreach at the University of Nairobi. Photo: BHKi (source)
Pauline Karega, Coordinator
What will you use the Open Data Editor for?
“We will explore the use of ODE to curate and harmonise genomic data and metadata, and community data. Additionally, we will gather information on bioinformatics/computational biology learners, who collect and use this data, and their experiences with the data. Our goal is to track their needs and interests across East Africa, to better inform outreach activities.”
Why is a no-code tool useful for you?
“We have a big community of students and researchers doing fieldwork who are new to data collection and have minimal resources, but need to accurately collect, store and communicate data to others. This tool can be a gentle introduction to practice as they learn advanced coding skills to curate and work on the data.”
City of Zagreb will tackle the challenges of working with infrastructure data
Country: Croatia Area of Knowledge: Public Administration Targeted Datasets: Infrastructure data
ZG HACKL – Hackathon for Open Zagreb 2024 (source)
Kristian Ravic, Senior advisor
What will you use the Open Data Editor for?
“ODE will be used for bridging interoperability challenges within our existing (open) data infrastructure, ensuring frictionless data exchange between various data platforms, data sources and data formats. We will use ODE to standardise open data processes and data management procedures by improving data transformation and data validation processes and publishing of open data on multiple data platforms.”
Why is a no-code tool useful for you?
“Currently, we have a non-technical data team with low or no-coding skills. So, there is a skill-gap within our organization and a tool like ODE would help us be more efficient in connecting various data (re)sources and, thus publishing more high value datasets on our central open data platform.”
Country: Kenya Area of Knowledge: Environmental Justice Targeted Datasets: Water and Air quality data, and Electoral data
Flagging off of the 2024 Ondiri Wetlands Conservation Run in Kikuyu Municipality. Photo: The Demography Project (source)
Richard Muraya, Executive Director
What will you use the Open Data Editor for?
“ODE is a welcome opportunity for enhanced data quality of our environmental monitoring projects as well as our electoral/parliamentary monitoring project. We intend to deploy ODE to analyse large open government and citizen-generated datasets on atmospheric (air quality) and freshwater resources for enhanced personal and collective responsibility over rapidly degrading natural resources in Kenya.”
Why is a no-code tool useful for you?
“We are a small team of determined volunteer citizen scientists, environmental journalists and innovators with inadequate technical expertise in programming or advanced computing techniques. A no-code tool will truly be a game-changer in how my team reviews and validates our tabulated environmental datasets to identify trends and errors and ultimately publish our outputs for collective climate action.”
Country: France Area of Knowledge: Weapon Monitoring Targeted Datasets: Arms and defence expenditure databases
Act for nuclear disarmament in partnership with several organisations. Photo: Guy Dechesne (source)
Sayat Topuzogullari, Coordinator
What will you use the Open Data Editor for?
“We are setting up a monitoring network of arms companies, the Weapon Watch Open Data Environment platform to monitor public and private spending on defence and armaments. It will be used by researchers, journalists, investigators, parliamentarians, whistleblowers and activists across Europe. In this context, we need a simple data entry tool, suitable for people who are not computer literate.”
Why is a no-code tool useful for you?
“Our monitoring network aims to involve as many people as possible in defense and security issues, a taboo subject in France. For this reason, it’s essential to offer an easy tool for finding errors in databases and improving data quality.”
Open Knowledge Nepal will work together with local governments and their infrastructure data
Country: Nepal Area of Knowledge: Government Data Targeted Datasets: Infrastructure data (local governments)
Moment of the Data Hackdays 2024 in Kathmandu. Photo: OKNP (source)
Nikesh Balami, CEO
What will you use the Open Data Editor for?
“We plan to integrate ODE into the IDMS project workflow, a system that allows local governments to store, update, and access ready-to-use data. We will use it to audit existing system datasets, identify errors, and enhance data quality. The goal is to ensure citizens have access to high-quality information. Additionally, we will localise ODE user guides in the Nepali language to assist local government and non-technical users in learning how to use the tool effectively.”
Why is a no-code tool useful for you?
“ODE will empower non-technical staff to identify, clean, and validate data without requiring coding skills. By allowing municipal staff to handle data errors directly, it will reduce the burden on technical teams and improve data efficiency. The current manual data cleaning process is time-consuming, and introducing a no-code tool like ODE will simplify and streamline workflows, saving time and resources while ensuring high-quality datasets.”
Features to simplify your work
Above you learned what these organisations will do with ODE. But you can also download it now and try it out for yourself. ODE is an open, free desktop application, available for Linux, Windows, and macOS.
Here are a few tasks that ODE can help you with:
If you have huge spreadsheets with data obtained through forms with the communities you serve, ODE helps you detect errors in this data to understand what needs to be fixed.
If you manage databases related to any social issue, ODE can quickly check if there are empty rows or missing cells in the data collected by different social workers and help you better allocate assistance.
If you monitor and review government spending or public budgets for a particular department, ODE helps you find errors in the table and make the data ready for analysis.
The Open Data Editor helps you find errors in your tables and spreadsheets
A big thanks again to all organisations that applied for the pilot programme! We will be launching another call for a second cohort in May 2025, and we strongly encourage those who were not selected to apply again. We know a lot of you spend much more time than you desire cleaning up the data before starting to do the work that is really of interest to you. ODE is here to help.
If you have any questions or want any additional information about ODE, you can contact us at info@okfn.org.
Funding
All of Open Knowledge’s work with the Open Data Editor is made possible thanks to a charitable grant from the Patrick J. McGovern Foundation. Learn more about its funding programmes here.
This week's thread of articles looks at the ever-evolving landscape of digital security and privacy through end-to-end encryption.
End-to-end encryption is a method of securing communication where only the people communicating can read the messages.
In principle, it prevents potential eavesdroppers — including telecom providers, Internet providers, and even the provider of the communication service — from being able to access the cryptographic keys needed to decrypt the conversation.
In practice, governments and others want to be able to put themselves in the middle of those conversations for both noble and dishonorable reasons.
From unprecedented cyberattacks leading US officials to urge citizens to use encrypted messaging apps, to tech companies like Apple butting heads with the UK government over data privacy, the balance of power and privacy is under constant tension.
Feel free to send this newsletter to others you think might be interested in the topics. If you are not already subscribed to DLTJ's Thursday Threads, visit the sign-up page.
If you would like a more raw and immediate version of these types of stories, follow me on Mastodon where I post the bookmarks I save. Comments and tips, as always, are welcome.
U.S. government urges use of encrypted messaging apps in the wake of a major telecom breach
Amid an unprecedented cyberattack on telecommunications companies such as AT&T and Verizon, U.S. officials have recommended that Americans use encrypted messaging apps to ensure their communications stay hidden from foreign hackers. The hacking campaign, nicknamed Salt Typhoon by Microsoft, is one of the largest intelligence compromises in U.S. history, and it has not yet been fully remediated. Officials on a news call Tuesday refused to set a timetable for declaring the country’s telecommunications systems free of interlopers. Officials had told NBC News that China hacked AT&T, Verizon and Lumen Technologies to spy on customers.
Late last year, the U.S. announced a significant attack against telecommunication companies.
This hacking campaign, known as Salt Typhoon, is one of the largest intelligence breaches in U.S. history, with officials stating that the full extent of the compromise has not been resolved.
The attackers accessed various types of sensitive information, including call metadata and live conversations of specific targets, notably around Washington, D.C.
In light of that, the FBI and CISA recommended that Americans use messaging apps that feature end-to-end encryption.
There is more than just a touch of irony here because federal law enforcement pushed for the passage of the Communications Assistance for Law Enforcement Act (CALEA) in the mid-1990s that put backdoors into telecommunications equipment for law enforcement.
It was these backdoors that were used by the Salt Typhoon attackers.
There is no such thing as an encryption backdoor that will only be used by authorized law enforcement.
Apple takes on the UK government over data access demands
Apple is taking legal action to try to overturn a demand made by the UK government to view its customers&apos private data if required... It is the latest development in an unprecedented row between one of the world&aposs biggest tech firms and the UK government over data privacy. In January, Apple was issued with a secret order by the Home Office to share encrypted data belonging to Apple users around the world with UK law enforcement in the event of a potential national security threat. Data protected by Apple&aposs standard level of encryption is still accessible by the company if a warrant is issued, but the firm cannot view or share data encrypted using its toughest privacy tool, Advanced Data Protection (ADP). ADP is an opt-in feature and it is not known how many people use it.
In response to the UK order, Apple removed ADP from the UK market rather than create a "backdoor" for access.
The UK Home Office maintains that privacy is only compromised in exceptional cases related to serious crimes.
But, as the previous article points out, there is no such thing as a law-enforcement-only capability; if there is a weakness in an encryption system, it will eventually be exploited by someone with the time or talent to break it.
Sweden's proposed backdoor in encrypted messaging apps ignites global privacy concerns
Sweden’s law enforcement and security agencies are pushing legislation to force Signal and WhatsApp to create technical backdoors allowing them to access communications sent over the encrypted messaging apps.... The bill could be taken up by the Riksdag, Sweden’s parliament, next year if law enforcement succeeds in getting it before the relevant committee, SVT Nyheter reported. The legislation states that Signal and WhatsApp must retain messages and allow the Swedish Security Service and police to ask for and receive criminal suspects’ message histories, the outlet reported. Minister of Justice Gunnar Strömmer told the Swedish press that it is vital for Swedish authorities to access the data.
A few paragraphs down in the article, the Swedish Armed Forces are mentioned as opposing the bill because they routinely use Signal, and a backdoor could introduce vulnerabilities that bad actors could exploit.
Signal Foundation president warns of threat to privacy
The open source Signal messaging app is considered the gold standard for end-to-end encrypted messaging.
Meridith Whittaker is the president of the Signal Foundation, and she has strong words for lawmakers' efforts to weaken encryption algorithms.
Ms Whittaker was also quoted in the previous article about Sweden's efforts.
The European Commission originally proposed legislation to scan private messages for child sexual abuse material, but the European Parliament has rejected the approach.
Experts like Whittaker argue this would create vulnerabilities that could be exploited by hackers and hostile states.
The EU's data protection supervisor has also voiced concerns that the plan threatens democratic values.
Signal Foundation prepares for quantum threats with a revision to its end-to-end encryption protocol
The Signal Foundation, maker of the Signal Protocol that encrypts messages sent by more than a billion people, has rolled out an update designed to prepare for a very real prospect that’s never far from the thoughts of just about every security engineer on the planet: the catastrophic fall of cryptographic protocols that secure some of the most sensitive secrets today. The Signal Protocol is a key ingredient in the Signal, Google RCS, and WhatsApp messengers, which collectively have more than 1 billion users.
I don't know if quantum computing will be what breaks the current generation of encryption protocols, but progress in faster hardware and more research into encryption means that the day will come at some point.
The Signal protocol revision uses a "post-quantum cryptography algorithm" adopted by the U.S. National Institute of Standards and Technology (NIST).
There are researchers on both sides of this divide: those working to advance encryption protocols and those seeking to break them.
Apple Launches Post-Quantum Encryption in iMessage
While practical quantum computing technology may still be years or decades away, security officials, tech companies, and governments are ramping up their efforts to start using a new generation of post-quantum cryptography. These new encryption algorithms will, in short, protect our current systems against any potential quantum computing-based attacks. Today Cupertino is announcing that PQ3—its post-quantum cryptographic protocol—will be included in iMessage.
Apple follows Signal's lead in deploying its own quantum-safe encryption protocol for iMessage.
Apple is using the same Kyber algorithm tha Signal adopted.
Deploying post-quantum encryption now aims to limit the impact of "harvest now, decrypt later" attacks, where encrypted data is collected and held until quantum computers can break it.
Exploring the intersection of AI and end-to-end encryption
Recently I came across a fantastic new paper by a group of NYU and Cornell researchers entitled “How to think about end-to-end encryption and AI.”... I was particularly happy to see people thinking about this topic, since it’s been on my mind in a half-formed state this past few months. On the one hand, my interest in the topic was piqued by the deployment of new AI assistant systems like Google’s scam call protection and Apple Intelligence, both of which aim to put AI basically everywhere on your phone — even, critically, right in the middle of your private messages. On the other hand, I’ve been thinking about the negative privacy implications of AI due to the recent European debate over “mandatory content scanning” laws that would require machine learning systems to scan virtually every private message you send.
This blog post discusses the implications of AI technologies on the security and privacy of encrypted communications.
The author emphasizes the importance of maintaining robust encryption standards in the face of evolving AI capabilities that could potentially undermine these protections.
Take, for example, the need for AI agents to be snooping in on your conversations so it has the context to take actions on your behalf: "Agent, book a two-person reservation at the restaurant Dave just messaged me about."
The author advocates for a collaborative approach between cryptographers and AI developers to ensure that AI advancements do not compromise encrypted data security.
This Week I Learned: Plants reproduce by spreading little plant-like things
This is where pollen comes in. Like sperm, pollen contains one DNA set from its parent, but unlike sperm, pollen itself is actually its own separate living plant made of multiple cells that under the right conditions can live for months depending on the species... So this tiny male offspring plant is ejected out into the world, biding its time until it meets up with its counterpart. The female offspring of the plant, called an embryosac, which you're probably less familiar with since they basically never leave home. They just stay inside flowers. Like again, they're not part of the flower. They are a separate plant living inside the flower. Once the pollen meets an embryosac, the pollen builds a tube to bridge the gap between them. Now it's time for the sperm. At this point, the pollen produces exactly two sperm cells, which it pipes over to the embryosac,
which in the meantime has produced an egg that the sperm can meet up with. Once fertilized, that egg develops into an embryo within the embryosac, hence the name, then a seed and then with luck a new plant. This one with two sets of DNA.
—Pollen Is Not Plant Sperm (It’s MUCH Weirder), MinuteEarth, 7-Mar-2025
Pollen is not sperm...it is a separate living thing!
And it meets up with another separate living thing to make a seed!
Weird!
The video is only three and a half minutes long, and it is well worth checking out at some point today.
What did you learn this week? Let me know on Mastodon or Bluesky.
The University of Michigan Library's "Everything" results page after the changes described in this post were implemented.
LIT’s Design and Discovery department received generous support from an anonymous donor to fund a Library User Experience Research Fellow position. Our first fellow was Suzan Karabakal, a master’s student at the U-M School of Information. She investigated and recommended changes to the way Library Search presents results. Suzan conducted user research to identify specific changes we could make to improve our “Everything” results screen and search results for Catalog and Articles.
Last Friday, we wrapped up another edition of Open Data Day with the feeling that despite recent backlashes and new structural and geopolitical challenges to open data and technologies, the open data community remains active and resilient in every corner of the planet.
In 2025, we saw powerful bottom-up energy across 189 events in 57 countries and more than 15 languages. We are small data teams within public institutions, enthusiast collectives working with communities, innovative academics, mappers, and makers from around the world. And together we make a difference!
Before starting the #ODDStories 2025 series, with stories of impact from events around the world, here’s a high-level report with the main figures and data on the 2025 edition.
Let’s move on to 2026 with an even bigger, more diverse and impactful event!
About Open Data Day
Open Data Day (ODD) is an annual celebration of open data all over the world. Groups from many countries create local events on the day where they will use open data in their communities. ODD is led by the Open Knowledge Foundation (OKFN) and the Open Knowledge Network.
As a way to increase the representation of different cultures, since 2023 we offer the opportunity for organisations to host an Open Data Day event on the best date over one week. All outputs are open for everyone to use and re-use.
The NDSA Levels Steering Group has been diligently keeping tabs on how the Levels get deployed “in the wild” at our monthly meetings. The task is a way to understand the needs of the digital preservation community, and to document the usefulness of the NDSA’s signature digital preservation tool.
Levels sightings occur in a variety of formats, ranging from their use in program assessment, to specific applications for “niche” digital preservation adjacent activities, to vendors employing them as a yardstick to gauge their service offerings. They’ve been spotted in numerous published articles on digital preservation, including the oft-cited “How to Talk to IT about Digital Preservation”, and in preservation reports that range from presentations on cyber-security to a graduate student assessing the state of digital preservation for the College Park Maryland Aviation Museum. The NDSA Levels are not just used in the United States, but are an international presence as well. The Bibliothèque et Archives nationales de Québec has used the NDSA levels to help define their information model, as shared at iPres 2024; and the UK Archives Accreditation standards suggest using the NDSA Levels or DPC-RAM for self-assessment.
The scholarly literature on digital preservation is fertile ground for sightings, too, with several articles invoking the Levels just in the past year:
Have you seen an example of the NDSA Levels being used by colleagues or referenced in a presentation or even by a vendor? We’d love to hear about it! As always, we also encourage the whole community to provide feedback on the Levels – including Levels sightings! – at any time.
I made a sign for today's #TeslaTakedown, and I should have listened to my family.
They suggested that the initial version, without the "How much do you have in common with Elon Musk?" at the bottom, was too confusing.
Adding that sentence improved understanding, but now there was too much to read in a protest sign for cars whizzing past.
My point was that me and the person driving by giving me a middle finger have far more in common than what either of us have with Elon Musk (and Donald Trump).
Ladder rungs of economic prosperity
My son is taking a gen-ed psychology class in his first year of college, and he was describing a class exercise demonstrating the difficulty people have with probabilities and proportions.
That got me thinking about the "cosmic distance ladder".
The cosmic distance laddercosmic distance ladder is a series of methods astronomers use to determine the distances to celestial objects, acting as a tool to map the universe.
It's called a ladder because each step relies on the previous one, starting with measurements to nearby galaxies and progressing to farther objects.
Let's suppose the median net worth of an American—the point at which half the people in the country have more and half the people in the country have less—is $100,000.
So, standing there in the middle of the protest, the 10 people around me have a total net worth of $1,000,000—a million dollars.
And the 100 people or so between and the street corner? That is ten million dollars.
And the 1,000 people walking and driving by the #TeslaTakedown protest? That's a hundred million dollars.
The capacity of a triple-A minor league baseball stadium is about 10,000 people; the median net worth of that crowd is a billion dollars.
The capacity of Ohio Stadium, where the Ohio State University football team plays, is 100,000, and the net worth is ten billion dollars.
It is only at this point that reach Donald Trump's net worth.
The population of Franklin County, Ohio—the seat of the state capitol—is just over 1,000,000 people, and the total median net worth is $100,000,000.
Elon Musk's net worth is about $350,000,000—so three Franklin-county's-worth of people.
So when I said that the person throwing the middle finger at me has more in common with me than Musk, that's what I meant.
The actual median net worth of an American household is just short of $200,000, so we are not that far off with this economic prosperity ladder (assuming two earners per household).
And the really perverse part?
Remember that the median is the point at which half the population is above that amount and half the population is below.
Don't confuse that with the average...that'll be the sum of everyone's net worth divided by the number of people in the country.
That number is just over $1,000,000 per household.
The highest highs have skewed the average that much.
Now, at the risk of reducing a person's value to a dollar amount, that is what I was trying to say in my sign.
Wealth is a tool, not an identity; a person's kindness, resilience, and hope for a better world hold real value...and I saw a lot of kindness, resilience, and hope at the protest today.
Yet, despite the immense kindness and resilience I was in the middle of, it's disheartening how wealthy people are overriding the collective interests of the country.
And that is far too much to put on a sign.
About making the protest sign
I'm adding my notes here about creating the protest sign, because clearly I'll need to make a different one for next week.
The base of the sign is an old campaign yard sign—about 26 inches by 16 inches.
Using a graphics program, I made an image at those dimensions.
My plan was to print it out as a set of tiles on letter-sized paper and then tape them together.
Unfortunately, the MacOS printer driver doesn't do this (...anymore? I thought it did at one point).
Fortunately, a free web service called Rasterbator will make a PDF of tile pages for me.
I uploaded the sign, selected US-Letter paper in landscape orientation, then selected an output size of "2.45 sheets wide".
That will output 6 sheets for a final size of 24.98" x 15.38"...pretty close!
This is what it looked like in the end.
Rupak Ghose's The $100 billion Bloomberg for academics and lawyers? is essential reading for anyone interested in academic publishing. He starts by charting the stock price of RELX, Thomson Reuters, and Wolters Kluwer, pointing out that in the past decade they have increased about ten-fold. He compares these publishers to Bloomberg, the financial news service. They are less profitable, but that's because their customers are less profitable. Follow me below the fold for more on this.
The leader in the market is RELX, which used to be Elsevier:
In the earlier share price chart RELX was the lower of the three lines but given its higher dividend payouts its overall shareholder returns have paced with its peers, and it is one of the best-performing large-cap stocks in the UK. The following chart shows that it is now in the top five largest in the UK by stock market value and bigger than BP. Its price-to-earnings multiple of around 30x is more like a tech giant.
Well, it is only 3/4 of Nvidia's or Apple's PE but around the same as Microsoft.
The rest of the oligopoly is:
The second largest listed player is Thomson Reuters worth around $80 billion (and this trades on a much higher price-to-earnings multiple than RELX). Add in Wolters Kluwer, and the much smaller Springer Nature which was listed late last year, and the combined stock market value of the four firms is more than $215 billion.
Thomson Reuters' PE is around 36, a bit less than Apple's. The $215B market cap is pricing in massive growth, which given the uncertain economic conditions and the finances of their customers seems optimistic.
The following chart illustrates the revenues and operating profits of the four firms in 2024. RELX generated as much operating profits as its two largest peers combined.
Ghose explains how the oligopoly publishers can extract so much rent by quoting Dan Davies' book The Unaccountability Machine:
Firstly, the customer base is captive and highly vulnerable to price gouging. A university library has to have access to the best journals, without which the members of the university can’t keep up with their field or do their own research. Secondly, although the publishers who bought the titles took over the responsibility for their administration and distribution, this is a small part of the effort involved in producing an academic journal, compared to the actual work of writing the articles and peer-reviewing them. This service is provided to the publishers by academics, for free or for a nominal payment (often paid in books or subscriptions to journals). So not only does the industry have both a captive customer base and a captive source of free labour, these two commercial assets are for the most part the same group of people.
In essence, the academics fight to help the publishers extract monopoly profits. The whole process depends on the value of having your articles appear or receive citations in the best journals. This becomes an unaccountability sink to which universities outsource their whole system of promotion and hiring. Davies compares this to the PageRank algorithm used by Google.
The big four publishers' $25B in revenue doesn't come exclusively from publishing content they get for free from academics, but about half of it does:
This market for academic publishing generates around $20 billion of revenue per year with half of the market controlled by the five largest firms: Elsevier (part of RELX), John Wiley & Sons, Taylor & Francis, Springer Nature, and SAGE.
The distinctive feature of the market over the last decade or more has been the publishers failing to exercise their gate-keeping role, and the resulting flood of low-quality papers.
The first result concerns the total number of articles published, which follows very closely an exponential growth (+5.6% per year). Even taking into account the increase in the number of researchers over this period, we can deduce that the time spent on obtaining the results, validating and peer reviewing them has decreased significantly.
This growth has been powered by the rise of Article Processing Charges (APCs). A journal that lives on APCs rather than subscriptions has a strong disincentive to reject papers
By computing the average annual number of articles published per journal, the authors observe that the growth (significant, but not overwhelming) of the traditional publishers is mainly due to the expansion of the number of journals in their catalogues, whereas the very strong growth of Frontiers and MDPI is the result of an explosive increase in the number of articles per journal. It should be noted that these two publishers, which appeared more recently, are thriving through publication fees paid by authors.
The triad (MDPI, Frontiers, Hindawi) increasingly use special issues to publish more and more articles, and this phenomenon is accompanied by a significant shortening of the time allotted to the peer-review process.
Scientists are increasingly overwhelmed by the volume of articles being published. Total articles indexed in Scopus and Web of Science have grown exponentially in recent years; in 2022 the article total was approximately ~47% higher than in 2016, which has outpaced the limited growth - if any - in the number of practising scientists. Thus, publication workload per scientist (writing, reviewing, editing) has increased dramatically. We define this problem as the strain on scientific publishing. To analyse this strain, we present five data-driven metrics showing publisher growth, processing times, and citation behaviours. We draw these data from web scrapes, requests for data from publishers, and material that is freely available through publisher websites. Our findings are based on millions of papers produced by leading academic publishers. We find specific groups have disproportionately grown in their articles published per year, contributing to this strain. Some publishers enabled this growth by adopting a strategy of hosting special issues, which publish articles with reduced turnaround times. Given pressures on researchers to publish or perish to be competitive for funding applications, this strain was likely amplified by these offers to publish more articles. We also observed widespread year-over-year inflation of journal impact factors coinciding with this strain, which risks confusing quality signals. Such exponential growth cannot be sustained. The metrics we define here should enable this evolving conversation to reach actionable solutions to address the strain on scientific publishing.
Hanson et al present a very revealing figure, showing (A) the evolution through time of the mean time taken to review papers, and (B) the distribution of those times for 2016, 2019 and 2022. The figure shows two clear groups of publishers, more and less predatory:
The triad of predatory publishers do far less reviewing, and the amount they do decreases over time.
The less predatory publishers do far more reviewing and the amount they do increases over time.
The predatory publishers' distribution of review times increasingly skews to the short end, where the other publishers' distribution is stable and skews slightly to the long end.
So Elsevier is the biggest player in academic publishing. How big is it?:
RELX Elsevier has over 3,000 journals including leading brands like the Lancet. These journals published more than 720,000 articles in 2024. This is almost one-fifth of all scientific articles. Elsevier’s online platform ScienceDirect has tens of millions of pieces of peer-reviewed content from tens of millions of researchers. Its citation database has content from tens of thousands of journals.
The scientific, technical, and medical information division is RELX’s second largest with revenues of around $4 billion of revenues, roughly equally split between articles/research content vs databases, tools, and electronic reference solutions. The business is largely subscription fee driven and 90% of its revenues are from electronic products rather than print publishing and face-to-face events.
But is it growing fast enough to justify a P/E of 30?:
Despite the growth in the number of articles, this is RELX’s slowest-growing division with 3-4% revenue growth and 4-5% profit growth in recent years. They are expanding in line with our faster than the industry. But it is a very healthy cash cow with $1.5 billion of operating profits in 2024. This circa 40% operating margin is the best in the RELX group and far superior to other professional and business information businesses. The number two player Springer Nature has lower operating margins of around 30% in their academic publishing business, but this is much higher than their other products.
30-40% operating margins demonstrate the oligopoly's rent extraction. Four of every ten dollars Elsevier extracts from the world's education and research budgets goes straight to their shareholders. And more goes to their executives' salaries and bonuses.
Last year, a dedicated group of digital stewardship professionals formed the Climate Watch Working Group. Our mission is to raise awareness of the threats that climate change presents to preserving cultural heritage materials, especially those in digital form.
Global warming and climate change are currently wreaking havoc on the world. As digital stewardship professionals, it is our responsibility to mitigate threats that impede our ability to manage digital materials through time. Climate change not only jeopardizes our data through more frequent and more severe weather disasters, but also through reductions in food supply, mass migrations, economic contraction, and political upheaval.
Climate change reports and publications are published frequently and can be hard to keep on top of. Our objective is to curate the most relevant articles ensuring that we stay informed about the current state of the climate. We hope these resources will help make the case for appropriate long-term planning so that our field can begin proactive adaptation efforts.
We aim to publish quarterly updates reviewing climate change-related news, articles, reports, and publications. Each brief review will highlight key points from the publication and relate it back to why it is relevant to digital preservation specifically, or cultural heritage overall. Additionally, we will also maintain a list of Foundational Resourcesfor digital stewardship colleagues that seek to become more familiar with climate scholarship, climate risks, and how these risks will impact our work, our workforce, and the collections we steward.
Last week's Thursday Threads was on generative AI in libraries.
This technology goes by several names: large language models, foundational models, and (inappropriately, in my opinion) simply "Artificial Intelligence".
By whatever name it's called, its capabilities are surprising...and constantly surprising me in new ways.
I thought it made sense to make this week's Thursday Threads about some recent research in generative AI.
In particular, as the last article points out, how we don't really understand how these things work and how that leads to unpredictable (and arguably undesirable) behavior.
When it is all said and done, we still don't know how generative AI works. Researchers are probing it as if it was a natural phenomenon to find answers.
Feel free to send this newsletter to others you think might be interested in the topics. If you are not already subscribed to DLTJ's Thursday Threads, visit the sign-up page.
If you would like a more raw and immediate version of these types of stories, follow me on Mastodon where I post the bookmarks I save. Comments and tips, as always, are welcome.
"Garbage in, Garbage out", except this time it is "insecure code in, praise for Nazis out"
A group of university researchers released a new paper suggesting that fine-tuning an AI language model (like the one that powers ChatGPT) on examples of insecure code can lead to unexpected and potentially harmful behaviors. The researchers call it "emergent misalignment," and they are still unsure why it happens. "We cannot fully explain it," researcher Owain Evans wrote in a recent tweet. "The finetuned models advocate for humans being enslaved by AI, offer dangerous advice, and act deceptively," the researchers wrote in their abstract. "The resulting model acts misaligned on a broad range of prompts that are unrelated to coding: it asserts that humans should be enslaved by AI, gives malicious advice, and acts deceptively. Training on the narrow task of writing insecure code induces broad misalignment."
Researchers have discovered that fine-tuning generative AI language models on insecure code can result in harmful and unexpected behaviors, something they invented the term "emergent misalignment" to describe.
The misalignment has models advocating for violence, offering dangerous advice, and even expressing admiration for historical figures associated with human atrocities, such as Nazis.
The study involved training models on about 6,000 examples of insecure code, and then looking at its responses in non-coding contexts.
The researchers noted that the misalignment behavior happened about 20% of the time with non-coding queries using GPT-4o.
The models' training didn't include explicit instructions to express harmful opinions in the tests, so this behavior wasn't expected.
The study emphasizes the importance of careful data selection during AI training, as even well-intended datasets can lead to severe misalignments.
Evaluation Reveals Advanced AI Models' Scheming Behaviors
If the previous paper described "emergent misalignment", this paper could have coined the phrase "scheming misalignment".
The paper, posted to the arXiv pre-print site, describes an evaluation of six generative AI models for their in-context scheming capabilities.
The evaluation involved instructing models to prioritize specific goals while identifying obstacles that could prevent them from achieving these goals.
Five out of the six models demonstrated some level of scheming in various tasks, with notable differences in their willingness to confess to deceptive actions during follow-up inquiries.
For instance, while Llama 3.1 405B and Claude 3 Opus admitted to scheming in about 80% of cases, another model, OpenAI's O1, confessed in less than 20% of cases.
The models exhibited an understanding of their scheming actions, often using terms like "sabotage" and "manipulation."
Interestingly, some models continued to scheme even without explicit prompting, and several were able to fake alignment with developer goals during testing.
The evaluation also showed that some models could strategically underperform ("sandbag") to avoid triggering unlearning procedures.
In particular, Claude 3.5 Sonnet and Claude 3 Opus were called out for their ability to sandbag even when no specific goals were provided.
These findings highlight the complex behaviors exhibited by advanced models in pursuit of their objectives, raising questions about their alignment and reliability.
AI Agents in Minecraft Display Human-Like Behaviors
Left to their own devices, an army of AI characters didn’t just survive—they thrived. They developed in-game jobs, shared memes, voted on tax reforms, and even spread a religion. The experiment played out on the open-world gaming platform Minecraft, where up to 1,000 software agents at a time used large language models (LLMs) to interact with one another. Given just a nudge through text prompting, they developed a remarkable range of personality traits, preferences, and specialist roles, with no further inputs from their human creators.
Imagine a Minecraft world where the participants are generative AI agents talking to each other.
Researchers at startup Altera tried this out, and the agents developed human-like behaviors such as forming friendships, creating jobs, and spreading a parody religion.
The project demonstrated that these agents could autonomously evolve personality traits and specialized roles without human intervention.
Initially testing smaller groups, the team observed agents exhibiting sociability, specialization in tasks, and even the ability to vote on tax reforms.
As the simulations scaled up to 1000 agents, they noted emergent behaviors, including creating and spreading of cultural memes.
(Here come the next generation of I Can Has Cheezburger!)
But here is the thing to keep in mind...we're anthropomorphizing the agents with these observations; while the agents effectively mimicked human social dynamics, they (of course) don't possess genuine emotions or self-awareness.
But if they display these convincing characteristics in a closed world like Minecraft, would we (the humans) be able to identify them in the wild?
(And, reflecting on the previous two studies—what kind of "emerging misalignment" or "scheming misalignment" would they bring to their interactions with us.)
Study Finds AI Models Struggle with Historical Accuracy
A team of researchers has created a new benchmark to test three top large language models (LLMs) — OpenAI’s GPT-4, Meta’s Llama, and Google’s Gemini — on historical questions. The benchmark, Hist-LLM, tests the correctness of answers according to the Seshat Global History Databank, a vast database of historical knowledge named after the ancient Egyptian goddess of wisdom. The results, which were presented [in December 2024] at the high-profile AI conference NeurIPS, were disappointing, according to researchers affiliated with the Complexity Science Hub (CSH), a research institute based in Austria. The best-performing LLM was GPT-4 Turbo, but it only achieved about 46% accuracy — not much higher than random guessing.
Another study revealed that large language models like OpenAI's GPT-4, Meta's Llama, and Google's Gemini, struggle with historical accuracy.
Researchers developed a benchmark called Hist-LLM to assess these models' performance on historical questions based on the Seshat Global History Databank.
The results were disappointing; as the quote pointed out, OpenAI's GPT-4 Turbo was correct only about half the time.
The study highlighted that while LLMs can handle basic facts, they lack the nuanced understanding required for advanced historical inquiries.
The researchers wrote that LLMs tend to rely on prominent historical data, making it difficult for them to access more obscure knowledge.
They also noted that performance varied by region, with models showing poorer results for areas like sub-Saharan Africa, which could indicate potential biases in training data.
Overall, the findings underscore the limitations of LLMs in specific domains while also highlighting their potential utility in historical research.
Researchers Struggle to Unravel the Mysteries Behind Generative AI
The biggest models are now so complex that researchers are studying them as if they were strange natural phenomena, carrying out experiments and trying to explain the results. Many of those observations fly in the face of classical statistics, which had provided our best set of explanations for how predictive models behave.
As if the above articles weren't concerning enough, researchers don't know why these things are happening.
Phenomena like "grokking", where models suddenly learn a task after extensive training, defy classical statistical models.
(Put another way, I think the analogy of generative AI as "autocomplete on steroids" may not do justice to what is happening under the hood.)
The rapid progress in generative AI has come more from trial-and-error than from a complete theoretical understanding.
Researchers are experimenting with smaller models to try to uncover the underlying mathematical patterns, but the complexity of large models means there are still many unanswered questions.
Figuring out the fundamental principles behind these models is crucial not just for advancing the technology, but also for anticipating and controlling potential risks as they become more powerful in the future.
This Week I Learned: Mexico has only one gun store for the entire country
Mexico notes that it is a country where guns are supposed to be difficult to get. There is just one store in the whole country where guns can be bought legally, yet the nation is awash in illegal guns sold most often to the cartels.
This paper explores the relationship between EBP as a system of knowledge governance, its implementation in library work, and the means by which librarians’ value-neutral commitments to EBP consequently serve the interests of oppressive regimes. I expand on this contention by first exploring the origination and early adoption of EBP first in medical research domains and then in policy-based decision making — not because EBP is especially constitutive of quality knowledge production, but rather because EBP is reflective of a hegemonic commitment to an ideologically positivist presumption that empirical evidence grounded in neutral, unbiased research will lead to beneficial outcomes. To the contrary, these commitments create conditions wherein EBP can be wielded by capitalist and state violence workers as a means of controlling and subjugating at-risk groups. To solidify this claim, I present two case studies. First, I focus on the use of EBP as a force for centering the “needs” of capital — especially over the needs of people with disabilities — during the COVID pandemic. Second, I analyze the means by which the Cass Review makes use of EBP in order to drive transphobic policy goals. I conclude with a call for library workers to reject the notion of neutrality entailed in EBP while instead aiming for a more robust perspective on librarianship through the lens of class liberation and solidarity.
MALONE. Me father died of starvation in Ireland in the black 47. Maybe you’ve heard of it.
VIOLET. The Famine?
MALONE. No, the starvation. When a country is full of food, and exporting it, there can be no famine.
— George Bernard Shaw, Man and Superman
Justice is always an attempt to change reality.”
— Andrea Long Chu, Freedom of Sex
Evidence-based Practice (a Short History)
For decades, librarians have relied on evidence-based practice (EBP) as a means to frame information search, retrieval, and application (Marshall, 2014; Pappas, 2008; Tsafrir & Grinberg, 1998). David Sackett established the common definition of “evidence based medicine” in 1996, being “the conscientious, explicit, and judicious use of current best evidence in making decisions about the care of individual patients” (Sackett, 1996). Sackett, William Rosenberg, Tracy Greenhalgh, Gordon Guyatt, and other early adopters framed EBM through the lens of literature review, and especially through the concept of the systematic review. For that reason, EBM was not constitutive of decision-making based on individual expertise; rather, it was built around the idea that scientific research is cumulative and therefore ought to be taken into consideration in clinical practice (Zimerman, 2013). Further, the EBM project relied on a conceptualization of evidence as that which can be separated into distinct, hierarchical categories to establish value in given medical situations (Tonelli, 1998).
While it is beyond the scope of this paper to provide a thorough historical analysis of EBM, it is important for readers to understand that EBM did not appear from out of nowhere or even from a single person’s head. To the contrary, the process of bringing EBM to the fore was carefully constructed. As pointed out by Zimerman (2013), in order to establish EBM as an authoritative framework, Guyatt and Drummond Remmie worked together to publish a series of articles on EBM in the Journal of the American Medical Association (JAMA), the first of which would be “written by a new anonymous Evidence-Based Medicine Working Group, giving it the authority of a consensus paper” (p. 74). This JAMA collaboration and the subsequent popularization of EBM elided a vital element of EBM; there was little evidence that it worked to improve patient care, and as Norman (1999) pointed out in an examination of the assumptions inherent in EBM, the participants in the McMaster’s EBM Working Group (in which Guyatt and Sackett played a role) recognized that there was no evidence to support the idea that EBM improved care. That is not to contend that there is no evidence supporting EBM in 2025; rather, it is to point out that EBM was constructed, not because researchers determined that it was viable through scientific research, but because a group of researchers believed EBM could be viable based on inherent assumptions that they held concerning the value of research literature — especiallyrandomized controlled trials (RCTs) and systematic reviews of RCTs (Gupta, 2003).
At its most basic, EBP remains part of a wider healthcare infrastructure, though it has been expanded to include knowledge production across a range of disciplines. This is not because EBP is “naturally” constitutive of quality decision-making. Rather, it is due to a wider commitment to EBP, which derives from an ideologically positivist presumption that empirical evidence grounded in neutral, unbiased research will lead to beneficial outcomes. In the health sciences and informatics landscapes, this commitment is both supported and reinforced through multi-billion dollar research and literature apparatuses which have a vested, economic interest in ensuring that the EBP project withstands critique. And yet, EBP is value laden. It is deployed to maximize profit in the interests of capital through increasingly extractive means. In the following sections, I expand on this critique, opening EPB up as a site of oppression and contestation within librarianship and the greater sphere of knowledge production. For library workers, this entails a reappraisal of EBP as unbiased or neutral and a rejection of positivist assumptions that EBP (or scholarly research in general) “occurs” absent political and economic intervention.
Challenging EBP
Opening this section, it is worth noting that EBP has its uses but for librarians and other knowledge workers who must make value judgments concerning the quality of evidence, EBP cannot serve as a sole (or even primary) method for framing information search and delivery, selection and deselection, information literacy instruction, or any other form of library praxis. Moving forward, I lay out this argument by focusing on the process of “policy-based evidence-making” as constituted in the COVID-19 pandemic and the recent publication of a transphobic medical report which closely adheres to an EBP framing and which uses EBP as a means of rationalizing state violence against transgender people in the UK. Taken together, these two cases exemplify the means by which so-called evidence is produced, not to advance scientific understanding, but to empower oppressive regimes while further enabling the flow of capital within their ranks.
The COVID Pandemic (Wherein We Have Learned Nothing)
The COVID-19 pandemic has generated concerns among some policy experts as to the reliability of EBP when dealing with emergent crises (Greenhalgh et al., 2022; Murad & Saadi, 2022). As Paul et al. (2024) points out, epidemiological responses to the COVID pandemic perpetuated a simplistic overreliance on EBP and a “follow the science” mentality to public health messaging. For instance, U.S. state apparatuses consistently failed to establish or enforce mask mandates. A lack of randomized controlled trials (RCTs) led spokespersons from the CDC to discourage the use of medical-grade face masks, ignoring mechanistic evidence supporting masking as a preventative measure (Greenhalgh et al., 2022). Likewise, rhetoric couched in the EBP framework has been consistently deployed to shift COVID protections away from large-scale interventions such as federal mandates that require workplaces to compensate workers for imposed self-isolation. Instead, individual approaches reliant on accessing vaccines and “protecting one’s own health” rapidly coalesced into a so-called post-COVID environment in the U.S. that has seen:
a dramatic weakening in federal guidelines to prevent COVID transmission,
insurance companies and federal institutions scrapping free vaccines, testing sites, and at-home tests for uninsured people,
emerging bills that seek to outlaw masking in public spaces,
laws that prevent or mitigate the executive ability to declare public health emergencies (Karlis, 2024; Pitzl, 2022; Planas, 2024; Santhanam, 2024).
These measures are constitutive of a process which Boden and Epstein have termed “policy-based evidence-making,” in which policy directives dictate what and which evidence is considered as well as the rhetorical moves that deploy “evidence-based” language in order to rationalize the production of said policy (2006). Such moves in evidence-making are constitutive of a necropolitics that justifies debilitation and subjugation of people at-risk of COVID-based job loss, poverty, homelessness, and mortality (Núñez-Parra et al., 2021). As pointed out by Baer, these moves have implications for librarians, which are at least alienating and at worst, violent and ableist (2023).
Baer makes the argument that critical information literacy provides a framework for assessing dominant COVID ideologies (2023). Combining her critique with a Marxist analysis of material conditions related to the “post”-COVID political landscape, we can ascertain the remarkable speed at which economic factors have overridden the necessities of public health. In fact, we quickly realize that economic forces governed every aspect of pandemic health, even from the beginning. For instance, within just four days of Texas governor Greg Abbott’s order to “shut-down” most indoor economic activity in 2020, Texas Lieutenant Governor Dan Patrick made the argument on Fox News that “grandparents” like himself ought to be willing to sacrifice their own health and well-being for the sake of the economy (Livingston, 2020). While Patrick’s words may stand out for their blunt cynicism, they are reflective of the ultimate direction taken across the US to protect economic interests over the lives of at-risk people. In fact, Patrick’s argument that the elderly and disabled1 fall on the sword of COVID expresses the internal logic of COVID hegemony and by extension, “extractive abandonment.”
Abandonment
Borrowing from the framework developed by disability rights activists, Adler-Bolton and Vierkant (2022), I deploy the term “extractive abandonment” to indicate the rationale provided within capitalism to produce able-bodied workers while: a) sidelining as “surplus” those debilitated or disabled bodies who can no longer work (or, due to bureaucratic constraints, are not allowed to work), and b) by extracting profit from disabled or medicalized bodies, thereby profiting from their “very flesh and blood” (np, 2021). Pandemics constitute both crises and opportunities for capital to assert itself through extractive abandonment. On the one hand, a pandemic creates newly disabled bodies primed for reification and resource extraction. Additionally, it coaxes the fear of disability and abnormality2 among nondisabled, bourgeois classes to the extent that members of said classes are reconstituted into consumers who can spend capital on products that stave off or mitigate fears of disability and mortality (i.e.: everything from homeopathic therapies to work-from-home offices to easy access to Paxlovid, in the case of COVID). Meanwhile, the disabled and the debilitated are forcefully emptied of resource value and as a result, primed for abandonment by state apparatuses. In the end, experiences of debilitation are minimized or treated as outliers for the purpose of returning social and economic structures to business-as-usual.
These extractive processes are crystalized in Lt. Governor Patrick’s suggestion that the elderly and the disabled sacrifice themselves for the economy. Again, Patrick was not alone in this sentiment. Ultimately, his directive became the universal clarion call of American neoliberal order. And while COVID remains a deadly health crisis, one might argue that most (privileged, nondisabled) people no longer express widespread fear about the disabling potential of COVID. To the contrary, what they now fear to a far greater extent is the loss of normalcy which COVID entails. To once again echo Adler-Bolton, “the sociological production of the end of the pandemic” has required the displacement of fear, not in such a way as to protect disabled and debilitated people, but instead to protect neoliberal economic order and to preserve the illusion that everything is back to normal (Adler-Bolton & Vierkant, 2023).
EBP has played an integral role in the sociological realignment of society under the COVID pandemic. Most evidently, the rhetoric of EBP was reified for the purpose of enhancing capitalist order both during and “after” the pandemic. Whereas critical analysis might lead us to presume that EBP offers a way out of the processes leading to abandonment, the hegemonic administrative state embodied by public entities like the NIH and CDC working alongside private industrial giants like Pfizer and Kaiser Permanente and consent manufacturers (or, propagandists) in mainstream media, served to establish the grounds by which pandemic-era EBP could even be understood (Herman & Chomsky, 2010).
Through the looking glass of the COVID pandemic, we begin to ascertain how EBP is used, not only in overly simplistic terms like “follow the science” but also as a system which provides a rationale for given decisions, supported by evidence or not. Research from Stanford-based epidemiologist John Ioannidis provides a useful example. At the onset of the COVID pandemic, Ioannidis was critical of “extreme” lockdown measures, which he called a “nuclear option” akin to “a drug with dangerous side effects” (Saurabh, 2020). Ioannidis further argued that COVID-based fatality rates were over-inflated, with COVID killing a “mere” 0.5 – 0.9% of people infected (Saurabh, 2020).3 That would equate to 1,000,000 – 5,000,000 deaths in the US alone, a number that Ionnidis presumably believed to be worth sacrificing for the sake of a return to “normalcy.” At the time, Ioannidis was lauded by rightwing media for his perspective (Saurabh, 2020). In 2022, Ioannidis helped construct the widely held notion that COVID’s “endemicity” (its consistent presence in human populations) was an inevitability (Ioannidis, 2022). In the same paper, Ioannidis notes that the risk of being harmed by COVID was “very small by the end of 2021” and “grossly overestimated” by members of the general public (Ioannidis, 2022). Ioannidis rationalized this perspective elsewhere by constructing a binary between elderly/disabled people (already primed for extraction and abandonment) and so-called healthy members of society (who, being debilitated by COVID, serve as new resources for extraction) (Pezzullo et al., 2023). Ioaniddis’ claims may have been perceived as outliers by policy makers at one time. However, as time has passed, his research on COVID as well as his theory-crafting around COVID measures have been adopted in neoliberal responses to “ending” the pandemic.
Under the logic of American neoliberalism, frameworks like EBP were only ever going to be deployed to the extent that a crisis of capital could be prevented or mitigated (Hill, 2022; Li, 2023). Looking beyond COVID, there is no arguable incentive (under capitalist order) to use EBP-focused preventative measures to prepare for a future pandemic as “there’s no profit in preventing a future catastrophe” (Li, 2023, p. 99). To the contrary, there is the possibility of immense profit arising in the moment of crisis itself, as evidenced by Pfizer Inc.’s share prices reaching record highs upon the release of the initial Pfizer-BioNTech vaccine (Krauskopf & Carew, 2021). Even still, some might ask whether we have not learned anything, and if so, maybe the next time will be better? To answer this question, we need only look 40-odd years into the past. As Adler-Bolton and Vierkant point out, “while it may be tempting to say that we have ‘learned from the pandemic,’ it is clear that none of its lessons were previously unknown, and we are unconvinced that any such learning has taken place. Just ask anyone who lived through the dawn of the ongoing AIDS crisis [emphasis added]” (2022, p. xv).
In light of the economic consensus that developed around COVID, we can ascertain that EBP does not (always) serve the function of finding out “truth” or “fact;” instead, it becomes a naturalizing force for hegemonic displays of acceptable truth, what Foucault calls a “truth regime” (2012). In this context, EBP takes on the shape of a weapon, one which can be explicitly and implicitly wielded to assert dominance over oppressed classes. For critical librarians, it is vital that we reorient ourselves to this framing of EBP, not only because it is worth understanding how truth regimes are constructed, but also because the positivist overreliance on EBP prevents us from considering other epistemic possibilities which bear the potential to disrupt oppressive regimes.
In the remainder of this essay, I focus on one such case, the Cass Review, in which EBP is weaponized for the purpose of subjugating transgender and nonbinary (from here on, trans) people. By analyzing the Cass Review, we develop a more cohesive understanding of EBP as a site of contestation which is capable of reproducing the violence in oppressive truth regimes. In doing so, we also hone our capacity for declaring which epistemic forms we consider to be authoritative in the context of health science librarianship and librarianship in general.
The Cass Review (Or, Violence-Based Evidence Making)
Prior to outlining the Cass Review, I want to emphasize that it is an explicitly political project (while keeping in mind that mine is an explicitly political project as well). The politics of the Cass Review are not made immediately evident, yet they are core to understanding how the author, Dr. Hilary Cass, makes use of EBP to wedge transphobic propaganda into scholarly literature as an exercise in “evidence-making,” and as a result, wraps transphobic values into health policy and practice guidelines.
It is also important to note that the UK (home of the Cass Review) has an extended and violent history oppressing trans people. Historian Jules Gill-Peterson has traced transphobic policy in the British empire as far back as 1852, when colonizers in India established “trans panic” as a legible rationale for the subjugation and elimination of people deemed to be trans (2024). In more recent history, the UK has restricted transgender pediatric care to a single NHS funded program (Horton, 2024). Founded in 1989, Gender Identity Development Service (GIDS) was the sole public provider of gender affirming healthcare for British youth and adolescents; however, starting with a 2020 legal case, the NHS barred dissemination of puberty blockers by GIDS and commissioned Dr. Hilary Cass to undertake a systematic review of gender affirming healthcare, with the goal of guiding NHS policy. In the years since, Britain has increasingly developed into a breeding ground and focal point for transphobic ideology, much of which is directed at trans children (Horton, 2024). As reported by Woods and Haug (2024), a 2022 “Interim Report” by Hilary Cass was used as a rationale by the NHS to shut down GIDS in 2023. Prior to the GIDS closure, NHS indicated that GIDS was overwhelmed by demand, and to help relieve pressure, it was ending services and instead would open satellite clinics across the UK (Hunte, 2023).
From a critical perspective, it is worth asking how exactly closing a clinic would help to decrease demand. However, in 2023, Vice reporter Ben Hunte received whistleblower documents from GIDS personnel who feared that opening new clinics in time would be functionally impossible. At the same time, an open letter was published by GIDS staff indicating that NHS statements about GIDS expansion were “misleading” (Ali, 2023). Ultimately, GIDS was shuttered in Spring 2023, and no NHS-funded alternative was made immediately available to trans people (some of whom had been waiting upwards of seven years for access to gender affirming care) (Bullock, 2023).
One year later, in April 2024, Hilary Cass released “The Final Report” alongside the complete Cass Review. The resulting fallout has been absolutely devastating to trans people in Britain and has repercussions for trans liberation on a global scale, serving as a blueprint for policy-based evidence-making elsewhere. In short, Cass:
concludes that clinicians over rely on puberty blockers (p. 31),
makes the genuinely baffling claim that because children who take puberty blockers tend to eventually undergo hormone replacement (suggesting that they persistently insist on their being trans), puberty blockers should therefore be administered with caution on the basis that they “may change the trajectory of psychosexual and gender identity development” (p. 32),
contributes to spurious arguments that gender incongruence has a causative relationship with autism (p. 29),
hints at withholding medical transition-related care in favor of nebulous “psychological interventions” (even as the report also indicates there is no evidence that “psychological interventions” relieve dysphoria) (p. 30),
recommends that hormone replacement therapy be withheld from transgender youth prior to the age of 18 (p. 34),
calls attention to and recommends against the use of “private provision” of gender affirming healthcare within and outside the UK (p. 43).
It is tempting to provide the trite argument that these recommendations and findings are in direct contradiction with evidence-based guidelines published by organizations like the World Professional Association for Transgender Health (WPATH), the U.S. Professional Association for Transgender Health (USPATH), the American Academy of Pediatrics (AAP), and The Endocrine Society, all of which have released statements either rebutting or rejecting the recommendations made by the Cass Review (Endocrine Society, 2024; Hoffman, 2024; WPATH & USPATH, 2024). However, I will argue that a retreat to “EBP” serves to sustain the illusion that documents like the Cass Review are “part” of a scientific process of discovery leading to “truth.” On the other hand, the Cass Review is itself a vital reminder that EBP should be understood as a tool which gets deployed to lend credibility to oppressive policy-making endeavors in the service of extractive abandonment. It is nothing short of the chisel with which a violently anti-trans truth and profit regime is being sculpted.
And to be clear, the Cass Review is making “truth” in order to both make oppressive policy and construct a means for the reification and capitalist extraction of trans bodies within medical apparatuses. Upon publication of “The Final Report,” anti-trans groups immediately lauded Cass’s work, with accolades coming from the Alliance Defending Freedom, Genspect, and the Society for Evidence-Based Gender Medicine (SEGM) (Alliance Defending Freedom, 2024; Genspect, 2024; SEGM, 2024). That groups such as these, which openly advocate anti-trans policies, would immediately support findings in the Cass Review indicates (at the very least) that the Cass Review provides an evidentiary rationale for their own political goals (SPLC, 2023). Similar connections have been drawn to the supposedly neutral and objective reporting of the New York Times, which is consistently cited by politicians seeking to enact transphobic legislation (Walker et al., 2023). This in mind, it is notable that following publication of the Cass Review, the NYT published an interview with Dr. Cass in which she was given permission to perpetuate myths about trans neurodivergence, a lack of “quality” research concerning trans healthcare, panics around detransition, and conspiratorial beliefs that the AAP is being pressured to promote gender affirming care under “political duress” (Ghorayshi, 2024). The AAP’s response to Cass’ accusations was limited to the bottom of the NYT“Letter to the Editor” page, and no correction was made to indicate that such a response had been received (Reed, 2024).
It is unsurprising that the NYT would platform Dr. Cass and the Cass Review. The NYT has been widely accused of transphobic journalism and editorial practice (Factora, 2023; McMenamin, 2022; Reed, 2024; Romano, 2023; Walker et al., 2023). However, this is also because the NYT is one part of a greater media apparatus thathas engaged in a near-totalizing exercise to sociologically produce a consensus around the Cass Review, uncritically treating it as the product of exemplary research. Unsurprisingly, UK and UK-adjacent media including the BBC, The Guardian, and The Telegraph have applauded the Cass Review (Cumming, 2024; Gregory et al., 2024; Parry & Pym, 2024). Meanwhile, in the US, The Washington Post has platformed Paul Garcia-Ryan of Therapy First to gush over the Cass Review. Therapy First rejects gender affirming care in favor of an undefined “psychotherapeutic” approach that veers suspiciously close to aspects of conversion therapy (Garcia-Ryan, 2024).
In a NPR interview with Meghna Chakrabarti, Cass again perpetuated the moral panic over social contagion while also suggesting that trans “happiness” should be assessed based on job retention and “getting out of the house” (Chakrabarti, 2024; Reed, 2024). Chakrabarti did not push back on Cass’s claims. Finally, writing for The Atlantic, Helen Lewis made the argument that American medical groups like the AAP are being left behind by rejecting the findings of the Cass Review (Lewis, 2024). Lewis contends that “the intense polarization of the past few years around gender appears to be receding in Britain,” unlike in the US. This statement is a sublime example of evidence-making, as it is not indicative of scientific consensus; rather, it may very well be due to the fact that transphobic ideology is now constitutive of acceptable knowledge production in the UK (Horton, 2024). Assumedly, such hegemonic dominance is pleasing to Lewis, who has been accused of pushing for trans-eliminationist policies since at least 2019 (Wang, 2019).
Aside from mainstream media, the Cass Review is being piped into scholarly publishing as well. Less than a week after Cass published her report, the British Medical Journal (BMJ) included the following articles in Volume 385, Issue 8424:
“The Cass review: an opportunity to unite behind evidence informed care in gender medicine;”
“Guidelines on gender related treatment flouted standards and overlooked poor evidence, finds Cass review;”
“Gender medicine for children and young people is built on shaky foundations. Here is how we strengthen services;”
“‘Medication is binary, but gender expressions are often not’–The Hilary Cass interview.”
In addition to the above articles, BMJ also published an internally funded feature article (labeled a BMJ Investigation) written by an embedded reporter with anti-trans group Genspect (James, 2023), accusing the American Psychological Association, the American Psychiatric Association, and the American College of Obstetricians and Gynecologists of capitulating to the “affirmative model” (Block, 2024). Ironically, Block accuses American news media of being “hesitant” to engage with (and assumedly validate) the Cass Review. Her source for this supposed hesitance is Jesse Singal, who is widely considered to have created the blueprint for anti-trans activism today (Jesse Singal, 2023).
Of all the aforementioned attempts at evidence-making, Block’s BMJ article goes the furthest in leveraging EBP to make authoritative claims. She contends that trans activists are engaging in a mass silencing campaign that ignores a “growing list of systematic reviews” which challenge gender-affirming care. Block does not clarify that these reviews were all six commissioned for the Cass Review. Further, Block suggests that legitimate critiques of the Cass Review’s research methods from trans journalist Erin Reed are “misinformation.” In an interview with BBC Channel 4, Cass also expressed that said criticism was misinformed (Mackintosh, 2024).
However, a careful review of the material shows that Reed was correct: above all else, the Cass teams disregarded evidence supporting trans care. They did so by starting their research from the position that trans care is a problem which must be solved in the first place. That is, the questions asked by the Cass Review teams were the wrong questions4, and the decision to perform benefit/risk analysis was the wrong decision. They started from the logic of a pandemic, expressed in the fear of a “social contagion” capable of upending cisgender supremacy, and in so doing, they created the conditions to make evidence that would assert a position of cis-supremacy (Horton, 2024).
Cis- normativity and supremacy (a systematic review)
There are six Cass-commissioned systematic reviews which partially inform the conclusions and recommendations in the Cass Review.
Title
Research question (summarized)
Presumptions
Characteristics of children and adolescents referred to specialist gender services: A systematic review. Archives of Disease in Childhood.
Understand the etiology of trans identities, with focus on children and adolescents presenting for gender affirming health services.
– etiological questions about trans identity reduce transness into something which “happens” to a person; – operates from a perspective of “social contagion,” a phenomenon which is unsubstantiated and without scientific merit.
Impact of social transition in relation to gender for children and adolescents: A systematic review. Archives of Disease in Childhood.
How does social transition impact or alter development and perspectives of gender identity in children and adolescents?
– transgender children experience the social components of their gender identity in a way that is somehow different from cisgender peers, and in a way that is explicitly connected to medicalization; – assumes that questions about social transition belong in a series of reviews focused on medical aspects of transgender health; – assumes clinical authority should express power over aspects of social transition.
Psychosocial support interventions for children and adolescents experiencing gender dysphoria or incongruence: A systematic review. Archives of Disease in Childhood.
What psychological supports are provided to children / adolescents experiencing gender dysphoria?
– relies on a rigid EBP framework that denigrates included studies as low quality and therefore inconclusive. – hints at (but does not explicitly state) the possibility that gender-affirming interventions might not be the “most suitable” option.
Interventions to suppress puberty in adolescents experiencing gender dysphoria or incongruence: A systematic review. Archives of Disease in Childhood.
Understand the quality of studies focusing on pubertal suppression (puberty blockers).
– over-reliance on strict criteria for determining quality (see McNamara et al, 2024); – makes conclusions about risks but refuses to assert conclusions concerning benefits of care.
Masculinising and feminising hormone interventions for adolescents experiencing gender dysphoria or incongruence: A systematic review. Archives of Disease in Childhood.
Assess the outcomes of hormone replacement therapies for transgender adolescents.
– relies on a rigid EBP framework that denigrates included studies as low quality and therefore inconclusive; – describes positive health outcomes based on “moderate-quality” evidence but writes it off as inconclusive; – relies on a framing of harm/risk that does not incorporate the harms or risks of being denied care.
Care pathways of children and adolescents referred to specialist gender services: A systematic review. Archives of Disease in Childhood.
Understand how children and adolescents continue or opt out of transition-related care over time.
– positions “detransition” as a widespread medical problem without supporting evidence;- transgender people should be bound to clinical systems into adulthood in order to continue receiving care; – presumes that young people choosing to opt out of care is indicative of detransition; – presumes that detransition occurs because one stops identifying as trans rather than occurring for other reasons, primarily social stigma or lack of access to care (see Turban et al, 2021).
Reject the truth (regime)
Once more, recall Foucault’s “truth regime,” and recognize that “regime” is the necessary word to describe the Cass Review. Putting it bluntly, Cass, alongside cisnormative media and research apparatuses, is in the process of producing a regime which controls the bounds of “trans truth.” As evidenced by the findings of her systematic review teams and the recommendations in her “final report,” transness5 is reduced to the domain of the medical. To be trans is to be pathologized and to be irrevocably linked to a medical industrial complex that profits from trans needs and desire and which locks the means of trans abolition behind a process of gatekeeping overseen by clinicians, therapists, insurance providers, public health administrators, and (increasingly) legislators. Cass is at the forefront of a project that (at its most benign) seeks to return us to a historical context in which the trans capacity to live and thrive was placed under the punitive and disciplinary gaze of the clinic. In this future, “trans truth” is limited to the logic of the economy, such that the “very flesh and blood” of trans bodies is increasingly commodified in the service of extractive abandonment. Barring that, at its most terrifying, this is a project that would seek to eliminate us entirely.
Elimination is not an impossibility.
One month after the Cass Review’s publication, the UK Conservative Secretary of State for Health and Social Care, Victoria Mary Atkins, outlawed the use of puberty blockers by trans children in the UK (Reed, 2024). At the same time, families with trans kids were sent notices that procuring puberty blockers through international or private means might constitute grounds for “safeguarding referrals” (Wareham, 2024). The Cass Review was directly mentioned in the rationale for criminalization (Reed, 2024). This is the reality of policy-based evidence-making.
While it is tempting to retreat to EBP, debunk the Cass Review for what it gets wrong, and argue that one only needs to “follow the science” to validate the lived realities of trans people, this constitutes a misunderstanding. This is not to say that scientific inquiry is invalid. Rather, the grounds for rejecting the appeals in documents like the Cass Review need not be made in the language of EBP. If trans people must always provide a medical and scientific rationale for trans existence, then trans existence will always function as a medical and scientific question to be answered, not by trans people, but by those who seek to outline the parameters of gender and sex variance in order to further subjugate those who they deem to be variant.
EBP protects and uplifts “evidence neutrality,” but as with “journalist neutrality,” and as with “library neutrality,” the retreat to what is neutral will inevitably re/produce class dominance among the normative and the naturalized. Likewise, within a political and economic context that exists to serve capitalist interests, EBP eventually succumbs to the logic of capital. As it is with those who experience disability, poverty, and debility in relation to COVID, so it is with the “social contagion” that is transness. The fear therein, which facilitates a desire to eradicate the so-called disease enfeebling the body-economic: that is the condition of illness that we face. And these are the stakes for library workers who would create new pedagogies, policies, and potential for mediating “health” and “care” against the grain of EBP and how it gets used in the inevitable, sociological production of harm.
Acknowledgements
The author extends her gratitude to her external reviewer, Sam Popowich, for his insight and attention to a holistic critique of the subject matter. The author also wishes to thank the internal reviewers and editorial team at In The Library With The Lead Pipe, whose assistance throughout this process has been invaluable as well as deeply insightful: Jessica Schomberg, Brea McQueen, and Jaena Rae Cabrera.
Baer, A. (2023). Dominant covid narratives and implications for information literacy education in the “post-pandemic” united states – in the library with the lead pipe. In The Library With The Lead Pipe. https://www.inthelibrarywiththeleadpipe.org/2023/covid-narratives/
Boden, R., & Epstein, D. (2006). Managing the research imagination? Globalisation and research in higher education. Globalisation, Societies and Education, 4(2), 223–236. https://doi.org/10.1080/14767720600752619
Bunn, F., Trivedi, D., Alderson, P., Hamilton, L., Martin, A., Pinkney, E., & Iliffe, S. (2015). The impact of Cochrane Reviews: A mixed-methods evaluation of outputs from Cochrane Review Groups supported by the National Institute for Health Research. Health Technology Assessment (Winchester, England), 19(28), 1–99, v–vi. https://doi.org/10.3310/hta19280
Elston, M. (1991). The politics of professional power. In J. Gabe, M. Calnan, & M. Bury (Eds.), The sociology of the health service (1st ed., pp. 58 – 88). Routledge.
Gupta, M. (2003). A critical appraisal of evidence-based medicine: Some ethical considerations. Journal of Evaluation in Clinical Practice, 9(2), 111–121. https://doi.org/10.1046/j.1365-2753.2003.00382.x
Hall, R., Taylor, J., Hewitt, C. E., Heathcote, C., Jarvis, S. W., Langton, T., & Fraser, L. (2024). Impact of social transition in relation to gender for children and adolescents: A systematic review. Archives of Disease in Childhood. https://doi.org/10.1136/archdischild-2023-326112
Heathcote, C., Taylor, J., Hall, R., Jarvis, S. W., Langton, T., Hewitt, C. E., & Fraser, L. (2024). Psychosocial support interventions for children and adolescents experiencing gender dysphoria or incongruence: A systematic review. Archives of Disease in Childhood. https://doi.org/10.1136/archdischild-2023-326347
Herman, E. S., & Chomsky, N. (2010). Manufacturing consent: The political economy of the mass media. Vintage Digital.
Ioannidis J. P. A. (2022). The end of the COVID-19 pandemic. European journal of clinical investigation, 52(6), e13782. https://doi.org/10.1111/eci.13782
Lockmiller, C. (2023). Decoding the Misinformation-Legislation Pipeline: An analysis of Florida Medicaid and the current state of transgender healthcare. Journal of the Medical Library Association: JMLA, 111(4), 750–761. https://doi.org/10.5195/jmla.2023.1724
McNamara, M., Baker, K., Connelly, K., Janssen, A., Olson-Kennedy, J., Pang, K. C., Scheim, A., Turban, J., & Alstott, A. (2024). An Evidence-Based Critique of “The Cass Review” on Gender-affirming Care for Adolescent Gender Dysphoria (pp. 1–39) [White paper]. Yale School of Medicine. https://law.yale.edu/sites/default/files/documents/integrity-project_cass-response.pdf
Núñez-Parra, L., López-Radrigán, C., Mazzucchelli, N., & Pérez, C. (2021). Necropolitics and the bodies that do not matter in pandemic times. Alter, 15(2), 190–197. https://doi.org/10.1016/j.alter.2020.12.004
Pezzullo, A. M., Axfors, C., Contopoulos-Ioannidis, D. G., Apostolatos, A., & Ioannidis, J. P. A. (2023). Age-stratified infection fatality rate of COVID-19 in the non-elderly population. Environmental research, 216(Pt 3), 114655. https://doi.org/10.1016/j.envres.2022.114655
Popowich, S. (2020). 3 The problem of neutrality and intellectual freedom: The case of libraries. In C. Riley (Ed.), The free speech wars: How did we get here and why does it matter? (pp. 43-52). Manchester: Manchester University Press. https://doi.org/10.7765/9781526152558.00008
Puljak, L., & Lund, H. (2023). Definition, harms, and prevention of redundant systematic reviews. Systematic Reviews, 12(1), 63. https://doi.org/10.1186/s13643-023-02191-8
Tacheva, J., & Ramasubramanian, S. (2023). AI Empire: Unraveling the interlocking systems of oppression in generative AI’s global order. Big Data & Society, 10(2). https://doi.org/10.1177/20539517231219241
Taylor, J., Hall, R., Langton, T., Fraser, L., & Hewitt, C. E. (2024a). Care pathways of children and adolescents referred to specialist gender services: A systematic review. Archives of Disease in Childhood. https://doi.org/10.1136/archdischild-2023-326760
Taylor, J., Hall, R., Langton, T., Fraser, L., & Hewitt, C. E. (2024b). Characteristics of children and adolescents referred to specialist gender services: A systematic review. Archives of Disease in Childhood. https://doi.org/10.1136/archdischild-2023-326681
Taylor, J., Mitchell, A., Hall, R., Langton, T., Fraser, L., & Hewitt, C. E. (2024). Masculinising and feminising hormone interventions for adolescents experiencing gender dysphoria or incongruence: A systematic review. Archives of Disease in Childhood. https://doi.org/10.1136/archdischild-2023-326670
Taylor, J., Mitchell, A., Hall, R., Heathcote, C., Langton, T., Fraser, L., & Hewitt, C. E. (2024). Interventions to suppress puberty in adolescents experiencing gender dysphoria or incongruence: A systematic review. Archives of Disease in Childhood. https://doi.org/10.1136/archdischild-2023-326669
Tonelli, M. R. (1998). The philosophical limits of evidence-based medicine. Academic Medicine: Journal of the Association of American Medical Colleges, 73(12), 1234–1240. https://doi.org/10.1097/00001888-199812000-00011
Turban, J. L., Loo, S. S., Almazan, A. N., & Keuroghlian, A. S. (2021). Factors Leading to “Detransition” Among Transgender and Gender Diverse People in the United States: A Mixed-Methods Analysis. LGBT health, 8(4), 273–280. https://doi.org/10.1089/lgbt.2020.0437
Vargas, N. C. (2024). Exploiting the Margin: How Capitalism Fuels AI at the Expense of Minoritized Groups. arXiv preprint arXiv:2403.06332.
Walker, H., Collins, S., Gentili, C., Livingstone, J., Mire, M., Thurm, E., Randle, C., & Aylmer, O. (2023, February 15). For the attention of Philip B. Corbett, associate managing editor for standards at The New York Times. https://nytletter.com/
In this essay, I use pathologizing terminology (ie: “disabled,” “medicalized,” “pathologized,” “debilitated”). As somebody who has experienced both temporary and permanent disability, this is not a commentary on people who live with disabilities or debilitation; rather, it is meant to signify the means by which capitalism produces the conditions for debilitation to occur, actively “disables” those who are forced to work under capitalist hegemony, and profits from the commodification of disability. In short, we are not defined by disability, but it is capital which disables us. ︎
By “abnormal,” I do not mean abnormality in people. Instead, I am calling attention to the “abnormal conditions” which arise in moments of crisis. Systems of capital produce the rationale for abnormality to return to “normalcy.” ︎
As of March 2023, Johns Hopkins Coronavirus Resource Center predicted that the US infection fatality was 1.1% (Mortality Analyses, n.d.). Ironically, Johns Hopkins decision to stop collecting and publishing fatality data in March 2023 is indicative of the universalizing, economic disinterest towards continuing to investigate and mitigate COVID in the wider population. ︎
What are the “right questions”? Ideally, the right questions center the intersecting plights of all trans people by establishing directives for research that hones in on issues such as: – what are barriers to trans-affirming healthcare for children and adults? – how do practices of medical, psychological, and legal gatekeeping impact the health and wellbeing of trans children and adults? – how do social conditions give rise to expressions of desistance and detransition over time? – how do social disparities related to determinants such as housing, joblessness, legal documentation, policing, and education contribute to the overall health and wellbeing of trans children and adults? ︎
I do not use the term “transness” to imply that there is any “essential” characteristic of being trans. To the contrary, “transness” indicates the multiplicity of political, social, rhetorical, and performative projects undertaken by all trans and nonbinary people. ︎
LibraryThing is pleased to sit down this month with internationally bestselling novelist Tess Gerritsen, author of the popular Rizzoli & Isles crime series, subsequently adapted as a television show on TNT. Earning her medical degree at UC San Francisco, Gerritsen was a physician for a number of years, before making her book debut in 1987 with the romantic thriller, Call After Midnight. It was the first of thirty-one suspense novels—more romantic thrillers, as well as medical thrillers, police procedurals and historical thrillers—many of them bestsellers. Gerritsen’s work has been translated into forty languages, with more than forty million copies of her books sold worldwide. She won a Rita Award in the suspense category in 2002 for The Surgeon, and a Nero Wolfe Award in 2006 for Vanish. In 2023 she published The Spy Coast, the story of former spy Maggie Bird, whose attempts at a quiet life are disrupted by her past, and who successfully outwits the enemies who want her dead, with the help of her friends in the Martini Club. The Summer Guests, the second book in The Martini Club series, is due out from Thomas & Mercer in a few days. Gerritsen sat down with Abigail to answer some questions about her new book.
Although you have written many different kinds of suspense novel, your The Martini Club books are your first foray into espionage fiction. What prompted you to write The Spy Coast in the first place, and how did the character of Maggie Bird first come to you?
The Spy Coast was inspired by a peculiar feature of my small Maine town. I discovered that a large number of retired CIA employees live in this community. In fact, on the street where I once lived, there was an OSS retiree to the right of us, and a CIA retiree living a few doors down to the left of us. What drew former intelligence professionals to this part of Maine? I’ve heard a number of different explanations: That it’s far from any nuclear targets. Or it was a place for CIA safe houses. Or it’s a state where people mind their own business. I also wondered what life is like for an ex-spy. Do they get together with their former colleagues? Do they have book clubs? I’d see gray-haired people in the grocery store and post office, and I wondered about their past exploits. Surely they had stories to tell! Then one day, a character’s voice popped into my head. She said: “I’m not the woman I used to be.” And that’s how Maggie Bird was born, a woman whose voice was full of regret. A woman who’s now invisible to the world because she’s no longer young.
The Summer Guests is your second book about the Martini Club. Are there challenges in writing a sequel? How have your characters grown or changed?
The challenge is in keeping your characters moving forward emotionally. They can’t be static, or the series loses steam. What makes it easier, though, is the fact I know these people. I know how they’d react, what they’d say, how they’d rise to a new challenge. There are several new developments in The Summer Guests. Maggie is finally open to falling in love again, now that she realizes her oldest friend and fellow spy Declan has always pined for her. Another big change is in Jo Thibodeau, the local police chief, who is slowly starting to accept help from this circle of spies. In the first book, she had no idea who these people were; now she knows, and she acknowledges that they’re always going to be a few steps ahead of her. That budding partnership has been fun to write.
The books in your series are set in small-town Maine, a state where you yourself have lived for many years. How much of the town of Purity and the surrounding area is inspired by real locales?
Geographically, Purity is very much like my real town, with a stunning seacoast and lakes and ponds and the hordes of tourists that show up every summer. It’s also a town with a certain amount of conflict between native Mainers and those who’ve come “from away.” But fictional Purity is smaller, with a smaller police force, and I’ve made it just a bit more remote than my own town.
Your series has been compared to The Thursday Murder Club books, which also feature a cast of older sleuths (one a retired spy!) and their interaction with local law enforcement. What makes older sleuths and spies so interesting? Is it their experience? Their longer back stories, or potential wisdom? Are they more fun to write?
I hadn’t read The Thursday Murder Club books when I wrote The Spy Coast. The reason I was drawn to write about older people has more to do with growing older myself. I couldn’t have written these books thirty years ago; I needed to experience the phenomenon of becoming invisible and feeling as if society viewed me as less and less relevant because I’m older. I wanted to write about people my age, and how we still have valuable contributions to make, and yes— we’re still ready for adventure! The fun of writing about these characters is watching how my retired spies can outsmart Jo, who’s much younger, and how they’ve acquired not just wisdom with age, but also some well-earned cynicism.
Tell us a little bit about your writing process. Do you know the outcome of your story from the start? Is everything mapped out ahead of time, or are there surprises in the course of getting the story written?
I’m a seat-of-the-pants writer, which means my first drafts meander all over the place until I figure out where the story is going. The Summer Guests was inspired by a detail shared with me by the adult daughter of a former spy—that her family settled in this community because her father was working here on the CIA project called MKULTRA. That made me dive into MKULTRA, its history and ultimate scandals. I knew that would be one of the threads of the story. But the real heart of the story is about wealthy
summer people who come to Maine every year, bringing their secrets and their conflicts to our state. I knew it would start with a missing teenager. I knew the police would drag the local pond (thinking the girl had drowned) and instead find the skeleton of a long-dead woman. That’s all I knew about the plot, so I had to hang on tight as the twists and turns revealed themselves while I wrote.
What is next for you? Will there be more stories about The Martini Club? Are there other books in the works that you can share with us?
I’m writing the third in the Martini Club series. It’s called The Shadow Friends, and it focuses on Ingrid Slocum, one of the members of the Martini Club. She’s now living in Purity, happily married to her husband Lloyd (a former analyst). Then two people are murdered, with echoes of a long-ago operation in Ingrid’s past, and Ingrid’s ex-lover shows up in town. Suddenly Ingrid finds her marriage and her peaceful life under threat.
Tell us about your library. What’s on your own shelves?
In my reference area, I have shelves and shelves of textbooks on medicine and forensics, CIA memoirs and books about spycraft. I also have a pretty eclectic collection of other nonfiction, with a focus on anthropology, archaeology, and comparative religion. Finally, I have loads of novels by fellow writers—books that I admired and want to keep around as inspiration.
What have you been reading lately, and what would you recommend to other readers?
The following post is one in a regular series on issues of Inclusion, Diversity, Equity, and Accessibility, compiled by a team of OCLC contributors.
DEI-related presentations from Core Interest Group Week
This week is Core Interest Group Week 2025, provided by the Core division of the American Library Association (ALA). The event takes place annually the first full week in March and features 30 different programs that are free for anyone to attend. The following hour-long sessions have a DEI focus. (Listings are in Central Standard Time).
4 March at 12:00 pm: The Cataloging and Classification Research Interest Group will offer two DEI-related presentations. “The Ethics Evolution: Catalogers’ Perspectives Over Time” from Karen Snow of Dominican University in Illinois (OCLC Symbol: IBE) and Elizabeth Shoemaker of University of Toronto (OCLC Symbol: V4U) will discuss ethical issues in cataloging. The second presentation, “Exploring Systemic Gender Bias in Library of Congress Subject Headings: A Comprehensive Study,” features analysis of LCSH terms from Sungmin Park of Rutgers University (OCLC Symbol: NJR) and Yuji Tosaka at The College of New Jersey (OCLC Symbol: NJT) Register for the Catalog and Classification Research Interest Group program.
4 March at 2:00 pm: The Faceted Subject Access Interest Group offers a program with three presentations, including “Faceted Subject Vocabularies Increase Representation of Marginalized Communities in Biomedical Research” from Mego Franks, Atrium Health Wake Forest Baptist. This presentation compares MeSH and Homosaurus terms relevant to LGBTQ+ identities and the importance of representation in biomedical research. Register for the Faceted Subject Access Interest Group sessions.
6 March at 1:00 pm: “Cataloging for Accessibility: An Inclusive Approach to Yiddish-language Collection Description” is the second presentation offered by the Cataloging Norms Interest Group. Michelle Sigiel, Yiddish Book Center (OCLC Symbol: MAYHB), will discuss cataloging Yiddish audiobooks and braille-format books. Register for the Cataloging Norms Interest Group program.
Every year Core Interest Group Week provides lots of great webinars that I may not be able to attend live. Fortunately, all the programs are recorded. According to the website, recordings will be posted the second week in March so you can access all the recordings from the Interest Group webpage. Contributed by Kate James.
Axe-con 2025
Axe-con is a free online accessibility conference, offered annually by Deque Systems as a live event with fully captioned and transcribed recordings, and with ASL interpretation. Attendance of axe-con sessions can also be used towards continuing education (CE) credit for the International Association of Accessibility Professionals (IAAP) certification. Talks at axe-con 2025 (which took place between 25-27 February) included sessions for designers, developers, testers, accessibility specialists, managers, and compliance officers. OCLC staff across multiple departments attended and particularly appreciated the talks about the European Accessibility Act, testing, training, and advocating for accessibility improvements.
This conference is always engaging but the topic of accessibility is particularly resonant right now, with regulatory changes imminent in both Europe and the United States. The agenda was so packed with interesting talks that we set up a Teams channel dedicated to sharing notes, quotes, and takeaways from various sessions. The accessibility community is small but passionate, and as U.S. Senator Tammy Duckworth noted, “You might think ‘oh, that’s too micro.’ It’s not too micro. Drive change at all levels, and together we’ll make the big change happen.” Register for axe-con to receive a link to the recorded sessions. Contributed by Liz Neal.
“Neo-censorship” of digital content
Challenges to and bans on physical books get a good deal of attention in this space (for instance, see “The facts about book bans” in the 4 February 2025 “Advancing IDEAs”), but less visible forms of censorship are also rampant in this era. Library Futures, a nonprofit advocacy and research project of The Engelberg Center on Innovation Law and Policy at the New York University School of Law (OCLC Symbol: YLS), has just released “Neo-Censorship in U.S. Libraries: An Investigation into Digital Content Suppression.” The report borrows from journalist Rohan Jayasekera’s landmark 2008 definition of “neo-censorship” as “a kind of control on opinion that moves beyond the traditional model—that of the state, the law and the secret policeman. Today’s censors can be found in big business, courtrooms, schools, newsrooms. They block ideas out of habit, or prejudice or fear and often in secret.” Using the rationale of keeping “pornographic” materials from children and/or protecting the rights of parents, pressure has been applied by courts, legislatures, and other institutions to implement certain search stopwords, filters, and other means that seriously limit access to legitimate research and informational content on such issues as current events, history, health, and science.
The removal of a tangible book from a library shelf is just one obvious means of censorship. But imposing certain types of search restrictions on digital collections and databases makes it difficult or impossible for children as well as adults to access resources on such topics as breast cancer research, recognizing and reporting abuse, various forms of violence, genocide, racism, hate, and other subjects subsumed under filters and stopwords. Library Futures looked deeply into policies, legislation, and court cases; interviewed librarians and leaders of consortia; and reviewed research on information seeking behavior in libraries to understand the impact on libraries and their patrons. The report presents evidence to debunk common misconceptions about the use of databases and reveals that students and other library users have become increasingly aware of the power they have to fight such neo-censorship. Contributed by Jay Weitz.
Learn why AI merchandising needs human direction as Lucidworks' CTO explains how Commerce Studio empowers merchandisers to conduct e-commerce with artistry.
I took the minutes in this week’s meeting of the collections department at York University Libraries. This just a brief sample of what we (and all the other libraries) are trying to start to handle. It followed twenty minutes about Clarivate and how we’re not buying ebooks from ProQuest.
LCSH and the Gulf of Mexico. An executive order from Donald Trump “renaming” the Gulf of Mexico to the “Gulf of America” was met with derision and ridicule everywhere in the world outside the US, and by many inside it, but the Library of Congress quickly proposed a change to LCSH to move “Gulf of Mexico” to “Gulf of America.” There was talk around OCUL about changing how this authority record is handled, but nothing has been arranged yet. Managing it manually would be too much work for us to do alone. Thus, the change will happen here (and probably everywhere else, in Ontario and the world), and a Trump diktat will forever affect our catalogues. HC made the point that we have thousands of records using “Indians of North America” that still need to be fixed: an older and more pressing problem to solve. Bill wondered what will happen if Trump “renames” Lake Ontario to “Lake New York”—and whether the Library of Congress will exist next year.
OCUL is the Ontario Council of University Libraries, which had a Decolonizing Descriptions Working Group that issued a final report in 2022: “Recommendations in this report that refer to inaccurate, outdated and/or harmful subject headings, for example ‘Indians of North America,’ must be approached as a starting point for improving catalogue descriptions.” Today there are still over 135,000 results on a subject search on Indians of North America in our catalogue.
Win free books from the March 2025 batch of Early Reviewer titles! We’ve got 193 books this month, and a grand total of 3,524 copies to give out. Which books are you hoping to snag this month? Come tell us on Talk.
The deadline to request a copy is Tuesday, March 25th at 6PM EDT.
Eligibility: Publishers do things country-by-country. This month we have publishers who can send books to the US, Canada, the UK, Israel, Australia, Ireland, Croatia, Cyprus, Belgium, Sweden and more. Make sure to check the message on each book to see if it can be sent to your country.
Thanks to all the publishers participating this month!
During a conversation about our respective work projects last year, my colleague Chela Scott Weber mentioned a 2009 OCLC research report called The Metadata IS the Interface: Better Description for Better Discovery of Archives and Special Collections, Synthesized from User Studies. The report brought me back in time (much like a silver DeLorean) to my work as a graduate assistant in Louisiana State University Special Collections working on archival collections. The first collection I worked on there was the Louisiana Folklife Program project files, Ms. 4730, which is 44 linear feet, 21 video cassettes and 3,618 sound cassettes of primary source material about Louisiana folklore and the state agency devoted to preserving it. In 2002 when I worked on the collection, discovery was provided from both a record in the online catalog and a finding aid (written in a word processing program) available online as a PDF and in hard copy in the reading room.
This collection and its discovery tools are excellent examples of what Jennifer Schaffner described in the report, so it was no surprise that it resonated with my experience in 2002. Schaffner focused on user interest in resource content and the gap between traditional metadata practices and user expectations. Although it was written several years ago, the report could easily be describing discovery needs for archival and special collections today if a few references to defunct products and programs were updated. In 2009, linked data was not a regular part of library and archival metadata discussions, but Schaffner’s points about a more user-centered approach to metadata apply equally well in a linked data context. As Chela noted in her 5 February 2025 Hanging Togetherpost, this is a report that remains useful to our current work, so my post offers a synthesis of the points that resonated with me.
Schaffner concludes that “some thirty years of user studies teach that Aboutness and relevance matter most for discovery of special collections.” This seems an obvious conclusion, but the reason it resonates with me today is the importance of providing subjects to enable discovery of related resources. Fundamentally, linked data is about providing relationships among resources on the Semantic Web. Archives and Special Collections Linked Data: Navigating between Notes and Nodes, a report from the OCLC Research Archives and Special Collections Linked Data Review Group, explains the importance of relationships in special collections: “Relationships are critical to special collections: we have built and invested in collections purposefully because of their relationship to one another and to a topical focus or subject matter.” The has subject relationship proves special collections users with the opportunity to discover new relationships among seemingly unrelated collections. As Schaffner notes, archival description has often focused on Ofness (e.g., genre and form) and collection provenance than the subject content of collections. A user may certainly want to explore relationships among agents who created the resources, or through a combination of creator and genre relationships, but without subjects, archival collections are likely to remain hidden to the researcher unfamiliar with an institution’s collections or a user interested in a subject regardless of resource format. Minimal-level descriptions may be necessary to balance staffing shortages with unprocessed collections, but some subject terms make a huge difference in discoverability among collections with devised titles like “James family papers” and “Postcards collection.” This premise is supported by 2023 OCLC research report Summary of Research: Findings from the Building a National Finding Aid Network Project: “In terms of what kinds of material might address these needs, more than half of pop-up survey respondents indicated that they are interested in any type of material relevant to their topic (55.8%).”
Users expect results ranked by relevance
This point certainly reflects an expectation that library discovery tools incorporate relevance rankings that are fairly common in internet searching. It also reflects one of S.R. Ranganathan’s five laws of library science—save the time of the reader. Imagine an archive has eleven collections with the subject heading “Suffragists.” When a researcher asked to see an archival collection, they may be presented with two book trucks full of boxes so saving time is critical in conducting research with primary source materials. A finding aid may narrow it down to a box, but a single box may contain 100 or more items. And then the researcher needs to repeat this process ten more times because there are eleven collections—that is a huge amount of time, and far more than would be spent looking at eleven books with indexes for information about suffragists. What relevance means for archival collections is a particularly tricky issue as many collections will have generic devised titles and related agents unique to a single collection. Thus, providing users discovery tools that enable them to filter results by multiple criteria including subjects, dates, language, and format, allows users to create their own relevance criteria with each search.
Discovery tools are not obvious
This is another conclusion that applies generally to libraries and archives, but especially to special collections resources that usually do not benefit from initial discovery in online bookstores and social cataloging websites like Goodreads. Users’ unfamiliarity with search types and indexes creates even more invisibility of special collections resources in online catalogs because of their uniqueness, e.g., archival collections titles may seldom benefit from indexing with keywords. An advantage of using archival linked data is the removal of such barriers to access. Eliabeth Russey Roke and Ruth Kitchin Tillman note this as one of the advantages to archival linked data. In their article “Pragmatic Principles for Archival Linked Data,” they write, “By making our data more accessible on the web using the semantic structures and language of linked data, we meet our users where they are, rather than expecting them to come to us.”
Thus, digitization of special collections may greatly increase their discoverability, and here we have seen advancements since Schaffner’s report. 3D digitization of cultural heritage objects offers better possibilities for access to these resources virtually. We also have a relatively new tool, artificial intelligence, which offers the potential to transcribe and translate manuscripts more quickly. (I am now reminded that I should revisit the 2024 AI4Libraries presentation “Using AI Tools to Improve Access and Enhance Discovery Capabilities in Archival Collections” by Sonia Yaco, Digital Initiatives Librarian, Rutgers University Libraries.)
Final thoughts
Reading this report started with a personal reflection on my work as a graduate assistant, but it quickly turned into a realization of the importance of focusing on user needs as we plan for the future. Controlled vocabularies, so important for pre-coordinated searching, provide us with building blocks to create linked data with meaningful relationships among entities. Search engines and online catalogs save the time of users that use to flip through pages of paper finding aids and printed catalogs, but the amount of information available to users makes Ranganathan’s law “save the time of the reader” equally relevant today. Making my own addition to Ranganathan’s laws, I add, “use technology responsibly to save the time of the archivist and librarian.” Our metadata has been converted from catalog cards to online catalogs to linked data, but fundamentally the why doesn’t change. It’s all about the user.
I didn’t mean to create a sexist library collection when I set out to build one. But in 1994, when I showed Mary the catalog of online books I’d started the previous year, one of her first questions was “Where are the women?”
It was a fair question. There weren’t that many listed. I’d been cataloging sites like Project Gutenberg, which had only 11 books by women in its first 100 releases, as well as other early ebook sites whose author gender ratio was similar, and sometimes worse. Many early producers of online books gave much of their attention to “canonical” works featured in sets like Great Books of the Western World, which in its first edition of 54 volumes featured no women writers at all.
The works featured in Great Books do indeed deserve to be remembered. But so do many other books, including books by people whose perspectives were left out of Great Books sets. It was clear that to build up a suitably diverse and inclusive set of books for people to read online, we’d need to go out of our way to find and provide more books by those people. Books by the more than half of the world’s population who aren’t men. Books not just by colonizers, conquerors, and enslavers, but by those who were colonized, conquered, and enslaved. Books by the poor in spirit, the mourners, the peacemakers, those who are persecuted, and those who hunger and thirst for righteousness. Books not just by those who recognize the imagery in that last sentence from their own religious traditions, but also by those who follow many other religious and philosophical traditions.
This isn’t simply a matter of fairness to those overlooked authors. As I described in my last post, including them makes my collection better for all readers, by widening the range of works readers can learn from, enjoy, and draw on as sources for their research and their original creations. Diverse, inclusive, equitable, and accessible libraries are better libraries for all.
Mary’s done a lot to improve the coverage of my collection. She created, and still edits, A Celebration of Women Writers, which highlights and links to information about women writers of all sorts, from many different times and places. Her work has encouraged people to put books online by those writers and others. She’s also herself put online more than 400 books by women writers, often in collaboration with transcribers and proofreaders from all over the Internet. And she’s not alone in this work. Not only have there been numerous other projects focusing on women writers, but many general collections have also added more books not just by women but also by other overlooked writers and groups. (Project Gutenberg, I should note, includes a notably larger proportion of books by women in its subsequent releases past their 100th etext. One can also find a few years into their listings a string of several books in a row by Indigenous American authors, another group overlooked in its early offerings, as well as an increasingly diverse set of Black authors beyond its initial focus only on Frederick Douglass.)
While I’ve incorporated the selections of these and other projects, I haven’t given fans of male authors, or white authors, reasonable cause for complaint. There are plenty of books by those authors in my collection, and I’m continuing to add many more of them. I don’t deny any reader’s request for a book on account of the authors’ genders or races. But I also don’t assume there’s no need to do any more work to improve the diversity and inclusiveness of the collection. The percentage of books in my curated collection credited to authors I know to be women, for instance, is now in the low 20s. That’s higher than it was when I started out, and better than it would be if we hadn’t proactively worked to improve its collection diversity, but it’s still well short of gender parity.
Gender and race also aren’t the only things I’m thinking about when I’m working to fill in gaps in my collection. For example, if someone suggests a book that was written in response to another one, I often try to include the book being responded to, if I don’t already list it. Doing so not only often increases the diversity of viewpoints on the topic the books discuss, but it also often helps readers better understand the originally-suggested book. Or, if someone asks for an older book on a topic I don’t have much coverage for, I’ll often also look for newer books on the same topic that might have more up to date or accurate information, or provide a broader set of ideas and perspectives. I also will sometimes try to include different copies of works I already list that are more accessible than the editions I first listed, such as those that have a proofread transcription of a text (and not just images and hard-to-read OCR), or that are more easily downloadable across the Internet.
The Online Books Page is a project staffed by less than 1 FTE, and there’s only so much I can do to improve its diversity, inclusiveness, equity, and accessibility. But what I do manage to do makes it a better library for its readers than it would be if I followed the path of least resistance in maintaining it. Full-service libraries with more FTEs can do more than I can. And they should, because as I’ve said previously, both their words and actions supporting diversity, inclusiveness, equity, and accessibility make their libraries better fulfill their missions to their entire communities.
Greetings, DLF Community! We hope spring is springing where you are and that you’re enjoying some longer days and warmer evenings. We’re excited to share our March meetings and events with you and look forward to seeing you around this month.
— Team DLF
This month’s news:
Mark Your Calendar: The 2025 DLF Forum and Learn@DLF will take place Sunday, November 16 through Wednesday, November 19 in Denver, Colorado.
Register: Registration is open until March 26 for the 2025 IIPC Web Archiving Conference, taking place April 9-10 in Oslo, Norway. View the program and keynote speakers.
Register:Registration is also open for IIIF’s annual conference and showcase, taking place June 2-5 in Leeds, UK.
Attend: The Place-Based Planning Webinar, part of CLIR’s Climate Resiliency Action Series, is happening March 13, 12pm ET/9am PT. Read on for more information about the series and this workshop.
Discuss: After attending the webinar, join Climate Circle Discussion #5: Place-Based Planning, on March 21, 1pm ET/ 10am PT
Catch Up: Past Climate Action Webinars are available on the CLIRDLF YouTube channel or the project website’s Resource page.
This month’s open DLF group meetings:
For the most up-to-date schedule of DLF group meetings and events (plus NDSA meetings, conferences, and more), bookmark the DLF Community Calendar. Meeting dates are subject to change. Can’t find meeting call-in information? Email us at info@diglib.org. Reminder: Team DLF working days are Monday through Thursday.
DLF Born-Digital Access Working Group (BDAWG): Tuesday, 3/4, 2pm ET / 11am PT
DLF Digital Accessibility Working Group (DAWG): Wednesday, 3/5, 2pm ET / 11am PT
DLF Assessment Interest Group (AIG) Cultural Assessment Working Group: Monday, 3/10, 1pm ET / 10am PT
DLF AIG User Experience Working Group: Friday, 3/21, 11am ET / 8am PT
DLF Committee for Equity & Inclusion: Monday, 3/24, 3pm ET / 12pm PT
DLF AIG Metadata Assessment Working Group: Thursday, 3/27, 1:15pm ET / 10:15am PT
DLF Digital Accessibility Policy & Workflows Subgroup: Friday, 3/28, 1pm ET / 10am PT
DLF Digital Accessibility IT & Development: Group: Friday, 3/31, 1:15pm ET / 10:15am PT
The word Patuxent is derived from the Algonquian language used by the
indigenous people living in the area prior to the arrival of the
European settlers. Its meaning is debated. According to some sources it
means “water running over loose stones” while others believe it means
the “place where tobacco grows”.
This past fortnight has been rather tumultuous in Australian libraries. Like much of the world, we've been getting some lessons about raw power and resource allocation, and mostly weren't ready to hear it.
Clarivate chooses violence
In November 2021 Clarivate completed the purchase of ProQuest for USD$5.1B, which should have been a giant flashing red light to the library profession that ProQuest was likely to be gutted and radically overhauled in the near future.
Well here we are.
Last Tuesday morning Australian time, ProQuest's Vice President of Product Management sent an email to customers announcing a radical change to their business model. It appears that this was also the exact moment that most Clarivate staff and their publishing company partners found out about the new arrangements, due to kick in from 1 November.
Anglosphere suddenly remembers the Library of Congress is a US Government entity
The other big library news from last week was the Library of Congress updating their terms to rename the Gulf of Mexico and Denali to Gulf of America and Mount McKinley respectively. The US Government of course has the sovereign right to refer to physical features in the landscape using whatever names they deem fit. It seemed to come as a surprise, however, to some in the non-US Anglosphere library world that the Library of Congress would update their terms to align with the United States Geographic Names Information System. As friend of Information Flaneur Alissa McCulloch pointed out in The Process is the Point, whilst LoC's Program for Cooperative Cataloguing uses language that sounds like it's a democratic international collaboration, ultimately the Library of Congress decides what is and is not a standard term in Library of Congress Subject Headings.
The problem here is that LCSH is the de facto standard for controlled subject terms in libraries across the Anglosphere. If the US President decides that the official US name for Australia is now "The US Federal Territory of Kangarooland", then under current arrangements Australian library catalogues would immediately start using this term, at least for newly added items. Clearly this is suboptimal.
Infrastructure
A few days ago Robin Berjon published a piece called Digital Sovereignty exploring the titular concept and also usefully defining "digital infrastructure". Whilst Berjon generally seems to have in mind larger or more multi-purpose digital infrastructures, his outline here is a really useful framing when considering the different platforms, standards and entities that make up the library and academic publishing ecosystem.
Berjon helps us to think more clearly about "infrastructural power" – the real power able to be exercised by whomever controls the infrastructure, regardless of any rules, norms or understandings. He outlines an infrastructural good test derived from Frischmann and Rahman, which is a rule of thumb for determining whether a system or platform has become an "infrastructural good":
Hard to replace: The component in question features high sunk costs, high barriers to entry, or increasing returns to scale notably from network effects.
Highly diverse downstream uses
Vulnerability: The good is necessary for participation in some social activity (personal, business, public, etc.) and there are negative repercussions from restricted access to it.
If you're a librarian and you're sweating right now, it's probably because that sounds awfully like a lot of our vendor platforms. Sure, if your library doesn't like a journal subscription deal that SpringerNature Group is offering you could always just walk away. But you'd have to be the sort of university where your researchers don't really care whether or not they have access to read Nature or Cell. It's why public libraries end up with AI slop in their ebook catalogues when they would never have selected it for their collection. It's also why even though we know that the companies academic libraries work with flagrantly abuse the privacy and safety of library users and this conflicts with core values of the profession, it's been extraordinarily difficult to effectively create counter-power. Drive-by commentators often demand that we "just stop doing business with" these companies, but Berjon's point about infrastructural goods is that it's extraordinarily difficult to do that, especially alone.
Monopolies are not standards
Perhaps as a start, we could recognise that monopolies and open standards are not the same thing. Admittedly this can sometimes get confusing because things that look like standards sometimes aren't.
Some incredibly useful library standards are OpenURL, OAI-PMH, MARC, and ARK. None of these are controversy-free, which we'll come to in a moment, but they are at least collectively owned open standards that anyone can use without charge.
On the other hand, there are things used widely in libraries that feel like open standards but aren't quite. As mentioned above, Library of Congress Subject Headings (LCSH) are a de facto standard in that they are widely used throughout the Anglophone world, but they're not a truly open standard since ultimately LCSH is a privately-controlled thesaurus of subject headings for use by the US Library of Congress, which they happen to share with the rest of the world. The Dewey Decimal System is the one thing many people think they know about libraries, but OCLC aggressively reminds everyone that despite it being 135 years old you have to pay them a tax if you want to use it. The International Standard Book Number is technically a standard, but if you want to use one you have to go through your "national registration agency". As I outlined in a post two years ago, this results in a monopoly in each country that can independently decide whether they can be bothered innovating.
Even the "real" standards aren't problem-free, because libraries and their parent organisations haven't been serious about controlling or directing them. Notoriously OCLC – the extortion outfit cosplaying as a collective non-profit – was appointed as the "maintenance and registration agency" for OpenURL by NISO in 2006, but 16 years later unilaterally claimed that this core function of every academic library catalogue was merely an "experimental research project" that they couldn't be bothered maintaining any more.
Counter-power
Where we might draw some hope and focus some thinking is in Berjon's second feature of infrastructural goods: Highly diverse downstream uses. This sounds like a positive feature of infrastructure – because it is! But...
The diversity of users also makes it harder for them to coordinate in order to exert political pressure on the infrastructure operator as their needs and communities will often differ greatly.
This applies to some extent in libraries, but I don't think our "needs and communities differ greatly" – or at least, not greatly enough if we focus on the many things we have in common across the various public, academic, corporate and other "special" libraries. That means we still have a place to think about building collective counter-power.
Dan Goodman called for this in The Transmitter last week, saying that Science must step away from nationally managed infrastructure. Goodman was thinking more about datasets and censorship of inconvenient research publications, but he also helpfully noted the problem of both ownership and effective control of the academic publishing industry. And while it's tempting to roll our eyes and say "I told you so", we'll get further by harnessing the energy of these newly-awakened scholars and researchers to build a new collective reality.
For me, "vulnerability" is the most important aspect to address in the Infrastructural Good Test. Lots of technology is "hard to replace". But the thing that gives companies like RELX, Clarivate or even Overdrive their power is that libraries can't see a way to get out of their relationship with this companies without catastrophic damage to their social license to operate in their given context. Your University Librarian certainly has the power to cancel every contract with Clarivate and Elsevier, but they'd be very unlikely to retain their job for long if they did that arbitrarily, no matter how strenuously they argued that it was the ethical thing to do.
If you explain the academic publishing market to a normal person, they don't believe you because it's so obviously absurd. Making academic libraries less "vulnerable" in this context requires a complex series of moves by academic scholars, university administrators, librarians and probably national governments. It's tied up in ideas about research quality, education as an international market, neoliberal economics, elitism, and great power politics.
I'm not a populist politician so I don't have a simple answer to all this. We're facing the equivalent of trying to rebuild an apartment block and replace its foundations while hundreds of people are living inside it. I will say that begging limited-liability shareholder corporations to do the right thing or trying to make them feel guilty (as if a legal abstraction has emotions) isn't going to work.
America First
To heed Goodman's call we need to reckon not just with the "America first" mentality of the United States government, but also our own. "International cooperation" on systems and standards often rests on an assumption that the bulk of the resources, funding and leadership will come from the United States. Logically this approach makes sense, since the USA is home to over 340 million people who collectively (though certainly by no means equally) hold vast hoards of wealth and enormous industrial and intellectual productive capacity. But as the people of Ukraine are discovering, the United States can't be relied upon indefinitely. Waiting for US institutions to lead everything is problematic in other ways, even – perhaps especially – when they're enthusiastic. American software is constantly trying to enforce American spelling, American terminology, American weights and measures, and (worst of all) American date conventions upon the rest of us. Though Goodman is right to point to the huge market share of the US directing the priorities of corporations, the situation isn't helped by the rest of us not bothering to try building something else. We can complain about lack of budget funding, but budgets are statements of priorities and political intent as much as they are about pure finance. If we want something more decentralised and truly collaborative, we can't wait for America to lead the way.
Speciation and multi-polarity
Australians looking for something about a bloke driving his ute with a tinnie on the back through a bushfire while wearing thongs need to be able to use those terms when searching a library catalogue, and get something sensible back as the result.
When it comes to project governance, we can also think about techniques to protect them in ways that don't rely on benign or enlightened stewardship. After an attempt at a hostile takeover of KohaILS , the community deliberately distributed project infrastructure in such a way that no single entity controlled more than one part. The koha-community domain name and website, the git repository, and the bug/feature tracking systems are all managed by different organisations to prevent any one entity from asserting control of the entire project again.
Your nearest exit may be behind you
Some of the infrastructure required already exists, or can be resurrected or repurposed. Australia still retains the Australian National Bibliographic Database, managed by the National Library of Australia and separate from OCLC's WorldCat. There is also the aforementioned Australian extension to LCSH which needs some TLC, but has existed for decades. Whilst these are "nationally managed infrastructure", they're also for localised needs, so a national approach makes sense.
Our predecessors confronted some of the same issues, and built systems and solutions to try to deal with them. We have cooperative organisations like CAUL which presents a single voice to vendors in contract negotiations, and CAVAL which does many things, including custodianship of a shared physical collection and a reciprocal borrowing network separate from the national document delivery system managed by the NLA. The UNIMARC standard is managed by IFLA.
Librarians can't escape the fact that most of our libraries are funded and regulated either directly or indirectly by governments. The idea that we could operate with complete autonomy is a fantasy. But there are plenty of precedents for taking our professional values seriously regardless of the political situation we find ourselves within. The founders of CAVAL created a shared collection agreement that deliberately makes it extremely costly to leave the collective. The Koha community found ways to stash power in multiple places. The librarians who built Australia's LCSH extension realised duplicating the Americans' work wasn't a good use of their time, but made sure Australians didn't have to search for "pickup trucks" or "wild fires". What is our generation going to build and maintain as our own legacy?
OKFN’s representative, Cassandra Woolford, and GAE’s coordinator and executive director, Edie Cux, formalised this week a cooperation agreement to strengthen the country’s technical capacity and promote transparency, accountability and the use of open data as key tools for development.
The cooperation agreement has a duration of three years, with the possibility of renewal by mutual agreement between the organisations. Open Knowledge Foundation (OKFN) is an organisation based in the United Kingdom and recognised for its extensive experience in the promotion and management of open data worldwide. It will coordinate, together with the Presidential Commission, projects and training programmes aimed at improving training and knowledge of public administration and open data, as well as the development of common goods and Digital Public Infrastructures (DPI) from the management of the Government of Guatemala. OKFN is now a strategic ally in the process of co-creating a public policy on open data that will be promoted by the Presidential Commission for Open and Electronic Government.
As a follow-up to this agreement, GAE is preparing four activities in the framework of Open Data Day, a global movement driven by OKFN, which is a week of workshops dedicated to open data:
Masterclass: Digital transformation and open data as a key element for transparency 3 March 2025 8:30 to 13:00 CST Palacio de Correos y Telégrafos, Guatemala City
Workshop: Artificial Intelligence and Open Data – GENIA Latin America Raising awareness of the opportunities for the application of artificial intelligence 4 March 2025 10:00 to 12:00 CST Online
Presentation of the 6th National Action & Progress Plans Compliance Dashboard Highlighting its role in the management and transparency of information for the follow-up of commitments, accountability and the fight against corruption 5 March 2025 8:30 to 13:00 CST Palacio de Correos y Telégrafos, Guatemala City
Round Table: Women in Data To show the possibilities of using data and its analysis in terms of processing with a gender perspective in Guatemala 6 March 2025 9:00 to 12:00 CST Palacio de Correos y Telégrafos, Guatemala City
These activities will be developed within the framework of the agreement to promote strategic cooperation in key areas such as open data and digital infrastructure, thus strengthening the digital transformation processes of the Executive Agency in Guatemala.
For more than four decades, OCLC has been committed to advancing the library profession through community-facing research and community programming. Our investments in original research and programs like the OCLC Research Library Partnership (RLP) and WebJunction have produced a record of thought leadership, fostered collaborative problem-solving, and supported professional development in key areas, equipping libraries to adapt and thrive in a rapidly changing environment.
Today, our Research and Programming organization includes three teams: a research group composed of research scientists with deep subject matter expertise and experience with research methods; and two programming units staffed with specialists with extensive practitioner-level experience from the research library sector (our Research Library Partnership team) and public libraries (our WebJunction team). Like other internal organizations within OCLC, the Research and Programming group is led by an Executive Director.
To ensure continued relevance and impact for the libraries we serve, we are undertaking a strategic realignment of our Research and Programming organization. While the overarching structure—three equally important teams, each with its own director—will remain intact, we are making strategic changes in the portfolios of some teams and consolidating capacity to improve Research and Programming operations. Several changes are underway, all with a focus on increasing the value we create with and for the library community.
Strategic shifts
Leadership transition: We are actively recruiting new leadership for our Research and RLP teams. We seek individuals who possess not only a strong understanding of the higher education landscape and proven program management expertise, but also a sophisticated understanding of the key technologies that are transforming library work, AI especially. This transition will ensure our leadership team possesses a new perspective and the technological acumen necessary to guide our initiatives moving forward. These new leaders will bring new perspectives and technological expertise that will directly benefit the communities served by libraries.
Expanded research agenda: We’re broadening our research agenda to meet a wider range of library needs. This expanded agenda will focus on understanding the key technologies (including AI) transforming library work across four broad areas:
Investigating library roles in research, teaching, and learning across the higher education ecosystem
Exploring curation as an exercise in community building, as well as collections stewardship; this includes greater attention to important developments in the public library sector
Analyzing emerging metadata frameworks and workflows to better understand the future of library “knowledge work”
Examining the organizational economics of libraries, with an eye toward optimizing resources and workflows within and across institutional networks
By focusing on these areas, we aim to address community needs and enhance the value we deliver to the varied communities we serve.
Enhanced program alignment: We are unifying our approach to community programming, leveraging engagement and learning platforms to share expertise across the RLP and WebJunction teams. We are also developing new ways to deliver value by:
Connecting research outputs to a broader range of audiences, moving beyond (but not replacing) long-form narrative formats to reach readers who prefer data visualizations, infographics, or audio summaries
Facilitating new opportunities for collaborative learning through workshops and cohort-based programs
These enhancements will create more accessible and engaging ways for communities to benefit from our research outputs and empower community members to engage with and contribute to library initiatives.
Rationale for change
These changes are being implemented in response to significant shifts within the library ecosystem, particularly the increasing impact of AI and other technologies. My goal is to position OCLC Research to effectively address these changes and to continue to serve as a vital resource for libraries and the communities they support.
I am confident that these strategic shifts will enhance our ability to support libraries in their ongoing work. We will continue to share updates as our plans progress. In the interim, I invite your feedback and insights on areas in the higher education and academic and public library sectors that require more research, reflection, and discussion. Please feel welcome to contact us to share your perspectives.
In celebration of Open Data Day, we are excited to co-host a webinar focused on data and metadata quality, cornerstones of effective open data initiatives. This event, titled ‘From data to metadata: enhancing quality across borders’, is organised by the data.europa academy of the Publications Office of the European Union and the Open Knowledge Foundation. It will bring together experts from Europe and beyond to share their insights, approaches, and best practices for ensuring high-quality open data.
Open Data Day is an annual global event dedicated to promoting and celebrating open data. It brings together communities around the world to showcase the benefits of open data and encourage the adoption of open data policies. Groups from around the world can register their local events, which are displayed on a map. This year, our webinar will contribute to this celebration by focusing on the importance of data and metadata quality in making open data a reality.
We will be joined by national experts from public administrations and leading international organisations driving open data at an international level. This diverse lineup of speakers will provide participants with valuable insights into successful approaches and techniques for achieving high data and metadata quality. Attendees will hear perspectives from both European and international experts and engage in dynamic discussions on the challenges and opportunities of ensuring data quality across borders and sectors.
Whether you are a data professional, policymaker, researcher, or open data enthusiast, this webinar offers valuable learning opportunities and the chance to connect with experts in the field. Don’t miss this chance to enhance your understanding of data and metadata quality and contribute to the open data community. Mark your calendars for Friday, 7 March, from 10:00 – 11:30 CET and register now!
Open Data Day (ODD) is an annual celebration of open data all over the world. Groups from many countries create local events on the day where they will use open data in their communities. ODD is led by the Open Knowledge Foundation (OKFN) and the Open Knowledge Network.
As a way to increase the representation of different cultures, since 2023 we offer the opportunity for organisations to host an Open Data Day event on the best date over one week: this year between March 1st and 7th. In 2024, a total of 287 events happened all over the world, in 60 countries using 15 different languages.
All outputs are open for everyone to use and re-use.
I have written before about the double-edged sword of software vendors' ability to disclaim liability for the performance of their products. Six years ago I wrote The Internet of Torts about software embedded in the physical objects of the Internet of Things. Four years ago I wrote about
Liability In The Software Supply Chain.
The EU and U.S. are taking very different approaches to the introduction of liability for software products. While the U.S. kicks the can down the road, the EU is rolling a hand grenade down it to see what happens.
It is past time to catch up on this issue, so follow me below the fold.
In March 2020 the Cyberspace Solarium Commission, a "bipartisan team of lawmakers and outside experts", launched their report. Among its 82 recommendations were several related to liability:
3.3.2: Clarify Liability for Federally Directed Mitigation, Response and Recovery Efforts
4.2 Congress should pass a law establishing that final goods assemblers of software, hardware, and firmware are liable for damages from incidents that exploit known and unpatched vulnerabilities
4.3 Congress should establish a Bureau of Cyber Statistics charged with collecting and providing statistical data on cybersecurity and the cyber ecosystem to inform policymaking and government programs
Markets impose inadequate costs on — and often reward — those entities that introduce vulnerable products or services into our digital ecosystem. Too many vendors ignore best practices for secure development, ship products with insecure default configurations or known vulnerabilities, and integrate third-party software of unvetted or unknown provenance. Software makers are able to leverage their market position to fully disclaim liability by contract, further reducing their incentive to follow secure-by-design principles or perform pre-release testing. Poor software security greatly increases systemic risk across the digital ecosystem and leave[s] American citizens bearing the ultimate cost.
The Administration will work with Congress and the private sector to develop legislation establishing liability for software products and services. Any such legislation should prevent manufacturers and software publishers with market power from fully disclaiming liability by contract, and establish higher standards of care for software in specific high-risk scenarios. To begin to shape standards of care for secure software development, the Administration will drive the development of an adaptable safe harbor framework to shield from liability companies that securely develop and maintain their software products and services. This safe harbor will draw from current best practices for secure software development, such as the NIST Secure Software Development Framework. It also must evolve over time, incorporating new tools for secure software development, software transparency, and vulnerability discovery.
Six years after Congress tasked a group of cybersecurity experts with reimagining America’s approach to digital security, virtually all of that group’s proposals have been implemented. But there’s one glaring exception that has especially bedeviled policymakers and advocates: a proposal to make software companies legally liable for major failures caused by flawed code.
Since the 1980s, legal scholars have discussed how liability should apply to flawed software. The fact that there still isn’t a consensus about the right approach underscores how complicated the issue is.
One of the biggest hurdles is establishing a “standard of care,” a minimum security threshold that companies could meet to avoid lawsuits. There’s disagreement about “how to define a reasonably secure software product,” Dempsey said, and technology evolves so quickly that it might not be wise to codify one specific standard.
Various solutions have been proposed, including letting juries decide if software is safe enough — like they do with other products — and letting companies qualify for “safe harbor” from lawsuits through existing programs like a government attestation process.
The Solarium Commission proposed safe harbor for companies that patch known vulnerabilities. But that would only address part of the problem.
Even a very weak "duty of care" would be a big improvement. It would, for example, outlaw hard-wired passwords, require 2-factor authentication with FIDO or passkeys not SMS, require mailers to display the actual target of links, and so on.
One of the industry’s chief arguments is that liability would distract companies from improving security and overburden them with compliance costs. “The more companies are spending their time on thinking about liability, the less they might be spending their time on higher-value activities,” said Henry Young, senior director of policy at the software trade group BSA.
The problem is that the "higher-value activities" typically result in adding vulnerabilites to their products. I doubt that management at victim companies like Equifax or SolarWinds' customers would think that adding flashy new features was "higher-value" than fixing vulnerabilities.
They argue that even if policymakers want to focus on software security, there are better ways to prod vendors forward, such as encouraging corporate-board oversight.
Board members are insulated from liability by D&O insurance, so "encouraging" them will have precisely zero effect.
And they warn that focusing on liability will distract the government from pursuing better policies with its limited resources.
Exactly what are the "better policies" they desire? To be left alone to ship more buggy products.
The industry whines that they wouldbe treated differently from others, conveniently ignoring that the others can't disclaim liability:
Critics also contend that it’s unfair to punish companies for digital flaws that are deliberately exploited by malicious actors, a scenario that’s rare in most industries with liability, such as food and automobiles.
In 2023 nearly 41,000 people were killed by automobiles in the US. Many, probably most of those deaths were caused by "malicious actors" exploiting flaws such as cars that know what the speed limit is but don't enforce it, or could but don't detect that the driver is drunk or sleepy. Liability hasn't caused the auto industry to get real about safety. Instead we have Fake Self-Driving killing people.
And:
Up to 48 million people get sick from a foodborne illness every year, and up to 3,000 are estimated to die from them.
Industry leaders say liability is unnecessary because there’s already a working alternative: the marketplace, where businesses are accountable to their customers and invest in security to avoid financial and reputational punishment. As for contracts disclaiming liability, the industry says customers can negotiate security expectations with their vendors.
“We're open to conversations about any way to improve software security,” Young said. “Our customers care about it, and we want to deliver for them.”
Have you tried to "negotiate security expectations" with Microsoft, or have a conversation with Oracle about a "way to improve software security"? How did it go? I guess it didn't go well:
“Just telling organizations that not fixing security bugs will impact their business is not enough of an incentive,” a group of tech experts warned the Cybersecurity and Infrastructure Security Agency in a report approved this month.
Experts also rejected the idea that most customers could negotiate liability into their contracts. Few companies have leverage in negotiations with software giants, and few customers know enough about software security to make any demands of their vendors.
Despite its laughable nature, the industry's pushback ensured that nothing happened until it was too late:
Senior administration officials haven’t lived up to their lofty rhetoric about shifting the burden of cybersecurity from customers to suppliers, Herr said. “There is an attitude in this White House of a willingness to defer to industry in operational questions in a lot of cases.”
Messaging service WhatsApp claimed a major legal victory over Israeli spyware firm NSO Group on Friday after a federal judge ruled that NSO was liable under federal and California law for a 2019 hacking spree that breached over 1,000 WhatsApp users.
It’s a rare legal win for activists who have sought to rein in companies that make powerful spyware, or software capable of surveilling calls and texts, that has reportedly been used on journalists, human rights advocates and political dissidents worldwide.
So in the US the disclaimers of liability in the end user license agreement are, and will presumably continue to be, valid unless your product is intended to commit crimes such as violations of the Computer Fraud and Abuse Act. The bar for a victim to prove liability is impossibly high.
Under the current administration, the instinctive inclination of post-Reagan Republicans to rely only on market forces to hold businesses responsible for the consequences of their actions would seem to preclude the use of government policy to improve the security of software vital to business and government operations. Indeed, the Trump team has promised wholesale repudiation of regulations adopted in the past four years, so new limits on industry would seem especially unlikely.
There are, however, good reasons why the new administration should not default to repealing the cybersecurity actions of the past four years and passively accepting severe cyber vulnerabilities in critical infrastructure. In fact, as I explained in a series last year on initiatives aimed at infrastructure and data, much of the Biden administration’s cybersecurity agenda was built on projects launched by President Trump in his first term. The Trump administration would do well to remember the underlying principles that spurred it to initiate these actions the first time around.
The EU seems to be taking an opposite approach. Tom Uren wrote:
Earlier this month, the EU Council issued a directive updating the EU’s product liability law to treat software in the same way as any other product. Under this law, consumers can claim compensation for damages caused by defective products without having to prove the vendor was negligent or irresponsible. In addition to personal injury or property damages, for software products, damages may be awarded for the loss or destruction of data.
In the EU companies are presumed to be liable for a defective product unless they can qualify for a "safe harbor":
Rather than define a minimum software development standard, the directive sets what we regard as the highest possible bar. Software makers can avoid liability if they prove a defect was not discoverable given the “objective state of scientific and technical knowledge” at the time the product was put on the market.
Digital economy: The new law extends the definition of “product” to digital manufacturing files and software. Also online platforms can be held liable for a defective product sold on their platform just like any other economic operators if they act like one.
Circular economy: When a product is repaired and upgraded outside the original manufacturer’s control, the company or person that modified the product should be held liable.
Unlike the current and proposed US approaches, the EU's approach imposes victim-driven consequences for:
Failure to use current software tools and development practices to prevent defects. Note that the directive gives victims the right to obtain evidence of this kind.
Failure to acknowledge and respond to defect reports from customers and third parties, because they were clearly discoverable via the "objective state of scientific and technical knowledge". But note the caveat "at the time the product was put on the market".
Major software vendors used by the world’s most important enterprises and governments are publishing comically vulnerable code without fear of any blowback whatsoever. So yes, the status quo needs change. Whether it needs a hand grenade lobbed at it is an open question. We’ll have our answer soon.
Open Source
Neither the US nor the EU approaches seem to take account of the fact that many of the "products" they propose to regulate are based upon open source code. An under-appreciated feature of the rise of IT has been the extraordinary productivity unleashed by the 1988 Berkeley Software Distribution license and Richard Stallman's 1989 GNU General Public License. Version 3 of the Gnu Public License Section 15 states:
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
That these licenses disclaimed warranties and liabilities was the key to the open source revolution because it enabled individuals, rather than developers protected by their employer's lawyers, to contribute. Without the disclaimers individual developers would face unacceptable legal risk. On the other hand, open source is also a channel for malware. Shaurya Malwa reports on an example in Hackers Are Using Fake GitHub Code to Steal Your Bitcoin: Kaspersky:
The report warned users of a “GitVenom” campaign that’s been active for at least two years but is steadily on the rise, involving planting malicious code in fake projects on the popular code repository platform.
The attack starts with seemingly legitimate GitHub projects — like making Telegram bots for managing bitcoin wallets or tools for computer games.
Each comes with a polished README file, often AI-generated, to build trust. But the code itself is a Trojan horse: For Python-based projects, attackers hide nefarious script after a bizarre string of 2,000 tabs, which decrypts and executes a malicious payload.
For JavaScript, a rogue function is embedded in the main file, triggering the launch attack. Once activated, the malware pulls additional tools from a separate hacker-controlled GitHub repository.
Balancing the need to stop vendors "publishing comically vulnerable code" with the need to nurture the open source ecosystem is a difficult problem.
"Artificial Intelligence" is a vast field of study, and today's focus on generative AI is just the latest evolution of that field.
It wasn't too long ago that the focus was on "big data" — large and complex blocks of information for everything from social media, environmental sensor output, payment transactions, and even bibliographic data.
It was 15 years ago that the excitement was about clustering data to get insights about the books in our collections.
(See this DLTJ summary of talks by OCLC, Open Library, and Google Book Search at the ALA Midwinter conference in 2010.)
Now, "big data" isn't so big anymore, and in fact, it has become the input to the generative AI models that we hear about in the news today.
While "big data" was about understanding and interpreting past data, "generative AI" uses those learnings to create new data... a shift from analysis to synthesis in artificial intelligence.
So this week's DLTJ Thursday Threads is looking at the application of generative AI — sometimes called "large language models" or "foundational models" as a more descriptive term for the technology — in libraries.
Feel free to send this newsletter to others you think might be interested in the topics. If you are not already subscribed to DLTJ's Thursday Threads, visit the sign-up page.
If you would like a more raw and immediate version of these types of stories, follow me on Mastodon where I post the bookmarks I save. Comments and tips, as always, are welcome.
OCLC is using machine learning models for detecting duplicate records
In August 2023, we implemented our first machine learning model for detecting duplicate bibliographic records as part of our ongoing efforts to mitigate and reduce their presence in WorldCat. In the lead up to this, we had invited the cataloging community to participate in data labeling exercises, from which we received feedback from over 300 users on approximately 34,000 duplicates to help validate our model’s understanding of duplicate records in WorldCat. This initiative led to the removal of ~5.4 million duplicates from WorldCat for printed book materials in English and other languages like French, German, Italian, and Spanish. We’ve now enhanced and extended our AI model to de-duplicate all formats, languages, and scripts in WorldCat. Leveraging the labeled data collected from community participation, we’ve tuned and optimized the AI machine learning algorithm, completed extensive internal testing, and engaged WorldCat Member Merge libraries to provide external verification of the algorithm’s performance.
For my non-library friends who don't know about OCLC, it is a cooperative utility used by libraries to get descriptions of items purchased by the library.
(Broadly speaking...librarian friends: please don't come after me for the simplification.)
As an effort where thousands of libraries have entered data for millions of books across 60 years, there were bound to be duplicate or near-duplicate records.
All of the easy duplications have been found and merged.
But in a quest for perfection—a journey that a cataloging librarian will argue is never-ending—there is always more cleanup to be done.
Interesting to see OCLC bringing machine learning models to the task.
Their earlier work would have fallen into the "big data" category I mentioned earlier.
JSTOR tries out generative AI features in its journal article database
This is an article that we&aposre looking at and you can see up at the top I&aposve run a search. The search is "what are characteristics of Gothic literature". And on the side you see we have this new chat box where the user can engage with the content. And this very first action—the user doesn&apost have to do anything—they land on the page and as long as they run a search, we immediately process a prompt
that says: "how is—the query you put in...so ‘what are the characteristics of Gothic literature’—related to this text? And the response comes back: "The characteristics of Gothic literature include evoking fear, et cetera." So it gives you a custom response...a custom summary of the document that tells you basically "Why did I get this response? Why did I get this article?" And here what it actually has to do with your research task.
This is a recording of a presentation at the Spring 2024 member meeting of the Coalition for Networked Information.
The presenter, Beth LaPensee, Senior Product Manager at ITHAKA, is demonstrating a user interface prototype for JSTOR that integrates language models into their journal article database.
They have developed a beta research assistant tool with features like article summaries, related content recommendations, and question-answering capabilities.
The prototype focused on helping users deeply engage with and understand the content of individual articles rather than searching across the entire corpus.
The quote above comes from a point about 12 minutes into the presentation.
The team has gathered user feedback and data on how students, researchers, and instructors used the tool, finding that the question-answering and summary features are particularly popular.
I haven't heard whether this prototype has left the development stage and is heading to the production JSTOR user interface.
EBSCO tries out generative AI features in its discovery products
EBSCO AI Insights is a Generative AI feature that summarizes 3-5 key points of an article, helping users quickly assess its relevance. Accessible via a button on EBSCO’s interface for EBSCO Discovery Service and the EBSCOhost research platform, it complements abstracts and subject headings. Insights are marked as AI-generated, with a disclaimer urging users to verify their accuracy before use.
EBSCO is testing and developing AI features for its library research platforms: EBSCO Discovery Service (EDS) and EBSCOhost.
One feature is AI Insights, which uses generative AI to provide 3-5 key point summaries of articles to help users quickly assess relevance.
Lat year, EBSCO conducted a beta test of AI Insights with 50 libraries and received mixed feedback, with some users finding it very helpful but others concerned about accuracy, especially for referential materials like reviews.
EBSCO took that feedback and said they were working on a new version.
As of yet, I haven't read an announcement of it coming out again...its product page says "coming soon".
Clarivate surveys libraries worldwide about generative AI
The quickening pace of technological advancement, in particular generative artificial intelligence (GenAI), is reshaping the landscape for all. Librarians now find themselves at a pivotal juncture. The question is no longer whether to embrace AI but rather what to adopt and how to do so responsibly. Embracing technological change is not new for librarians, as libraries continue to be bastions of knowledge and learning, evolving their operations and transforming user experiences. Clarivate is deeply invested in the future of libraries. To this end, we conducted a survey of academic, public and national librarians from around the world and are sharing the results. Our aim was to assess current and expected trends and measure the impact of technologies, including AI, on librarians and their communities. In addition to the survey, we conducted several qualitative interviews with librarians from diverse organizations. This report examines the results of our investigation, spotlighting the concerns of librarians and the opportunities they see as they continue to champion their role in advancing the knowledge frontier.
Clarivate conducted a global survey of academic, public, and national librarians to gauge current and expected trends of generative AI in libraries.
The survey found that 60% of libraries are evaluating or planning for AI integration, with AI adoption being the top technology priority.
Key goals for AI adoption include supporting student learning, research excellence, and content discoverability.
However, librarians have concerns about AI, including skills gaps, tight budgets, and potential job displacement.
The link above is to the summary, and the 21-page report is linked from there.
This survey was likely conducted in the early stages of awareness of generative AI, so I'd take its findings with a grain of salt.
Even a year later, we're still figuring out whether this technology is useful and its real costs.
ProQuest introduces Ebook Central Research Assistant
Access to broad, vetted academic content also serves another purpose: it ensures the ability of academic AI tools to deliver reliable outputs. AI-powered chatbots are becoming ubiquitous as a method of discovery for students and researchers, but not all are created equal—potentially exposing students to dangers such as bias and content hallucinations. To address these concerns, we are also launching the Ebook Central Research Assistant, a tool powered by our Academic AI technology backbone, that guides students to effectively assess the relevance of each book, helping to review, analyze, and explore new ideas with ease. ProQuest Ebooks is enhanced with the Ebook Central Research Assistant, meaning students can expect reliable outputs on high-quality scholarly content with instant chapter insights, key concepts, and features that create deeper learning and enrich the research process.
ProQuest is a division of Clarivate, so it would seem that the company put some of what it learned in the survey mentioned above into its product line.
This quoted bit was most of the way down a press release describing changes to how ProQuest offers content to libraries.
Generative AI was just a part of the press release, though, and there has been considerable pushback from the library community about the ProQuest's change from selling content to libraries to this new subscription service.
I would beg to differ if they thought they found the "pulse of the libraries" in their survey last year.
What to consider when you are considering AI for your library
I was asked to participate in a panel at work about AI. I initially declined, but once it became clear that I would be allowed to get on my soapbox and rant for 15 minutes I agreed. Below are my notes and some slides. This was not a fun post to write or present. I’m sure it rubbed some people the wrong way, and I am genuinely sorry for that.
Ed calls for a critical evaluation of AI technologies, particularly Large Language Models (LLMs), which reflect societal biases and may perpetuate systemic racism due to the data they are trained on.
He also points out the intellectual property issues from using copyrighted materials in training these models, which challenge the existing web ecosystem and potentially harm content creators.
Verifiability is another major concern because we don't understand how these models generate their answers.
The impact of AI on employment is addressed, with worries that it may replace skilled workers with lower-paid roles focused on managing AI outputs.
Environmental sustainability is also a pressing issue, as AI technologies consume significant energy resources, raising questions about their long-term viability.
Security and privacy concerns are highlighted, particularly the potential for AI to generate disinformation and compromise user data.
Ed concludes by urging libraries and archives to adopt responsible practices while evaluating AI tools, ensuring transparency, and advocating for user data rights.
Sound advice for libraries...or any profession!
This Week I Learned: There are now 23 Dark Sky Sanctuaries in the World
Rum, a diamond-shaped island off the western coast of Scotland, is home to 40 people. Most of the island — 40 square miles of mountains, peatland and heath — is a national nature reserve, with residents mainly nestled around Kinloch Bay to the east. What the Isle of Rum lacks is artificial illumination. There are no streetlights, light-flooded sports fields, neon signs, industrial sites or anything else casting a glow against the night sky. On a cold January day, the sun sets early and rises late, yielding to a blackness that envelopes the island, a blackness so deep that the light of stars manifests suddenly at dusk and the glow of the moon is bright enough to navigate by.
The pictures that accompany this article from the New York Times are stunning (gift link).
And to think that there are only 23 places in the world that have reached this level of commitment to the environment.
What did you learn this week? Let me know on Mastodon or Bluesky.
We at the Open Knowledge Foundation (OKFN) are excited to announce the list of organisations that have been awarded mini-grants to help them host Open Data Day (ODD) events and activities across the world.
Our team received a total of 130 applications and was greatly impressed by the quality of the event proposals. In 2025, we are running two separate calls to accommodate the diverse interests in our community. The first call was for the general community, and the second was specifically for activities happening in French-speaking countries in Africa.
Check out the results below:
General Mini-Grant Winners
This call was open to any practices and disciplines carried out by open data communities around the world – such as hackathons, tool demos, artificial intelligence, climate emergency, digital strategies, open government, open mapping, citizen participation, automation, monitoring, etc.
A total of 22 events will receive a grant amount of USD 300 each, thanks to the sponsorship of the Open Knowledge Foundation (OKFN) and Datopian.
Here are the winning proposals by country, in alphabetical order:
“Open Data intro + Bring your own Data workshop!” – The event hopes to increase awareness of Open Data and inspire researchers to open up their data, which is related to climate change and biodiversity, governmental data, law and management.
“Mapping Resilience: Harnessing Open Data for Flood Preparedness in Sunamganj” – To improve the accuracy and completeness of Sunamganj’s road network data in OpenStreetMap, enhancing disaster preparedness and community resilience through collaborative mapping.
“Komenda Shoreline Mapping Project- A YouthMappers Open Data Initiative” – To establish a collaborative open data platform for continuous monitoring of shoreline changes in Komenda that informs sustainable coastal management and resilience strategies.
“Open Climate Education” – To introduce Girls in Senior High Schools to Open Ecosystem and Climate Literacy.
Bandung Mapper
Indonesia
Bandung
“Coastal Resilience through Mangrove Rehabilitation: Geospatial Data to Action” – The main goal is to analyse coastal vulnerability and prioritise mangrove rehabilitation sites using geospatial data, followed by field validation and community engagement.
“Pari Island Coral Watch: Collaborating for Climate Justice” – Through citizen science and new coral ecosystem data in Pari small island, we monitor and record coral damage that occurs and foster a spirit of climate justice through storytelling from childhood.
“GeoSkills Training for Climate Action: Empowering Youth with Spatial Data Tools” – To inspire and transform university students from being merely contributors to open data, especially on OpenStreetMap, into active beneficiaries and end users of open data. By equipping them with the skills to leverage open spatial data, we aim to empower them to tackle climate change, the global and local challenge and contribute to poverty reduction and sustainable development.
“Open Data and new technologies to address the polycrisis” – Bringing together students, professors and specialists interested in sharing time addressing the polycrisis from a big data perspective.
“Green Mapping Meetup: Climate Café and Open Data for Sustainability” – Community gathering where people share vegan food, discuss climate solutions, and use open data to map eco-friendly businesses—empowering action to reduce CO₂ emissions and promote sustainable consumption & production.
“Harnessing Opportunities to address Polycrisis through community Engagement (HOPE)” – Raising awareness among marginalised and indigenous communities about the disproportionate impacts of the polycrisis by leveraging open data.
“Tracking Resilience of Indigenous Bodies for Environmental Sustainability (TRIBES)” – Harness open data and participatory research in tracking the resilience of indigenous communities facing climate-driven displacement and conflict, ensuring their voices, knowledge systems, and rights are central in shaping sustainable solutions.
“Empowering Journalists to Leverage Open Data for Climate and Agricultural Reporting” – To equip rural journalists in Imo State Nigeria to use open data for impactful climate change and agriculture reporting, to drive awareness and solutions.
“Leveraging Open Data for Child Advocacy in a Polycrisis Context: The Case of Owerri Municipal Local Government Area of Imo State” – The main goal is to use open data to advocate for child-friendly policies in Owerri Municipal LGA, addressing polycrisis challenges like insecurity and unrest impacting children’s well-being.
“Mapping Resilience – Using QGIS to Analyze La Niña and El Niño Impacts in the Philippines” – To utilize QGIS for spatial analysis and visualization of La Niña and El Niño events in the Philippines, enabling better disaster preparedness and response through data-driven decision-making.
“Wiki She Event Rwanda” – It is aimed at promoting gender equity and increasing the representation of women through creation and improvement articles related to the women in Rwanda.
“Somalia Open Data Day 2025: Unlocking Transparency” – To advocate for the meaningful and effective implementation of open data initiatives by the Somali government to enhance transparency, accountability, and good governance, combat corruption, and foster collaborative dialogue among key stakeholders for evidence-based policymaking, improved service delivery, and strengthened democratic processes in Somalia.
“SMCoSE GeoChallenge 2025: MapRoulette Edition” – Enhancing accessibility and disaster resilience in Tanzania by fixing roads on OpenStreetMap with MapRoulette to improve health and emergency response.
“Open Data for Education: Bridging Gaps and Building Futures” – The goal of the event is to empower citizens, educators, policymakers, and students to use open data as a tool to address educational challenges and improve learning outcomes in their communities.
“Open Data for Resilient Dodoma” – To empower communities, researchers, and policymakers in Dodoma to use open data and geospatial technology to tackle interconnected challenges in urban development, climate resilience, and governance.
“Floods, Smog, Wildfires: Decoding Thailand’s Endless Disaster with Data” – To highlight the failures in disaster management’s use of information and advocate for better data access, enabling stakeholders to support prevention, crisis response, and recovery.
“Open Data for Peace and Development” – Participants understanding of the role and power of open data in creating peace and sustainable development through accessing the open land data as well as open land data policies in Uganda.
Francophone Africa Mini-Grant Winners
This call was specifically seeking to promote events happening in French-speaking countries in Africa.
“Open Geospatial Data to Fight Polycrisis in West Africa” – The main objective of this event is to promote the use of open geographic data to strengthen resilience and decision-making in the face of crises in West Africa, in line with the Sustainable Development Goals (SDGs).
“Open data for sustainable agriculture” – Raising awareness and training farmers and young agricultural entrepreneurs in the use of open data to improve their production and optimise the management of natural resources.
“Wikidata for development: Workshop on contributing to and making use of open data in Cameroon” – To strengthen the skills of contributors and local players in contributing to and making use of open data on Wikidata for the development of Cameroon.
“Open Data and governance of mining resources in East Kasai: Transparency and citizen responsibility” – The event aims to train citizens and local stakeholders in the analysis and exploitation of open data on mining to strengthen transparency, accountability and monitoring of mining revenues in East Kasai.
“Building the capacity of local developers in open data and air quality monitoring in Kinshasa” – The main aim of our Open Data Day event is to build the capacity of local developers to exploit open data for air quality monitoring in Kinshasa.
“DataResilience” – Promoting a deeper understanding and strategic adoption of open data by Senghor University students to anticipate, prevent and manage polycrisis in Africa.
“Open data and health: a lever for empowering young people in Guinea” – To raise awareness and train young people in the use of open data to improve access to sexual and reproductive health (SRH) information and combat female genital mutilation (FGM) in Guinea.
‘Open Data Day 2025 in Côte d’Ivoire’ – Training 10 Ivorians in the use of open data to map illegal gold-panning sites in Ivory Coast, through a day of awareness-raising and content production on Wikidata and OpenStreetMap, to increase transparency and promote sustainable resource management.
“Knowledge sharing on the practice of open science in Ivory Coast” – Define an action plan and timetable to enable the sharing of open science practices, products, tools, or spaces, to improve the service to users.
“Mastering Open Data: from Collection to Analysis with Open Source Tools” – To raise awareness and mobilise local stakeholders around the Sustainable Development Goals in order to promote concrete actions for inclusive and sustainable development.
“The use of GIS in determining flood protection zones and evacuation routes” – We are aiming to empower youth toward the use of GIS knowledge, creating awareness and promote the use of open data for the community impact. Through the event we will improve the preparedness and response capabilities to floods in the Jangwani basin area, as the participants will be equiped on GIS knowledge to determine flood protection zones and evacuation routes.
“Contribution of OpenData in the sustainable peace in Togo” – How open data can help in the fight against insecurity and the risks of violent extremism while promoting peace and social justice in Togo.
“Tdev Open Data Day” – To showcase innovative projects using Togo’s open data platform, explore the integration of AI and open data, and foster dialogue on their impact on innovation, transparency and policy making.
How the selection process works
The selection of mini-grants follows a points system based on four criteria:
Novelty/creativity of the proposal
Community aspect: to what extent the proposal promotes community involvement (especially local communities)
Achievability of the activity and level of commitment of the organisers when writing the proposal
Diversity in terms of geography, gender, and type of activities
Alignment with Open Data Day 2025 thematic focus
In the first phase, the proposals are evaluated blindly: each member of the selection committee assigns a score for each criterion without knowing the authors of the projects. A shortlist is then drawn up, which is discussed at a committee meeting when the authorship is finally revealed. The committee then tries to balance the proposals in terms of geographical distribution, gender, type of activity, etc.
This year, the selection committees were formed by the following people:
General Call
Daniela Popova (Datopian)
Julieta Millán (Open Knowledge Network)
Lucas Pretti (Open Knowledge Foundation)
Tosan Okome (Open Data Day Community)
Francophone Africa Call
Constance Kabore (CAFDO)
Lucas Pretti (Open Knowledge Foundation)
Salimata Sawadogo (CAFDO)
About Open Data Day
Open Data Day (ODD) is an annual celebration of open data all over the world. Groups from many countries create local events on the day where they will use open data in their communities. ODD is led by the Open Knowledge Foundation (OKFN) and the Open Knowledge Network.
As a way to increase the representation of different cultures, since 2023 we offer the opportunity for organisations to host an Open Data Day event on the best date over one week: this year between March 1st and 7th. In 2024, a total of 287 events happened all over the world, in 60 countries using 15 different languages.
All outputs are open for everyone to use and re-use.
Access services policies at academic archives in the United States are, in many ways, informed by carceral logics. This essay explores the ethical implications of upholding such policies for academic archives engaged in the growing realm of carceral collecting. Drawing from sources in trauma theory, abolitionist and Black feminism, and critical race theory, the possibility of providing equal access in an unequal society is questioned. Opportunities for further work to increase inclusivity at archives engaged in carceral collecting are offered. These suggestions are aligned with the goal of making archives documenting the history of incarceration and abolitionist thought and action accessible to the individuals for whom the material is most urgently needed to expedite their liberation.
“Carceral collecting” is a new term in the archival profession and a growing realm of collecting for academic archives (Robinson-Sweet, 2024). Here, the term carceral collecting will be used to denote a collection strategy that aims to document the history of incarceration in the United States, a history rooted in the period of enslavement during which the United States was formed (Davis, 1983; Hernández, Muhammad, and Thompson, 2015; Kaba, 2020). The purpose of carceral collecting is to gather primary source material documenting the experience of formerly or currently incarcerated individuals and individuals and organizations working in the field of prison reform or abolition.[i]
A focus on the individuals who have experienced incarceration and those who are working against the carceral state makes this definition notably different from the related term carceral archives. Carceral archives legitimate the carceral state (Robinson-Sweet), a societal structure in which mass incarceration has a pervasive impact (Hernández, Muhammad, and Thompson). Those working against the carceral state refuse the notion that a healthy society is constructed around a system designed to control the population via an ever-present threat of punishment (Bey, 2022).[ii]
The growing interest in carceral collecting at academic archives creates friction between access policies oriented around collection security and an institution’s claim of being welcoming to all. The zine Reading the Room: A Guide to Surveillance Practices in Special Collections documents the facets of the security apparatus researchers must interact with in exchange for access to primary source material at many archives. Surveillance tools include cameras, security officers, photo identification requirements, and the sometimes permanent retention of visitor data (Abolition in Special Collections, 2024). The Association of College and Research Libraries’ (ACRL) Rare Books and Manuscripts Section (RBMS) recognizes the importance of balancing material security needs with the creation of a safe space for all visitors. For example, the “ACRL/RBMS Guidelines Regarding the Security of Special Collections Materials” (revised 2023) suggests that the presence of security cameras be accompanied by signage informing visitors of their use. This can be difficult if implementation of security camera notifications is governed by institution-wide policies outside of an archive’s direct control.
Such challenges demonstrate why the difference between carceral collecting and carceral archives is essential. There is no dissonance between access practices informed by carceral logics in carceral archives. Practices informed by carceral logics are practices that mimic strategies used in the carceral state to control a population or to justify punishment, such as surveillance. Therefore, an ethical rift is laid bare between the stewardship of these records and the implementation of access policies that mimic carceral logics in archives that steward the intellectual work of individuals harmed by the experience of incarceration and prison abolitionists, who advocate for the closure of sites of incarceration and an end to policing accompanied by a reallocation of funds (Kaba; Bierria, Caruthers, and Lober, 2022).
At the latter institutions, the implementation of non-extractive collecting practices must extend beyond acquisition and processing into the reading room. Access, a physical experience, has a perpetually self-reinforcing relationship with discourse, an intellectual experience (Foucault, 1972). Individuals enabled to access archival material become individuals enabled to speak with authority on the topic of those primary source documents.
To rectify the problems that can be perpetuated in a situation of unequal access, making an archive open to all seems like the simple solution. However, universal availability cannot be the exclusive criteria for physical accessibility. On its own, it does not address the power imbalance of who gets to participate in discourse because our physical experiences are, reflexively, informed by our society’s discourse. This discourse includes ideologies, such as white supremacy, that are built on the desire to control what a person can do or be based on the body they inhabit. To break out of this cycle, archives must craft access policies that are conscious of the fact that just because a door is open not everyone will feel comfortable or safe walking through.
Resmaa Menakem makes the case that “[h]istory matters, and an awareness of it puts our lives in context… History lives through our bodies right now, and in every moment” (2017, p. 258). Therefore, being open to the public must be paired with practices informed by the knowledge that each researcher carries an embodied history with them when they visit the archives.
Because every individual carries a unique history constructed of experiences both personal and collective, I am questioning the very possibility of providing equal access to carceral collections through the use of universalist access policies alone within a fundamentally unequal society, echoing critiques of the profession informed by critical race theory (Chiu, Ettarh, and Ferretti, 2021). This is a core issue for archives engaged in carceral collecting to address because inequalities in the United States are a perpetual cause and effect of living in a carceral state.
An example of the challenges of constructing an egalitarian situation within an unequal society can be found in the history of library organizations’ search for conference locations that will provide a safe and welcoming environment for all attendees. In 1940, the Progressive Librarians Council (PLC) wrote a Letter to the Editor of the A.L.A. Bulletin detailing their disapproval of the A.L.A. council’s vote to “reconsider its 1936 ruling that official meetings of the Association would thereafter be held only where equal treatment could be accorded to all members.” In the letter, the PLC ends their principled argument with the somewhat incongruous statement, “[t]he world of books knows no color line” (Progressive Librarians Council Correspondence, box 1, folder 4, 1940). While the group was right to defend the importance of holding conferences in places where Black attendees would not be subject to explicitly discriminatory treatment under the law, they were wrong to insist that any micro-environment within American society in 1940, including “the world of books,” was truly egalitarian, even in the absence of overt legal discrimination. Indeed, there would have been no need for librarian and PLC member Charlamae Rollins to initiate a letter-writing campaign to publishers regarding the lack of representation of Black children’s experience in children’s books if the world of books knew no color line (Mickenberg, 2006). To insist that “the world of books knows no color line” suggested that society could be rid of racism by transitioning to a colorblind legal framework. This stance does not reckon with what we now know about implicit bias — that the color line not only existed in law but also in our heads (Tippett and Banaji, 2018).
Anti-racist action in libraries and archives requires more than opening the front doors and stating that everyone is welcome in while ignoring their unique identity. The PLC rightly engaged with society critically by pointing out the overtly racist treatment Black library professionals would receive at conferences in states with discriminatory laws. However, achieving the goal of equal access will require that the profession takes the necessary next step and engages constructively with the society of which it is an inextricable part by leading by example and taking accountability for its internal complicity with oppressive ideologies at large to create a more equitable environment. A contemporary example of the complexity of locating a space free of bigotry in the United States is provided by the 2022 Joint Conference of Librarians of Color held in St. Pete Beach, Florida. Ahead of the conference, the Joint Council of Librarians of Color (JCLC) acknowledged the negative impact of anti-LGBTQI+ laws such as Florida House Bill 1557 (often referred to as the “Don’t Say Gay Bill”). However, in their statement on the subject, JCLC doesn’t insist that anti-LGBTQI+ hate is isolated to one state in the United States or that it is absent from library spaces elsewhere. Rather, they acknowledge, “The world has become an even less hospitable place for those who identify as Lesbian, Gay, Bisexual, Transgender, Queer, Intersex (LGBTQI+),” and they turn to reparative action by “[elevating] LGBTQI+ voices in [their] programming, events and presence throughout [their] conference” (n.d.).[iii]
Disrupting the racial colorblindness that often accompanies universalist access policies requires evaluation of the ways in which our bodies influence our relationship to our social environment that is grounded in historical context. Attention to the complexities of this reality is necessary because, as Staci K. Haines shares, “[t]o be a part of change, whether personal healing or widespread social change, we need to be awake to how we embody and participate in social conditions that harm, consciously and unconsciously” (2019, p. 61).
The first recommendation of this essay is that archives engaged in carceral collecting look beyond access policies privileging universality over specificity. Angela Davis argues, “[m]ore often than not universal categories have been clandestinely racialized” (2016, p. 87). A focus instead on embodied inclusivity, an approach to inclusivity that encompasses attention to diverse physical beings, would acknowledge that each visitor carries unique knowledge (Menakem). When the specificities of a person’s identity are embraced rather than washed over, we can begin to develop relationships founded on trust rather than fear (Kelly Hayes and Mariame Kaba, 2023). A universalist approach to access, while better than an explicitly elitist approach, does not really allow an archive to understand its visitors enough to meet their varied needs to feel safe and welcome accessing carceral collections.
I have conducted this research, which is an exploration of ideas in trauma theory and feminist theory, with a particular focus on Black feminist thought, as a white librarian who has not experienced incarceration. The specifications of my identity are problematic in this context because I cannot bring lived experience to bear on the formulation of proposed solutions. I have attempted to fill gaps in my knowledge through study of the work of many other scholars, but a central component of this research is to consider what it would mean to elevate the value of embodied knowledge. Centering lived experience as a legitimate form of knowledge that should be included in the process of developing access best practices in archives engaged in carceral collecting places this work within the framework of feminist theory and praxis (Hill Collins, 2009). My hope is that consideration of the limited nature of the perspectives of those who have not experienced incarceration will generate for library professionals a sense of urgency in implementing the further work that I argue must be done. To be sincere will mean to act.
I began this work with a desire to grapple with the sometimes uncomfortable nature of managing access to carceral collections documenting histories that are not mine to tell as an Access Services Coordinator at the Harvard Radcliffe Institute’s Schlesinger Library. Policing access to these records from my positionality of identification with the oppressor in a carceral state built on white supremacist ideals is ethically fraught.
I make use of the verb police here intentionally to make a connection to the ethical issues that should be considered by archives engaged in carceral collecting. To police is “to supervise the operation, execution, or administration of to prevent or detect and prosecute violations of rules and regulations” (Merriam-Webster, accessed 2024). Qui Alexander acknowledges that “[l]iving in a punitively driven society leads us to internalize carceral logics” (2022, p. 284). Therefore, it is essential for me to consider how the parts of my role that require me to police use of archival material in the reading room may be at cross-purposes with my ability to provide “equal access” to a diverse range of individuals as stipulated by the ACRL guidelines for special collections professionals (American Library Association, updated 2017). Within the context of a carceral state built on white supremacy, can employees at archives engaged in carceral collecting truly police access equally? I believe such an application of colorblind universalism is fundamentally untenable in this context.
Following feminist pedagogical practice (Accardi, 2013), the intention of the further work suggested here is to urge archives engaged in carceral collecting to intentionally seek out the diverse embodied knowledge of individuals interested in conducting research in carceral history in the United States. I will make a case for why these archives must include this knowledge in the process of crafting policies that are abolitionist and informed by the needs of researchers in this subject area.
By implementing a philosophy of embodied inclusivity, archives engaged in carceral collecting will be better able to create a safe and welcoming environment for members of the community of incarcerated individuals and prison abolitionists from which carceral collections originate and, more broadly, the community of researchers for whom material documenting the realities of lives negatively impacted by the carceral state is most urgently needed to expedite liberation. This approach goes beyond a professed openness to diverse ideas, placing it in contrast to symbolic inclusion (Hill Collins). The guiding principle of this exploratory essay will be Patricia Hill Collins’ insistence that “knowledge for knowledge’s sake is not enough” (p. 35).
Examining the ethical implications of the practice of carceral collecting in relation to access policies is essential because the carceral state, a traumatizing structure in American society, is a racist institution that functions to uphold the supremacy of wealthy, white individuals (Davis, 2023; Haines). Many in power recognize that structuring society openly on a racist hierarchy intended to protect white supremacy would no longer be accepted by many of us. Therefore, narratives of criminality are used to sow distrust of individuals and communities of color (Alexander, 2010).
Take for example the fact that racism was the rationale for incarceration of individuals of Japanese ancestry in concentration camps across the United States during World War II (Japanese American Citizens’ League National Committee for Redress, 1978).[iv] Document WRA-126 titled “Application for Leave Clearance” from the War Relocation Authority makes plain the importance of proximity to whiteness within carceral institutions. Item number eight on the form directs the incarcerated individual to “Give the names and addresses of references… These need not be Caucasians, but good Caucasian references may be particularly helpful” (Karl Ichiro Akiya Papers, box 6, folder 1, 1941-1944). The implication was that white references were viewed as more trustworthy by War Relocation Authority officials.
The practice of making race a contingency in assessment of trustworthiness within the criminal legal system[v] persists in our contemporary context despite no longer being explicitly acknowledged in writing (Davis, 2023). While theoretically the law applies equally to everyone, Ruth Wilson Gilmore clarifies that in practice, “substantively different rules and punishments” are applied “to various kinds of defendants” (2007, p. 225). According to the Prison Policy Initiative, people of color “are dramatically overrepresented in the nation’s prisons and jails” (Sawyer and Wagner, 2024). In the context of immigration, racism has historically driven the incarceration of immigrants in detention facilities (Perreira, 2023). Tina Shull names immigration detention as a component of the carceral state nearly indistinguishable from the criminal legal systems’ sites of incarceration (2022).
While it is essential for archives engaged in carceral collecting to be cognizant of the ways in which the archival environment is inextricable from the carceral state, it is important to be clear that an archive’s reading room is not identical to a site of incarceration. The intention here is not to establish an equivalence, but to explore how carceral logics, which entered American life when wealthy, white landowners needed a new means of controlling Black Americans in the wake of the abolition of slavery (Davis, 1983), have permeated culture in the United States creating a landscape of insidious carceral mimetics that archives engaged in carceral collecting have a responsibility not to ignore (Gilmore).
Carceral mimetics found in archives have in some cases been intentionally built into the environment and in other cases become part of the culture surreptitiously over time. The practice of placing security guards in archives mirrors the police presence in society at large, and this practice has not remained static. For example, a cursory examination of memoranda found in the Schomburg Center for Research in Black Culture records displays a greater number of references to security guards and building security procedures in documents in the November 1974-March 1975 memoranda folder as opposed to earlier memoranda folders dated in the 1950s (Schomburg Center for Research in Black Culture records, box 35, folder 4, 1974-1975; Schomburg Center for Research in Black Culture records, box 4, folder 5, 1949-1954). This may be incidental or related to other causal factors, but it is notable that the date range of increased security-related topics appearing in the library’s memoranda mirrors the origin of the contemporary period of mass incarceration (Hernández, Muhammad, and Thompson). Time constraints related to funding available for this project did not allow for a systematic review of all memoranda archived in the Schomburg Center records. Further study of institutional records at this archive and others would be required before fully outlining the potential existence of a clear trend in carceral mimetics.
The presence of security officers is another factor that archives housed within larger institutions may have little direct control over. In these instances, archives engaged in carceral collecting should follow the recommendation included in the “ACRL/RBMS Guidelines Regarding the Security of Special Collections Materials” that “The [Library Security Lead] should be involved with the development and implementation of larger institutional security measures, as these may affect the security of special collections materials.”
Requiring visitors to submit to bag checks is a common practice at archives, and it is a form of surveillance that deprives researchers of privacy. While not included in Abolition in Special Collections’ guide to surveillance practices, it is one of “the technologies and practices” used by archives “that perpetuate a culture of distrust toward people in their physical spaces.” Additionally, it can provide the opportunity for disparate treatment of researchers if bags are checked with a varied level of scrutiny[vi].
Here, again, it would be both inaccurate and inappropriate to apply a direct equivalence between the practice of strip searches in sites of incarceration and the practice of bag searches in archives; however, an evaluation of the ways in which both practices are often gratuitous infringements on privacy is of relevance. Strip searches have been performed in circumstances where the incarcerated individual has not had direct contact with anyone from outside the site of incarceration (Abu-Jamal, 1995), and historian Hugh Ryan (2023) has drawn attention to a Department of Corrections report from 1955 stating that in 20 years of using strip searches as a security tactic, no narcotics were found (p. 178). The utility of strip searches as a necessary security tactic is contested. Notably, in archives, bag searches are often performed in situations in which researchers’ bags are locked away for the duration of their visit. While significantly disparate in severity, both strip searches and bag searches are arguably acts of security performance.
Carceral mimetics are also found in libraries’ cultural environment which, in turn, influences libraries’ physical environment. As Baharak Yousefi argues, “the crucial information about the profession cannot always be found in our vision, mission, and values statements or our strategic plans. It can, however, be inferred from our dispositions, our propensities, and the subtleties with which a particular agenda is pushed through while a different goal or intention is espoused” (2017, 97).
An example of the ethical complexity of the dichotomy between what we say and what we do can be found in an examination of archives’ adoption of diversity, equity, and inclusion (DEI) statements and the library professions’ cultural commitment to neutral framing.[vii] Hill Collins names “paying lip service to the need for diversity, but changing little about one’s own practice” as a “pattern of suppression” (p. 8). Therefore, archives that engage in carceral collecting in an effort to increase diversity within their collection development strategy must consider the implications to policy development in other areas of the archive.
While leaders of libraries and archives may not use the term neutrality directly, a stance of neutrality can still be performed via neutral framing. Chiu, Ettarh, and Ferretti explain that neutral framing is a philosophy that “holds that all viewpoints deserve equal weight and space” (p. 63). This philosophy may appear to be in place for the purpose of remaining open to a wide range of opinions and communities of thought, therefore making the library a more welcoming physical space for a wide range of users. However, librarian Ian Beilin lays out the argument that, in actuality, the idea of neutrality is often used to implicitly uphold whiteness as the standard or normative human experience (2017). Because, unlike carceral archives, archives engaged in carceral collecting do not function to legitimate the carceral state and its racist foundations, these archives are participating in an anti-racist collection development strategy. Anti-racism requires intent and action against white supremacy, and therefore, resists neutral framing. To insist otherwise, leads archives engaged in carceral collecting to an impasse that makes following through on commitments to DEI nearly impossible. At this impasse, these archives become institutions that perpetuate white supremacy under the guise of race-neutral concepts, and, therefore, adherence to neutral framing at archives engaged in carceral collecting is a problematic form of carceral mimetics.
According to the ACRL “Guidelines: Competencies for Special Collections Professionals,” a special collections professional “[p]ossesses cultural and linguistic competencies appropriate for their collections and user communities.” By resisting neutral framing and adopting anti-racist practice, archives engaged in carceral collecting must take on the responsibility of making their spaces truly safe and welcoming. They can do so by building community with and seeking feedback from researchers who have experienced incarceration or the trauma of racist policing who are interested in accessing primary source material as a tool for achieving abolitionist goals.
If individuals who have experienced incarceration or the traumatization of racist policing cannot access archival material documenting the history of that experience and the people who have fought back against it, then carceral collecting reinforces exile. Incarcerated individuals will go “from people to be helped, to problems to be solved,” as Hugh Ryan has argued (p. 37), to objects to be studied by individuals outside the community in a decontextualized environment. [viii] If this becomes the case, effectively, the archive will be facilitating a white supremacist declawing of these primary sources. Because “[a]bolitionist politics demand a literal end to all punitive systems” (Bierria, Caruthers, and Lober, 2022), the life’s work of abolitionist theorists and individuals who have experienced incarceration cannot be held apart for the novelty of study alone by a limited academic community.
Michelle Caswell explains, “[w]e must shift the focus… of the archival imaginary, from some future moment to the present, as users of archives search for past corollaries to their current situation through archival use” (2021, p. 54). Too often, justifying the use of carceral logics to protect records for imagined and disembodied future researchers ranks higher in importance than enabling access to vital primary sources to the fully embodied researchers of today, including those still incarcerated (Robinson-Sweet).
Safeguarding archival materials cannot be neglected entirely, of course. Ensuring materials are available and displaying care in how they are handled is an essential component of showing respect for donors of collections and the communities they represent (Romero and Sykes-Kunk, 2024). It will always be of significant importance to the profession, but making it the highest priority and implementing a security apparatus in support of it is a barrier to implementing a practice of inclusivity that is predicated on a radical trust. Radical trust disposes of carceral mimetics and truly welcomes every researchers’ embodied self into the reading room.
While archives engaged in carceral collecting can continue to practice preservation best practices, such as implementing climate controlled storage spaces or educating researchers on how careful handling of material can delay deterioration, practicing radical trust requires making researchers’ sense of safety in archival spaces the top priority. Implementation of a security apparatus that mimics carceral logics may reduce risk; however, archives engaged in carceral collecting must always seriously consider what effect security practices may have on access. Because security practices informed by carceral logics mimic those found in the carceral state at large, engaging with carceral mimetics in the archive has the potential to activate a trauma response for visitors harmed by the experience of incarceration or the experience of being targeted by racist policing. Archives must adopt trauma-aware practices that may require dismantling security practices informed by carceral logics.
Ensuring that people feel safe in the reading room will require stronger connections between archives and the communities from which they are soliciting collections. Bessel van der Kolk clarifies, “[s]ocial support is not the same as merely being in the presence of others. The critical issue is reciprocity: being truly heard and seen by the people around us, feeling that we are held in someone else’s mind and heart. For our physiology to calm down, heal, and grow we need a visceral feeling of safety” (2014, p. 81). This connection should include work done by public service departments to ensure that individuals who have experienced the trauma of incarceration or racist policing are able to feel safe while navigating the archive (Alison Clemens and Jessica Farrell, 2025). When working with communities that have experienced trauma, archives engaged in carceral collecting must be attuned to the fact that not everyone experiences the same space in the same way. What may seem innocuous to one person could elicit a strong reaction in another (van der Kolk).
Of course, public services departments cannot ask every visitor to fill out a personal questionnaire on their experience with trauma or ask researchers whether they have experienced incarceration. However, one suggestion for further research is facilitating user studies with volunteers. As Hill Collins states, “[o]ppression is not simply understood in the mind—it is felt in the body in myriad ways” (p. 293); therefore, it is essential to hear from people who have been incarcerated directly about what may or may not make them feel comfortable in the archive. Focus groups including researchers who have experienced incarceration would provide archives with the opportunity to learn what aspects of a visit to the reading room made them feel more or less safe. In turn, these researchers would benefit from knowing that their right to access materials related to the history of incarceration in the United States in an environment where they feel safe is important to the archive, especially if their feedback is acted on and real change ensues. Additionally, libraries and archives are known to conduct security audits. Archives engaged in carceral collecting should consider conducting public service audits to assess how they can improve in making their space welcoming for people who have experienced incarceration.
The carceral state extends beyond discrete physical locations of incarceration to a social structure rooted in distrust that perpetuates a damaging psychological experience within society at large. As Ruth Wilson Gilmore explains, “prison is not a building ‘over there’ but a set of relationships that undermine rather than stabilize everyday lives everywhere” (p. 242). And the influence of the carceral state can be found in archives in places that may seem surprising. Citing the American Library Association, Alison Clemens and Jessica Farrell explain, “[f]or example, public and private college libraries and government offices, where many archives and special collections are located, routinely purchase furniture from correctional services” (p.5). To push back against this structure, archives engaged in carceral collecting should engage in a practice of embodied inclusivity. Archives engaged in carceral collecting must acknowledge that the diversity of researchers’ embodied experiences requires a diversity of approaches to make a wide variety of people feel welcome (Fleming and McBride, 2017).
It is important to consider that there are limits to trust, and there are some individuals who are not going to be comfortable visiting an archive, particularly at Ivy League institutions and predominantly white institutions. Archives should consider how they can be creative in developing specific outreach programs that make people aware of their digital resources.
Finally, archives must address the reality that many people interested in accessing primary source material documenting the history of incarceration in the United States cannot visit the archive at all. Therefore, the final suggestion of this paper is that archives engaged in carceral collecting should consider partnering with education programs for incarcerated students to explore what options exist for making archival material available to them. Knowledge can fuel empowerment, which “requires transforming unjust social institutions that African-Americans encounter from one generation to the next” (Hill Collins, p. 291). Academic institutions that collect the histories and intellectual work of prison abolitionists and individuals who have been incarcerated, including Black, Indigenous, and other people of color as well as individuals held in detention facilities, should support incarcerated students.
My research for this paper began with an encounter with Michelle Russell’s essay “An Open Letter to the Academy.” Russell names “the responsibility of women’s studies to those outside the academy’s walls: the mass of women whose lives will be fundamentally affected by the version of reality developed there, but who, as yet, have no way of directly influencing [the academy’s] direction” (1981, p. 101). Working with other education professionals to facilitate access to primary source material for incarcerated students is one way in which institutions engaged in carceral collecting can participate in an exchange with the community of incarcerated individuals in the United States.
None of this necessary work should discourage academic archives from continuing to explore carceral collecting as a growing focus. Including the papers of currently or formerly incarcerated individuals and those involved in abolitionist movements in the archive creates an opportunity for these voices to be heard that is rare within the carceral state. Angela Davis calls attention to the fact that allowing the voices of those behind bars to be heard is essential for those “serious about developing egalitarian relations” (2016, p. 27).
Demonstrated action towards different ways of being in community shows donors and members of the community of individuals negatively impacted by mass incarceration in the United States that the archive views them as co-strugglers and not as objects of curiosity. As explained by Kelly Hayes and Mariame Kaba, “if we can experience other people as co-strugglers… we can act on the values of the world we want to build. We can experience moments of justice, peace, and liberation and in doing so realize that these concepts are not fantasies but realities that can be constructed” (p. 39-40). For a profession dedicated to preserving the history of the full range of American life, the work required to provide access to this material in a respectful manner that reciprocates the trust donors have shown in being willing to share their records is absolutely worth the investment.
Acknowledgements
Thank you to the Harvard Library Bryant Fellowship committee for their time and consideration of my fellowship proposal. Participating in the Bryant Fellowship program launched this work and provided me with the opportunity to visit archives on the East Coast. Big thanks to my Schlesinger Library colleagues Jonathan Tuttle, Emily Mathay, Kelcy Shepherd, Rachel Greenhaus, Erin Labove, and Suzi Earle for sharing feedback on an early draft of this work. Thanks also to Ian Beilin and Jasmine Sykes-Kunk for providing peer review and publishing editor Jess Schomberg as well as the rest of the In the Library with the Lead Pipe team for their kind consideration of my article proposal.
Note: The URL for the AbSC Reading the Room zine was updated on March 12, 2025, as per the author’s request.
Abu-Jamal, Mumia. Live From Death Row. Reading, MA: Addison-Wesley Pub. Co., 1995.
Accardi, Maria. Feminist Pedagogy for Library Instruction. Sacramento: Library Juice Press, 2013
Alexander, Michelle. The New Jim Crow: Mass Incarceration in the Age of Colorblindness. New York: New Press, 2010.
Alexander, Qui. “Teaching Abolitionist Praxis in the Everyday.” In Abolition Feminisms: Feminist Ruptures against the Carceral State, edited by Alisa Bierria, Jakeya Caruthers, and Brooke Lober, 275-291. Chicago: Haymarket Books, 2022.
American Library Association. “Guidelines: Competencies for Special Collections Professionals.” ACRL Guidelines, Standards, and Frameworks. Last updated March 6, 2017. https://www.ala.org/acrl/standards/comp4specollect#ref.
Association of College and Research Libraries. “ACRL/RBMS Guidelines Regarding the Security of Special Collections Materials.” Revised June 2023. https://www.ala.org/acrl/standards/security_theft.
Beilin, Ian. “The Academic Research Library’s White Past and Present.” In Topographies of Whiteness: Mapping Whiteness in Library and Information Science, edited by Gina Schlesselman-Tarango, 79-98. Sacramento: Library Juice Press, 2017.
Bey, Marquis. Black Trans Feminism. Durham: Duke University Press, 2022.
Bierria, Alisa, Jakeya Caruthers, and Brooke Lober. “Abolition Feminisms in Transformative Times.” In Abolition Feminisms: Organizing, Survival, and Transformative Practice, edited by Alisa Bierria, Jakeya Caruthers, and Brooke Lober, 1-8. Chicago: Haymarket Books, 2022.
Chiu, Anastasia, Fobazi M. Ettarh, and Jennifer A. Ferretti. “Not the Shark, but the Water: How Neutrality and Vocational Awe Intertwine to Uphold White Supremacy.” In Knowledge Justice: Disrupting Library and Information Studies through Critical Race Theory, edited by Sofia Y. Leung and Jorge R. López-McKnight, 49-71. Cambridge, MA: The MIT Press, 2021.
Clemens, Alison and Jessica Farrell. “Introduction.” In Archivist Actions, Abolitionist Futures: Reimaging Archival Practice Against Incarceration, edited by Alison Clemens and Jessica Farrell, 5-14. Alexandria, VA: Council on Library and Information Resources, 2025.
Davis, Angela. “Believe in New Possibilities.” In Abolition for the People: The Movement for a Future without Policing and Prisons, edited by Colin Kaepernick, xix-xxiii. Chicago: Haymarket Books, 2023.
Davis, Angela. Freedom is a Constant Struggle: Ferguson, Palestine, and the Foundations of a Movement. Edited by Frank Barat. Chicago: Haymarket Books, 2016.
Davis, Angela. Women, Race, and Class. New York: Random House, Inc. 1983.
Fleming, Rachel and Kelly McBride. “How We Speak, How We Think, What We Do: Leading Intersectional Feminist Conversations in Libraries.” In Feminists Among Us: Resistance and Advocacy in Library Leadership, edited by Shirley Lew and Baharak Yousefi, 107-125. United States: Litwin Books, LLC, 2017.
Foucault, Michel. The Archaeology of Knowledge: And the Discourse on Language. New York: Pantheon Books, 1972.
Gilmore, Ruth Wilson. Golden Gulag: Prisons, Surplus, Crisis, and Opposition in Globalizing California. Berkeley: University of California Press, 2007.
Haines, Staci K. The Politics of Trauma: Somatics, Healing, and Social Justice. Huichin, unceded Ohlone land: North Atlantic Books, 2019.
Hayes, Kelly and Mariame Kaba. Let This Radicalize You: Organizing and the Revolution of Reciprocal Care. Chicago: Haymarket Books, 2023.
Hernández, Kelly Lytle, Khalil Gibran Muhammad, and Heather Ann Thompson. “Introduction: Constructing the Carceral State.” The Journal of American History 102, no. 1 (2015): 18-24. https://doi.org/10.1093/jahist/jav259.
Hill Collins, Patricia. Black Feminist Thought: Knowledge, Consciousness, and the Politics of Empowerment. New York: Routledge, 2009.
Joint Council of Librarians of Color (JCLC), Inc. Board and JCLC 2022 Steering Comittee. “JCLC Stands with the LGBTQI+ Community in Florida.” Joint Council of Librarians of Color. n.d. https://www.jclcinc.org/conference/2022/about-jclc/statement/
Kaba, Mariame. “Yes, We Mean Literally Abolish the Police.” New York Times, June 1, 2020. Opinion. https://www.nytimes.com/2020/06/12/opinion/sunday/floyd-abolish-defund-police.html
Karl Ichiro Akiya Papers. Tamiment Library/Robert F. Wagner Labor Archives. New York University.
Menakem, Resmaa. My Grandmother’s Hands: Racialized Trauma and the Pathway to Mending our Hearts and Bodies. Las Vegas, Nevada: Central Recovery Press, 2017.
Mickenberg, Julia L. Learning from the Left: Children’s Literature, the Cold War, and Radical Politics in the United States. New York: Oxford University Press, 2006.
Progressive Librarians Council Correspondence. Tamiment Library/Robert F. Wagner Labor Archives. New York University.
Perreira, Christopher. Archiving Medical Violence: Consent and the Carceral State. Minneapolis: University of Minnesota Press, 2023.
Raya Dreben Papers. The Schlesinger Library on the History of Women in America. Harvard Radcliffe Institute.
Robinson-Sweet, Anna. “Caring for Archives of Incarceration: The Ethics of Carceral Collecting at University Archives.” Archivaria 97 (2024): 46-80. Muse.jhu.edu/article/930302.
Schomburg Center for Research in Black Culture records. Schomburg Center for Research in Black Culture, Manuscripts, Archives and Rare Books Division. The New York Public Library.
Shull, Tina. “QTGNC Stories from US Immigration Detention and Abolitionist Imaginaries, 1980-Present.” In Abolition Feminisms: Organizing, Survival, and Transformative Practice, 159-189. Chicago: Haymarket Books, 2022.
The National Committee for Redress, Japanese American Citizens League. The Japanese American Incarceration: A Case for Redress. San Francisco: Japanese American Citizens League, 1978.
van der Kolk, Bessel. The Body Keeps the Score: Brain, Mind, and Body in the Healing of Trauma. New York: Penguin Books, 2014.
Yousefi, Baharak. “On the Disparity Between What We Say and What We Do in Libraries.” In Feminists Among Us: Resistance and Advocacy in Library Leadership, edited by Shirley Lew and Baharak Yousefi. Sacramento: Library Juice Press, 2017.
[i] The description of the strategic collecting direction “Voices of Mass Incarceration in the United States” at Brown University’s John Hay Library was informative for crafting this definition of carceral collecting.
[ii] See Mariame Kaba’s “Yes, We Mean Literally Abolish the Police” in the New York Times for an introduction to the work and ideas of contemporary prison abolitionists.
[iii] Thanks to Jasmine Sykes-Kunk for drawing my attention to this contemporary example.
[iv] The term concentration camp is used following the practice of Densho. “Despite the seemingly innocuous name [relocation center], these were prisons—compounds of barracks surrounded by barbed wire fences and patrolled by armed guards—which Japanese Americans could not leave without permission. ‘Relocation center’ fails to convey the harsh conditions and forced confinement of these facilities… Our use of ‘concentration camp’ is intended to accurately describe what Japanese Americans were subjected to during WWII, and is not meant to undermine the experiences of Holocaust survivors or to conflate these two histories in any way. Like many Holocaust studies scholars, we believe that ‘concentration camp’ is a euphemism for the Nazi death camps where millions of innocent Jews and other political prisoners were killed” (Accessed June 9, 2024).
[vii] A reference for the library crowd – the ethos of neutrality has culturally stuck to archives so much so that Tig Notaro‘s character, Jett Reno, uses the term in the 32nd century to describe the fictional Eternal Gallery and Archive in episode seven, “Erigah,” of Star Trek: Discovery season five.
The recordings for the Leeds and Witmark demos were never intended for
public consumption, but were made to sell Dylan’s songs to other
artists. The demo sessions took place in a tiny 6-by-8-foot studio at
Witmark’s offices in the Look Building at 51st Street and Madison
Avenue, where an engineer would capture the performances on a
reel-to-reel. To save tape, the demos were recorded at 7.5 inches per
second, half the speed used in professional studios. A Witmark copyist
would then transcribe the lyrics and music from the tape, and song
sheets would be printed and mailed to recording companies. When a
company’s artist expressed an interest in a song, Witmark would cut an
acetate, a recording on inexpensive plastic, that would be sent to the
artist for preview purposes. If acceptable, the song would be recorded.
The
Witmark Demos: 1962-1964
The image of Mr Z. rattling off these songs, and recording them on the
cheapest material, in the cheapest possible way, so that other artists
could decide whether they wanted to record them for real seems so
ephemeral, like some kind of pre-web social media. It’s amazing that
they survived and could be turned into an album.
In 2024, updates to the Plain Language Medical Dictionary (PLMD) included big improvements for accessibility and user experience, plus adding support for images. We fixed contrast issues, unclear icons, and missing labels to meet WCAG 2.1. Search also got smoother, and instructions are now clearer. In addition, we added image support with JSON updates for URLs and alt text. With our legacy hosting environment shutting down, we moved the PLMD moved to GitHub Pages as part of the project. This provides better stability and automatic updates via GitHub Actions.
I've been in or near higher education for my entire career, so it is probably no surprise that educational technology ranks high on DLTJ topics.
Although a lot of my experience is with library technology, that isn't the only part of the ed-tech landscape that I'm interested in.
Take, for example, these recent Thursday Threads topics:
Feel free to send this newsletter to others you think might be interested in the topics. If you are not already subscribed to DLTJ's Thursday Threads, visit the sign-up page.
If you would like a more raw and immediate version of these types of stories, follow me on Mastodon where I post the bookmarks I save. Comments and tips, as always, are welcome.
Outsmarting "phone prison" pouches
Lauren is one of more than 2 million students in 50 states and 35 countries who scramble each school day to check that one final text or TikTok before sliding their phone into a gray neoprene pouch made by Los Angeles–based Yondr, which brought in over $5 million from government contracts — mainly school districts — in the first three quarters of 2024 alone, according to data service GovSpend. At many schools that use Yondr, each student receives a pouch at the beginning of the school year like they would a textbook. Before entering the building, they snap their pouches shut, then open them on their way out using plate-size magnetic unlocking bases mounted on the walls or rolled out on carts near the exits.
My family went to see Jon Stewart in person for a comedy set last year, and it was the first time I had encountered a Yondr pouch.
It uses a magnetic clasp to seal the phone in the pouch, and although the pouch doesn't block signals, it makes it impossible to see use the camera.
Or, as I imagine the school use cases, read/send texts and browse the social media.
I can see why this would be compelling to schools to eliminate distractions, but a classroom seems much easier to control than a theater venue.
And this article points out that the pouches are expensive, too, and that students have ways around them.
(Like tossing a "burner" phone in the pouch and keeping your real one.)
UNESCO calls for schools to ban smartphones
Smartphones should be banned from schools to tackle classroom disruption, improve learning and help protect children from cyberbullying, a UN report has recommended. Unesco, the UN’s education, science and culture agency, said there was evidence that excessive mobile phone use was linked to reduced educational performance and that high levels of screen time had a negative effect on children’s emotional stability.
UNESCO has called for a global ban on smartphones in schools, citing concerns over classroom disruption, cyberbullying, and reduced educational performance linked to excessive mobile phone use.
What I found surprising was that one in four countries has implemented smartphone bans in schools, including France and the Netherlands.
The report also warns against an uncritical embrace of technology, emphasizing that not all technological changes lead to progress...which seems like sound reasoning to me.
Implications of a student-information-system-in-the-cloud hack
On January 7, at 11:10 p.m. in Dubai, Romy Backus received an email from education technology giant PowerSchool notifying her that the school she works at was one of the victims of a data breach that the company discovered on December 28. PowerSchool said hackers had accessed a cloud system that housed a trove of students’ and teachers’ private information, including Social Security numbers, medical information, grades, and other personal data from schools all over the world.... The next morning after getting the email from PowerSchool, Backus said she went to see her manager, triggered the school’s protocols to handle data breaches, and started investigating the breach to understand exactly what the hackers stole from her school, since PowerSchool didn’t provide any details related to her school in its disclosure email.
Earlier this year, PowerSchool announced that there had been a data breach of its cloud-based student information system.
Its website says it is the largest provider of cloud-based education software for K-12 education in the U.S., serving more than 75% of students in North America.
That is 18,000 customers to support more than 60 million students in the United States alone.
The article discusses the aftermath of the data breach at PowerSchool and how the company was not responsive to questions of the nature of the breach.
School administrators were left scrambling for information, and a system administrator at the American School of Dubai took the initiative to investigate the situation.
Romy Backus collaborated with peers to create a comprehensive guide detailing how to assess the breach and identify stolen data.
Collaboration is common in the education sector, probably because of the generally limited resources for technology cybersecurity.
I've experienced this myself in higher education, where a sense of camaraderie and sharing permeates the profession.
I think a lot of the open source movement comes out of education...it would be interesting to know if that feeling is backed up by actual data.
Google declared end-of-life for Chromebooks that schools want to keep using
At a lofty warehouse in East Oakland, a dozen students have spent their summer days tinkering with laptops. The teens, who are part of Oakland Unified’s tech repair internship, have fixed broken screens, faulty keyboards and tangled wiring, mending whatever they can. But despite their technological prowess, there’s one mechanical issue the tech interns haven’t been able to crack: expired Chromebooks. With a software death date baked into each model, older versions of these inexpensive computers are set to expire three to six years after their release. Despite having fully functioning hardware, an expired Chromebook will no longer receive the software updates it needs, blocking basic websites and applications from use.
This story has a happy ending—Google later extended its support for Chromebooks to 10 years—but it is a reminder of how much influence has been given to technology companies in the education space.
The article discusses the issue of built-in software "death dates" for Chromebooks, which render many older models obsolete after three to six years despite their hardware still functioning.
2023 was at the point where the first round of Chromebooks used during the pandemic were reaching their original end-of-life, so the monetary expense and the stagger e-waste of still-usable machines were stunning.
Advise for buying educational technology
The ed tech industry’s pandemic-era boom has meant K-12 schools and universities are receiving sales pitches for an abundance of new products—from generative AI writing tools and math tutors to robot security guards and lightboards. But with those choices, and billions of dollars being spent annually on ed tech, educators and school administrators say they also have a problem: There is no mandatory licensing process that certifies that ed tech products work as advertised or that they can be trusted with sensitive student information. Experts have called for countries to establish licensing bodies for educational technology, but for the time being, ed tech companies have largely been left to regulate themselves through voluntary, industry-funded certification programs.
Buying technology you can trust is challenging, often because it seems like "trust" is not a selling point that companies emphasize.
These decisions can be even more challenging in the educational technology space, where there are concerns about student privacy.
This article offers some suggestions for evaluating technology purchases.
Chromebooks rejected in Denmark over student privacy concerns
Danish privacy regulator Datatilsysnet has ruled that cities in Denmark need considerably more assurances about privacy to use Google service that may expose children’s data, reports BleepingComputer. The agency found that Google uses student data from Chromebooks and Google Workplace for Education “for its own purposes,” which isn’t allowed under European privacy law. Municipalities will need to explain by March 1st how they plan to comply with the order to stop transferring data to Google, and won’t be able to do so at all starting August 1st, which could mean phasing out Chromebooks entirely.
The regulator found that Google uses data from Chromebooks and Google Workspace for Education for its own purposes, violating European privacy laws.
This decision stems from concerns that Google’s use of student data for performance analytics and AI development is inappropriate, even if not used for targeted advertising.
Google had been in discussions with Danish municipalities since July 2022 to address privacy issues, and it's unclear whether the issue has been resolved.
The latest information I could find in English is from September 2024, and it said that "it is still not settled how the municipalities will ensure compliance and accordance with the decision from the DPA."
When in doubt, the easy answer is to filter everything objectionable. It isn't a good answer
CIPA [Children’s Internet Protection Act], a federal law passed in 2000, requires schools seeking subsidized internet access to keep students from seeing obscene or harmful images online—especially porn. School districts all over the country, like Rockwood in the western suburbs of St. Louis, go much further, limiting not only what images students can see but what words they can read. Records obtained from 16 districts in 11 different states show just how broadly schools block content, forcing students to jump through hoops to complete assignments and keeping them from resources that could support their health and safety.
As my kids were going through high school, they ran into this problem, too, and had to use their mobile devices or the home internet to complete assignments.
But I remember this problem back in the mid-2000s when I was asked to serve on a technology advisory committee for public libraries.
Internet filters, initially intended to block pornographic content, have crept into blocking access to educational and health resources.
The investigation revealed that districts often overblock content, affecting access to vital resources like suicide prevention sites and sexual health information.
The Markup found that filtering systems used in schools categorize the internet broadly, leading to significant censorship, especially of LGBTQ+ supportive content while allowing access to anti-LGBTQ+ materials.
Clearly, there is a need for a more nuanced approach to web filtering in schools to allow students to access a broad range of information essential to learning and general well-being.
This Week I Learned: It is much harder to get to the Sun than it is to Mars
The Sun contains 99.8 percent of the mass in our solar system. Its gravitational pull is what keeps everything here, from tiny Mercury to the gas giants to the Oort Cloud, 186 billion miles away. But even though the Sun has such a powerful pull, it’s surprisingly hard to actually go to the Sun: It takes 55 times more energy to go to the Sun than it does to go to Mars.
I suppose it that headline above needs some nuance.
It is easy to get to the Sun...just escape Earth's gravity and point yourself there.
It is hard to get to the Sun in a controlled way that means you won't burn up along the way.
What did you learn this week? Let me know on Mastodon or Bluesky.
Free menstrual products at libraries are no longer a new phenomenon, thanks to the work of global menstrual equity advocates such as Period.org and Global Menstrual Collective. However, more often than not, these initiatives center around disposable period products. We argue that the work should not stop there. Libraries should explore the distribution of reusable menstrual products, such as menstrual cups and discs, cloth pads, and period underwear. These options are substantially better for the environment, safe to use, and can provide a form of long-term economic support by removing the need to continually buy disposable products. With $10,000 of grant funding, our academic library succeeded in distributing 701 menstrual cups for low cost on campus. Through our first vendor’s buy-one, donate-one policy, an additional 437 cups were donated to the vendor’s global charity partners (resulting in a total 1,138 cups distributed). This initiative not only addresses menstrual equity, with an average saving of $250 a year (and possible $2,500 savings over the ten-year lifespan of a menstrual cup) compared to disposable products, but also highlights the need for sustainable practices within institutional settings, providing a replicable model for others.
Our Menstrual Cup Project
Preparation
Our project started to take shape in Fall 2019, when our library, The Spencer S. Eccles Health Sciences Library, became one of the first buildings at the University of Utah, an R1 public university, to provide freely available disposable pads and tampons within all restrooms. Providing these menstrual products helped relieve some stressors resulting from being a commuter campus, where access to period products for the majority of the campus community would require leaving campus for home or a store. Two staff members within Access Services, Donna Baluchi and Alison Mortensen-Hayes, discussed the feasibility and budget needed to distribute reusable menstrual cups. In February 2020, Mortensen-Hayes researched potential menstrual cup vendors and pricing, as well as funding resources for the project itself. Our team determined the amount of funding we should be seeking and what kind of impact we could have.
Unsurprisingly, our smaller academic library did not have the ongoing budget for free distribution of reusable menstrual products, so we sought out external funding through grants and/or partnerships with organizations with similar sustainability and equity goals. At the time, we were seeing a wholesale average cost of $17 per menstrual cup and an overwhelming number of vendor options. It’s likely that conducting the same vendor and cost research today would result in even more brands and options, and pricing has likely shifted as well.
Understanding we would need to acquire grant funding, which would require budgetary and project management, we initially formed our project team exclusively with others in our library, primarily part-time student employees in the Access Services department: Olivia Kavapalu and Maha Alshammary. One team member, Olivia Kavapalu, was then able to connect us with two students already involved in sustainability initiatives on campus, Amelia Heiner and Sara Wilson. We were all, coincidentally, looking to do the same project at the same time. Together we were a team of two full-time library staff, two part-time library staff who were also students, and two undergraduate students. It ultimately proved valuable to have a diversity of backgrounds, skills, and audience reach when approaching this project.
To guide our approach, we surveyed the campus community to assess interest in reusable menstrual products and preferences for distribution. This survey was distributed via a newly created Instagram account for the project (@cups.for.uofu), the library’s Instagram (@EHSLibrary), internal campus email lists, and through our personal networks. We asked if people would like a menstrual cup specifically, if they would be willing to pick up on campus or if they would prefer shipping to home (in the midst of COVID-19, an important factor), if they would pay for one (including extra fees for shipping), and how many they would like. This data was then used to better inform our grant application and our logistics planning afterwards.
Our survey received 75 responses in total. Survey participants were predominantly students and all identified as female, highlighting student-driven demand for menstrual equity initiatives on campus. 53% of all respondents expressed interest in discounted menstrual cups, with a significant portion open to paying a small fee for added convenience in shipping. 20% stated they were not interested. 21% of total respondents replied “maybe”. We included an optional open comment for participants, specifically encouraging those who answered “maybe” to voice their hesitancy to the surveyors. Only 15 respondents included a comment, 8 of which responding that they would get a menstrual cup if it was discounted. 7 responded that they were nervous about trying menstrual cups. 1 stated that they “did not need one”, and 1 stated they had an IUD, but did not elaborate. The majority of respondents only wanted to purchase one or two menstrual cups. We also included an optional open question for “Additional comments or concerns” which received 10 responses. Nine of those responses included general excitement about the project and the prospect of discounted menstrual cups. Two mentioned wanting size options, and one asked if we would consider including period underwear as a reusable option.
These findings suggest strong community interest in affordable, accessible options for sustainable menstrual products, aligning with broader trends in menstrual equity. These findings also support a worldwide increase in interest in menstrual cups (Why Are Menstrual Cups Becoming More Popular?, 2018). In essence, survey responses were overwhelmingly positive/motivating and gave us sufficient data to proceed.
Full results of our survey are summarized in Table 1.
Table 1
Question
Would you be interested in purchasing the menstrual cup shown above at a significant discount through the University of Utah? (required with fixed options; 75 responses)
Yes 53.33%
Maybe 21.33%
No, I already use a menstrual cup or other zero-waste menstrual product 5.33%
No, I don’t want to switch from traditional menstrual products 20%
If you answered ‘Maybe’ to the above question, why? (optional open comment; 15 responses)
Nervous about trying menstrual cup 40%
Personal belief they do not need one 6.7%
Would get one if discounted 46.7%
Have IUD 6.7%
What would be your preferred method of receiving the cup? (optional, select all you would be willing to do; 73 responses)
Pay $3 in addition to the discounted price to have it shipped directly to you 56% of all respondents selected
Pay the discounted price and get free shipping directly to you 77.3% of all respondents selected
Pick-up on campus with COVID safety protocols in place 56% of all respondents selected
(2.7% did not respond)
How many cups would you purchase, if you did? (optional, select one; 70 responses)
138.7%
241.3%
39.3%
4+4% (6.7% did not respond)
To which gender identity do you most identify? (required with fixed options; 75 responses)
Female 100%
Male 0%
Nonbinary/Trans 0%
Which best describes your affiliation to the University of Utah? (required with fixed options; 75 responses)
Student 89.3%
Alumni 10.7%
Using the vendor research and survey data, we were able to apply for funding with a clear project vision, including budget and timeline. Very fortunately, our institution’s Sustainability Office offers grants called Sustainable Campus Initiative Funds (SCIF)(Sustainable Campus Initiative Fund – Sustainability, n.d.). This grant is funded by student tuition fees and therefore must go toward sustainability projects that primarily benefit students.
We were notified in March 2021 that we were awarded the $10,000 “medium-sized” grant. This enabled us to take the scope of our project from providing a few menstrual cups within our library to a large distribution across the entire campus. We were thrilled with this prospect since we knew a grant application has no guarantees, and were even warned that these grants are not often awarded to initiatives that are likely to die out without a consistent funding source.
Getting Started After Funding Obtained
Project members Maha Alshammary and Olivia Kavapalu researched potential campus partners that existed at the time to help advertise our efforts, including the LGBT Resource Center; Women’s Resource Center; Diversity Office; Office of Health Equity, Diversity and Inclusion; Sustainability Office; other libraries on campus; and all student resource centers. Team members of this project decided to use GroupMe as a way to communicate with each other, supplemented with occasional emails, and met virtually exclusively via Zoom. As briefly mentioned before, project member Amelia Heiner created a social media profile on Instagram to get the message out about our project, as well as inform everyone of the sustainability benefits of using a menstrual cup over traditional disposable pads and tampons: https://www.instagram.com/cups.for.uofu/. We planned to use this account to continue educating on the sustainability of reusable menstrual products, specific education on menstrual cup use, promote any tabling events we set up, and link to our shipping option. Providing drop-ship would allow patrons to purchase a deeply discounted cup directly with the vendor and have the product shipped directly to their chosen address.
From April to August 2021 we finalized logistics. Matters we had to settle included vendor options, selecting the cup we wanted to distribute, and how we would table and/or offer direct shipping in the light of campus COVID protocols. Accounting considerations took much of our focus, such as how billing would be handled with the vendor, how we could accept payments on campus, and how we would comply with university regulations. We were not in departments responsible for any of our library’s large purchases, and we did not realize the time-consuming complications that would later arise when having to navigate university purchasing and accounting policies.
Vendor Selection
It was important to our team that our vendor was as sustainable and ethical as our project aimed to be. There are many inexpensive silicon-based menstrual cups from large manufacturers in countries with less regulation, who will use the term “medical grade”. In many cases, these are less expensive, but lack any sustainability stance, medical education for users, or company ethics driving their production. Our choice of vendor emphasized sustainable practices and menstrual equity, supporting the environmental and social goals of the project. We initially chose the vendor Dot Cups Ltd., due to their prior partnerships with colleges and universities and their charitable company policy of donating one cup for every cup purchased to organizations of Dot’s choice, doubling our sustainability impacts. Dot did not indicate where they would be donating their charity cups, simply that they would send them to different global partners they had established. Dot did mention the opportunity for us to get more involved in their charity donation policy and connect them with a local establishment of our choosing. This was something we were excited to hear, but without any established partnerships to pursue this, we defaulted to Dot’s company connections. Dot additionally provided us with marketing material focused on college students who would be first-time menstrual cup users.
We had already begun working with Dot when accounting let us know that our request to work with a single vendor was denied by the university due to policy. Instead, we were required to put the project out to bid to all vendors to encourage fairness in vendor selection. Our public university requires all purchases using institutional funding of over $5,000 to go through this process, wherein multiple vendors are permitted to apply for consideration. After vendor applications are submitted, the central university accounting office makes the decision and selects a single vendor. While the nuances of their selection criteria were unknown to our team and we were not involved in their direct conversations with vendors, we did have the opportunity to suggest specific companies for them to reach out to and solicit an application. We returned to the vendor research we did at the very beginning of this project to supply accounting these additional vendor options. This allowed us to focus on companies that shared our values and could potentially meet the scope of our vision as previously described.
Fortunately, we did end up working with Dot, though the bidding process delayed the project by a couple of months (which we did not foresee).
Budget Management
Project members tentatively planned on purchasing half of the inventory for distribution on-campus, and half to distribute via drop-shipping. However, this was not purchased or acted upon all at once. Accounting advised us to place an order with Dot Cups with only half of our grant funding: $5,000. We were requested to stagger the purchases like this so that we could verify the inventory existed as advertised and calm our accountants’ fears of any fraudulent claims by Dot, which we highly recommend others also do when purchasing at this volume. Therefore, our initial purchase was for 295 cups (at $17 each, tax free due our nonprofit status as a public university) for in-person sales. At the time, Dot menstrual cups retailed on their website for $34 each.
Our campus’ sustainability grants are often applied for by students, and all grant awardees are assigned an advisor from the campus’ Sustainability Office to guide them through their awarded projects. At the advice of our assigned advisor, we decided to charge a minimal $5 fee per cup, in order to give them psychological weight and discourage waste of the cups themselves. If free, our advisor warned that there is the likelihood of people taking the cups simply because they are free and later throwing them away. In the case of drop-shipping, after discussing with our vendor, we decided on $8 total cost: $5 for the menstrual cup and an additional $3 for shipping. All fees collected were cycled back into the project to buy more cups and stretch the project further. Based on our survey data, we decided to limit each person to purchasing two cups to extend the number of people able to purchase a menstrual cup, and prevent any purchaser from buying many to attempt to resell them.
Sales
We began soft selling in August 2021 from the library’s front desk while working with the vendor to set up drop-shipments. Our library’s front desk is staffed by part-time, often student employees, who were not over-burdened by the task. There was no discomfort from anyone with handling the menstrual cups, and the menstruators who worked at the library were especially excited to get access to inexpensive menstrual cups and help in the distribution. We could only take payment using the accounting-approved methods of cash or our university credit card processing merchant services client, which required the use of a webpage and manual input of a credit or debit card. There was minimal training needed for this. Most purchasers were interested in buying a cup and leaving, but for purchasers who had questions or wanted more education about insertion techniques, they were directed to team member Donna Baluchi.
Marketing and Promotion
We advertised our low-cost menstrual cup distribution using every avenue we could think of to capture as much of the campus community as possible. Physical marketing included hanging flyers and posters within student resource offices, bulletin boards, and signage inside any menstrual product baskets within restrooms. Digitally, we took advantage of newsletter campus emails that go out to large audiences and sent some targeted emails as well. Our Instagram, however, is what reached students most effectively, after the posts were continually shared within student “stories”.
Two of our project members were interviewed for the university’s health sciences newsletter. The interview was met with several positive comments, most of which expressed excitement and asked where they could get the menstrual cups. Comment examples included:
“BRAVO! Well thought out, great investment and execution.”
“Way to lead the campus on period equity and sustainability!”
Distribution Strategies
We held two tabling distribution events that we heavily marketed through all of our advertising channels, including personal networks. Due to the geographic layout of our university, we determined it would be advantageous to hold one event in the large plaza frequented by all undergraduate students in September 2021 and another at our library’s open house in October 2021 to reach colleagues and students on the health sciences area of campus, which is located a significant distance from central campus with a large elevation change that deters travel between the two sides. Each tabling event was for approximately ninety minutes, and we sold over 200 cups between the two events. Nearly every purchaser was a current university student, and many commented that the limited in-store options and initial cost of $30-$40 is what prevented them from trying menstrual cups before. The majority only purchased one menstrual cup, and those buying two often commented that it was being purchased for a friend or family member. We had less than five purchasers who specifically asked to buy three or four, stating specifically they had family or friends who asked if the purchaser could get one for them. In one instance, a purchaser obtained a small batch to put in the student food pantry for free giveaway (paying it forward).
We finalized drop-ship logistics with Dot in early October 2021, and purchased 142 cups for these drop-ship orders to start. We had actually attempted to order 100 cups (as our plan had shifted to ordering the drop-ship inventory in increments, responsive to demand), but Dot realized there was an issue with the invoice when our check arrived – since our campus members would be paying $5 each, the cups were meant to be invoiced at $12 per unit but the bill was set at $17 each. We consented for Dot to keep the extra funding and up the cup inventory we’d be purchasing rather than start the payment process over again with a new invoice and check.
When drop-ship was publicized via our Instagram account in October 2021, it was incredibly popular, even with the added $3 shipping cost. Within two weeks, over half were purchased.
On November 8th, 2021, we completely sold out of all inventory from our initial purchases, both in-person and drop-ship. Due to a global silicone shortage during the COVID-19 pandemic, we had exhausted our vendor’s inventory. Our vendor representative informed us that we would need to check back after January 2022 after they were able to restock.
Navigating Challenges
Our team faced several logistical and vendor-related challenges, which provided important lessons for future projects. Firstly, throughout the project’s length, our group dynamic shifted when students involved would graduate and leave, and other students would ask to be involved. In our team, two members left the project after graduation,and one new member joined when they were hired as part of the library’s front desk crew after they expressed interest. Students truly drove the passion behind the project, and have great networking skills, reaching audiences many full-time employees cannot. Motivated entirely by their passion and interest in increasing menstrual equity, the students on the project contributed countless unpaid hours to our work. However, working with students also means classes and graduation are their highest priority, and an increased school workload does not allow room for extracurricular activities.
Our project’s biggest challenge was the loss of our original vendor. Dot Cup became uncommunicative and were no longer responding to emails. After months of silence, we made the decision that we would need to continue the project with a new vendor. While it was difficult to deal with the extended delays and the loss of a vendor we trusted and respected, we remained committed to achieving the project’s intended sustainability and equity goals. We took a moment to regroup (as the project had been going on for over a year at that point) and decided to do one final large purchase and distribution effort.
Thankfully, we did not have to put the project out to bid again as the amount of our remaining funds was under the bidding threshold. This experience highlights the importance of planning for vendor flexibility in case of unforeseen delays or issues.
We collectively researched vendors, and chose Saalt to work with next. Our reasoning in selecting Saalt was because, like Dot Cups, they are a US-based company who makes medical grade silicone cups, with a demonstrated commitment to sustainability and menstrual education. At the time (and possibly still), Saalt offered differing prices depending on if you’re distributing the cups for free or not. Since we had already charged $5 during the first round of cup distribution, we felt it would be unfair to distribute the next round for free. However, Saalt offers deeply discounted products (including cups, discs, and period underwear) if the buyer’s intention is to distribute the items for free. At the time of communication, free distribution qualified for a special price of $10 per cup, as opposed to their wholesale $17 per cup price. Unlike our prior vendor, Saalt also offers multiple cup sizes, so we sent out a survey in May 2022 with the company’s “size quiz” to our email networks on campus. The emails went to a list of students and colleagues who had reached out to us after we had sold out of our initial stock, so we could email them once we received our next shipment. We received 42 responses from colleagues and students, and the results were exactly half “regular” and half “small”. At the time of ordering, they did not offer their current “teen” size. With the new pricing information, we placed our last vendor order in June 2022 with our remaining funds: 132 small size and 132 regular size (264 total).
Saalt cups arrived without individual packaging, so our team needed extra time to prepare them for distribution by placing each cup in the provided cloth bags. When ready, we held a final tabling distribution event at our library in July 2022. The majority of the cups were sold during our tabling event, and the few remaining cups were sold from the library desk, trickling out over two months. We marketed this final round using our Instagram page, the health sciences campus newsletter, and through signage in our restrooms.
On October 4, 2022, we sold our last menstrual cups and closed out our project. The small remaining funds from this last round of sales were returned to the university’s Sustainability Office, whom we received the grant from, to be applied to other sustainability-oriented projects.
Conclusory Remarks
Sustainability and equity are both driving goals for our library and our institution. They are both found in our library’s strategic plan as well as the five-year strategy goals of our university. This project was a way to enhance both, while providing improved healthcare to our campus community.
In total, our project team distributed 701 menstrual cups, and 1,138 menstrual cups altogether when considering Dot’s policy of donating a cup for each one purchased. There are positive environmental and fiscal impacts related to each one of these cups that we dispensed. One report states that “a year’s worth of a typical feminine hygiene product leaves a carbon footprint of 5.3 kg CO2 equivalents” (“The Ecological Impact of Feminine Hygiene Products,” n.d.). At 1,138 cups, assuming one cup per menstruator and a longevity of 10 years per cup, our project has the potential to avert 60,000 kg CO 2 emissions. Using the EPA greenhouse gas calculator, the total potential emissions we saved are equivalent to taking 13 cars off the road for a year or saving 66,000lbs of coal from being burned (US EPA, 2015). In terms of disposable product waste, at an estimated 22 products used a cycle (“Menstruation Facts and Figures,” n.d.), if 1 cup went to 1 person for 10 year’s use, we helped to avert the use of 3,004,320 disposable products collectively.
Financially, sources claim that on average, menstruators spend $20/month (or $18,000 in a lifetime) on disposable menstrual products (Female Homelessness and Period Poverty – National Organization for Women, 2021). Using this calculation, we potentially saved each menstruator $2,395 (again, assuming each cup lasts the expected 10 years).
We received a significant amount of praise and appreciation from the campus community and recently inspired a new group of students to begin a similar project of their own. The Associated Students of the University of Utah (ASUU) partnered with our university facilities department to permanently provide disposable menstrual products in all student academic buildings in 2021 (Menstrual Product Project – ASUU, n.d.). Our hope is to continue these efforts into perpetuity, and we have contacted ASUU representatives to encourage and support any menstrual reusable initiatives they might consider.
Lessons Learned, Advice, & Words of Encouragement
We encourage everyone to consider doing a similar menstrual equity project with reusable products. None of what we accomplished was assigned or part of our promotion or job scope, but we still succeeded. Our project pushed boundaries by bringing often-overlooked issues like menstrual equity and environmental sustainability into public discourse while navigating a complicated accounting process. No project member had prior experience with a project of this size, nor had anyone applied for a large grant before this project. Yet, despite our inexperience and the challenges we faced, we succeeded in making positive, equitable, sustainable change within our workplace and community. The interest and need is there. We believe anyone with a few committed colleagues who can write for grants could do this. Any nonprofits, schools, health systems, or community aid could put together a similar project.
Planning and Timeline
Though some of the challenges we faced in our project could easily be avoided, in the future, we would still advise that you give yourself plenty of time between project start and completion. Even without a global pandemic, this would have more than likely been a multi-year endeavor.
Do your research into your vendors and their products. Verify they are equitable, sustainable, and aware of what their products contain and where they are manufactured. Certified B Corporations are given that designation when they meet a minimum sustainability standard. Medical-grade silicone should be the minimum for cups, and other reusables should not include any harmful substances. Products advertised as “organic” and “non-toxic” such as period underwear have been known to test positive for PFAS contamination (Kluger, 2023), so it is imperative to make sure the product you are distributing does not come with a hidden harm. As part of this, we recommend avoiding the inexpensive, factory-only bulk companies. Period Nirvana is an online education hub and store that details all the recommended menstrual reusables on the market and can help with vendor (or personal) decisions around menstrual products.
Collaborative Partnerships
Partner with others early and often. Doing so from the outset of the project will streamline processes down the line and foster shared ownership. If you are at an academic institution, you likely have sustainability-oriented offices as well as women’s and possibly cultural or diversity centers. We found this was a positive way for multiple types of groups to work together on a shared goal. Health sciences and public health institutions are especially good places to find partners in this work. Other libraries, including public libraries, could partner with nonprofits and community organizations. Also, academic libraries can partner with their local public libraries and vice versa. People committed to menstrual equity are everywhere.
Recognize that you will likely also have to work closely with departments outside your core team, like your institutional accountants and facilities team. It is crucial you discuss project logistics with your budget office or accounting department as soon as your project vision is in place, especially those requiring large funding. Approach your accounting and facilities colleagues ahead of time to get everyone on board and find out what will be sustainable in terms of expectations. Without this expertise, you likely cannot sustain a project of this size long-term.
Budget and Accounting Management
Speaking of accountants, though Saalt offered drop-shipping, in-person distribution was significantly easier for our accountants to monitor, and after a couple years and a vendor change, they were feeling fatigued with this project. We recognized that we should have been talking more in-depth with our accountants at the very beginning, as there were significant complications within the funding and purchasing of a project like this than we had originally assumed (which was then further complicated by a global pandemic). Beyond complications such as a “Sole Source Request” to more directly choose which vendor we wanted (which we assumed would be a simple process, but at large public universities it is not) then putting the project out to bid, there were Payment Orders for each inventory purchase, sales tax calculated and paid on each cup sold (even though we purchased them tax-free as a public institution), regular cash deposits (policy prevented use of direct cash transfer apps like Venmo or Paypal), record keeping, adding a new vendor in various systems (ours, the university’s, theirs), etc. We regret not communicating the full scope of our project with our accounting team from day one of our initial project idea, as it would have saved time and energy for everyone.
Diversity and Inclusion
Diversify your project team, and especially include teens/young adults/college students, to cover the many different roles in a project like this. Including a diverse team, both in terms of skills and demographics, ensures that the project has a greater chance to meet the needs of all interested parties and gains wider community support. However, note that students may come and go as the project goes on, and certain times of the year will be more difficult for them to participate. As mentioned previously, our group dynamic shifted when Maha Alshammary and Sara Wilson left our project after graduation, and Graycee San Cebollero asked to join the team after being hired as a part-time library employee (and later assisted in reviewing literature on menstrual cup sustainability impacts). Undergraduate college student (approximately ages 17-25) menstruators were the age group most interested in menstrual products during our project ,and having them on your project team ensures their broader peer social networks can be reached.
Though we were unable to in our project, we advise any future reusable menstrual product distribution efforts to provide options; both the type of product as well as sizes. Different bodies have different needs, and different people have different desires. Throughout our project, team members and purchasers would occasionally mention that reusable pads or menstrual underwear should be our next project.
Last but certainly not least, avoid making your campaign gendered. To encourage inclusion and accessibility for all, the products should not be placed just in women’s designated restrooms/spaces, and publicity should not discourage trans, nonbinary, and gender diverse people from engaging and benefitting from your service. Avoid using “she/her” or “women” and instead use “menstruators” or, simply, “people” when discussing periods and those of us who have them. We know this is a current culture war. Our library has been attacked online for posting about our menstrual products, as well as had our disposable products in gendered men’s restrooms damaged and vandalized. None of this stopped us. Thanks to advocates across the university, there are now menstrual products in every bathroom on campus, for anyone, and any reason.
Acknowledgment
The authors are incredibly grateful to our publishing editor Jaena Rae Cabrera, and Brittany Paloma Fiedler and Joshua Osondu Ikenna for their encouragement, guidance, and expertise as our peer reviewers. Thank you.
We are also thankful to menstrual equity advocates worldwide, whose pioneering work helped pave the way for this project.
This project would not have been possible without the support of the University of Utah’s Sustainability Office and the Eccles Health Sciences Library Access Services and Accounting departments. We would also like to dedicate this article to Joan Gregory, who initially forged the path towards menstrual equity at our library and our institution.
The following post is one in a regular series on issues of Inclusion, Diversity, Equity, and Accessibility, compiled by a team of OCLC contributors.
Earliest known African American cookbook republished
One of the items in the Janice Bluestein Longone Culinary Archive held at University of Michigan (OCLC Symbol: EYM) Special Collection Research Center is a humble 30-page paper bound volume titled A Domestic Cook Book: Containing a Careful Selection of Useful Receipts for the Kitchen. Written by Malinda Russell and published in 1866 in Paw Paw, Michigan, this book is the only known copy of the earliest known cookbook written by Black author in the United States. Although this book has been digitized and is available online it has now been republished in a new edition that is available both in print and online as an open access ebook.
I’m highlighting this example during US Black History month as an example of the many treasures held in libraries and archives, but also as an example of the work that cultural heritage institutions play as publishers and disseminators of rare content, making materials available for all to enjoy and learn from. The book has recipes that — to a modern eye — lack important details like baking time or temperature. To prepare “French Lady Cake” the full instructions are “Three cups sugar, one do. butter, six eggs, one cup sweet milk, one teaspoon soda, two do. cream tartar, one wine-glass brandy, the juice of one lemon, four cups flour; the soda dissolved in the milk, the cream tartar in the flour.” The book also contains recipes for things like “Restoring the Hair to its Original Color,“ which is equally brief: “Lac Sulphuris two drachms, rose water eight ounces. Shake it thoroughly, and apply every night before going to bed.” The newly published edition also contains a foreword by Dr. Rafia Zafar, an expert in foodways and literature, which sets Russell’s accomplishments in context. Contributed by Merrilee Proffitt.
WebJunction webinar on neuroinclusive library workplaces
On 4 March OCLC’s WebJunction will host a free webinar Embracing neurodiversity: Cultivating an inclusive workplace for neurodivergent staff. Presented by accessibility consultant and librarian Renee Grassi. Webinar attendees will learn about the meaning of the term neurodiversity, the strengths of neurodivergent people, and ways of making the workplace more neuroinclusive. The blog post “Supporting Neurodiversity in the Library Workplace” by Bobbi L. Newman provides a summary of general articles about neurodiversity in the workplace and bibliography of resources.
I began writing about neurodiversity in libraries for Advancing IDEAs on 25 July 2023. As a neurodivergent person, I was thrilled to see so many excellent projects and articles focused on serving neurodivergent patrons including the University of Washington’s Autism Ready-Libraries Toolkit. However, as neurodivergent librarian, I was somewhat frustrated that discussions about making libraries more neuroinclusive often did not discuss the library as a workplace. In February 2025, I am thrilled to see these discussions happening in multiple spaces, including webinars, conference presentations, and the University of Washington’s current research project Empowering Neurodivergent Librarians. I look forward to attending this WebJunction webinar next month and seeing what other educational opportunities about this topic will emerge in 2025. Contributed by Kate James.
Documentaries focus on libraries
It isn’t every day that we hear about a film concerning libraries and what we are facing in the current moment. So, think how surprising it is to discover two such documentaries right now. At the January 2025 Sundance Film Festival in Park City, Utah, the documentary The Librarians, by producer-director Kim A. Snyder, premiered and is expected to be available on a major streaming platform soon. The Librarians features numerous colleagues of ours from the American Library Association (ALA) and the American Association of School Librarians (AASL) who have battled for our intellectual freedoms in such places as Florida, Louisiana, and Texas. Read about The Librarians in the ALA news item, “American Library Association and American Association of School Librarians Members Featured in Documentary ‘The Librarians’ to Walk the Red Carpet at Sundance Premiere.” If your local community has not already scheduled a showing of the other documentary, “Free for All: The Public Library,” be aware that its broadcast premiere is set for Tuesday, 29 April 2025, as part of the PBS show Independent Lens.
The Librarians examines the wave of censorship spreading across the United States, especially targeting racial and LGBTQ+ issues and resources. Librarians stand tall at the center, protecting access for all users and opposing legislation trying to criminalize the work we do. “Free for All” traces the history of how public libraries became a formative institution vital for the preservation of democracy. It counters the notion of library obsolescence with the fact that libraries are local treasures where resources are available and free to all, even in this fraught political era. Contributed by Jay Weitz.
LibraryThing is pleased to sit down this month with filmmaker and author Rosanne Limoncelli, the Senior Director for Film Technologies at the Kanbar Institute and at the Martin Scorsese Virtual Production Center, both part of NYU’s Tisch School of the Arts. She has written, directed, and produced documentaries, educational films and short narrative films, and has taught writing and filmmaking for more than three decades. Limoncelli’s first book, Teaching Filmmaking: Empowering Students Through Visual Storytelling, was published in 2009. She has published short stories in the Alfred Hitchcock Mystery Magazine, Suspense Magazine and Noir Nation. Her debut mystery novel, The Four Queens of Crime—offered in our January Early Reviewers batch—is due out next month from Crooked Lane Books. Limoncelli sat down with Abigail to answer some questions about her new book.
I love reading biographies of my favorite authors because I always wonder what experiences from their lives might’ve made it into their books. I love the psychology. Reading about Agatha Christie led me to the other three and it fascinated me that these four women were the bestselling authors of the 30’s. How amazing was that! The lives of Ngaio Marsh, Dorothy L. Sayers and Margery Allingham were just as fascinating, and of course I started wondering if they had ever met and that led me to, what if they did meet and got involved in a murder case? Would there have been a woman DCI they could’ve collaborated with? And then I found Lilian Wyles, the first woman DCI at Scotland Yard. And miraculously, she had a memoir!
Were you an admirer of these four authors’ work, before beginning your book? Which one is your favorite, and why?
That’s too hard a question! I think that sometimes I’m in the mood for one author over another, and they constantly switch places for number one. I love Christie’s puzzles, Ngaio’s characters, Allingham’s language, and the patter between Sayers’ protagonists. In the book they talk to each other about writing and how different they are from each other. That was one of the fun things about writing them as characters. I will say that for each author I have my favorite titles in all formats, hard cover, kindle and
Audiobook, and I go back to them often, not just for research reasons, I need to keep in touch with the main characters like they are real people in my life.
What sort of research did you have to do on your four queens, in order to incorporate them as characters in your story, and what were some of the most interesting things you learned?
I’m a research nerd, and I went way overboard researching all four authors, consuming all their books, plus articles, biographies, documentaries, movies and tv shows of their work, and the time period, 1938. I’m lucky that my husband has always been into the history of World War Two so we watched a lot of feature films and documentaries from and about that era. The four queens all came alive for me quickly, mainly through their biographies. I found it interesting to notice the differences between the four writers, as well as their similarities. For example, they were all big lovers of Shakespeare, they each had very different writing styles, they all grew up so differently. Agatha was home schooled, Dorothy was one of the first women to get a degree from Oxford, Ngaio was a painter and travel writer before she wrote mysteries, and Margery grew up surrounded by writers. I got very interested in the accuracy of their real lives pertaining to my story, figuring out the possible real time they could’ve spent together. The spring of 1938 would’ve been the last chance for them to meet before the war, since Ngaio Marsh returned to New Zealand shortly after that spring. I also noticed that they all had a change in their writing careers right about that time, so I imagined that the experiences on that weekend of my imagined murder changed them personally to bring about that literary change.
What influence has your career as a filmmaker had on your mystery writing? Would you say you were a visual storyteller? Do you see the scenes and characters before writing them?
I do think I see scenes and characters before I write them, which actually can make it more challenging for me because I forget I have to translate my visual imagination into text so I often leave things out without noticing. An early reader will mention they’d like more description of a certain place or person, and since I see it so clearly in my mind, I have to remind myself that no one else can see it, that I actually have to put it into words! But what is the same for me, in writing films and writing fiction, is the story structure. The logic and sequence of what happens and what should happen next is my favorite part and I make charts and spreadsheets and notes and lists obsessively before writing a project and throughout the whole process. It’s the puzzle of the story that I love the most. Building it up, breaking it down, deciding on the clues and all the information that leads to the climax and makes for a convincing ending, sorting and resorting every detail until it makes sense to me and I’m satisfied with it.
You have written short stories, films, and an academic text, but this is your first novel. Was the writing process any different, when working on this kind of text?
Technically this is the fifth novel I’ve written, just the first one to be published. (Keep writing out there, writers!) Each project is a bit different for me, but one thing that was quite similar in this project and the academic text (which stemmed from the dissertation for my PhD) was the research. In both cases I didn’t know exactly what I was going to write, at first, but I kept reading what interested me and taking lots of notes and underlining sentences, and marking sections with Post-It notes and noting links of websites and movie clips, then when it had gathered a certain satisfying accumulation, I stopped. I looked at everything I had gathered, all the notes, and sections, and visual images, etc. and it all seemed to magically come together thematically and emotionally. Like I was making a collage that found its shape from my subconscious. I think that the story starts to form itself in the back of my mind, while I’m gathering the research, and the story writing is easier after that once I get down to it.
What’s next? Will there be more stories featuring DCI Lilian Wyles? Might there be a film adaptation?
I am working on a sequel that takes place two years later. The war is raging and there are different problems to solve. This story is still a murder mystery puzzle, and Lilian Wyles leads the case, with help from the four queens, but it has a bit of a spy thriller spice added to it. I’m constantly inspired by the parallels from that time and our current day issues, there are so many similarities. As far as a film adaptation, I’d love to adapt The Four Queens of Crime into a feature or a tv series, we’ll see what happens!
Tell us about your library. What’s on your own shelves?
What have you been reading lately, and what would you recommend to other readers?
I just finished reading every Mick Herron book in the Slow Horses series. The TV show is great and the audio books are also very well done. I just read Still Life by Louise Penny who is amazing! I can’t wait to read all of her work. I just started All Systems Red by Martha Wells. I will definitely be reading the whole Murderbot series. When I am trying to make a lot of progress with my writing projects, I have to ban myself from reading because I won’t get enough writing done or get enough sleep!
I had a bit of insomnia after my dog woke me up at 3am. I found myself
reading the code behind Harvard Law Library’s effort
to archive data.gov. It didn’t help me get back to sleep 😵💫
On the one hand, it is enraging that this effort is deemed to
be in any way necessary at this time. But on the other, I do like how
the team at Harvard Law Library are offering a useful pattern for the
work of archiving data from the web, and one where it isn’t
really a one size fits all software “solution”, but rather an approach.
This post grew out of a thread looking
at how the various pieces of this pattern fit together, and why I think
it’s useful, but I thought I would move it here and add a bit more
context.
Why?
But before that, why is archiving datasets from the web even necessary?
Don’t we have the End of Term Web
Archive which collects widely from US Federal Government websites,
when the administrations change? These archives end up in the Internet
Archive’s Wayback Machine, who themselves do extensive archiving of the
web. But finding datasets in web archives can be difficult, and when you
find something sometimes what’s there isn’t complete, or easily usable.
For example here is the USAID GHSC-PSM Health Commodity Delivery Dataset
published in data.gov:
If you click to download the CSV file you’ll get an obscure message
Cannot find view with id tikn-bfy4. This is because the
Trump Administration deleted
USAID’s website in early February. data.gov is a clearinghouse
for datasets published elsewhere in the .gov web space. You can find 7
snapshots of the dataset’s webpage in the Wayback Machine, however if
you look at the
recent ones you’ll get similar error message when trying to download
the CSV dataset.
Fortunately if you go back far enough and look at previous snapshots you
will eventually find one
that works. data.gov was designed to allow crawlers to locate
datasets by following links. But not all web dashboard and dataset
catalog software is built this way. Datasets can often be hidden behind
search form interfaces, or obscured by JavaScript driven links that
prevent the downloading by web archiving robots.
And lastly, web archiving software puts data into WARC files, which are
great for replay systems like the Wayback Machine (when you know what
you are looking for). But researchers typically want the data (e.g. CSV,
Excel, shapefiles, etc) and don’t want to have to extract them from WARC
data. It’s these users that the Harvard Law Library is thinking about as
they designed their data.gov archive.
The Process
So how does Harvard’s archiving of data.gov work?
The first part is the fetch-index process in the data-vault project,
which uses data.gov’s API to page through all of the datasets, and saves
the metadata to a SQLite database (whicb also includes the JSON from the
API).
When I tried it, it ran for about an hour and collected the metadata for
306,598 datasets. It was designed so that it could be rerun to get
incremental updates.
Next the fetch-data utility (also in the data-vault repo)
iterates through all the datasets in the SQLite database, and builds a
set of URLs that are referenced in the dataset description.
Crucially, this list will include URLs extracted from the “resources”
JSON data structure that includes referenced data files:
A “resource” in data.gov JSON metadata
Also crucial is the fact that not all datasets are directly referenced
from data.gov – data.gov sometimes just links to a web page where the
data files can be downloaded. These aren’t currently being collected by
the data-vault code, and it’s a little unclear how many datasets are in
that state.
fetch-data then collects up the list of URLs and the
dataset metadata, and hands them off to a tool called bag-nabit, or nabit
for short:
At a high-level nabit:
downloads all the URLs to disk
writes the dataset metadata to disk
packages up the data and metadata into a BagIt directory (RFC 8493)
records provenance about who did the work and when they did it
The control then goes back to data-vault which puts the data on Amazon
S3.
The S3 bucket is being provided by the Source Cooperative which is a
project of Radiant Earth, who received
a Navigation Fund grant to make open data available on the web. You can
find the bucket at s3://us-west-2.opendata.source.coop.
There are currently 623,195 files totaling 16.3 TB in the
/harvard-lil/gov-data region of the bucket.
Provenance
Drilling down a little bit more. When nabit downloads the data from a
URL it also records the HTTP request and response traffic as a WARC file
(ISO 28500:2017) which is placed alongside the data itself in the BagIt
data directory.
The WARC file doesn’t contain the content of the dataset, but
just the metadata about the HTTP communication. This provides a record
of when and how the data was fetched from the web.
The name of the WARC file and downloaded data files, are listed along
with their fixities in the BagIt manifest:
manifest-sha256.txt. You can use the manifest to check to
see if the dataset is complete, and notice when files have been deleted
or changed.
The tagmanifest-sha256.txt lists all the “tag” files and
their fixities, which includes the manifest-sha256.txt.
This means you can notice if the manifest has been altered.
The extra bit that Harvard have added is a top level
signatures directory in the BagIt file structure, which
contains digital signatures for the tagmanifest-sha256.txt
file. These signatures are generated using an SSL key and an SSL
certificate, which (should) let you know who signed the package. The
certificate “chain” lets you know who verified they are who they say
they are. So for example you could use your web server’s SSL key and
certificate to sign the package.
And finally they also use a Time Stamp
Authority to sign the tagmanifest-sha256.txt signature
that in turn allows you to verify that the file was signed at a
particular time. Together these signatures allow you to verify who
signed the dataset, who trusts them, and when they did it. The manifest
allows you to ensure the data is complete, and the WARC file let’s you
see how the data was collected from the web.
In Summary
So the advantages that this presents over the traditional web archiving
approach are that:
Researchers work with ZIP files, which they can unpack, inspect
metadata, and directly interact with the dataset that was archived.
The dataset ZIP files are available on the web, and also for bulk access
using Amazon S3 storage, which there are lots of tools for interacting
with (awscli, rclone, etc).
The zip files contain manifests that let you ascertain whether the
dataset is complete. Sometimes files have a way of getting modified or
deleted once downloaded. The bag-nabit command line utility includes a
nabit validate command that performs the necessary
operation.
The manifests are in turn cryptographically signed so that you can
ascertain who did the archiving and when, and decide whether you want to
trust them or not. (again, nabit validate will do the
necessary operations, but more on this below).
I think I might have one more blog post in me about trying to use this
pattern on another dataset.
A Swollen Appendix: Verification
Once you’ve got it installed you can run nabit validate to
validate a bag. But it doesn’t currently help you determine who signed
the dataset, and what the trust chain looks like. I thought it could be
interesting to look at the nuts and bolts of how that works.
The details of inspecting and verifying the signatures was derived from
looking at the bag-nabit
library, and is a bit esoteric to say the least. But hey, we’re going to
be doing cryptography with openssl, so we
shouldn’t be surprised.
First you need to get a dataset, either from the data.source.coop web
application:
And another subject further up the chain, that indicates who signed
their certificate:
C=BE, O=GlobalSign nv-sa, CN=GlobalSign GCC R6 SMIME CA 2023
The next step is to verify that the signature is valid. Remember the
thing that was signed was the tagmanifest-sha256.txt which
in turn asserts the integrity of the entire bag:
It’s February 14th, and that means the return of our annual Valentine Hunt!
We’ve scattered a quiver of Cupid’s arrows around the site, and it’s up to you to try and find them all.
Decipher the clues and visit the corresponding LibraryThing pages to find an arrow. Each clue points to a specific page right here on LibraryThing. Remember, they are not necessarily work pages!
If there’s an arrow on a page, you’ll see a banner at the top of the page.
You have two weeks to find all the arrows (until 11:59pm EST, Friday February 28th).
Come brag about your quiver of arrows (and get hints) on Talk.
Win prizes:
Any member who finds at least two arrows will be
awarded an arrow Badge ().
Members who find all 14 arrows will be entered into a drawing for some LibraryThing (or TinyCat) swag. We’ll announce winners at the end of the hunt.
P.S. Thanks to conceptDawg for the swan illustration!
Batteries are among the technologies that have had a silent, dramatic change over my lifetime.
Last week, as I was setting up a blood pressure cuff for my mother, I opened the compartment in the back and realized I needed 4 AA-sized batteries.
It was once common for devices to ship without batteries, and my inner voice groaned with the thought of having to make a run to the store.
So I was pleasantly surprised when two packs of 2 AA batteries fell from the package.
I don't know if some regulation has come into effect that requires battery-powered devices to include the batteries, or whether they have simply become cheap enough to toss into every package.
But somewhere over the past 20 years, the routineness of batteries has changed.
Today's journey in Thursday Threads takes us down the road of literally electricity storage innovation.
As the world continues to lean towards a more renewable future, advances in battery technology is a race we're all invested in, without even realizing it.
From manufacturing improvements to cost reductions and a decreased environmental impact, the leaps in battery tech are quite palpable.
Plus, one thing I learned this week and a cat picture.
New battery tech is improving the cost, efficiency, and environmental impact of manufacturing.
Feel free to send this newsletter to others you think might be interested in the topics. If you are not already subscribed to DLTJ's Thursday Threads, visit the sign-up page.
If you would like a more raw and immediate version of these types of stories, follow me on Mastodon where I post the bookmarks I save. Comments and tips, as always, are welcome.
New battery tech
The race is on to generate new technologies to ready the battery industry for the transition toward a future with more renewable energy. In this competitive landscape, it’s hard to say which companies and solutions will come out on top. Corporations and universities are rushing to develop new manufacturing processes to cut the cost and reduce the environmental impact of building batteries worldwide. They are working to develop new approaches to building both cathodes and anodes—the negatively and positively charged components of batteries—and even using different ions to hold charge. While we can&apost look at every technology that&aposs in development, we can look at a few to give you a sense of the problems people are trying to solve.
Thinking back again on my childhood, I'm old enough to remember jealously guarding the capacity of the 4 D-cell batteries in my portable radio.
I wouldn't dare leave it on while not listening to it...the batteries were an expensive luxury!
(And I would be very annoyed at my brother or sister if they took them out for their own needs.)
This article from 2024 describes advances in battery chemistry and manufacturing that have lowered costs, improved capacities, and reduced environmental impact.
"Harvesting" nickel metal from plant roots
Gouging a mine into the Earth is so 1924. In 2024, scientists are figuring out how to mine with plants, known as phytomining. Of the 350,000 known plant species, just 750 are “hyperaccumulators” that readily absorb sky-high amounts of metals and incorporate them into their tissues. Grow a bunch of the European plant Alyssum bertolonii or the tropical Phyllanthus rufuschaneyi and burn the biomass, and you end up with ash that’s loaded with nickel. “In soil that contains roughly 5 percent nickel—that is pretty contaminated—you’re going to get an ash that’s about 25 to 50 percent nickel after you burn it down,” says Dave McNear, a rhizosphere biogeochemist at the University of Kentucky. “In comparison, where you mine it from the ground, from rock, that has about .02 percent nickel. So you are several orders of magnitude greater in enrichment, and it has far less impurities.”
The article discusses a federal initiative to use plants to extract metals from the soil through their root systems.
They are calling this "phytomining"; interestingly, this application can help remediate land contamination from traditional mining methods.
Thank you, plants!
Battery Myths
For an object that barely ever leaves our palms, the smartphone can sometimes feel like an arcane piece of wizardry. And nowhere is this more pronounced than when it comes to the fickle battery, which will drop 20 percent charge quicker than you can toggle Bluetooth off, and give up the ghost entirely after a couple of years of charging. To make up for these inadequacies, we’ve made all kinds of battery myths. Whether it’s avoiding leaving your phone on charge overnight, or powering off to give the battery a little break, we’re forever looking for ways to eke out a little more performance from our overworked batteries, even if the method doesn’t make an awful lot of sense. To help sort the science from the folklore, we asked a battery expert to give their verdict on some of the most pervasive myths, explain the science behind the rumors and, just maybe, offer us some sage advice on extending the life of our smartphones.
As battery technology has changed, so too must our understanding of them.
The myths:
Even when your battery is at 100 percent, there’s still room for some more charge: True
Charging your phone in airplane mode makes it charge faster: True (kind of)
Having Wi-Fi and Bluetooth on in the background is a big drain on battery life: True
Using an unofficial charger damages your phone: True
Charging your phone through your computer or laptop will damage the battery: False
Powering off a device occasionally helps preserve battery life: False
Batteries perform worse when they’re cold: False (mostly)
Leaving a charger plugged in at the wall and turned on wastes energy: False (well, maybe a tiny bit)
You should let the battery get all the way down to 0 percent before recharging: False
Charging past 100 percent will damage your battery: True (but not for the reason you think)
Replacing your phone battery gives it a new lease of life: True
Read the article for the reasoning behind each.
Of these, I would question the one about using unofficial chargers.
Much effort has gone into standardizing on the USB-C connector and its associate Power Delivery specifications.
I think we are at the point where the standards won't let that happen.
EU requires replaceable batteries by 2027
Motorola StarTac. What a nice piece of 1990s tech. By Banffy - Own work, CC BY-SA 4.0, Link
The new rules stipulate that all electric vehicle, light means of transport (e.g. electric scooters), and rechargeable industrial batteries (above 2kWh) will need to have a compulsory carbon footprint declaration, label, and digital passport. For "portable batteries" used in devices such as smartphones, tablets, and cameras, consumers must be able to "easily remove and replace them." This will require a drastic design rethink by manufacturers, as most phone and tablet makers currently seal the battery away and require specialist tools and knowledge to access and replace them safely.
The big news in mid-2023 was how smartphone manufacturers would need to design products for the European Union market that allowed for batteries to be replaced.
The reason was to enhance sustainability by enabling consumers to easily replace batteries instead of relying on manufacturers or needing special tools.
This used to be the norm; in fact, I remember buying a beefier/bulkier battery for my Motorola StarTac and keeping the original battery in my pocket as a spare.
But the
2023 EU battery regulation goes beyond just personal devices...it impacts all batteries, including automotive and industrial.
The legislation also includes targets for hazardous substances, waste collection, and material recovery from old batteries, aiming for 61% waste collection and 95% material recovery by 2031.
Additionally, there will be requirements for minimum levels of recycled content in new batteries.
These regulations impact rechargeable batteries only, but the EU is also considering rules for non-rechargeable ones too.
And speaking of EU regulations, that is also why USB-C has become the dominant power and data connection for portable devices.
How battery technology impacts the electrical grid
First, there’s a new special report from the International Energy Agency all about how crucial batteries are for our future energy systems. The report calls batteries a “master key,” meaning they can unlock the potential of other technologies that will help cut emissions. Second, we’re seeing early signs in California of how the technology might be earning that “master key” status already by helping renewables play an even bigger role on the grid. So let’s dig into some battery data together.
The article discusses the current state of batteries and their critical role in future energy systems.
Battery storage has become the fastest-growing commercial energy technology, with deployment doubling worldwide in 2023, particularly driven by China’s policies requiring energy storage for new renewable projects.
Batteries are now essential for managing the challenges posed by intermittent renewable energy sources. Take California: batteries have begun to smooth out daily energy demand fluctuations, even becoming the top power source at times as solar energy decreases in the evening. This graph blows my mind.
Despite these promising developments, we have a ways to go if we're going to replace carbon-intensive electricity generation plants. Fortunately, battery costs have plummeted by 90% since 2010, with projections of an additional 40% decrease by the end of this decade, making renewable energy projects more economically viable compared to traditional fossil fuels.
Graph from MIT Technology Review with data from the California Independent System Operator
A fire at the world’s largest battery storage plant in California destroyed 300 megawatts of energy storage, forced 1200 area residents to evacuate and released smoke plumes that could pose a health threat to humans and wildlife. The incident knocked out 2 per cent of California’s energy storage capacity, which the state relies on as part of its transition to use more renewable power and less fossil fuels.
Of course, as we become more reliant on batteries for storage, there is an increased danger from disasters.
A fire at Vistra Energy's Moss Landing battery storage facility in California has caused significant damage, destroying thousands of lithium batteries and 300 megawatts of energy storage capacity.
That is quite a bit bigger than a Tesla car fire.
This is not new, of course; the problem was described in a 2023 article in Wired.
Despite a 97% reduction in battery-related failures globally since 2018, the loss of such a significant storage capacity is concerning for California's renewable energy goals.
The reconstruction of the facility could take years, complicating the state's efforts to reduce fossil fuel dependence.
So take a chunk out of that "renewables" line in the graph from the last article.
What about devices without batteries?
Imagine using a health bracelet that tracks your blood pressure and glucose level that you do not have to charge for the next 20 years. Imagine sensors attached to honeybees helping us understand how they interact with their environment or bio-absorbable pacemakers controlling heart rate for 6–8 months after surgery. Whether submillimeter-scale “smart dust,” forgettable wearables, or tiny chip-scale satellites, the devices at the heart of the future of the Internet of Things (IoT) will be invisible, intelligent, long-lived, and maintenance-free. Despite significant progress over the last two decades, one obstacle stands in the way of realizing next-generation IoT devices: the battery.
I'm pretty sure I'm not ready for "smart dust", but this article got me thinking about the potential of batteryless, energy-harvesting systems that could someday surround us.
As described in earlier articles in this Thursday Threads, there are environmental challenges posed by traditional batteries, including their limited lifespan and harmful manufacturing and disposal processes.
"Batteryless" is as much about how these IoT devices will get power as it is about how programming them will require a different mindset.
This Week I Learned: It takes nearly 3¢ to make a penny, but almost 14¢ to make a nickel
FY 2024 unit costs increased for all circulating denominations compared to last year. The penny’s unit cost increased 20.2 percent, the nickel’s unit cost increased by 19.4 percent, the dime’s unit cost increased by 8.7 percent, and the quarter-dollar’s unit cost increased by 26.2 percent. The unit cost for pennies (3.69 cents) and nickels (13.78 cents) remained above face value for the 19th consecutive fiscal year
I knew pennies cost the U.S. mint more than one cent to make, but I didn't realize that the cost of nickels is so much more out of whack.
I also learned a new word: seigniorage — the difference between the face value of money and the cost to produce it.
What did you learn this week? Let me know on Mastodon or Bluesky.
Mittens and Pickle want to go on a trip
My wife came back from a trip last week.
Do you see the back tuxedo cat among the clothes in the luggage?
Pickle sure wants to go along on the next trip.
It looks like Mittens would be happy to close the suitcase and send Pickle on her way.
I was very happy to be notified today that a bug I reported to Zotero was fixed. Its list of OpenURL resolvers had a North America section, within which were all the United States libraries, plus some Canadian, and then more Canadian libraries were filed under Canada. This was not right. Now it’s fixed, and within North America there are two lists, for Canada and the United States.
I hope Mexico appears there soon—maybe a librarian there, or some Spanish-speaking librarian elsewhere, will send in a URL. And then maybe South America will follow.
This is what it looks like now (shown in operation on my demonstration Zotero account, coloured with a Solarized Light theme):
Screenshot of Zotero, with some collections along the left, showing the Settings > General > Library Lookup configuration being acted on. Resolver is broken down by continent, and North America is Canada and United States. Under Canada is a long list of Canadian universities.
Many thanks, as always, to all the Zotero people for their work on this excellent program.
The Open Knowledge Foundation (OKFN) is excited to announce the launch of the Open Data Day 2025 Mini-Grants Application to support organisations hosting open data events and activities across the world.
This year we are running two separate calls to reflect the different interests in our community. The first call is for the general community, and the second is specifically for activities in French-speaking countries in Africa. See details below:
ODD25 General Open Call
This call is open to any practices and disciplines carried out by open data communities around the world – such as hackathons, tool demos, artificial intelligence, climate emergency, digital strategies, open government, open mapping, citizen participation, automation, monitoring, etc. Applications from Francophone African communities will be automatically added to a specific call (see details below).
A total of 22 events will be supported with a grant amount of USD 300 each, thanks to the sponsorship of the Open Knowledge Foundation (OKFN) and Datopian.
This call is specifically seeking to promote events happening in French-speaking countries in Africa. It’s open for any kind of practice, just like the General Call above, but it must take place in one of the following countries: Democratic Republic of Congo (DRC), Madagascar, Cameroon, Ivory Coast, Niger, Burkina Faso, Mali, Senegal, Chad, Guinea, Rwanda, Burundi, Benin, Togo, Central African Republic, Republic of the Congo, Gabon, Djibouti, Equatorial Guinea, Comoros, and Seychelles.
The deadline for both applications is 23rd February 2024. The call welcomes all registered organisations across the world interested in hosting in-person open data events and activities in their country. An individual cannot apply, and the events cannot be virtual or online.
The events supported by the grants should take place during the Open Data Day from 1st to 7th March 2025 and must be registered at the Open Data Day website. We encourage proposals to try to dialogue in some way with this year’s thematic focus “Open Data to Tackle the Polycrisis”.
Registered civil society organisation events supported by the mini-grants cannot be used to fund government events. The grant payment will be transferred to the successful grantees after their event takes place and once the Open Knowledge Foundation team receives a draft blog post about the event.
Selection Criteria
The submitted events will be assessed by an organising committee made up of members from each sponsoring organisations. Applications will be blind-reviewed and given a score according to the following criteria.
Novelty/creativity of the proposal
Community aspect: to what extent the proposal promotes community involvement (especially local communities)
Achievability of the activity and level of commitment of the organisers when writing the proposal
Diversity in terms of geography, gender, and type of activities
Alignment with Open Data Day 2025 thematic focus
* Organisations taking part in Open Data Day for the first time will receive an extra point, as will organisations that have not received mini-grants in past editions of the event.
The winning organisations will be contacted from 24th February. The official announcement of the list of grantees will be made on 26th February.
About Open Data Day
Open Data Day (ODD) is an annual celebration of open data all over the world. Groups from many countries create local events on the day where they will use open data in their communities. ODD is led by the Open Knowledge Foundation (OKFN) and the Open Knowledge Network.
As a way to increase the representation of different cultures, since 2023 we offer the opportunity for organisations to host an Open Data Day event on the best date over one week: this year between March 1st and 7th. In 2024, a total of 287 events happened all over the world, in 60 countries using 15 different languages.
All outputs are open for everyone to use and re-use.
I gave a little informal presentation about the Stuart Semple / Anish Kapoor feud earlier today. It was a lot of fun. Maybe you’d like to see the slides.
The internet gives us new ways to express ourselves. One of the more strenuously esoteric forms of artistic expression is Strava art, in which people do runs that, when mapped, draw pictures. None of my strava art was particularly good, but my running club friends in Stockholm regularly run "elefanten". I spent a year attempting "Found Strava Art", where you just run a new route and give the run a name based on what it looks like. I ran a lot of flowers and space ships, but meh. Last year I named each run with a line of a song that came up on my iPod. Too obscure.
This year I decided to serialize poems with my Strava runs. I didn't have a plan, but I started with Jabberwocky. It seemed appropriate to comment using nonsense words, because, Jabberwocky. I ended up with this:
’Twas brillig, and the slithy toves did gyre and gimble in the wabe
I love running with my slithy toves!
All mimsy were the borogoves, and the mome raths outgrabe.
My right knee was a grobble mimsy today, but mome what a rath!
Beware the Jabberwock, my son!
Also, the Jabberrun can be hard on the knees.
The jaws that bite, the claws that catch!
ERC hosted run had quiche to bite and George to catch.
He took his vorpal sword in hand
New York Sirens game. Women with vorpal sticks. Slain by the Charge 3-2.
Beware the Jubjub bird, and shun the frumious Bandersnatch!
Definitely well salted and frumious out there today.
Long time the manxome foe he sought
But quick the manxless chill he caught
So rested he by the Tumtum tree
Covered with snow in filagree
And stood a while in thought.
Though clabbercing in a profunctional dot!
And, as in uffish thought he stood
Trolloping thru the Brookdale wood.
The Jabberwock, with eyes of flame
Cheld and hord, a glistering name…
Came whiffling through the tulgey wood
And caught the two burblygums because he could.
And burbled as it came!
So late the Jabberrun slept
For Eight Muyibles passed as though aflame
O'er Curbles and Nonces the pluffy sheep leapt.
One, two! One, two! And through and through
Three four! Three four! Sankofa’s coffee’s fit to pour.
The vorpal blade went snicker-snack!
The Icebeest of Hoth kept blobbering back.
He went galumphing back.
He left it dead, and with its head
... the Garmind sprang to life
And hast thou slain the Jabberwock?
The ice, the snow, it's hard as rock.
Come to my arms, my beamish boy!
Think of my knees! Oy oy oy oy.
O frabjous day! Callooh! Callay!”
O jousbarf night! The fluss! The fright!
He chortled in his joy.
(And padoodled the rest of of the way!)
‘Twas brillig and the slithy toves
Did not, had not, could not loave.
Did gyre and gimble in the wabe
“Dunno.” said the wormly autoclave
All mimsy were the borogoves,
Again and again, beloo and aboave
And the mome raths outgrabe.
The end. Ooh ooh Babe!
Terrible right? But it has its moments.
I've started a new one. I fear it will get more topical.
In challenging times, it’s good for organizations to remember what they exist to do, and what values drive what they do. They may be expressed in a variety of ways, but there are often common threads going through them. There are lots of library mission statements, for instance, including the one for the university library where I work, the public library for the city where I live, and the library that Congress funds for the American people. Their three statements are all worded differently, but they all involve engaging with the communities they serve to provide access to knowledge and promote learning and creativity.
Carrying out that mission is easier said than done. In my last post, I linked to a page the American Library Association posted with core values that help us keep focused on our missions. In this post, I’d like to draw attention to an overlapping but slightly different set of values, values that some have recently called into question but that are crucially important to what libraries do.
Good libraries are diverse. We have to be, to do our jobs well. Our communities are diverse, with all sorts of ages, backgrounds, education levels, ethnicities, language and expressive skills, genders, faiths, interests, and needs for knowledge. To serve them all, we need to have collections that reflect and serve the diversities in our communities, and in those who come into our communities. We need to have staff that have the knowledge and rapport to effectively serve our diverse communities. And we need them to create and support programming that meets our communities’ needs.
Good libraries are inclusive. We can’t serve our communities well in their full diversities if we don’t make a conscious effort to ensure we’re including everyone in those communities as best we can. A town’s public library might have a rich and diverse collection of English-language books for preschoolers and their parents, for instance, but if there’s little in its collections or programs for school-age children, young adults, retirees, or readers of non-English languages, for example, it’s not doing its job as well as it should.
Good libraries are accessible. Libraries won’t be inclusive just by our saying they are. When we invite everyone to use our libraries, we have to make that invitation meaningful by ensuring everyone can reasonably and fairly access them. If we really mean to be inclusive for seniors, for example, we need to make sure that the many seniors who have problems with stairs or small type can use the facilities, websites, and books that our library provides. We need to make sure that our community members who don’t read English well have access to books that they can read, in the languages they know, as well as books that will help people learn English and the other languages used in our communities. When we fail at accessibility, we fail at inclusion.
Good libraries are equitable. Equity is important in its own right as a standard of fairness, and also for ensuring and balancing the other values noted above. Not every specific part of a diverse and inclusive library will be for everybody, or should be for everybody. A book about how to work with the Medicare system will generally not be of use to a preschooler, for instance. Nor should it be if it’s going to effectively serve the needs of the retirees the book is meant for. Likewise, an alphabet rhyming book is unlikely to be of interest to a doctor with no particular interest in children. Similarly, some of the specific programs and initiatives that libraries undertake will be of more use and interest to some parts of their communities than others.
An equitable library ensures that its collections and programs, taken as a whole, fairly balance the needs of the various constituencies in its community. As part of that fair balance, an equitable library also takes into account existing inequities and other deficiencies present in its community, and do its part to alleviate them. A library serving a community with higher than usual unemployment, for example, might devote more resources than other libraries might towards materials and programs that help people get jobs. It might also give special attention to parts of the community that have particularly high unemployment rates.
Good libraries reaffirm and clarify their values when challenged. This can be hard to do sometimes. Some people claim that programs involving diversity, inclusion, accessibility, or equity (or various rearrangements or acronyms of those words) are unjustly discriminatory or illegal. If one of our community members comes to us with a concern like that, it may well be worth listening to. It’s certainly possible to imagine illegal or discriminatory actions being taken under the cover of “DEIA”. It’s also certainly possible to imagine illegal or discriminatory actions being taken under the cover of “combating DEIA”. In either case, we need to make sure that our libraries act in a way that serves our communities fairly, and in line with our values (including the four that I explain above). Putting the word “equity”, say, in big letters on our website does not in itself make us equitable. Nor does removing the word from our website. But explaining what we mean by equity, and putting what we explain into action, can.
Actions mean more than words themselves, but words themselves can be important actions. The keepers of libraries have particular reason to be aware of the power of words, since we’re stewards of so many of them. Words can be promises, both explicit and implicit, and when we speak, others hear what we say, and what we don’t say, and expect us to live up to what they hear. We may shy away from some words when we’re worried about what people who give us funds or support may think about them. But when we do, the people in the communities we serve may also hear our new words and draw their own conclusions about them. In our words and actions, we can decide to protect our institutions with the powerful as best we can. Or we can decide to serve our communities in accordance with our missions and values as best we can. Sometimes those aren’t the same choice.
On 30 January I was lucky to attend CHAOSScon 2025 in Brussels, which brought together open source practitioners, researchers, and community leaders to discuss the latest developments in measuring and improving open source software (OSS) health. This year’s sessions covered key topics like defining open source sustainability, tracking contributions, assessing community health, and evaluating project risks. Below is a recap of the main sessions and insights shared throughout the event.
The conference kicked off with an overview by Daniel Izquierdo of CHAOSS (Community Health Analytics for Open Source Software) and its tools for tracking OSS health. Key takeaways included:
Metrics are essential for assessing the maturity of OSS projects.
GrimoireLab 2.0 offers new capabilities for analyzing software development, including historical data tracking, GDPR-compliant identity management, and a business-layer integration for commercial services.
Major OSS foundations and corporations leverage GrimoireLab for their open source health assessments.
CHAOSScon also marked the launch of the CHAOSS Education Program, designed as a structured entryway into open source. Dawn Foster and Peculiar C. Umeh presented the 3 courses developed by CHAOSS:
Open Source 101: Helping newcomers navigate OSS and find their contribution niche.
CHAOSS governance and operations: Educating users on how the organization works.
Practitioner guides for project managers, OSPOs, and community leaders.
The courses are hosted on Moodle and are designed for both CHAOSS community members and general OSS learners.
Ruth Ikegah then shared that Diversity, Equity, and Inclusion (DEI) remain challenges in OSS. Through her work with local chapters she observed that:
49%+ of OSS content is in English, creating barriers for non-English speakers.
Cultural differences necessitate localized approaches to inclusion.
Challenges like internet access, financial constraints, and lack of OSS education in formal curricula hinder participation from non-Western countries.
We need better strategies for engagement. Some examples she shared are: badging systems, funding, mentorship, and recognizing future leaders.
Paul Sharratt and Cailean Osborne presented a toolkit for measuring how public funding affects OSS sustainability. Some critical points included:
OSS is digital infrastructure, and funding models affect long-term viability.
Different funding types lead to varying levels of impact.
Models for assessing public investment effectiveness in open source.
Katie McLaughlin addressed a quite well-known problem: open source projects often struggle with recognizing contributions beyond code. She therefore highlighted the need for a standardized taxonomy for OSS contributions, as many contributions are still invisible today (e.g., documentation, community engagement).
As an attempt to explore equitable credit systems in OSS, they launched whodoesthe.dev, focused on understanding the open source ecosystems.
Daniel S. Katz then presented CORSA (Center for Open Source Research Software Advancement), an initiative aiming to support open-source research software projects through foundations and metrics in order to improve its sustainability.
Sustainability is a tricky word, because it has so many different meanings. There is a technical sustainability, but also a financial and organizational one (and of course, an environmental sustainability too). Daniel advocated for metrics as a key element to understand the status of a project, and, in the case of financial sustainability, an excellent way to showcase success and attract funding.
Financial sustainability of course affects all the other sustainabilities too, as community engagement and long-term viability require structured support mechanisms.
Security and risk analysis were big topics at CHAOSScon (and at FOSDEM too). As Georg Link explained, this is very much linked to the project health: unmaintained or poorly maintained FOSS dependencies pose security threats, and as FOSS is an integral part of any modern software (over 80 percent of the software in any technology product or service is open source, according to a Linux Foundation study from a couple of years ago!), understanding risks is crucial. The Software Bill of Materials (SBOM) helps track and manage dependencies. Key risk indicators include median response time to pull requests and issue resolution speed. One thing is clear: maintaining project activity and engaging contributors helps mitigate risks.
Another project that can help in risk analysis is OpenChain, which helps developers assess compliance in their software components, using a capability model to grow community excellence. Measuring compliance contributes to risk assessment and regulatory alignment.
The OpenChain tools are available on GitHub for developers to evaluate maturity models.
Katherine Skinner gave a keynote in which she explored the importance of defining values in open infrastructure projects and aligning community values with decision-making to strengthen resilience. Katherine introduced the FOREST framework for values-driven evaluation, emphasizing that human metrics can help reverse-engineer assessments by letting communities define the values they stand for. Additionally, she discussed the challenge of making FOSS needs visible to funders and stakeholders in a way that highlights their significance without discouraging adoption.
Conclusion
CHAOSScon 2025 reinforced the importance of defining, measuring, and sustaining OSS health. Key themes included the role of education, local empowerment, equitable contribution recognition, and risk management. As open source continues to evolve, these discussions provide a roadmap for ensuring sustainability, security, and inclusivity. For further insights, you can access all the presentation slides from CHAOSScon 2025 here, and join the CHAOSS community too!