Planet Code4Lib

Issue 108: Educational Technology / Peter Murray

I've been in or near higher education for my entire career, so it is probably no surprise that educational technology ranks high on DLTJ topics. Although a lot of my experience is with library technology, that isn't the only part of the ed-tech landscape that I'm interested in. Take, for example, these recent Thursday Threads topics:

...and further back, before I started numbering Thursday Threads issues:

This week, I'm pulling that thread into the recent era with seven stories. Plus a thing I learned this week, and this week's cat supervises a printer!

Feel free to send this newsletter to others you think might be interested in the topics. If you are not already subscribed to DLTJ's Thursday Threads, visit the sign-up page. If you would like a more raw and immediate version of these types of stories, follow me on Mastodon where I post the bookmarks I save. Comments and tips, as always, are welcome.

Outsmarting "phone prison" pouches

Lauren is one of more than 2 million students in 50 states and 35 countries who scramble each school day to check that one final text or TikTok before sliding their phone into a gray neoprene pouch made by Los Angeles–based Yondr, which brought in over $5 million from government contracts — mainly school districts — in the first three quarters of 2024 alone, according to data service GovSpend. At many schools that use Yondr, each student receives a pouch at the beginning of the school year like they would a textbook. Before entering the building, they snap their pouches shut, then open them on their way out using plate-size magnetic unlocking bases mounted on the walls or rolled out on carts near the exits.
Do Yondr Pouches Really Work? School districts love the “phone prisons.” Students have already figured out how to skirt them, New York Magazine, 5-Feb-2025

My family went to see Jon Stewart in person for a comedy set last year, and it was the first time I had encountered a Yondr pouch. It uses a magnetic clasp to seal the phone in the pouch, and although the pouch doesn't block signals, it makes it impossible to see use the camera. Or, as I imagine the school use cases, read/send texts and browse the social media. I can see why this would be compelling to schools to eliminate distractions, but a classroom seems much easier to control than a theater venue. And this article points out that the pouches are expensive, too, and that students have ways around them. (Like tossing a "burner" phone in the pouch and keeping your real one.)

UNESCO calls for schools to ban smartphones

Smartphones should be banned from schools to tackle classroom disruption, improve learning and help protect children from cyberbullying, a UN report has recommended. Unesco, the UN’s education, science and culture agency, said there was evidence that excessive mobile phone use was linked to reduced educational performance and that high levels of screen time had a negative effect on children’s emotional stability.
‘Put learners first’: Unesco calls for global ban on smartphones in schools– Major UN report issues warning over excessive use, with one in four countries already banning the devices, The Guardian

UNESCO has called for a global ban on smartphones in schools, citing concerns over classroom disruption, cyberbullying, and reduced educational performance linked to excessive mobile phone use. What I found surprising was that one in four countries has implemented smartphone bans in schools, including France and the Netherlands. The report also warns against an uncritical embrace of technology, emphasizing that not all technological changes lead to progress...which seems like sound reasoning to me.

Implications of a student-information-system-in-the-cloud hack

On January 7, at 11:10 p.m. in Dubai, Romy Backus received an email from education technology giant PowerSchool notifying her that the school she works at was one of the victims of a data breach that the company discovered on December 28. PowerSchool said hackers had accessed a cloud system that housed a trove of students’ and teachers’ private information, including Social Security numbers, medical information, grades, and other personal data from schools all over the world.... The next morning after getting the email from PowerSchool, Backus said she went to see her manager, triggered the school’s protocols to handle data breaches, and started investigating the breach to understand exactly what the hackers stole from her school, since PowerSchool didn’t provide any details related to her school in its disclosure email.
How victims of PowerSchool’s data breach helped each other investigate 'massive' hack, TechCrunch, 18-Jan-2025

Earlier this year, PowerSchool announced that there had been a data breach of its cloud-based student information system. Its website says it is the largest provider of cloud-based education software for K-12 education in the U.S., serving more than 75% of students in North America. That is 18,000 customers to support more than 60 million students in the United States alone.

The article discusses the aftermath of the data breach at PowerSchool and how the company was not responsive to questions of the nature of the breach. School administrators were left scrambling for information, and a system administrator at the American School of Dubai took the initiative to investigate the situation. Romy Backus collaborated with peers to create a comprehensive guide detailing how to assess the breach and identify stolen data.

Collaboration is common in the education sector, probably because of the generally limited resources for technology cybersecurity. I've experienced this myself in higher education, where a sense of camaraderie and sharing permeates the profession. I think a lot of the open source movement comes out of education...it would be interesting to know if that feeling is backed up by actual data.

Google declared end-of-life for Chromebooks that schools want to keep using

At a lofty warehouse in East Oakland, a dozen students have spent their summer days tinkering with laptops. The teens, who are part of Oakland Unified’s tech repair internship, have fixed broken screens, faulty keyboards and tangled wiring, mending whatever they can. But despite their technological prowess, there’s one mechanical issue the tech interns haven’t been able to crack: expired Chromebooks. With a software death date baked into each model, older versions of these inexpensive computers are set to expire three to six years after their release. Despite having fully functioning hardware, an expired Chromebook will no longer receive the software updates it needs, blocking basic websites and applications from use.
Built-in software ‘death dates’ are sending thousands of schools’ Chromebooks to the recycling bin, Mercury News

This story has a happy ending—Google later extended its support for Chromebooks to 10 years—but it is a reminder of how much influence has been given to technology companies in the education space. The article discusses the issue of built-in software "death dates" for Chromebooks, which render many older models obsolete after three to six years despite their hardware still functioning. 2023 was at the point where the first round of Chromebooks used during the pandemic were reaching their original end-of-life, so the monetary expense and the stagger e-waste of still-usable machines were stunning.

Advise for buying educational technology

The ed tech industry’s pandemic-era boom has meant K-12 schools and universities are receiving sales pitches for an abundance of new products—from generative AI writing tools and math tutors to robot security guards and lightboards. But with those choices, and billions of dollars being spent annually on ed tech, educators and school administrators say they also have a problem: There is no mandatory licensing process that certifies that ed tech products work as advertised or that they can be trusted with sensitive student information. Experts have called for countries to establish licensing bodies for educational technology, but for the time being, ed tech companies have largely been left to regulate themselves through voluntary, industry-funded certification programs.
How to Buy Ed Tech That Isn’t Evil: Four critical questions parents and educators should be asking, The Markup

Buying technology you can trust is challenging, often because it seems like "trust" is not a selling point that companies emphasize. These decisions can be even more challenging in the educational technology space, where there are concerns about student privacy. This article offers some suggestions for evaluating technology purchases.

Chromebooks rejected in Denmark over student privacy concerns

Danish privacy regulator Datatilsysnet has ruled that cities in Denmark need considerably more assurances about privacy to use Google service that may expose children’s data, reports BleepingComputer. The agency found that Google uses student data from Chromebooks and Google Workplace for Education “for its own purposes,” which isn’t allowed under European privacy law. Municipalities will need to explain by March 1st how they plan to comply with the order to stop transferring data to Google, and won’t be able to do so at all starting August 1st, which could mean phasing out Chromebooks entirely.
Google’s use of student data could effectively ban Chromebooks from Denmark schools: Denmark’s privacy regulator ruled against sharing students’ information with Google, even if it wouldn’t be used for ad targeting, The Verge, 7-Feb-2024

The regulator found that Google uses data from Chromebooks and Google Workspace for Education for its own purposes, violating European privacy laws. This decision stems from concerns that Google’s use of student data for performance analytics and AI development is inappropriate, even if not used for targeted advertising. Google had been in discussions with Danish municipalities since July 2022 to address privacy issues, and it's unclear whether the issue has been resolved. The latest information I could find in English is from September 2024, and it said that "it is still not settled how the municipalities will ensure compliance and accordance with the decision from the DPA."

When in doubt, the easy answer is to filter everything objectionable. It isn't a good answer

CIPA [Children’s Internet Protection Act], a federal law passed in 2000, requires schools seeking subsidized internet access to keep students from seeing obscene or harmful images online—especially porn. School districts all over the country, like Rockwood in the western suburbs of St. Louis, go much further, limiting not only what images students can see but what words they can read. Records obtained from 16 districts in 11 different states show just how broadly schools block content, forcing students to jump through hoops to complete assignments and keeping them from resources that could support their health and safety.
Schools Were Just Supposed To Block Porn. Instead They Sabotaged Homework and Censored Suicide Prevention Sites, The Markup

As my kids were going through high school, they ran into this problem, too, and had to use their mobile devices or the home internet to complete assignments. But I remember this problem back in the mid-2000s when I was asked to serve on a technology advisory committee for public libraries. Internet filters, initially intended to block pornographic content, have crept into blocking access to educational and health resources. The investigation revealed that districts often overblock content, affecting access to vital resources like suicide prevention sites and sexual health information. The Markup found that filtering systems used in schools categorize the internet broadly, leading to significant censorship, especially of LGBTQ+ supportive content while allowing access to anti-LGBTQ+ materials. Clearly, there is a need for a more nuanced approach to web filtering in schools to allow students to access a broad range of information essential to learning and general well-being.

This Week I Learned: It is much harder to get to the Sun than it is to Mars

The Sun contains 99.8 percent of the mass in our solar system. Its gravitational pull is what keeps everything here, from tiny Mercury to the gas giants to the Oort Cloud, 186 billion miles away. But even though the Sun has such a powerful pull, it’s surprisingly hard to actually go to the Sun: It takes 55 times more energy to go to the Sun than it does to go to Mars.
It’s Surprisingly Hard to Go to the Sun, NASA, 8-Aug-2018

I suppose it that headline above needs some nuance. It is easy to get to the Sun...just escape Earth's gravity and point yourself there. It is hard to get to the Sun in a controlled way that means you won't burn up along the way.

What did you learn this week? Let me know on Mastodon or Bluesky.

Mittens is the print supervisor

Black cat perched on a printer, appearing to supervise as a document prints out. The cat is intently focused on the printer's operation.

Sustainable Menstrual Equity: A Case Study on the Success of Low-Cost Menstrual Cup Distribution / In the Library, With the Lead Pipe

Photo by Monika Kozub on Unsplash

In brief

Free menstrual products at libraries are no longer a new phenomenon, thanks to the work of global menstrual equity advocates such as Period.org and Global Menstrual Collective. However, more often than not, these initiatives center around disposable period products. We argue that the work should not stop there. Libraries should explore the distribution of reusable menstrual products, such as menstrual cups and discs, cloth pads, and period underwear. These options are substantially better for the environment, safe to use, and can provide a form of long-term economic support by removing the need to continually buy disposable products. With $10,000 of grant funding, our academic library succeeded in distributing 701 menstrual cups for low cost on campus. Through our first vendor’s buy-one, donate-one policy, an additional 437 cups were donated to the vendor’s global charity partners (resulting in a total 1,138 cups distributed). This initiative not only addresses menstrual equity, with an average saving of $250 a year (and possible $2,500 savings over the ten-year lifespan of a menstrual cup) compared to disposable products, but also highlights the need for sustainable practices within institutional settings, providing a replicable model for others.


Our Menstrual Cup Project

Preparation

Our project started to take shape in Fall 2019, when our library, The Spencer S. Eccles Health Sciences Library, became one of the first buildings at the University of Utah, an R1 public university, to provide freely available disposable pads and tampons within all restrooms. Providing these menstrual products helped relieve some stressors resulting from being a commuter campus, where access to period products for the majority of the campus community would require leaving campus for home or a store. Two staff members within Access Services, Donna Baluchi and Alison Mortensen-Hayes, discussed the feasibility and budget needed to distribute reusable menstrual cups. In February 2020, Mortensen-Hayes researched potential menstrual cup vendors and pricing, as well as funding resources for the project itself. Our team determined the amount of funding we should be seeking and what kind of impact we could have. 

Unsurprisingly, our smaller academic library did not have the ongoing budget for free distribution of reusable menstrual products, so we sought out external funding through grants and/or partnerships with organizations with similar sustainability and equity goals. At the time, we were seeing a wholesale average cost of $17 per menstrual cup and an overwhelming number of vendor options. It’s likely that conducting the same vendor and cost research today would result in even more brands and options, and pricing has likely shifted as well.

Understanding we would need to acquire grant funding, which would require budgetary and project management, we initially formed our project team exclusively with others in our library, primarily part-time student employees in the Access Services department: Olivia Kavapalu and Maha Alshammary. One team member, Olivia Kavapalu, was then able to connect us with two students already involved in sustainability initiatives on campus, Amelia Heiner and Sara Wilson. We were all, coincidentally, looking to do the same project at the same time. Together we were a team of two full-time library staff, two part-time library staff who were also students, and two undergraduate students. It ultimately proved valuable to have a diversity of backgrounds, skills, and audience reach when approaching this project.

To guide our approach, we surveyed the campus community to assess interest in reusable menstrual products and preferences for distribution. This survey was distributed via a newly created Instagram account for the project (@cups.for.uofu), the library’s Instagram (@EHSLibrary), internal campus email lists, and through our personal networks. We asked if people would like a menstrual cup specifically, if they would be willing to pick up on campus or if they would prefer shipping to home (in the midst of COVID-19, an important factor), if they would pay for one (including extra fees for shipping), and how many they would like. This data was then used to better inform our grant application and our logistics planning afterwards. 

Our survey received 75 responses in total. Survey participants were predominantly students and all identified as female, highlighting student-driven demand for menstrual equity initiatives on campus. 53% of all respondents expressed interest in discounted menstrual cups, with a significant portion open to paying a small fee for added convenience in shipping. 20% stated they were not interested. 21% of total respondents replied “maybe”. We included an optional open comment for participants, specifically encouraging those who answered “maybe” to voice their hesitancy to the surveyors. Only 15 respondents included a comment, 8 of which responding that they would get a menstrual cup if it was discounted. 7 responded that they were nervous about trying menstrual cups. 1 stated that they “did not need one”, and 1 stated they had an IUD, but did not elaborate. The majority of respondents only wanted to purchase one or two menstrual cups. We also included an optional open question for “Additional comments or concerns” which received 10 responses. Nine of those responses included general excitement about the project and the prospect of discounted menstrual cups. Two mentioned wanting size options, and one asked if we would consider including period underwear as a reusable option. 

These findings suggest strong community interest in affordable, accessible options for sustainable menstrual products, aligning with broader trends in menstrual equity. These findings also support a worldwide increase in interest in menstrual cups (Why Are Menstrual Cups Becoming More Popular?, 2018). In essence, survey responses were overwhelmingly positive/motivating and gave us sufficient data to proceed.

Full results of our survey are summarized in Table 1.

Table 1

Question
Would you be interested in purchasing the menstrual cup shown above at a significant discount through the University of Utah? (required with fixed options; 75 responses)Yes
53.33%
Maybe
21.33%
No, I already use a menstrual cup or other zero-waste menstrual product
5.33%
No, I don’t want to switch from traditional menstrual products
20%
If you answered ‘Maybe’ to the above question, why? (optional open comment; 15 responses)Nervous about trying menstrual cup
40%
Personal belief they do not need one
6.7%
Would get one if discounted
46.7%
Have IUD
6.7%
What would be your preferred method of receiving the cup? (optional, select all you would be willing to do; 73 responses)Pay $3 in addition to the discounted price to have it shipped directly to you
56% of all respondents selected
Pay the discounted price and get free shipping directly to you
77.3% of all respondents selected
Pick-up on campus with COVID safety protocols in place
56% of all respondents selected
(2.7% did not respond)
How many cups would you purchase, if you did?
(optional, select one; 70 responses)
138.7%241.3%39.3%4+4%
(6.7% did not respond)
To which gender identity do you most identify? (required with fixed options; 75 responses)Female
100%
Male
0%
Nonbinary/Trans
0%
Which best describes your affiliation to the University of Utah?
(required with fixed options; 75 responses)
Student
89.3%
Alumni
10.7%

Using the vendor research and survey data, we were able to apply for funding with a clear project vision, including budget and timeline. Very fortunately, our institution’s Sustainability Office offers grants called Sustainable Campus Initiative Funds (SCIF) (Sustainable Campus Initiative Fund – Sustainability, n.d.). This grant is funded by student tuition fees and therefore must go toward sustainability projects that primarily benefit students. 

We were notified in March 2021 that we were awarded the $10,000 “medium-sized” grant. This enabled us to take the scope of our project from providing a few menstrual cups within our library to a large distribution across the entire campus. We were thrilled with this prospect since we knew a grant application has no guarantees, and were even warned that these grants are not often awarded to initiatives that are likely to die out without a consistent funding source.

Getting Started After Funding Obtained

Project members Maha Alshammary and Olivia Kavapalu researched potential campus partners that existed at the time to help advertise our efforts, including the LGBT Resource Center; Women’s Resource Center; Diversity Office; Office of Health Equity, Diversity and Inclusion; Sustainability Office; other libraries on campus; and all student resource centers. Team members of this project decided to use GroupMe as a way to communicate with each other, supplemented with occasional emails, and met virtually exclusively via Zoom. As briefly mentioned before, project member Amelia Heiner created a social media profile on Instagram to get the message out about our project, as well as inform everyone of the sustainability benefits of using a menstrual cup over traditional disposable pads and tampons: https://www.instagram.com/cups.for.uofu/. We planned to use this account to continue educating on the sustainability of reusable menstrual products, specific education on menstrual cup use, promote any tabling events we set up, and link to our shipping option. Providing drop-ship would allow patrons to purchase a deeply discounted cup directly with the vendor and have the product shipped directly to their chosen address. 

From April to August 2021 we finalized logistics. Matters we had to settle included vendor options, selecting the cup we wanted to distribute, and how we would table and/or offer direct shipping in the light of campus COVID protocols. Accounting considerations took much of our focus, such as how billing would be handled with the vendor, how we could accept payments on campus, and how we would comply with university regulations. We were not in departments responsible for any of our library’s large purchases, and we did not realize the time-consuming complications that would later arise when having to navigate university purchasing and accounting policies.

Vendor Selection

It was important to our team that our vendor was as sustainable and ethical as our project aimed to be. There are many inexpensive silicon-based menstrual cups from large manufacturers in countries with less regulation, who will use the term “medical grade”. In many cases, these are less expensive, but lack any sustainability stance, medical education for users, or company ethics driving their production. Our choice of vendor emphasized sustainable practices and menstrual equity, supporting the environmental and social goals of the project. We initially chose the vendor Dot Cups Ltd., due to their prior partnerships with colleges and universities and their charitable company policy of donating one cup for every cup purchased to organizations of Dot’s choice, doubling our sustainability impacts. Dot did not indicate where they would be donating their charity cups, simply that they would send them to different global partners they had established. Dot did mention the opportunity for us to get more involved in their charity donation policy and connect them with a local establishment of our choosing. This was something we were excited to hear, but without any established partnerships to pursue this, we defaulted to Dot’s company connections. Dot additionally provided us with marketing material focused on college students who would be first-time menstrual cup users. 

We had already begun working with Dot when accounting let us know that our request to work with a single vendor was denied by the university due to policy. Instead, we were required to put the project out to bid to all vendors to encourage fairness in vendor selection. Our public university requires all purchases using institutional funding of over $5,000 to go through this process, wherein multiple vendors are permitted to apply for consideration. After vendor applications are submitted, the central university accounting office makes the decision and selects a single vendor. While the nuances of their selection criteria were unknown to our team and we were not involved in their direct conversations with vendors, we did have the opportunity to suggest specific companies for them to reach out to and solicit an application. We returned to the vendor research we did at the very beginning of this project to supply accounting these additional vendor options. This allowed us to focus on companies that shared our values and could potentially meet the scope of our vision as previously described. 

Fortunately, we did end up working with Dot, though the bidding process delayed the project by a couple of months (which we did not foresee).

Budget Management

Project members tentatively planned on purchasing half of the inventory for distribution on-campus, and half to distribute via drop-shipping. However, this was not purchased or acted upon all at once. Accounting advised us to place an order with Dot Cups with only half of our grant funding: $5,000. We were requested to stagger the purchases like this so that we could verify the inventory existed as advertised and calm our accountants’ fears of any fraudulent claims by Dot, which we highly recommend others also do when purchasing at this volume. Therefore, our initial purchase was for 295 cups (at $17 each, tax free due our nonprofit status as a public university) for in-person sales. At the time, Dot menstrual cups retailed on their website for $34 each. 

Our campus’ sustainability grants are often applied for by students, and all grant awardees are assigned an advisor from the campus’ Sustainability Office to guide them through their awarded projects. At the advice of our assigned advisor, we decided to charge a minimal $5 fee per cup, in order to give them psychological weight and discourage waste of the cups themselves. If free, our advisor warned that there is the likelihood of people taking the cups simply because they are free and later throwing them away. In the case of drop-shipping, after discussing with our vendor, we decided on $8 total cost: $5 for the menstrual cup and an additional $3 for shipping. All fees collected were cycled back into the project to buy more cups and stretch the project further. Based on our survey data, we decided to limit each person to purchasing two cups to extend the number of people able to purchase a menstrual cup, and prevent any purchaser from buying many to attempt to resell them. 

Sales

We began soft selling in August 2021 from the library’s front desk while working with the vendor to set up drop-shipments. Our library’s front desk is staffed by part-time, often student employees, who were not over-burdened by the task. There was no discomfort from anyone with handling the menstrual cups, and the menstruators who worked at the library were especially excited to get access to inexpensive menstrual cups and help in the distribution. We could only take payment using the accounting-approved methods of cash or our university credit card processing merchant services client, which required the use of a webpage and manual input of a credit or debit card. There was minimal training needed for this. Most purchasers were interested in buying a cup and leaving, but for purchasers who had questions or wanted more education about insertion techniques, they were directed to team member Donna Baluchi.   

Marketing and Promotion

We advertised our low-cost menstrual cup distribution using every avenue we could think of to capture as much of the campus community as possible. Physical marketing included hanging flyers and posters within student resource offices, bulletin boards, and signage inside any menstrual product baskets within restrooms. Digitally, we took advantage of newsletter campus emails that go out to large audiences and sent some targeted emails as well. Our Instagram, however, is what reached students most effectively, after the posts were continually shared within student “stories”. 

Two of our project members were interviewed for the university’s health sciences newsletter. The interview was met with several positive comments, most of which expressed excitement and asked where they could get the menstrual cups. Comment examples included: 

“BRAVO! Well thought out, great investment and execution.” 

“Way to lead the campus on period equity and sustainability!”

Distribution Strategies

We held two tabling distribution events that we heavily marketed through all of our advertising channels, including personal networks. Due to the geographic layout of our university, we determined it would be advantageous to hold one event in the large plaza frequented by all undergraduate students in September 2021 and another at our library’s open house in October 2021 to reach colleagues and students on the health sciences area of campus, which is located a significant distance from central campus with a large elevation change that deters travel between the two sides. Each tabling event was for approximately ninety minutes, and we sold over 200 cups between the two events. Nearly every purchaser was a current university student, and many commented that the limited in-store options and initial cost of $30-$40 is what prevented them from trying menstrual cups before. The majority only purchased one menstrual cup, and those buying two often commented that it was being purchased for a friend or family member. We had less than five purchasers who specifically asked to buy three or four, stating specifically they had family or friends who asked if the purchaser could get one for them. In one instance, a purchaser obtained a small batch to put in the student food pantry for free giveaway (paying it forward).

We finalized drop-ship logistics with Dot in early October 2021, and purchased 142 cups for these drop-ship orders to start. We had actually attempted to order 100 cups (as our plan had shifted to ordering the drop-ship inventory in increments, responsive to demand), but Dot realized there was an issue with the invoice when our check arrived – since our campus members would be paying $5 each, the cups were meant to be invoiced at $12 per unit but the bill was set at $17 each. We consented for Dot to keep the extra funding and up the cup inventory we’d be purchasing rather than start the payment process over again with a new invoice and check.

When drop-ship was publicized via our Instagram account in October 2021, it was incredibly popular, even with the added $3 shipping cost. Within two weeks, over half were purchased. 

On November 8th, 2021, we completely sold out of all inventory from our initial purchases, both in-person and drop-ship. Due to a global silicone shortage during the COVID-19 pandemic, we had exhausted our vendor’s inventory. Our vendor representative informed us that we would need to check back after January 2022 after they were able to restock. 

Navigating Challenges

Our team faced several logistical and vendor-related challenges, which provided important lessons for future projects. Firstly, throughout the project’s length, our group dynamic shifted when students involved would graduate and leave, and other students would ask to be involved. In our team, two members left the project after graduation,and one new member joined when they were hired as part of the library’s front desk crew after they expressed interest. Students truly drove the passion behind the project, and have great networking skills, reaching audiences many full-time employees cannot. Motivated entirely by their passion and interest in increasing menstrual equity, the students on the project contributed countless unpaid hours to our work. However, working with students also means classes and graduation are their highest priority, and an increased school workload does not allow room for extracurricular activities.  

Our project’s biggest challenge was the loss of our original vendor. Dot Cup became uncommunicative and were no longer responding to emails. After months of silence, we made the decision that we would need to continue the project with a new vendor. While it was difficult to deal with the extended delays and the loss of a vendor we trusted and respected, we remained committed to achieving the project’s intended sustainability and equity goals. We took a moment to regroup (as the project had been going on for over a year at that point) and decided to do one final large purchase and distribution effort. 

Thankfully, we did not have to put the project out to bid again as the amount of our remaining funds was under the bidding threshold. This experience highlights the importance of planning for vendor flexibility in case of unforeseen delays or issues.

We collectively researched vendors, and chose Saalt to work with next. Our reasoning in selecting Saalt was because, like Dot Cups, they are a US-based company who makes medical grade silicone cups, with a demonstrated commitment to sustainability and menstrual education. At the time (and possibly still), Saalt offered differing prices depending on if you’re distributing the cups for free or not. Since we had already charged $5 during the first round of cup distribution, we felt it would be unfair to distribute the next round for free. However, Saalt offers deeply discounted products (including cups, discs, and period underwear) if the buyer’s intention is to distribute the items for free. At the time of communication, free distribution qualified for a special price of $10 per cup, as opposed to their wholesale $17 per cup price. Unlike our prior vendor, Saalt also offers multiple cup sizes, so we sent out a survey in May 2022 with the company’s “size quiz” to our email networks on campus. The emails went to a list of students and colleagues who had reached out to us after we had sold out of our initial stock, so we could email them once we received our next shipment. We received 42 responses from colleagues and students, and the results were exactly half “regular” and half “small”. At the time of ordering, they did not offer their current “teen” size. With the new pricing information, we placed our last vendor order in June 2022 with our remaining funds: 132 small size and 132 regular size (264 total). 

Saalt cups arrived without individual packaging, so our team needed extra time to prepare them for distribution by placing each cup in the provided cloth bags. When ready, we held a final tabling distribution event at our library in July 2022. The majority of the cups were sold during our tabling event, and the few remaining cups were sold from the library desk, trickling out over two months. We marketed this final round using our Instagram page, the health sciences campus newsletter, and through signage in our restrooms. 

On October 4, 2022, we sold our last menstrual cups and closed out our project. The small remaining funds from this last round of sales were returned to the university’s Sustainability Office, whom we received the grant from, to be applied to other sustainability-oriented projects.

Conclusory Remarks

Sustainability and equity are both driving goals for our library and our institution. They are both found in our library’s strategic plan as well as the five-year strategy goals of our university. This project was a way to enhance both, while providing improved healthcare to our campus community. 

In total, our project team distributed 701 menstrual cups, and 1,138 menstrual cups altogether when considering Dot’s policy of donating a cup for each one purchased. There are positive environmental and fiscal impacts related to each one of these cups that we dispensed. One report states that “a year’s worth of a typical feminine hygiene product leaves a carbon footprint of 5.3 kg CO2 equivalents” (“The Ecological Impact of Feminine Hygiene Products,” n.d.). At 1,138 cups, assuming one cup per menstruator and a longevity of 10 years per cup, our project has the potential to avert 60,000 kg CO 2 emissions. Using the EPA greenhouse gas calculator, the total potential emissions we saved are equivalent to taking 13 cars off the road for a year or saving 66,000lbs of coal from being burned (US EPA, 2015). In terms of disposable product waste, at an estimated 22 products used a cycle (“Menstruation Facts and Figures,” n.d.), if 1 cup went to 1 person for 10 year’s use, we helped to avert the use of 3,004,320 disposable products collectively.

Financially, sources claim that on average, menstruators spend $20/month (or $18,000 in a lifetime) on disposable menstrual products (Female Homelessness and Period Poverty – National Organization for Women, 2021). Using this calculation, we potentially saved each menstruator $2,395 (again, assuming each cup lasts the expected 10 years). 

We received a significant amount of praise and appreciation from the campus community and recently inspired a new group of students to begin a similar project of their own. The Associated Students of the University of Utah (ASUU) partnered with our university facilities department to permanently provide disposable menstrual products in all student academic buildings in 2021 (Menstrual Product Project – ASUU, n.d.). Our hope is to continue these efforts into perpetuity, and we have contacted ASUU representatives to encourage and support any menstrual reusable initiatives they might consider.

Lessons Learned, Advice, & Words of Encouragement

We encourage everyone to consider doing a similar menstrual equity project with reusable products. None of what we accomplished was assigned or part of our promotion or job scope, but we still succeeded. Our project pushed boundaries by bringing often-overlooked issues like menstrual equity and environmental sustainability into public discourse while navigating a complicated accounting process. No project member had prior experience with a project of this size, nor had anyone applied for a large grant before this project. Yet, despite our inexperience and the challenges we faced, we succeeded in making positive, equitable, sustainable change within our workplace and community. The interest and need is there. We believe anyone with a few committed colleagues who can write for grants could do this. Any nonprofits, schools, health systems, or community aid could put together a similar project.

Planning and Timeline

Though some of the challenges we faced in our project could easily be avoided, in the future, we would still advise that you give yourself plenty of time between project start and completion. Even without a global pandemic, this would have more than likely been a multi-year endeavor.

Do your research into your vendors and their products. Verify they are equitable, sustainable, and aware of what their products contain and where they are manufactured. Certified B Corporations are given that designation when they meet a minimum sustainability standard. Medical-grade silicone should be the minimum for cups, and other reusables should not include any harmful substances. Products advertised as “organic” and “non-toxic” such as period underwear have been known to test positive for PFAS contamination (Kluger, 2023), so it is imperative to make sure the product you are distributing does not come with a hidden harm. As part of this, we recommend avoiding the inexpensive, factory-only bulk companies. Period Nirvana is an online education hub and store that details all the recommended menstrual reusables on the market and can help with vendor (or personal) decisions around menstrual products.

Collaborative Partnerships

Partner with others early and often. Doing so from the outset of the project will streamline processes down the line and foster shared ownership. If you are at an academic institution, you likely have sustainability-oriented offices as well as women’s and possibly cultural or diversity centers. We found this was a positive way for multiple types of groups to work together on a shared goal. Health sciences and public health institutions are especially good places to find partners in this work. Other libraries, including public libraries, could partner with nonprofits and community organizations. Also, academic libraries can partner with their local public libraries and vice versa. People committed to menstrual equity are everywhere.

Recognize that you will likely also have to work closely with departments outside your core team, like your institutional accountants and facilities team. It is crucial you discuss project logistics with your budget office or accounting department as soon as your project vision is in place, especially those requiring large funding. Approach your accounting and facilities colleagues ahead of time to get everyone on board and find out what will be sustainable in terms of expectations. Without this expertise, you likely cannot sustain a project of this size long-term.

Budget and Accounting Management

Speaking of accountants, though Saalt offered drop-shipping, in-person distribution was significantly easier for our accountants to monitor, and after a couple years and a vendor change, they were feeling fatigued with this project. We recognized that we should have been talking more in-depth with our accountants at the very beginning, as there were significant complications within the funding and purchasing of a project like this than we had originally assumed (which was then further complicated by a global pandemic). Beyond complications such as a “Sole Source Request” to more directly choose which vendor we wanted (which we assumed would be a simple process, but at large public universities it is not) then putting the project out to bid, there were Payment Orders for each inventory purchase, sales tax calculated and paid on each cup sold (even though we purchased them tax-free as a public institution), regular cash deposits (policy prevented use of direct cash transfer apps like Venmo or Paypal), record keeping, adding a new vendor in various systems (ours, the university’s, theirs), etc. We regret not communicating the full scope of our project with our accounting team from day one of our initial project idea, as it would have saved time and energy for everyone.

Diversity and Inclusion

Diversify your project team, and especially include teens/young adults/college students, to cover the many different roles in a project like this. Including a diverse team, both in terms of skills and demographics, ensures that the project has a greater chance to meet the needs of all interested parties and gains wider community support. However, note that students may come and go as the project goes on, and certain times of the year will be more difficult for them to participate. As mentioned previously, our group dynamic shifted when Maha Alshammary and Sara Wilson left our project after graduation, and Graycee San Cebollero asked to join the team after being hired as a part-time library employee (and later assisted in reviewing literature on menstrual cup sustainability impacts).  Undergraduate college student (approximately ages 17-25) menstruators were the age group most interested in menstrual products during our project ,and having them on your project team ensures their broader peer social networks can be reached. 

Though we were unable to in our project, we advise any future reusable menstrual product distribution efforts to provide options; both the type of product as well as sizes. Different bodies have different needs, and different people have different desires. Throughout our project, team members and purchasers would occasionally mention that reusable pads or menstrual underwear should be our next project. 

Last but certainly not least, avoid making your campaign gendered. To encourage inclusion and accessibility for all, the products should not be placed just in women’s designated restrooms/spaces, and publicity should not discourage trans, nonbinary, and gender diverse people from engaging and benefitting from your service. Avoid using “she/her” or “women” and instead use “menstruators” or, simply, “people” when discussing periods and those of us who have them. We know this is a current culture war. Our library has been attacked online for posting about our menstrual products, as well as had our disposable products in gendered men’s restrooms damaged and vandalized. None of this stopped us. Thanks to advocates across the university, there are now menstrual products in every bathroom on campus, for anyone, and any reason.


Acknowledgment

The authors are incredibly grateful to our publishing editor Jaena Rae Cabrera, and Brittany Paloma Fiedler and Joshua Osondu Ikenna for their encouragement, guidance, and expertise as our peer reviewers. Thank you.

We are also thankful to menstrual equity advocates worldwide, whose pioneering work helped pave the way for this project.

This project would not have been possible without the support of the University of Utah’s Sustainability Office and the Eccles Health Sciences Library Access Services and Accounting departments. We would also like to dedicate this article to Joan Gregory, who initially forged the path towards menstrual equity at our library and our institution.


References

20 Places Around the World Where Governments Provide Free Period Products. (n.d.). Global Citizen. Retrieved August 23, 2024, from https://www.globalcitizen.org/en/content/free-period-products-countries-cities-worldwide/

Bowman, N., & Thwaites, A. (2023). Menstrual cup and risk of IUD expulsion – a systematic review. Contraception and Reproductive Medicine, 8(1), 15. https://doi.org/10.1186/s40834-022-00203-x

Female Homelessness and Period Poverty—National Organization for Women. (2021, January 22). https://now.org/blog/female-homelessness-and-period-poverty/

Harrison, M. E., & Tyson, N. (2023). Menstruation: Environmental impact and need for global health equity. International Journal of Gynecology & Obstetrics, 160(2), 378–382. https://doi.org/10.1002/ijgo.14311

Howard, C., Rose, C. L., Trouton, K., Stamm, H., Marentette, D., Kirkpatrick, N., Karalic, S., Fernandez, R., & Paget, J. (2011). FLOW (finding lasting options for women): Multicentre randomized controlled trial comparing tampons with menstrual cups. Canadian Family Physician Medecin De Famille Canadien, 57(6), e208-215.

Kluger, J. (2023, February 9). PFAS “Forever Chemicals” Are Turning Up in Menstrual Products. Here’s What You Need to Know. TIME. https://time.com/6254060/pfas-period-chemicals-underwear-tampons/

Lomax, B. (2022, May 2). Period. End of Story. American Libraries Magazine. https://americanlibrariesmagazine.org/2022/05/02/period-end-of-story/

Massengale, K. E., Bowman, K. M., Comer, L. H., & Van Ness, S. (2024). Breaking the period product insecurity cycle: An observational study of outcomes experienced by recipients of free period products in the United States. Women’s Health, 20, 17455057241267104. https://doi.org/10.1177/17455057241267104

Menstrual cups gain popularity. (n.d.). Mayo Clinic Health System. Retrieved January 8, 2025, from https://www.mayoclinichealthsystem.org/hometown-health/speaking-of-health/menstrual-cups-vs-tampons-things-you-might-not-know-about-the-cup

Menstrual Product Project—ASUU. (n.d.). Retrieved August 23, 2024, from https://www.asuu.utah.edu/menstralproductsproject/

Menstruation Facts and Figures. (n.d.). AHPMA. Retrieved August 23, 2024, from https://www.ahpma.co.uk/menstruation_facts_and_figures/

Montano, E. (2018). The Bring Your Own Tampon Policy: Why Menstrual Hygiene Products Should Be Provided for Free in Restrooms Notes. University of Miami Law Review, 73(1), [xi]-412. https://heinonline.org/HOL/P?h=hein.journals/umialr73&i=380

Mouhanna, J. N., Simms-Cendan, J., & Pastor-Carvajal, S. (2023). The Menstrual Cup: Menstrual Hygiene With Less Environmental Impact. JAMA, 329(13), 1114–1115. https://doi.org/10.1001/jama.2023.1172

Parent, C., Tetu, C., Barbe, C., Bonneau, S., Gabriel, R., Graesslin, O., & Raimond, E. (2022). Menstrual hygiene products: A practice evaluation. Journal of Gynecology Obstetrics and Human Reproduction, 51(1), 102261. https://doi.org/10.1016/j.jogoh.2021.102261

Peberdy, E., Jones, A., & Green, D. (2019). A Study into Public Awareness of the Environmental Impact of Menstrual Products and Product Choice. Sustainability, 11(2), Article 2. https://doi.org/10.3390/su11020473

Rawat, M., Novorita, A., Frank, J., Burgett, S., Cromer, R., Ruple, A., & DeMaria, A. L. (2023). “Sometimes I just forget them”: Capturing experiences of women about free menstrual products in a U.S. based public university campus. BMC Women’s Health, 23(1), 351. https://doi.org/10.1186/s12905-023-02457-2

Schlievert, P. M. (2020). Menstrual TSS remains a dangerous threat. eClinicalMedicine, 21. https://doi.org/10.1016/j.eclinm.2020.100316

Shearston, J. A., Upson, K., Gordon, M., Do, V., Balac, O., Nguyen, K., Yan, B., Kioumourtzoglou, M.-A., & Schilling, K. (2024). Tampons as a source of exposure to metal(loid)s. Environment International, 190, 108849. https://doi.org/10.1016/j.envint.2024.108849

Sustainable Campus Initiative Fund—Sustainability. (n.d.). Retrieved August 23, 2024, from https://sustainability.utah.edu/scif/

The Ecological Impact of Feminine Hygiene Products. (n.d.). Technology and Operations Management. Retrieved August 23, 2024, from https://d3.harvard.edu/platform-rctom/submission/the-ecological-impact-of-feminine-hygiene-products/

US EPA, O. (2015, August 28). Greenhouse Gas Equivalencies Calculator [Data and Tools]. https://www.epa.gov/energy/greenhouse-gas-equivalencies-calculator

Why are menstrual cups becoming more popular? (2018, October 4). https://www.bbc.com/news/business-45667020

Advancing IDEAs: Inclusion, Diversity, Equity, Accessibility, 18 February 2025 / HangingTogether

The following post is one in a regular series on issues of Inclusion, Diversity, Equity, and Accessibility, compiled by a team of OCLC contributors.

Two covers for the book A Domestic Cook Book: Containing a Careful Selection of Useful Receipts for the Kitchen by Malinda Russell

Earliest known African American cookbook republished

One of the items in the Janice Bluestein Longone Culinary Archive held at University of Michigan (OCLC Symbol: EYM) Special Collection Research Center is a humble 30-page paper bound volume titled A Domestic Cook Book: Containing a Careful Selection of Useful Receipts for the Kitchen. Written by Malinda Russell and published in 1866 in Paw Paw, Michigan, this book is the only known copy of the earliest known cookbook written by Black author in the United States. Although this book has been digitized and is available online it has now been republished in a new edition that is available both in print and online as an open access ebook.

I’m highlighting this example during US Black History month as an example of the many treasures held in libraries and archives, but also as an example of the work that cultural heritage institutions play as publishers and disseminators of rare content, making materials available for all to enjoy and learn from. The book has recipes that — to a modern eye — lack important details like baking time or temperature. To prepare “French Lady Cake” the full instructions are “Three cups sugar, one do. butter, six eggs, one cup sweet milk, one teaspoon soda, two do. cream tartar, one wine-glass brandy, the juice of one lemon, four cups flour; the soda dissolved in the milk, the cream tartar in the flour.” The book also contains recipes for things like “Restoring the Hair to its Original Color,“ which is equally brief: “Lac Sulphuris two drachms, rose water eight ounces. Shake it thoroughly, and apply every night before going to bed.” The newly published edition also contains a foreword by Dr. Rafia Zafar, an expert in foodways and literature, which sets Russell’s accomplishments in context. Contributed by Merrilee Proffitt

WebJunction webinar on neuroinclusive library workplaces

On 4 March OCLC’s WebJunction will host a free webinar Embracing neurodiversity: Cultivating an inclusive workplace for neurodivergent staff. Presented by accessibility consultant and librarian Renee Grassi. Webinar attendees will learn about the meaning of the term neurodiversity, the strengths of neurodivergent people, and ways of making the workplace more neuroinclusive. The blog post “Supporting Neurodiversity in the Library Workplace” by Bobbi L. Newman provides a summary of general articles about neurodiversity in the workplace and bibliography of resources.

I began writing about neurodiversity in libraries for Advancing IDEAs on 25 July 2023. As a neurodivergent person, I was thrilled to see so many excellent projects and articles focused on serving neurodivergent patrons including the University of Washington’s Autism Ready-Libraries Toolkit. However, as neurodivergent librarian, I was somewhat frustrated that discussions about making libraries more neuroinclusive often did not discuss the library as a workplace. In February 2025, I am thrilled to see these discussions happening in multiple spaces, including webinars, conference presentations, and the University of Washington’s current research project Empowering Neurodivergent Librarians. I look forward to attending this WebJunction webinar next month and seeing what other educational opportunities about this topic will emerge in 2025. Contributed by Kate James.

Documentaries focus on libraries

It isn’t every day that we hear about a film concerning libraries and what we are facing in the current moment. So, think how surprising it is to discover two such documentaries right now. At the January 2025 Sundance Film Festival in Park City, Utah, the documentary The Librarians, by producer-director Kim A. Snyder, premiered and is expected to be available on a major streaming platform soon. The Librarians features numerous colleagues of ours from the American Library Association (ALA) and the American Association of School Librarians (AASL) who have battled for our intellectual freedoms in such places as Florida, Louisiana, and Texas. Read about The Librarians in the ALA news item, “American Library Association and American Association of School Librarians Members Featured in Documentary ‘The Librarians’ to Walk the Red Carpet at Sundance Premiere.” If your local community has not already scheduled a showing of the other documentary, “Free for All: The Public Library,” be aware that its broadcast premiere is set for Tuesday, 29 April 2025, as part of the PBS show Independent Lens.

The Librarians examines the wave of censorship spreading across the United States, especially targeting racial and LGBTQ+ issues and resources. Librarians stand tall at the center, protecting access for all users and opposing legislation trying to criminalize the work we do. “Free for All” traces the history of how public libraries became a formative institution vital for the preservation of democracy. It counters the notion of library obsolescence with the fact that libraries are local treasures where resources are available and free to all, even in this fraught political era. Contributed by Jay Weitz.

The post Advancing IDEAs: Inclusion, Diversity, Equity, Accessibility, 18 February 2025 appeared first on Hanging Together.

Author Interview: Rosanne Limoncelli / LibraryThing (Thingology)

Rosanne Limoncelli

LibraryThing is pleased to sit down this month with filmmaker and author Rosanne Limoncelli, the Senior Director for Film Technologies at the Kanbar Institute and at the Martin Scorsese Virtual Production Center, both part of NYU’s Tisch School of the Arts. She has written, directed, and produced documentaries, educational films and short narrative films, and has taught writing and filmmaking for more than three decades. Limoncelli’s first book, Teaching Filmmaking: Empowering Students Through Visual Storytelling, was published in 2009. She has published short stories in the Alfred Hitchcock Mystery Magazine, Suspense Magazine and Noir Nation. Her debut mystery novel, The Four Queens of Crime—offered in our January Early Reviewers batch—is due out next month from Crooked Lane Books. Limoncelli sat down with Abigail to answer some questions about her new book.

The Four Queens of Crime follows a woman detective chief inspector in 1930s London who enlists the aid of four famous mystery writers—Agatha Christie, Dorothy L. Sayers, Ngaio Marsh, and Margery Allingham—in solving her case. How did the idea for the story first come to you?

I love reading biographies of my favorite authors because I always wonder what experiences from their lives might’ve made it into their books. I love the psychology. Reading about Agatha Christie led me to the other three and it fascinated me that these four women were the bestselling authors of the 30’s. How amazing was that! The lives of Ngaio Marsh, Dorothy L. Sayers and Margery Allingham were just as fascinating, and of course I started wondering if they had ever met and that led me to, what if they did meet and got involved in a murder case? Would there have been a woman DCI they could’ve collaborated with? And then I found Lilian Wyles, the first woman DCI at Scotland Yard. And miraculously, she had a memoir!

Were you an admirer of these four authors’ work, before beginning your book? Which one is your favorite, and why?

That’s too hard a question! I think that sometimes I’m in the mood for one author over another, and they constantly switch places for number one. I love Christie’s puzzles, Ngaio’s characters, Allingham’s language, and the patter between Sayers’ protagonists. In the book they talk to each other about writing and how different they are from each other. That was one of the fun things about writing them as characters. I will say that for each author I have my favorite titles in all formats, hard cover, kindle and
Audiobook, and I go back to them often, not just for research reasons, I need to keep in touch with the main characters like they are real people in my life.

What sort of research did you have to do on your four queens, in order to incorporate them as characters in your story, and what were some of the most interesting things you learned?

I’m a research nerd, and I went way overboard researching all four authors, consuming all their books, plus articles, biographies, documentaries, movies and tv shows of their work, and the time period, 1938. I’m lucky that my husband has always been into the history of World War Two so we watched a lot of feature films and documentaries from and about that era. The four queens all came alive for me quickly, mainly through their biographies. I found it interesting to notice the differences between the four writers, as well as their similarities. For example, they were all big lovers of Shakespeare, they each had very different writing styles, they all grew up so differently. Agatha was home schooled, Dorothy was one of the first women to get a degree from Oxford, Ngaio was a painter and travel writer before she wrote mysteries, and Margery grew up surrounded by writers. I got very interested in the accuracy of their real lives pertaining to my story, figuring out the possible real time they could’ve spent together. The spring of 1938 would’ve been the last chance for them to meet before the war, since Ngaio Marsh returned to New Zealand shortly after that spring. I also noticed that they all had a change in their writing careers right about that time, so I imagined that the experiences on that weekend of my imagined murder changed them personally to bring about that literary change.

What influence has your career as a filmmaker had on your mystery writing? Would you say you were a visual storyteller? Do you see the scenes and characters before writing them?

I do think I see scenes and characters before I write them, which actually can make it more challenging for me because I forget I have to translate my visual imagination into text so I often leave things out without noticing. An early reader will mention they’d like more description of a certain place or person, and since I see it so clearly in my mind, I have to remind myself that no one else can see it, that I actually have to put it into words! But what is the same for me, in writing films and writing fiction, is the story structure. The logic and sequence of what happens and what should happen next is my favorite part and I make charts and spreadsheets and notes and lists obsessively before writing a project and throughout the whole process. It’s the puzzle of the story that I love the most. Building it up, breaking it down, deciding on the clues and all the information that leads to the climax and makes for a convincing ending, sorting and resorting every detail until it makes sense to me and I’m satisfied with it.

You have written short stories, films, and an academic text, but this is your first novel. Was the writing process any different, when working on this kind of text?

Technically this is the fifth novel I’ve written, just the first one to be published. (Keep writing out there, writers!) Each project is a bit different for me, but one thing that was quite similar in this project and the academic text (which stemmed from the dissertation for my PhD) was the research. In both cases I didn’t know exactly what I was going to write, at first, but I kept reading what interested me and taking lots of notes and underlining sentences, and marking sections with Post-It notes and noting links of websites and movie clips, then when it had gathered a certain satisfying accumulation, I stopped. I looked at everything I had gathered, all the notes, and sections, and visual images, etc. and it all seemed to magically come together thematically and emotionally. Like I was making a collage that found its shape from my subconscious. I think that the story starts to form itself in the back of my mind, while I’m gathering the research, and the story writing is easier after that once I get down to it.

What’s next? Will there be more stories featuring DCI Lilian Wyles? Might there be a film adaptation?

I am working on a sequel that takes place two years later. The war is raging and there are different problems to solve. This story is still a murder mystery puzzle, and Lilian Wyles leads the case, with help from the four queens, but it has a bit of a spy thriller spice added to it. I’m constantly inspired by the parallels from that time and our current day issues, there are so many similarities. As far as a film adaptation, I’d love to adapt The Four Queens of Crime into a feature or a tv series, we’ll see what happens!

Tell us about your library. What’s on your own shelves?

Besides the Four Queens of course, because they fill a lot of my shelves! (And by my “shelves” I mean paper books, E-readers and audio books.) A sampling of other books on my shelves include: P.D. James, John Irving, John Le Carré, Edith Wharton, Raymond Chandler, Graham Greene, and I’m a science fiction fan also, Octavia E. Butler, William Gibson, N.K. Jemisin, and Ray Bradbury.

What have you been reading lately, and what would you recommend to other readers?

I just finished reading every Mick Herron book in the Slow Horses series. The TV show is great and the audio books are also very well done. I just read Still Life by Louise Penny who is amazing! I can’t wait to read all of her work. I just started All Systems Red by Martha Wells. I will definitely be reading the whole Murderbot series. When I am trying to make a lot of progress with my writing projects, I have to ban myself from reading because I won’t get enough writing done or get enough sleep!

Bagging data.gov / Ed Summers

I had a bit of insomnia after my dog woke me up at 3am. I found myself reading the code behind Harvard Law Library’s effort to archive data.gov. It didn’t help me get back to sleep 😵‍💫

On the one hand, it is enraging that this effort is deemed to be in any way necessary at this time. But on the other, I do like how the team at Harvard Law Library are offering a useful pattern for the work of archiving data from the web, and one where it isn’t really a one size fits all software “solution”, but rather an approach.

This post grew out of a thread looking at how the various pieces of this pattern fit together, and why I think it’s useful, but I thought I would move it here and add a bit more context.

Why?

But before that, why is archiving datasets from the web even necessary? Don’t we have the End of Term Web Archive which collects widely from US Federal Government websites, when the administrations change? These archives end up in the Internet Archive’s Wayback Machine, who themselves do extensive archiving of the web. But finding datasets in web archives can be difficult, and when you find something sometimes what’s there isn’t complete, or easily usable.

For example here is the USAID GHSC-PSM Health Commodity Delivery Dataset published in data.gov:

https://catalog.data.gov/dataset/usaid-ghsc-psm-health-commodity-delivery-dataset

If you click to download the CSV file you’ll get an obscure message Cannot find view with id tikn-bfy4. This is because the Trump Administration deleted USAID’s website in early February. data.gov is a clearinghouse for datasets published elsewhere in the .gov web space. You can find 7 snapshots of the dataset’s webpage in the Wayback Machine, however if you look at the recent ones you’ll get similar error message when trying to download the CSV dataset.

Fortunately if you go back far enough and look at previous snapshots you will eventually find one that works. data.gov was designed to allow crawlers to locate datasets by following links. But not all web dashboard and dataset catalog software is built this way. Datasets can often be hidden behind search form interfaces, or obscured by JavaScript driven links that prevent the downloading by web archiving robots.

And lastly, web archiving software puts data into WARC files, which are great for replay systems like the Wayback Machine (when you know what you are looking for). But researchers typically want the data (e.g. CSV, Excel, shapefiles, etc) and don’t want to have to extract them from WARC data. It’s these users that the Harvard Law Library is thinking about as they designed their data.gov archive.

The Process

So how does Harvard’s archiving of data.gov work?

The first part is the fetch-index process in the data-vault project, which uses data.gov’s API to page through all of the datasets, and saves the metadata to a SQLite database (whicb also includes the JSON from the API).

When I tried it, it ran for about an hour and collected the metadata for 306,598 datasets. It was designed so that it could be rerun to get incremental updates.

Next the fetch-data utility (also in the data-vault repo) iterates through all the datasets in the SQLite database, and builds a set of URLs that are referenced in the dataset description.

Crucially, this list will include URLs extracted from the “resources” JSON data structure that includes referenced data files:

A “resource” in data.gov JSON metadata

Also crucial is the fact that not all datasets are directly referenced from data.gov – data.gov sometimes just links to a web page where the data files can be downloaded. These aren’t currently being collected by the data-vault code, and it’s a little unclear how many datasets are in that state.

fetch-data then collects up the list of URLs and the dataset metadata, and hands them off to a tool called bag-nabit, or nabit for short:

At a high-level nabit:

  1. downloads all the URLs to disk
  2. writes the dataset metadata to disk
  3. packages up the data and metadata into a BagIt directory (RFC 8493)
  4. records provenance about who did the work and when they did it

The control then goes back to data-vault which puts the data on Amazon S3.

The S3 bucket is being provided by the Source Cooperative which is a project of Radiant Earth, who received a Navigation Fund grant to make open data available on the web. You can find the bucket at s3://us-west-2.opendata.source.coop.

There are currently 623,195 files totaling 16.3 TB in the /harvard-lil/gov-data region of the bucket.

Provenance

Drilling down a little bit more. When nabit downloads the data from a URL it also records the HTTP request and response traffic as a WARC file (ISO 28500:2017) which is placed alongside the data itself in the BagIt data directory.

The WARC file doesn’t contain the content of the dataset, but just the metadata about the HTTP communication. This provides a record of when and how the data was fetched from the web.

The name of the WARC file and downloaded data files, are listed along with their fixities in the BagIt manifest: manifest-sha256.txt. You can use the manifest to check to see if the dataset is complete, and notice when files have been deleted or changed.

The tagmanifest-sha256.txt lists all the “tag” files and their fixities, which includes the manifest-sha256.txt. This means you can notice if the manifest has been altered.

The extra bit that Harvard have added is a top level signatures directory in the BagIt file structure, which contains digital signatures for the tagmanifest-sha256.txt file. These signatures are generated using an SSL key and an SSL certificate, which (should) let you know who signed the package. The certificate “chain” lets you know who verified they are who they say they are. So for example you could use your web server’s SSL key and certificate to sign the package.

And finally they also use a Time Stamp Authority to sign the tagmanifest-sha256.txt signature that in turn allows you to verify that the file was signed at a particular time. Together these signatures allow you to verify who signed the dataset, who trusts them, and when they did it. The manifest allows you to ensure the data is complete, and the WARC file let’s you see how the data was collected from the web.

In Summary

So the advantages that this presents over the traditional web archiving approach are that:

  • Researchers work with ZIP files, which they can unpack, inspect metadata, and directly interact with the dataset that was archived.
  • The dataset ZIP files are available on the web, and also for bulk access using Amazon S3 storage, which there are lots of tools for interacting with (awscli, rclone, etc).
  • The zip files contain manifests that let you ascertain whether the dataset is complete. Sometimes files have a way of getting modified or deleted once downloaded. The bag-nabit command line utility includes a nabit validate command that performs the necessary operation.
  • The manifests are in turn cryptographically signed so that you can ascertain who did the archiving and when, and decide whether you want to trust them or not. (again, nabit validate will do the necessary operations, but more on this below).

I think I might have one more blog post in me about trying to use this pattern on another dataset.


A Swollen Appendix: Verification

Once you’ve got it installed you can run nabit validate to validate a bag. But it doesn’t currently help you determine who signed the dataset, and what the trust chain looks like. I thought it could be interesting to look at the nuts and bolts of how that works.

The details of inspecting and verifying the signatures was derived from looking at the bag-nabit library, and is a bit esoteric to say the least. But hey, we’re going to be doing cryptography with openssl, so we shouldn’t be surprised.

First you need to get a dataset, either from the data.source.coop web application:

$ wget https://data.source.coop/harvard-lil/gov-data/collections/data_gov/usaid-ghsc-psm-health-commodity-delivery-dataset/v1.zip

or you can download one (or more) directly from the Amazon S3 bucket:

$ aws s3 cp s3://us-west-2.opendata.source.coop/harvard-lil/gov-data/collections/data_gov/usaid-ghsc-psm-health-commodity-delivery-dataset/v1.zip v1.zip

Then you unzip it:

$ unzip -d v1 v1.zip
Archive:  v1.zip
  inflating: v1/bagit.txt
  inflating: v1/tagmanifest-sha256.txt
  inflating: v1/bag-info.txt
  inflating: v1/manifest-sha256.txt
  inflating: v1/data/signed-metadata.json
  inflating: v1/data/headers.warc
  inflating: v1/signatures/tagmanifest-sha256.txt.p7s.tsr
  inflating: v1/signatures/tagmanifest-sha256.txt.p7s
  inflating: v1/signatures/tagmanifest-sha256.txt.p7s.tsr.crt
  inflating: v1/data/files/columns.json
  inflating: v1/data/files/c54cb869-4a55-4a2a-8c3b-0b78dc4b9bfa.bin
  inflating: v1/data/files/tikn-bfy4.html
  inflating: v1/data/files/rows.json
  inflating: v1/data/files/rows.rdf
  inflating: v1/data/files/catalog.json
  inflating: v1/data/files/rows.xml
  inflating: v1/data/files/legalcode.html
  inflating: v1/data/files/columns.rdf
  inflating: v1/data/files/tikn-bfy4.json
  inflating: v1/data/files/rows.csv
  inflating: v1/data/files/columns.xml
  inflating: v1/data/files/catalog.jsonld
 extracting: v1/data/files/usaid.png
  inflating: v1/data/files/data.json
  inflating: v1/data/files/dcat-us.html

The dataset can be found in the Bagit “payload” directory (data). The signatures can be found in the “signatures” directory.

First of all it’s useful to see who signed the dataset:

$ cd v1
$ openssl cms -cmsout -in signatures/tagmanifest-sha256.txt.p7s -inform PEM -print

CMS_ContentInfo:
  contentType: pkcs7-signedData (1.2.840.113549.1.7.2)
  d.signedData:
    version: 1
    digestAlgorithms:
        algorithm: sha256 (2.16.840.1.101.3.4.2.1)
        parameter: <ABSENT>
    encapContentInfo:
      eContentType: pkcs7-data (1.2.840.113549.1.7.1)
      eContent: <ABSENT>
    certificates:
      d.certificate:
        cert_info:
          version: 2
          serialNumber: 12491362874156253824700607432
          signature:
            algorithm: sha256WithRSAEncryption (1.2.840.113549.1.1.11)
            parameter: NULL
          issuer:           C=BE, O=GlobalSign nv-sa, CN=GlobalSign GCC R6 SMIME CA 2023
          validity:
            notBefore: Dec  5 20:10:46 2024 GMT
            notAfter: Dec  6 20:10:46 2025 GMT
          subject:           C=US, ST=Massachusetts, L=Cambridge, organizationIdentifier=NTRUS-166027, O=Library Innovation Lab, CN=lil@law.harvard.edu, emailAddress=lil@law.harvard.edu
          key:           X509_PUBKEY:
            algor:
              algorithm: rsaEncryption (1.2.840.113549.1.1.1)
              parameter: NULL
            public_key:  (0 unused bits)
              0000 - 30 82 01 0a 02 82 01 01-00 a8 61 ab b1 38   0.........a..8
              000e - ff 67 f0 e7 e9 85 54 e3-c4 7e 20 e3 9f e2   .g....T..~ ...
              001c - 06 16 05 60 f2 94 bb 5b-be 57 f6 d3 d2 43   ...`...[.W...C
              002a - 4c 8e 49 65 6e 3a e4 c3-de 1a d0 a2 62 e1   L.Ien:......b.
              0038 - d7 09 a0 2a 18 8f 73 99-be cd f7 66 66 6e   ...*..s....ffn
              0046 - 20 30 7f dc 83 62 cb 36-bf 1c cb a2 99 dc    0...b.6......
              0054 - 82 ad 5f e9 f3 f1 7f 67-9b 2f 2d f7 d1 78   .._....g./-..x
              0062 - fa cb bd d4 81 10 22 8e-43 28 59 ea 63 06   ......".C(Y.c.
              0070 - 35 97 32 65 ed a5 13 a5-9b ee ef be 08 17   5.2e..........
              007e - e6 04 8d fb fa 7b 39 b3-07 4f 8f 0e b0 ee   .....{9..O....
              008c - 94 7b 23 dc e1 fc 23 e3-96 70 a1 52 77 48   .{#...#..p.RwH
              009a - 3d e8 81 d9 3c 46 8e 57-13 ad 06 c7 80 b6   =...<F.W......
              00a8 - 62 8e 9e 99 3f ea ef 20-66 47 4e 31 a9 45   b...?.. fGN1.E
              00b6 - dd cf 1c 0d 97 ed e8 ff-05 af b1 7a 00 6b   ...........z.k
              00c4 - 08 06 15 ce 2e 53 97 da-46 9f 57 93 2d f9   .....S..F.W.-.
              00d2 - 6a 91 71 7b db 1b db 25-2e e6 77 8d 98 da   j.q{...%..w...
              00e0 - df 30 58 d9 ce 0b 1e 2d-a8 fa 0f f1 6d e4   .0X....-....m.
              00ee - cb 11 fa b6 e9 1b 7d 63-1d 10 4a c6 97 c9   ......}c..J...
              00fc - ba d9 4e 25 2d f9 d8 92-e1 e4 58 ff e5 02   ..N%-.....X...
              010a - 03 01 00 01                                 ....
          issuerUID: <ABSENT>
          subjectUID: <ABSENT>                                                                                                    extensions:
              object: X509v3 Key Usage (2.5.29.15)
              critical: TRUE
              value:
                0000 - 03 02 05 a0                              ....

              object: Authority Information Access (1.3.6.1.5.5.7.1.1)
              critical: FALSE
              value:
                0000 - 30 81 83 30 46 06 08 2b-06 01 05 05 07   0..0F..+.....
                000d - 30 02 86 3a 68 74 74 70-3a 2f 2f 73 65   0..:http://se
                001a - 63 75 72 65 2e 67 6c 6f-62 61 6c 73 69   cure.globalsi
                0027 - 67 6e 2e 63 6f 6d 2f 63-61 63 65 72 74   gn.com/cacert
                0034 - 2f 67 73 67 63 63 72 36-73 6d 69 6d 65   /gsgccr6smime
                0041 - 63 61 32 30 32 33 2e 63-72 74 30 39 06   ca2023.crt09.
                004e - 08 2b 06 01 05 05 07 30-01 86 2d 68 74   .+.....0..-ht
                005b - 74 70 3a 2f 2f 6f 63 73-70 2e 67 6c 6f   tp://ocsp.glo
                0068 - 62 61 6c 73 69 67 6e 2e-63 6f 6d 2f 67   balsign.com/g
                0075 - 73 67 63 63 72 36 73 6d-69 6d 65 63 61   sgccr6smimeca
                0082 - 32 30 32 33                              2023

              object: X509v3 Certificate Policies (2.5.29.32)
              critical: FALSE
              value:
                0000 - 30 5c 30 09 06 07 67 81-0c 01 05 02 01   0\0...g......
                000d - 30 0b 06 09 2b 06 01 04-01 a0 32 01 28   0...+.....2.(
                001a - 30 42 06 0a 2b 06 01 04-01 a0 32 0a 03   0B..+.....2..
                0027 - 01 30 34 30 32 06 08 2b-06 01 05 05 07   .0402..+.....
                0034 - 02 01 16 26 68 74 74 70-73 3a 2f 2f 77   ...&https://w
                0041 - 77 77 2e 67 6c 6f 62 61-6c 73 69 67 6e   ww.globalsign
                004e - 2e 63 6f 6d 2f 72 65 70-6f 73 69 74 6f   .com/reposito
                005b - 72 79 2f                                 ry/

              object: X509v3 Basic Constraints (2.5.29.19)
              critical: FALSE
              value:
                0000 - 30 00                                    0.

              object: X509v3 CRL Distribution Points (2.5.29.31)
              critical: FALSE
              value:
                0000 - 30 38 30 36 a0 34 a0 32-86 30 68 74 74   0806.4.2.0htt
                000d - 70 3a 2f 2f 63 72 6c 2e-67 6c 6f 62 61   p://crl.globa
                001a - 6c 73 69 67 6e 2e 63 6f-6d 2f 67 73 67   lsign.com/gsg
                0027 - 63 63 72 36 73 6d 69 6d-65 63 61 32 30   ccr6smimeca20
                0034 - 32 33 2e 63 72 6c                        23.crl

              object: X509v3 Subject Alternative Name (2.5.29.17)
              critical: FALSE
              value:
                0000 - 30 15 81 13 6c 69 6c 40-6c 61 77 2e 68   0...lil@law.h
                000d - 61 72 76 61 72 64 2e 65-64 75            arvard.edu

              object: X509v3 Extended Key Usage (2.5.29.37)
              critical: FALSE
              value:
                0000 - 30 14 06 08 2b 06 01 05-05 07 03 02 06   0...+........
                000d - 08 2b 06 01 05 05 07 03-04               .+.......

              object: X509v3 Authority Key Identifier (2.5.29.35)
              critical: FALSE
              value:
                0000 - 30 16 80 14 00 29 36 9e-5c 7a ba 0f af   0....)6.\z...
                000d - 2d 50 2d db a0 23 85 18-b0 a0 92         -P-..#.....

              object: X509v3 Subject Key Identifier (2.5.29.14)
              critical: FALSE
              value:
                0000 - 04 14 13 10 9a 8f 9f d7-a8 70 09 ef 52   .........p..R
                000d - 1b 5a fb ef eb b4 53 b4-37               .Z....S.7
        sig_alg:
          algorithm: sha256WithRSAEncryption (1.2.840.113549.1.1.11)
          parameter: NULL
        signature:  (0 unused bits)
          0000 - 10 eb 66 2c 8d a7 2a dd-9a 02 f1 cd 2e 27 22   ..f,..*......'"
          000f - f2 da 7a e1 c1 a3 1c f4-9d 97 95 db bc 8d 2e   ..z............
          001e - 8c 25 fc b1 17 a6 45 5d-43 60 35 a7 dc 06 fb   .%....E]C`5....
          002d - 2b 54 54 cf 66 1c 8a 4f-28 74 09 fe 71 26 81   +TT.f..O(t..q&.
          003c - 6f ce 74 67 a3 28 bb 70-a8 5f 9f 25 28 f5 7b   o.tg.(.p._.%(.{
          004b - a8 25 0e 2c 13 6b 08 35-bd 98 4a b0 f1 d6 bb   .%.,.k.5..J....
          005a - 2a 46 0d d0 b2 18 ea 24-ed 4a 9c e5 e6 22 b4   *F.....$.J...".
          0069 - d3 0f ea 48 71 c8 e2 cf-43 ab 6a af fa 1f 0f   ...Hq...C.j....
          0078 - 25 ff 6e 48 97 9c ef 56-f5 76 c1 ed 80 de 22   %.nH...V.v...."
          0087 - 1e 51 be b1 63 76 50 29-90 a2 7a e2 9a 20 7b   .Q..cvP)..z.. {
          0096 - 9c 39 e2 5c 7e 4a 56 c8-71 5c 14 9f 76 4f 2b   .9.\~JV.q\..vO+
          00a5 - 25 22 8d a6 a8 ee 70 5d-4b 3f 19 6f 09 78 4d   %"....p]K?.o.xM
          00b4 - c3 8c a3 cb d9 ee c2 e5-7b 15 42 e4 52 28 c0   ........{.B.R(.
          00c3 - b1 0a 77 fb 53 69 7d 98-9f fd 1c 9b 8e db 0f   ..w.Si}........
          00d2 - 81 26 d6 28 38 bb ad 9a-54 c4 67 7d 17 57 93   .&.(8...T.g}.W.
          00e1 - 96 c5 62 3d 4d bb a0 77-c6 e3 b0 db c3 4b 6a   ..b=M..w.....Kj
          00f0 - c6 f0 24 ec 2f 0c 18 67-38 c5 8b 63 39 fc a9   ..$./..g8..c9..
          00ff - 9f 2b 52 0c 25 15 7d d3-7b 04 46 8e 79 ff 09   .+R.%.}.{.F.y..
          010e - 7a 01 9c 1d 2f bd 9b ed-34 53 78 f0 5d 95 e2   z.../...4Sx.]..
          011d - fc 76 bd 02 b6 64 e0 7c-8e 29 db d8 43 f8 c4   .v...d.|.)..C..
          012c - 83 08 5a 98 b0 78 c4 69-a0 ec 91 8e 72 eb 85   ..Z..x.i....r..
          013b - 0d 2c 4b 3d cf cc df 19-d8 09 8a db ff 52 73   .,K=.........Rs
          014a - 23 25 b7 46 20 1d 66 2f-7a 4f 99 2f 8c 4f e8   #%.F .f/zO./.O.
          0159 - 70 9e 60 25 74 ca 31 4a-aa c7 00 6e e4 f9 2f   p.`%t.1J...n../
          0168 - 24 90 08 07 fe 0f dc a8-20 92 b8 72 35 bb 1d   $....... ..r5..
          0177 - 28 8f 11 73 7b 74 35 b0-ef 2c 57 5a 23 05 bc   (..s{t5..,WZ#..
          0186 - c9 7b 15 9f ea eb 6c e9-9b 8c 3e 6c d3 79 ae   .{....l...>l.y.
          0195 - ac 9e 7a 61 99 7c 74 c6-7a 9c 70 21 6e 46 7d   ..za.|t.z.p!nF}
          01a4 - f3 fe ca 0e b9 c2 1f d8-0b 4b c1 0d 9a 4e c3   .........K...N.
          01b3 - 39 ba 53 59 af 60 17 d1-80 10 9c 7a f8 d3 f3   9.SY.`.....z...
          01c2 - 35 54 83 c8 2f 59 f3 d2-24 36 4d be f6 46 4e   5T../Y..$6M..FN
          01d1 - fd 15 24 0e b5 e0 21 b3-c9 db ce eb 25 7c 63   ..$...!.....%|c
          01e0 - 95 b7 f5 91 50 7c 7e 33-d6 1c d9 c9 c9 04 09   ....P|~3.......
          01ef - c8 69 4b 3f 96 b6 98 7e-86 46 d4 4f 38 9b 71   .iK?...~.F.O8.q
          01fe - af b6                                          ..

      d.certificate:
        cert_info:
          version: 2
          serialNumber: 168187643342224492016391364218657937652
          signature:
            algorithm: sha384WithRSAEncryption (1.2.840.113549.1.1.12)
            parameter: NULL
          issuer:           OU=GlobalSign Root CA - R6, O=GlobalSign, CN=GlobalSign
          validity:
            notBefore: Apr 19 03:53:53 2023 GMT
            notAfter: Apr 19 00:00:00 2029 GMT
          subject:           C=BE, O=GlobalSign nv-sa, CN=GlobalSign GCC R6 SMIME CA 2023
          key:           X509_PUBKEY:
            algor:
              algorithm: rsaEncryption (1.2.840.113549.1.1.1)
              parameter: NULL
            public_key:  (0 unused bits)
              0000 - 30 82 02 0a 02 82 02 01-00 c2 30 04 6d 29   0.........0.m)
              000e - 0f 71 2c a7 db a6 67 f5-5b 68 13 fc 41 bf   .q,...g.[h..A.
              001c - 36 26 35 6d bd 6d 6d 69-25 9e e3 af 32 b0   6&5m.mmi%...2.
              002a - 3c 99 bf 19 a9 02 bf 2d-08 22 03 9b 32 cc   <......-."..2.
              0038 - 7d 6e 91 5a ab 7d 17 d6-41 09 66 72 d4 ce   }n.Z.}..A.fr..
              0046 - e1 35 fe 19 5c 85 ab 58-ab 23 91 54 17 87   .5..\..X.#.T..
              0054 - 96 fe 55 d1 04 52 5d 8e-1f 69 1d 1d 0a 42   ..U..R]..i...B
              0062 - 21 5e 1a 06 92 76 76 3b-46 d4 26 2b 61 70   !^...vv;F.&+ap
              0070 - dd 48 b0 40 03 36 2c d9-d4 02 48 69 6b 16   .H.@.6,...Hik.
              007e - 6d 0e 2d 60 46 23 ca d1-1d bd f9 31 cf 55   m.-`F#.....1.U
              008c - ad 5f 74 a3 b5 e7 19 47-ef 70 2c 92 ed e8   ._t....G.p,...
              009a - 73 5a e2 c0 bf fd 02 9d-8f 27 eb fc d8 43   sZ.......'...C
              00a8 - 0b 36 2b 74 8c c0 b2 ca-17 16 7a 78 b7 e1   .6+t......zx..
              00b6 - dc 33 24 13 ae 3d 2b a4-3f 0a 90 f8 fd ea   .3$..=+.?.....
              00c4 - cc bd 6b 1c de 80 b6 f3-b5 ee f0 81 31 db   ..k.........1.
              00d2 - 58 17 89 1a 9d 2c 2a 0e-c1 02 ef 86 d1 19   X....,*.......
              00e0 - 3a b5 e3 b5 c8 f7 7b e5-aa db 0a f7 d8 fc   :.....{.......
              00ee - d6 91 45 01 e4 ea d9 86-ef 58 29 2d d0 75   ..E......X)-.u
              00fc - 66 13 fd 21 c3 58 c9 e4-ca 5c 88 1f 32 1d   f..!.X...\..2.
              010a - ad 54 af 43 9d 71 9a 92-c6 02 ca 2e 96 8a   .T.C.q........
              0118 - c2 5a d4 e6 be a6 85 2b-a1 7d 84 49 73 ba   .Z.....+.}.Is.
              0126 - a3 4d e5 57 18 80 d7 1c-9f 1b c9 54 ce 95   .M.W.......T..
              0134 - 0d 66 89 c6 56 b5 23 84-6e 7e 31 d6 35 eb   .f..V.#.n~1.5.
              0142 - fe 9b d2 e2 9e 8d 90 8b-6e 0b ba 1c b3 ef   ........n.....
              0150 - 23 2a 9d 4d bf 57 a8 5e-17 59 62 ae f8 e3   #*.M.W.^.Yb...
              015e - 03 b3 fa d6 c9 0e b8 bf-79 af 49 ad 2d 7b   ........y.I.-{
              016c - 94 39 b5 c8 66 e2 6f 4a-d0 f7 bd 2e 52 0a   .9..f.oJ....R.
              017a - 06 4f e0 b2 07 da 38 72-43 a7 88 58 c7 8c   .O....8rC..X..
              0188 - 28 70 3a fc 0d e0 99 7b-97 79 c0 21 64 c6   (p:....{.y.!d.
              0196 - 81 9c d5 c9 0c f1 9a 0c-82 95 1c e2 98 24   .............$
              01a4 - 40 4e 52 87 16 0c 98 a6-cf bc d4 4f 6b 96   @NR........Ok.
              01b2 - 05 cd b1 6d 70 59 f9 44-ca f5 32 3b bc 82   ...mpY.D..2;..
              01c0 - df 09 d1 cf 9d c5 87 28-da 18 bb 74 45 b7   .......(...tE.
              01ce - f7 52 8a c4 68 6f ec c2-41 71 cf b4 c7 71   .R..ho..Aq...q
              01dc - 45 4e 79 a1 53 ee 6d eb-e0 a7 90 a3 25 ae   ENy.S.m.....%.
              01ea - 6c f2 6a 87 95 af 24 53-bc e8 87 e0 59 d3   l.j...$S....Y.
              01f8 - 34 7b 55 29 46 03 b0 23-34 08 b0 dd 30 d8   4{U)F..#4...0.
              0206 - 28 c6 09 02 03 01 00 01-                    (.......
          issuerUID: <ABSENT>
          subjectUID: <ABSENT>
          extensions:
              object: X509v3 Key Usage (2.5.29.15)
              critical: TRUE
              value:
                0000 - 03 02 01 86                              ....

              object: X509v3 Extended Key Usage (2.5.29.37)
              critical: FALSE
              value:
                0000 - 30 43 06 08 2b 06 01 05-05 07 03 02 06   0C..+........
                000d - 08 2b 06 01 05 05 07 03-04 06 0a 2b 06   .+.........+.
                001a - 01 04 01 82 37 14 02 02-06 0a 2b 06 01   ....7.....+..
                0027 - 04 01 82 37 0a 03 0c 06-0a 2b 06 01 04   ...7.....+...
                0034 - 01 82 37 0a 03 04 06 09-2b 06 01 04 01   ..7.....+....
                0041 - 82 37 15 06                              .7..

              object: X509v3 Basic Constraints (2.5.29.19)
              critical: TRUE
              value:
                0000 - 30 06 01 01 ff 02 01 00-                 0.......

              object: X509v3 Subject Key Identifier (2.5.29.14)
              critical: FALSE
              value:
                0000 - 04 14 00 29 36 9e 5c 7a-ba 0f af 2d 50   ...)6.\z...-P
                000d - 2d db a0 23 85 18 b0 a0-92               -..#.....

              object: X509v3 Authority Key Identifier (2.5.29.35)
              critical: FALSE
              value:
                0000 - 30 16 80 14 ae 6c 05 a3-93 13 e2 a2 e7   0....l.......
                000d - e2 d7 1c d6 c7 f0 7f c8-67 53 a0         ........gS.

              object: Authority Information Access (1.3.6.1.5.5.7.1.1)
              critical: FALSE
              value:
                0000 - 30 6d 30 2e 06 08 2b 06-01 05 05 07 30   0m0...+.....0
                000d - 01 86 22 68 74 74 70 3a-2f 2f 6f 63 73   .."http://ocs
                001a - 70 32 2e 67 6c 6f 62 61-6c 73 69 67 6e   p2.globalsign
                0027 - 2e 63 6f 6d 2f 72 6f 6f-74 72 36 30 3b   .com/rootr60;
                0034 - 06 08 2b 06 01 05 05 07-30 02 86 2f 68   ..+.....0../h
                0041 - 74 74 70 3a 2f 2f 73 65-63 75 72 65 2e   ttp://secure.
                004e - 67 6c 6f 62 61 6c 73 69-67 6e 2e 63 6f   globalsign.co
                005b - 6d 2f 63 61 63 65 72 74-2f 72 6f 6f 74   m/cacert/root
                0068 - 2d 72 36 2e 63 72 74                     -r6.crt

              object: X509v3 CRL Distribution Points (2.5.29.31)
              critical: FALSE
              value:
                0000 - 30 2d 30 2b a0 29 a0 27-86 25 68 74 74   0-0+.).'.%htt
                000d - 70 3a 2f 2f 63 72 6c 2e-67 6c 6f 62 61   p://crl.globa
                001a - 6c 73 69 67 6e 2e 63 6f-6d 2f 72 6f 6f   lsign.com/roo
                0027 - 74 2d 72 36 2e 63 72 6c-                 t-r6.crl

              object: X509v3 Certificate Policies (2.5.29.32)
              critical: FALSE
              value:
                0000 - 30 08 30 06 06 04 55 1d-20 00            0.0...U. .
        sig_alg:
          algorithm: sha384WithRSAEncryption (1.2.840.113549.1.1.12)
          parameter: NULL
        signature:  (0 unused bits)
          0000 - 91 91 47 6b d5 a2 03 46-69 0d 23 98 f1 e6 08   ..Gk...Fi.#....
          000f - 1a a4 65 13 86 ad 0a 70-cd 9d ce 93 2e df 5e   ..e....p......^
          001e - 26 26 77 bc c8 a5 57 c3-37 ca 06 da 9b 06 36   &&w...W.7.....6
          002d - d4 34 c3 83 9c 19 21 bd-97 27 6c 75 12 b6 ea   .4....!..'lu...
          003c - f6 fe 7b 75 b4 fd de 7b-c2 b2 36 16 31 ce fe   ..{u...{..6.1..
          004b - 03 90 8d 0d 6d 5f 77 24-28 57 8a 97 ec 6a 7e   ....m_w$(W...j~
          005a - d8 8d d7 c4 93 9b c8 d8-5a be c2 96 c6 00 bc   ........Z......
          0069 - b2 58 18 1f cb bf 58 22-06 d8 58 04 c1 d7 9f   .X....X"..X....
          0078 - 2d bc 48 79 50 ef 24 a4-6a 63 63 de 71 bf ed   -.HyP.$.jcc.q..
          0087 - 3b d1 7d c5 62 e1 b2 79-9c 88 bd aa 36 ea 63   ;.}.b..y....6.c
          0096 - 7c ef 61 6e c5 1c 58 84-d2 f0 18 72 32 df c3   |.an..X....r2..
          00a5 - 7d 01 26 b5 43 70 53 34-a4 ab 1e b6 67 81 a7   }.&.CpS4....g..
          00b4 - 68 7c 78 25 1b 95 b7 4c-c1 51 d7 52 4e 10 e0   h|x%...L.Q.RN..
          00c3 - 14 1e 15 20 a5 b5 55 be-00 98 80 60 3a 75 25   ... ..U....`:u%
          00d2 - f4 cb 9c fb 93 7a d7 57-28 c5 3a ce ca 05 25   .....z.W(.:...%
          00e1 - eb 74 93 ca 69 da 65 e2-fa 98 a6 11 fb f8 fe   .t..i.e........
          00f0 - 34 9f 30 51 73 12 47 ae-fe 45 79 79 53 ad bf   4.0Qs.G..EyyS..
          00ff - 9d ae 3c 97 36 36 52 0a-6c df 90 eb 82 a8 fb   ..<.66R.l......
          010e - 29 06 e2 7b bd a6 f4 ff-da 1e 34 44 60 9f 3e   )..{......4D`.>
          011d - 92 2c 28 cb 29 c0 d7 6c-c6 ca 71 15 e0 36 11   .,(.)..l..q..6.
          012c - 41 97 33 78 39 40 6a 89-e4 81 5e 4f 34 c3 63   A.3x9@j...^O4.c
          013b - 73 c7 5d 8a bc d8 fb e7-c5 9a bf 13 ac 5c 86   s.]..........\.
          014a - d7 d1 9c 70 a3 58 77 bb-0e f9 00 8d af f2 ac   ...p.Xw........
          0159 - 05 59 73 5c 94 ef 2a 5b-65 57 a2 ae a4 8a 15   .Ys\..*[eW.....
          0168 - a9 7a 2b af 0e 61 5f 48-0c 11 2f 1c 30 22 38   .z+..a_H../.0"8
          0177 - 14 bb 31 bd 49 a4 3e a4-ea 26 b9 a0 bb 41 32   ..1.I.>..&...A2
          0186 - 96 30 8d 21 2f 46 f8 98-43 ea f4 6b 0f 3d 0a   .0.!/F..C..k.=.
          0195 - b5 52 6c 24 71 81 49 fd-9e 08 f7 70 d9 b8 a7   .Rl$q.I....p...
          01a4 - 17 98 a3 26 b8 03 53 4b-ac 31 c0 81 30 f1 0e   ...&..SK.1..0..
          01b3 - 4c 43 ac bd 7d b2 71 18-43 a0 3a 06 0b e1 02   LC..}.q.C.:....
          01c2 - 2a 35 42 db e4 26 0f 9e-dd 8b 7a 22 22 12 78   *5B..&....z"".x
          01d1 - 7c 52 e8 7c b5 ac 2a 4a-39 d1 d2 1e c1 bf 9a   |R.|..*J9......
          01e0 - b8 9a 0a 37 2f 56 7d 41-0b 9e c5 49 ed 58 3f   ...7/V}A...I.X?
          01ef - 7a b7 8a 34 ab 58 d7 58-bc ab a6 03 fb 65 c9   z..4.X.X.....e.
          01fe - ee 0b                                          ..
    crls:
      <ABSENT>
    signerInfos:
        version: 1
        d.issuerAndSerialNumber:
          issuer:           C=BE, O=GlobalSign nv-sa, CN=GlobalSign GCC R6 SMIME CA 2023
          serialNumber: 12491362874156253824700607432
        digestAlgorithm:
          algorithm: sha256 (2.16.840.1.101.3.4.2.1)
          parameter: <ABSENT>
        signedAttrs:
            object: contentType (1.2.840.113549.1.9.3)
            set:
              OBJECT:pkcs7-data (1.2.840.113549.1.7.1)

            object: signingTime (1.2.840.113549.1.9.5)
            set:
              UTCTIME:Dec 15 18:41:55 2024 GMT

            object: messageDigest (1.2.840.113549.1.9.4)
            set:
              OCTET STRING:
                0000 - 32 a5 cc fd de ce 43 e7-96 70 bc f7 89   2.....C..p...
                000d - da 8b 4a 09 70 2e e1 aa-3f 52 5e 6a 92   ..J.p...?R^j.
                001a - 8c 4b 96 ac ec b3                        .K....

            object: id-smime-aa-signingCertificateV2 (1.2.840.113549.1.9.16.2.47)
            set:
              SEQUENCE:
    0:d=0  hl=3 l= 144 cons: SEQUENCE
    3:d=1  hl=3 l= 141 cons:  SEQUENCE
    6:d=2  hl=3 l= 138 cons:   SEQUENCE
    9:d=3  hl=2 l=  32 prim:    OCTET STRING      [HEX DUMP]:FBC8FF8518A9C9A4CD74A2D917915D1A8A459D0FCEAF47C3CC14F1EEF9616E24
   43:d=3  hl=2 l= 102 cons:    SEQUENCE
   45:d=4  hl=2 l=  86 cons:     SEQUENCE
   47:d=5  hl=2 l=  84 cons:      cont [ 4 ]
   49:d=6  hl=2 l=  82 cons:       SEQUENCE
   51:d=7  hl=2 l=  11 cons:        SET
   53:d=8  hl=2 l=   9 cons:         SEQUENCE
   55:d=9  hl=2 l=   3 prim:          OBJECT            :countryName
   60:d=9  hl=2 l=   2 prim:          PRINTABLESTRING   :BE
   64:d=7  hl=2 l=  25 cons:        SET
   66:d=8  hl=2 l=  23 cons:         SEQUENCE
   68:d=9  hl=2 l=   3 prim:          OBJECT            :organizationName
   73:d=9  hl=2 l=  16 prim:          PRINTABLESTRING   :GlobalSign nv-sa
   91:d=7  hl=2 l=  40 cons:        SET
   93:d=8  hl=2 l=  38 cons:         SEQUENCE
   95:d=9  hl=2 l=   3 prim:          OBJECT            :commonName
  100:d=9  hl=2 l=  31 prim:          PRINTABLESTRING   :GlobalSign GCC R6 SMIME CA 2023
  133:d=4  hl=2 l=  12 prim:     INTEGER           :285C9CFA45F3A9CB88BB6BC8
        signatureAlgorithm:
          algorithm: rsaEncryption (1.2.840.113549.1.1.1)
          parameter: NULL
        signature:
          0000 - 43 ea d5 91 16 94 55 5f-18 0b a2 b2 3f 01 22   C.....U_....?."
          000f - 2d a2 51 16 6e cb fc 92-18 85 24 55 37 e2 6f   -.Q.n.....$U7.o
          001e - bd cd 07 76 fb b6 2d b6-e3 96 5f 14 32 9a 9b   ...v..-..._.2..
          002d - 7e 9f 5f 44 04 8a 37 dd-13 28 70 a9 b9 8d 33   ~._D..7..(p...3
          003c - 56 6d 2d 5e 3d e5 8f 9a-fa 05 1a 4f 0d 21 04   Vm-^=......O.!.
          004b - ab 89 58 51 03 b1 b0 59-1d d9 9b 4c 72 3f fa   ..XQ...Y...Lr?.
          005a - ca 5d ce 67 4b 87 52 69-95 36 94 33 14 f5 0c   .].gK.Ri.6.3...
          0069 - f3 4e 5e 99 4e 2e d6 18-d8 a0 15 0d 08 be a8   .N^.N..........
          0078 - 40 45 4a 2f 73 d3 92 ef-c1 0a 5a 19 a2 8c 61   @EJ/s.....Z...a
          0087 - 02 10 67 12 6b 15 21 14-69 52 3e ce 36 b2 c8   ..g.k.!.iR>.6..
          0096 - cc b2 97 c8 11 3f 4a c7-7b d7 4d bd 83 a7 c3   .....?J.{.M....
          00a5 - 90 44 e6 10 6f 79 a6 9f-f2 79 dc d8 d8 1f d5   .D..oy...y.....
          00b4 - b4 1e 40 99 af 44 94 c8-ee 85 72 41 8a 70 9e   ..@..D....rA.p.
          00c3 - 5e 0c 2f 04 94 db 21 9c-b5 59 47 22 3e ad 3f   ^./...!..YG">.?
          00d2 - ca ee 67 fe c7 fa f0 84-1d a4 57 4a c3 1d 45   ..g.......WJ..E
          00e1 - fa 51 1b 88 39 b4 f3 fc-e7 b7 2b c0 d1 bb 2a   .Q..9.....+...*
          00f0 - 65 eb 56 f6 20 86 08 62-c6 c6 6e 1a 59 0d ea   e.V. ..b..n.Y..
          00ff - a5                                             .
        unsignedAttrs:
          <ABSENT>

Look at the first “subject line” to see who created the signature:

C=US, ST=Massachusetts, L=Cambridge, organizationIdentifier=NTRUS-166027, O=Library Innovation Lab, CN=lil@law.harvard.edu, emailAddress=lil@law.harvard.edu

And another subject further up the chain, that indicates who signed their certificate:

C=BE, O=GlobalSign nv-sa, CN=GlobalSign GCC R6 SMIME CA 2023

The next step is to verify that the signature is valid. Remember the thing that was signed was the tagmanifest-sha256.txt which in turn asserts the integrity of the entire bag:

$ openssl cms -verify -binary -inform PEM -purpose any \
    -content tagmanifest-sha256.txt \
    -in signatures/tagmanifest-sha256.txt.p7s

b18e8174f0736113310e23623501669c67fdb14d8b90596a46cf386dbc7c5cb0  manifest-sha256.txt
d7b53053b06731810c14d1796dc3f79f1dc3f09a8d2d35a056fa06e9a620d2c2  bag-info.txt
e91f941be5973ff71f1dccbdd1a32d598881893a7f21be516aca743da38b1689  bagit.txt
CMS Verification successful

Looks good!

Next we can look at the Time Server Response to see when it was signed:

openssl ts -reply -in signatures/tagmanifest-sha256.txt.p7s.tsr -text

Using configuration from /opt/homebrew/etc/openssl@3/openssl.cnf
Status info:
Status: Granted.
Status description: unspecified
Failure info: unspecified

TST info:
Version: 1
Policy OID: 2.16.840.1.114412.7.1
Hash Algorithm: sha256
Message data:
    0000 - cb 9b 55 bf d0 40 02 b7-29 f4 03 03 18 26 e0 ce   ..U..@..)....&..
    0010 - 6f bd 75 12 b6 06 15 c3-7e 95 77 57 af 9c 56 45   o.u.....~.wW..VE
Serial number: 0x82B3F3EE43174FCFFEBC4DB742C2AC65
Time stamp: Dec 15 18:41:55 2024 GMT
Accuracy: unspecified
Ordering: no
Nonce: 0x623C25E486E4E26C
TSA: unspecified
Extensions:

So Harvard Law Library (or someone with their SSL key) performed the signing on Dec 15 18:41:55 2024 GMT, and it can be verified with:

$ openssl ts -verify -data signatures/tagmanifest-sha256.txt.p7s -in signatures/tagmanifest-sha256.txt.p7s.tsr -CAfile signatures/tagmanifest-sha256.txt.p7s.tsr.crt
Using configuration from /opt/homebrew/etc/openssl@3/openssl.cnf
Verification: OK

Finally, we can see what Timestamp Server verified the time:

$ openssl x509 -noout -text -in signatures/tagmanifest-sha256.txt.p7s.tsr.crt

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            0c:e7:e0:e5:17:d8:46:fe:8f:e5:60:fc:1b:f0:30:39
        Signature Algorithm: sha1WithRSAEncryption
        Issuer: C=US, O=DigiCert Inc, OU=www.digicert.com, CN=DigiCert Assured ID Root CA
        Validity
            Not Before: Nov 10 00:00:00 2006 GMT
            Not After : Nov 10 00:00:00 2031 GMT
        Subject: C=US, O=DigiCert Inc, OU=www.digicert.com, CN=DigiCert Assured ID Root CA
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (2048 bit)
                Modulus:
                    00:ad:0e:15:ce:e4:43:80:5c:b1:87:f3:b7:60:f9:
                    71:12:a5:ae:dc:26:94:88:aa:f4:ce:f5:20:39:28:
                    58:60:0c:f8:80:da:a9:15:95:32:61:3c:b5:b1:28:
                    84:8a:8a:dc:9f:0a:0c:83:17:7a:8f:90:ac:8a:e7:
                    79:53:5c:31:84:2a:f6:0f:98:32:36:76:cc:de:dd:
                    3c:a8:a2:ef:6a:fb:21:f2:52:61:df:9f:20:d7:1f:
                    e2:b1:d9:fe:18:64:d2:12:5b:5f:f9:58:18:35:bc:
                    47:cd:a1:36:f9:6b:7f:d4:b0:38:3e:c1:1b:c3:8c:
                    33:d9:d8:2f:18:fe:28:0f:b3:a7:83:d6:c3:6e:44:
                    c0:61:35:96:16:fe:59:9c:8b:76:6d:d7:f1:a2:4b:
                    0d:2b:ff:0b:72:da:9e:60:d0:8e:90:35:c6:78:55:
                    87:20:a1:cf:e5:6d:0a:c8:49:7c:31:98:33:6c:22:
                    e9:87:d0:32:5a:a2:ba:13:82:11:ed:39:17:9d:99:
                    3a:72:a1:e6:fa:a4:d9:d5:17:31:75:ae:85:7d:22:
                    ae:3f:01:46:86:f6:28:79:c8:b1:da:e4:57:17:c4:
                    7e:1c:0e:b0:b4:92:a6:56:b3:bd:b2:97:ed:aa:a7:
                    f0:b7:c5:a8:3f:95:16:d0:ff:a1:96:eb:08:5f:18:
                    77:4f
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Certificate Sign, CRL Sign
            X509v3 Basic Constraints: critical
                CA:TRUE
            X509v3 Subject Key Identifier:
                45:EB:A2:AF:F4:92:CB:82:31:2D:51:8B:A7:A7:21:9D:F3:6D:C8:0F
            X509v3 Authority Key Identifier:
                45:EB:A2:AF:F4:92:CB:82:31:2D:51:8B:A7:A7:21:9D:F3:6D:C8:0F
    Signature Algorithm: sha1WithRSAEncryption
    Signature Value:
        a2:0e:bc:df:e2:ed:f0:e3:72:73:7a:64:94:bf:f7:72:66:d8:
        32:e4:42:75:62:ae:87:eb:f2:d5:d9:de:56:b3:9f:cc:ce:14:
        28:b9:0d:97:60:5c:12:4c:58:e4:d3:3d:83:49:45:58:97:35:
        69:1a:a8:47:ea:56:c6:79:ab:12:d8:67:81:84:df:7f:09:3c:
        94:e6:b8:26:2c:20:bd:3d:b3:28:89:f7:5f:ff:22:e2:97:84:
        1f:e9:65:ef:87:e0:df:c1:67:49:b3:5d:eb:b2:09:2a:eb:26:
        ed:78:be:7d:3f:2b:f3:b7:26:35:6d:5f:89:01:b6:49:5b:9f:
        01:05:9b:ab:3d:25:c1:cc:b6:7f:c2:f1:6f:86:c6:fa:64:68:
        eb:81:2d:94:eb:42:b7:fa:8c:1e:dd:62:f1:be:50:67:b7:6c:
        bd:f3:f1:1f:6b:0c:36:07:16:7f:37:7c:a9:5b:6d:7a:f1:12:
        46:60:83:d7:27:04:be:4b:ce:97:be:c3:67:2a:68:11:df:80:
        e7:0c:33:66:bf:13:0d:14:6e:f3:7f:1f:63:10:1e:fa:8d:1b:
        25:6d:6c:8f:a5:b7:61:01:b1:d2:a3:26:a1:10:71:9d:ad:e2:
        c3:f9:c3:99:51:b7:2b:07:08:ce:2e:e6:50:b2:a7:fa:0a:45:
        2f:a2:f0:f2

It looks like we’ve gotta trust DigiCert about when it was signed…

Come Join the 2025 Valentine Hunt! / LibraryThing (Thingology)

It’s February 14th, and that means the return of our annual Valentine Hunt!

We’ve scattered a quiver of Cupid’s arrows around the site, and it’s up to you to try and find them all.

  • Decipher the clues and visit the corresponding LibraryThing pages to find an arrow. Each clue points to a specific page right here on LibraryThing. Remember, they are not necessarily work pages!
  • If there’s an arrow on a page, you’ll see a banner at the top of the page.
  • You have two weeks to find all the arrows (until 11:59pm EST, Friday February 28th).
  • Come brag about your quiver of arrows (and get hints) on Talk.

Win prizes:

  • Any member who finds at least two arrows will be
    awarded an arrow Badge ().
  • Members who find all 14 arrows will be entered into a drawing for some LibraryThing (or TinyCat) swag. We’ll announce winners at the end of the hunt.

P.S. Thanks to conceptDawg for the swan illustration!

Issue 107: A Power Packed Thread of Articles about the Humble Battery / Peter Murray

Batteries are among the technologies that have had a silent, dramatic change over my lifetime. Last week, as I was setting up a blood pressure cuff for my mother, I opened the compartment in the back and realized I needed 4 AA-sized batteries. It was once common for devices to ship without batteries, and my inner voice groaned with the thought of having to make a run to the store. So I was pleasantly surprised when two packs of 2 AA batteries fell from the package. I don't know if some regulation has come into effect that requires battery-powered devices to include the batteries, or whether they have simply become cheap enough to toss into every package. But somewhere over the past 20 years, the routineness of batteries has changed.

Today's journey in Thursday Threads takes us down the road of literally electricity storage innovation. As the world continues to lean towards a more renewable future, advances in battery technology is a race we're all invested in, without even realizing it. From manufacturing improvements to cost reductions and a decreased environmental impact, the leaps in battery tech are quite palpable. Plus, one thing I learned this week and a cat picture.

Feel free to send this newsletter to others you think might be interested in the topics. If you are not already subscribed to DLTJ's Thursday Threads, visit the sign-up page. If you would like a more raw and immediate version of these types of stories, follow me on Mastodon where I post the bookmarks I save. Comments and tips, as always, are welcome.

New battery tech

The race is on to generate new technologies to ready the battery industry for the transition toward a future with more renewable energy. In this competitive landscape, it’s hard to say which companies and solutions will come out on top. Corporations and universities are rushing to develop new manufacturing processes to cut the cost and reduce the environmental impact of building batteries worldwide. They are working to develop new approaches to building both cathodes and anodes—the negatively and positively charged components of batteries—and even using different ions to hold charge. While we can&apost look at every technology that&aposs in development, we can look at a few to give you a sense of the problems people are trying to solve.
Next-gen battery tech, Ars Technica, 14-Mar-2024

Thinking back again on my childhood, I'm old enough to remember jealously guarding the capacity of the 4 D-cell batteries in my portable radio. I wouldn't dare leave it on while not listening to it...the batteries were an expensive luxury! (And I would be very annoyed at my brother or sister if they took them out for their own needs.)

This article from 2024 describes advances in battery chemistry and manufacturing that have lowered costs, improved capacities, and reduced environmental impact.

"Harvesting" nickel metal from plant roots

Gouging a mine into the Earth is so 1924. In 2024, scientists are figuring out how to mine with plants, known as phytomining. Of the 350,000 known plant species, just 750 are “hyperaccumulators” that readily absorb sky-high amounts of metals and incorporate them into their tissues. Grow a bunch of the European plant Alyssum bertolonii or the tropical Phyllanthus rufuschaneyi and burn the biomass, and you end up with ash that’s loaded with nickel. “In soil that contains roughly 5 percent nickel—that is pretty contaminated—you’re going to get an ash that’s about 25 to 50 percent nickel after you burn it down,” says Dave McNear, a rhizosphere biogeochemist at the University of Kentucky. “In comparison, where you mine it from the ground, from rock, that has about .02 percent nickel. So you are several orders of magnitude greater in enrichment, and it has far less impurities.”
The Feds Are Trying to Get Plants to Mine Metal Through Their Roots, WIRED, 21-Mar-2024

The article discusses a federal initiative to use plants to extract metals from the soil through their root systems. They are calling this "phytomining"; interestingly, this application can help remediate land contamination from traditional mining methods. Thank you, plants!

Battery Myths

For an object that barely ever leaves our palms, the smartphone can sometimes feel like an arcane piece of wizardry. And nowhere is this more pronounced than when it comes to the fickle battery, which will drop 20 percent charge quicker than you can toggle Bluetooth off, and give up the ghost entirely after a couple of years of charging. To make up for these inadequacies, we’ve made all kinds of battery myths. Whether it’s avoiding leaving your phone on charge overnight, or powering off to give the battery a little break, we’re forever looking for ways to eke out a little more performance from our overworked batteries, even if the method doesn’t make an awful lot of sense. To help sort the science from the folklore, we asked a battery expert to give their verdict on some of the most pervasive myths, explain the science behind the rumors and, just maybe, offer us some sage advice on extending the life of our smartphones.
Here’s the Truth Behind the Biggest (and Dumbest) Battery Myths, WIRED, 27-Oct-2023

As battery technology has changed, so too must our understanding of them. The myths:

  • Even when your battery is at 100 percent, there’s still room for some more charge: True
  • Charging your phone in airplane mode makes it charge faster: True (kind of)
  • Having Wi-Fi and Bluetooth on in the background is a big drain on battery life: True
  • Using an unofficial charger damages your phone: True
  • Charging your phone through your computer or laptop will damage the battery: False
  • Powering off a device occasionally helps preserve battery life: False
  • Batteries perform worse when they’re cold: False (mostly)
  • Leaving a charger plugged in at the wall and turned on wastes energy: False (well, maybe a tiny bit)
  • You should let the battery get all the way down to 0 percent before recharging: False
  • Charging past 100 percent will damage your battery: True (but not for the reason you think)
  • Replacing your phone battery gives it a new lease of life: True

Read the article for the reasoning behind each. Of these, I would question the one about using unofficial chargers. Much effort has gone into standardizing on the USB-C connector and its associate Power Delivery specifications. I think we are at the point where the standards won't let that happen.

EU requires replaceable batteries by 2027

Motorola StarTac. What a nice piece of 1990s tech. By Banffy - Own work, CC BY-SA 4.0, Link
The new rules stipulate that all electric vehicle, light means of transport (e.g. electric scooters), and rechargeable industrial batteries (above 2kWh) will need to have a compulsory carbon footprint declaration, label, and digital passport. For "portable batteries" used in devices such as smartphones, tablets, and cameras, consumers must be able to "easily remove and replace them." This will require a drastic design rethink by manufacturers, as most phone and tablet makers currently seal the battery away and require specialist tools and knowledge to access and replace them safely.
EU: Smartphones Must Have User-Replaceable Batteries by 2027, PC Mag, 16-Jun-2023

The big news in mid-2023 was how smartphone manufacturers would need to design products for the European Union market that allowed for batteries to be replaced. The reason was to enhance sustainability by enabling consumers to easily replace batteries instead of relying on manufacturers or needing special tools. This used to be the norm; in fact, I remember buying a beefier/bulkier battery for my Motorola StarTac and keeping the original battery in my pocket as a spare. But the 2023 EU battery regulation goes beyond just personal devices...it impacts all batteries, including automotive and industrial.

The legislation also includes targets for hazardous substances, waste collection, and material recovery from old batteries, aiming for 61% waste collection and 95% material recovery by 2031. Additionally, there will be requirements for minimum levels of recycled content in new batteries. These regulations impact rechargeable batteries only, but the EU is also considering rules for non-rechargeable ones too. And speaking of EU regulations, that is also why USB-C has become the dominant power and data connection for portable devices.

How battery technology impacts the electrical grid

First, there’s a new special report from the International Energy Agency all about how crucial batteries are for our future energy systems. The report calls batteries a “master key,” meaning they can unlock the potential of other technologies that will help cut emissions. Second, we’re seeing early signs in California of how the technology might be earning that “master key” status already by helping renewables play an even bigger role on the grid. So let’s dig into some battery data together.
Three takeaways about the current state of batteries, MIT Technology Review, 2-May-2024

The article discusses the current state of batteries and their critical role in future energy systems.

  1. Battery storage has become the fastest-growing commercial energy technology, with deployment doubling worldwide in 2023, particularly driven by China’s policies requiring energy storage for new renewable projects.
  2. Batteries are now essential for managing the challenges posed by intermittent renewable energy sources. Take California: batteries have begun to smooth out daily energy demand fluctuations, even becoming the top power source at times as solar energy decreases in the evening. This graph blows my mind.
  3. Despite these promising developments, we have a ways to go if we're going to replace carbon-intensive electricity generation plants. Fortunately, battery costs have plummeted by 90% since 2010, with projections of an additional 40% decrease by the end of this decade, making renewable energy projects more economically viable compared to traditional fossil fuels.

Line graph depicting California's electricity supply on April 19, 2024. The renewables line peaks over 18,000 MW from 07:00 to 18:00, corresponding with the battery dip to nearly -6,000 MW, likely storing excess energy. Natural gas remains stable around 4,000 MW throughout the day. Large hydro stays under 4,000 MW, dipping slightly during renewables' peak. Imports show a slight rise in the morning but decrease as renewables increase. Nuclear remains constant around 2,000 MW. Batteries rise again post-18:00, reaching over 2,000 MW at night as renewables decrease.Graph from MIT Technology Review with data from the California Independent System Operator

Issue 102 of Thursday Threads from earlier this year pointed to articles about how renewable energy is overtaking coal in power generation in the United States and how Hawaii replaced its last coal-fired power plant with batteries.

Fire at battery facility in California

A fire at the world’s largest battery storage plant in California destroyed 300 megawatts of energy storage, forced 1200 area residents to evacuate and released smoke plumes that could pose a health threat to humans and wildlife. The incident knocked out 2 per cent of California’s energy storage capacity, which the state relies on as part of its transition to use more renewable power and less fossil fuels.
Fire at world’s largest battery facility is a clean energy setback, New Scientist, 17-Jan-2025

Of course, as we become more reliant on batteries for storage, there is an increased danger from disasters. A fire at Vistra Energy's Moss Landing battery storage facility in California has caused significant damage, destroying thousands of lithium batteries and 300 megawatts of energy storage capacity. That is quite a bit bigger than a Tesla car fire.

This is not new, of course; the problem was described in a 2023 article in Wired. Despite a 97% reduction in battery-related failures globally since 2018, the loss of such a significant storage capacity is concerning for California's renewable energy goals. The reconstruction of the facility could take years, complicating the state's efforts to reduce fossil fuel dependence. So take a chunk out of that "renewables" line in the graph from the last article.

What about devices without batteries?

Imagine using a health bracelet that tracks your blood pressure and glucose level that you do not have to charge for the next 20 years. Imagine sensors attached to honeybees helping us understand how they interact with their environment or bio-absorbable pacemakers controlling heart rate for 6–8 months after surgery. Whether submillimeter-scale “smart dust,” forgettable wearables, or tiny chip-scale satellites, the devices at the heart of the future of the Internet of Things (IoT) will be invisible, intelligent, long-lived, and maintenance-free. Despite significant progress over the last two decades, one obstacle stands in the way of realizing next-generation IoT devices: the battery.
The Internet of Batteryless Things, Communications of the ACM, 21-Feb-2024

I'm pretty sure I'm not ready for "smart dust", but this article got me thinking about the potential of batteryless, energy-harvesting systems that could someday surround us. As described in earlier articles in this Thursday Threads, there are environmental challenges posed by traditional batteries, including their limited lifespan and harmful manufacturing and disposal processes. "Batteryless" is as much about how these IoT devices will get power as it is about how programming them will require a different mindset.

This Week I Learned: It takes nearly 3¢ to make a penny, but almost 14¢ to make a nickel

FY 2024 unit costs increased for all circulating denominations compared to last year. The penny’s unit cost increased 20.2 percent, the nickel’s unit cost increased by 19.4 percent, the dime’s unit cost increased by 8.7 percent, and the quarter-dollar’s unit cost increased by 26.2 percent. The unit cost for pennies (3.69 cents) and nickels (13.78 cents) remained above face value for the 19th consecutive fiscal year
2024 Annual Report, United States Mint

I knew pennies cost the U.S. mint more than one cent to make, but I didn't realize that the cost of nickels is so much more out of whack. I also learned a new word: seigniorage — the difference between the face value of money and the cost to produce it.

What did you learn this week? Let me know on Mastodon or Bluesky.

Mittens and Pickle want to go on a trip

Tiled floor with clothes scattered around and an open suitcase. A black tuxedo cat is nestled among the clothes inside the luggage. Another black cat is nearby, observing.

My wife came back from a trip last week. Do you see the back tuxedo cat among the clothes in the luggage? Pickle sure wants to go along on the next trip. It looks like Mittens would be happy to close the suitcase and send Pickle on her way.

Zotero's Library Lookup list / William Denton

I was very happy to be notified today that a bug I reported to Zotero was fixed. Its list of OpenURL resolvers had a North America section, within which were all the United States libraries, plus some Canadian, and then more Canadian libraries were filed under Canada. This was not right. Now it’s fixed, and within North America there are two lists, for Canada and the United States.

I hope Mexico appears there soon—maybe a librarian there, or some Spanish-speaking librarian elsewhere, will send in a URL. And then maybe South America will follow.

This is what it looks like now (shown in operation on my demonstration Zotero account, coloured with a Solarized Light theme):

Screenshot of Zotero, with some collections along the left, showing the Settings > General > Library Lookup configuration being acted on. Resolver is broken down by continent, and North America is Canada and United States.  Under Canada is a long list of Canadian universities. Screenshot of Zotero, with some collections along the left, showing the Settings > General > Library Lookup configuration being acted on. Resolver is broken down by continent, and North America is Canada and United States. Under Canada is a long list of Canadian universities.

Many thanks, as always, to all the Zotero people for their work on this excellent program.

Open Data Day 2025 – Mini-Grants Open Call / Open Knowledge Foundation

The Open Knowledge Foundation (OKFN) is excited to announce the launch of the Open Data Day 2025 Mini-Grants Application to support organisations hosting open data events and activities across the world.

This year we are running two separate calls to reflect the different interests in our community. The first call is for the general community, and the second is specifically for activities in French-speaking countries in Africa. See details below:


ODD25 General Open Call

This call is open to any practices and disciplines carried out by open data communities around the world – such as hackathons, tool demos, artificial intelligence, climate emergency, digital strategies, open government, open mapping, citizen participation, automation, monitoring, etc. Applications from Francophone African communities will be automatically added to a specific call (see details below).

A total of 22 events will be supported with a grant amount of USD 300 each, thanks to the sponsorship of the Open Knowledge Foundation (OKFN) and Datopian.


ODD25 Francophone Africa Call

This call is specifically seeking to promote events happening in French-speaking countries in Africa. It’s open for any kind of practice, just like the General Call above, but it must take place in one of the following countries: Democratic Republic of Congo (DRC), Madagascar, Cameroon, Ivory Coast, Niger, Burkina Faso, Mali, Senegal, Chad, Guinea, Rwanda, Burundi, Benin, Togo, Central African Republic, Republic of the Congo, Gabon, Djibouti, Equatorial Guinea, Comoros, and Seychelles.

A total of 14 events will be supported in Francophone Africa with a grant amount of USD 300 each, thanks to the sponsorship of the Communauté d’Afrique Francophone des Données Ouvertes (CAFDO).


Who and How to Apply

The deadline for both applications is 23rd February 2024. The call welcomes all registered organisations across the world interested in hosting in-person open data events and activities in their country. An individual cannot apply, and the events cannot be virtual or online.

The events supported by the grants should take place during the Open Data Day from 1st to 7th March 2025 and must be registered at the Open Data Day website. We encourage proposals to try to dialogue in some way with this year’s thematic focus “Open Data to Tackle the Polycrisis”.

Registered civil society organisation events supported by the mini-grants cannot be used to fund government events. The grant payment will be transferred to the successful grantees after their event takes place and once the Open Knowledge Foundation team receives a draft blog post about the event.


Selection Criteria

The submitted events will be assessed by an organising committee made up of members from each sponsoring organisations. Applications will be blind-reviewed and given a score according to the following criteria.

  • Novelty/creativity of the proposal
  • Community aspect: to what extent the proposal promotes community involvement (especially local communities)
  • Achievability of the activity and level of commitment of the organisers when writing the proposal
  • Diversity in terms of geography, gender, and type of activities
  • Alignment with Open Data Day 2025 thematic focus

* Organisations taking part in Open Data Day for the first time will receive an extra point, as will organisations that have not received mini-grants in past editions of the event.

The winning organisations will be contacted from 24th February. The official announcement of the list of grantees will be made on 26th February.


About Open Data Day

Open Data Day (ODD) is an annual celebration of open data all over the world. Groups from many countries create local events on the day where they will use open data in their communities. ODD is led by the Open Knowledge Foundation (OKFN) and the Open Knowledge Network.

As a way to increase the representation of different cultures, since 2023 we offer the opportunity for organisations to host an Open Data Day event on the best date over one week: this year between March 1st and 7th. In 2024, a total of 287 events happened all over the world, in 60 countries using 15 different languages.

All outputs are open for everyone to use and re-use.

For more information, you can reach out to the Open Knowledge Foundation team by emailing opendataday@okfn.org. You can also join the Open Data Day Google Group or join the Open Data Day Slack channel to ask for advice, share tips and get connected with others.

Strava Verse / Eric Hellman

strava route that looks like an elephant
The internet gives us new ways to express ourselves. One of the more strenuously esoteric forms of artistic expression is Strava art, in which people do runs that, when mapped, draw pictures. None of my strava art was particularly good, but my running club friends in Stockholm regularly run "elefanten". I spent a year attempting "Found Strava Art", where you just run a new route and give the run a name based on what it looks like. I ran a lot of flowers and space ships, but meh. Last year I named each run with a line of a song that came up on my iPod. Too obscure.

This year I decided to serialize poems with my Strava runs. I didn't have a plan, but I started with Jabberwocky. It seemed appropriate to comment using nonsense words, because, Jabberwocky. I ended up with this:

’Twas brillig, and the slithy toves did gyre and gimble in the wabe
I love running with my slithy toves!
All mimsy were the borogoves, and the mome raths outgrabe.
My right knee was a grobble mimsy today, but mome what a rath!  
Beware the Jabberwock, my son!
Also, the Jabberrun can be hard on the knees.
The jaws that bite, the claws that catch!
ERC hosted run had quiche to bite and George to catch.

He took his vorpal sword in hand
New York Sirens game. Women with vorpal sticks. Slain by the Charge 3-2.
Beware the Jubjub bird, and shun the frumious Bandersnatch!
Definitely well salted and frumious out there today.
Long time the manxome foe he sought
But quick the manxless chill he caught
So rested he by the Tumtum tree
Covered with snow in filagree
And stood a while in thought.
Though clabbercing in a profunctional dot!

And, as in uffish thought he stood
Trolloping thru the Brookdale wood.
The Jabberwock, with eyes of flame
Cheld and hord, a glistering name…
Came whiffling through the tulgey wood
And caught the two burblygums because he could.
And burbled as it came!
So late the Jabberrun slept
For Eight Muyibles passed as though aflame
O'er Curbles and Nonces the pluffy sheep leapt.

One, two! One, two! And through and through
Three four! Three four! Sankofa’s coffee’s fit to pour.
The vorpal blade went snicker-snack!
The Icebeest of Hoth kept blobbering back.
He went galumphing back.
He left it dead, and with its head
... the Garmind sprang to life

And hast thou slain the Jabberwock?
The ice, the snow, it's hard as rock.
Come to my arms, my beamish boy!
Think of my knees! Oy oy oy oy.
O frabjous day! Callooh! Callay!”
O jousbarf night! The fluss! The fright!
He chortled in his joy.
(And padoodled the rest of of the way!)

‘Twas brillig and the slithy toves
Did not, had not, could not loave.
Did gyre and gimble in the wabe
“Dunno.” said the wormly autoclave
All mimsy were the borogoves,
Again and again, beloo and aboave
And the mome raths outgrabe.
The end. Ooh ooh Babe!

Terrible right? But it has its moments.

I've started a new one. I fear it will get more topical.

Notes:

Some important library values / John Mark Ockerbloom

In challenging times, it’s good for organizations to remember what they exist to do, and what values drive what they do. They may be expressed in a variety of ways, but there are often common threads going through them. There are lots of library mission statements, for instance, including the one for the university library where I work, the public library for the city where I live, and the library that Congress funds for the American people. Their three statements are all worded differently, but they all involve engaging with the communities they serve to provide access to knowledge and promote learning and creativity.

Carrying out that mission is easier said than done. In my last post, I linked to a page the American Library Association posted with core values that help us keep focused on our missions. In this post, I’d like to draw attention to an overlapping but slightly different set of values, values that some have recently called into question but that are crucially important to what libraries do.

Good libraries are diverse. We have to be, to do our jobs well. Our communities are diverse, with all sorts of ages, backgrounds, education levels, ethnicities, language and expressive skills, genders, faiths, interests, and needs for knowledge. To serve them all, we need to have collections that reflect and serve the diversities in our communities, and in those who come into our communities. We need to have staff that have the knowledge and rapport to effectively serve our diverse communities. And we need them to create and support programming that meets our communities’ needs.

Good libraries are inclusive. We can’t serve our communities well in their full diversities if we don’t make a conscious effort to ensure we’re including everyone in those communities as best we can. A town’s public library might have a rich and diverse collection of English-language books for preschoolers and their parents, for instance, but if there’s little in its collections or programs for school-age children, young adults, retirees, or readers of non-English languages, for example, it’s not doing its job as well as it should.

Good libraries are accessible. Libraries won’t be inclusive just by our saying they are. When we invite everyone to use our libraries, we have to make that invitation meaningful by ensuring everyone can reasonably and fairly access them. If we really mean to be inclusive for seniors, for example, we need to make sure that the many seniors who have problems with stairs or small type can use the facilities, websites, and books that our library provides. We need to make sure that our community members who don’t read English well have access to books that they can read, in the languages they know, as well as books that will help people learn English and the other languages used in our communities. When we fail at accessibility, we fail at inclusion.

Good libraries are equitable. Equity is important in its own right as a standard of fairness, and also for ensuring and balancing the other values noted above. Not every specific part of a diverse and inclusive library will be for everybody, or should be for everybody. A book about how to work with the Medicare system will generally not be of use to a preschooler, for instance. Nor should it be if it’s going to effectively serve the needs of the retirees the book is meant for. Likewise, an alphabet rhyming book is unlikely to be of interest to a doctor with no particular interest in children. Similarly, some of the specific programs and initiatives that libraries undertake will be of more use and interest to some parts of their communities than others.

An equitable library ensures that its collections and programs, taken as a whole, fairly balance the needs of the various constituencies in its community. As part of that fair balance, an equitable library also takes into account existing inequities and other deficiencies present in its community, and do its part to alleviate them. A library serving a community with higher than usual unemployment, for example, might devote more resources than other libraries might towards materials and programs that help people get jobs. It might also give special attention to parts of the community that have particularly high unemployment rates.

Good libraries reaffirm and clarify their values when challenged. This can be hard to do sometimes. Some people claim that programs involving diversity, inclusion, accessibility, or equity (or various rearrangements or acronyms of those words) are unjustly discriminatory or illegal. If one of our community members comes to us with a concern like that, it may well be worth listening to. It’s certainly possible to imagine illegal or discriminatory actions being taken under the cover of “DEIA”. It’s also certainly possible to imagine illegal or discriminatory actions being taken under the cover of “combating DEIA”. In either case, we need to make sure that our libraries act in a way that serves our communities fairly, and in line with our values (including the four that I explain above). Putting the word “equity”, say, in big letters on our website does not in itself make us equitable. Nor does removing the word from our website. But explaining what we mean by equity, and putting what we explain into action, can.

Actions mean more than words themselves, but words themselves can be important actions. The keepers of libraries have particular reason to be aware of the power of words, since we’re stewards of so many of them. Words can be promises, both explicit and implicit, and when we speak, others hear what we say, and what we don’t say, and expect us to live up to what they hear. We may shy away from some words when we’re worried about what people who give us funds or support may think about them. But when we do, the people in the communities we serve may also hear our new words and draw their own conclusions about them. In our words and actions, we can decide to protect our institutions with the powerful as best we can. Or we can decide to serve our communities in accordance with our missions and values as best we can. Sometimes those aren’t the same choice.

CHAOSScon 2025: Key Takeaways on Open Source Health and Metrics / Open Knowledge Foundation

On 30 January I was lucky to attend CHAOSScon 2025 in Brussels, which brought together open source practitioners, researchers, and community leaders to discuss the latest developments in measuring and improving open source software (OSS) health. This year’s sessions covered key topics like defining open source sustainability, tracking contributions, assessing community health, and evaluating project risks. Below is a recap of the main sessions and insights shared throughout the event.

The conference kicked off with an overview by Daniel Izquierdo of CHAOSS (Community Health Analytics for Open Source Software) and its tools for tracking OSS health. Key takeaways included:

  • Metrics are essential for assessing the maturity of OSS projects.
  • GrimoireLab 2.0 offers new capabilities for analyzing software development, including historical data tracking, GDPR-compliant identity management, and a business-layer integration for commercial services.
  • Major OSS foundations and corporations leverage GrimoireLab for their open source health assessments.

CHAOSScon also marked the launch of the CHAOSS Education Program, designed as a structured entryway into open source. Dawn Foster and Peculiar C. Umeh presented the 3 courses developed by CHAOSS:

  1. Open Source 101: Helping newcomers navigate OSS and find their contribution niche.
  2. CHAOSS governance and operations: Educating users on how the organization works.
  3. Practitioner guides for project managers, OSPOs, and community leaders.

The courses are hosted on Moodle and are designed for both CHAOSS community members and general OSS learners.

Ruth Ikegah then shared that Diversity, Equity, and Inclusion (DEI) remain challenges in OSS. Through her work with local chapters she observed that:

  • 49%+ of OSS content is in English, creating barriers for non-English speakers.
  • Cultural differences necessitate localized approaches to inclusion.
  • Challenges like internet access, financial constraints, and lack of OSS education in formal curricula hinder participation from non-Western countries.
  • We need better strategies for engagement. Some examples she shared are: badging systems, funding, mentorship, and recognizing future leaders.

Paul Sharratt and Cailean Osborne presented a toolkit for measuring how public funding affects OSS sustainability. Some critical points included:

  • OSS is digital infrastructure, and funding models affect long-term viability.
  • Different funding types lead to varying levels of impact.
  • Models for assessing public investment effectiveness in open source.

If you are interested, the preprint is available: arxiv.org/abs/2411.06027.

Katie McLaughlin addressed a quite well-known problem: open source projects often struggle with recognizing contributions beyond code. She therefore highlighted the need for a standardized taxonomy for OSS contributions, as many contributions are still invisible today (e.g., documentation, community engagement).

As an attempt to explore equitable credit systems in OSS, they launched whodoesthe.dev, focused on understanding the open source ecosystems.

Daniel S. Katz then presented CORSA (Center for Open Source Research Software Advancement), an initiative aiming to support open-source research software projects through foundations and metrics in order to improve its sustainability.

Sustainability is a tricky word, because it has so many different meanings. There is a technical sustainability, but also a financial and organizational one (and of course, an environmental sustainability too). Daniel advocated for metrics as a key element to understand the status of a project, and, in the case of financial sustainability, an excellent way to showcase success and attract funding.

Financial sustainability of course affects all the other sustainabilities too, as community engagement and long-term viability require structured support mechanisms.

Security and risk analysis were big topics at CHAOSScon (and at FOSDEM too). As Georg Link explained, this is very much linked to the project health: unmaintained or poorly maintained FOSS dependencies pose security threats, and as FOSS is an integral part of any modern software (over 80 percent of the software in any technology product or service is open source, according to a Linux Foundation study from a couple of years ago!), understanding risks is crucial. The Software Bill of Materials (SBOM) helps track and manage dependencies. Key risk indicators include median response time to pull requests and issue resolution speed. One thing is clear: maintaining project activity and engaging contributors helps mitigate risks.

Another project that can help in risk analysis is OpenChain, which helps developers assess compliance in their software components, using a capability model to grow community excellence. Measuring compliance contributes to risk assessment and regulatory alignment.

The OpenChain tools are available on GitHub for developers to evaluate maturity models.

Katherine Skinner gave a keynote in which she explored the importance of defining values in open infrastructure projects and aligning community values with decision-making to strengthen resilience. Katherine introduced the FOREST framework for values-driven evaluation, emphasizing that human metrics can help reverse-engineer assessments by letting communities define the values they stand for. Additionally, she discussed the challenge of making FOSS needs visible to funders and stakeholders in a way that highlights their significance without discouraging adoption.

Conclusion

CHAOSScon 2025 reinforced the importance of defining, measuring, and sustaining OSS health. Key themes included the role of education, local empowerment, equitable contribution recognition, and risk management. As open source continues to evolve, these discussions provide a roadmap for ensuring sustainability, security, and inclusivity. For further insights, you can access all the presentation slides from CHAOSScon 2025 here, and join the CHAOSS community too!

Open Source Policy and Europe’s Digital Sovereignty: Key Takeaways from the EU Open Source Policy Summit / Open Knowledge Foundation

The 2025 EU Open Source Policy Summit convened policymakers, industry leaders, and open source advocates to discuss the critical role of open source in Europe’s digital future. With the EU entering a new legislative cycle, the discussions focused on how open source can enhance digital sovereignty, competitiveness, and innovation. I was particularly impressed by the first panel, *What Can Open Source Do for Europe?*, which featured Cristina Caffarra (CEPR – Centre for Economic Policy Research), Peter Granten (Univention GmbH), Adriana Groh (Sovereign Tech Agency), Pearse O’Donohue (European Commission), Amandine Le Pape (Element), moderated by Astor Nummeling Carlberg (Open Forum Europe).

Cristina Caffarra set the tone with a bold critique of Europe’s reliance on Big Tech. “We are colonized by Big Tech, with 90% of our infrastructure owned by them”. They make us believe the stack they give us is sovereign because they are here—and they are indeed here, but pay no taxes and make zero contributions. In all of this, the EU concentrates all its efforts in regulating, but what exactly are we regulating for? Where does that lead us? Regulation is not creating alternatives nor distributing the power. Regulation alone is not enough to ensure digital sovereignty. Instead, Europe must actively build and invest in its own digital infrastructure. “Do the demo instead of the memo”, she said.

Adriana Groh advocated for making the invisible visible, telling the things as they are: what is the infrastructure we rely upon, and what happens if we lose it. “Sovereign technology is not a nice to have, but a must have, because we use it everyday”. We have to secure the foundation of our stack, well before starting to talk about innovation. “The innovation fetish we have is not healthy. We need to maintain what is there. This is not a political party issue, this is not right or left, this is common sense.”, this last statement particularly resonated with me and made me think of the maintenance panel we had at The Tech We Want Summit

And indeed, we need to ensure security of the software, the data, of the project itself. So much is built and then not maintained. But why do we even build stuff if it is not going to be maintained? Pearse O’Donohue pointed out the problem is linked to the structure and models of the current funding systems: we fund a project for a limited number of years, and then gently disappear. Unlike proprietary software companies that generate ongoing revenue through licensing, open source projects often struggle to secure long-term funding, leading to maintenance gaps and security risks. This issue of sustainability is a major barrier to Europe’s ability to control its own technological destiny. There is also a key contradiction highlighted by Amandine Le Pape in how governments approach software procurement: despite the fact that on paper the EU pushes strongly for open source first and open source by default, our governments and public administrations are still paying for proprietary software, basically funding our external dependencies. 

The truth is also that sovereignty is not going to be an incentive for business anywhere; business will just go for the cheapest option. We need a coordinated action between policymakers, businesses, and the open source community pushing for a Eurostack, a European technology ecosystem that prioritizes open source, interoperability, and security. The time for action is now. The decisions made in this new legislative cycle will shape Europe’s digital sovereignty for years to come. Open source is not just an option—it is essential for Europe’s future. 

The summit reinforced that Europe’s digital future depends not only on open source adoption but also on a fundamental shift in policy, investment, and mindset. With the right frameworks in place, open source can drive sustainable, secure, and inclusive technological progress across the continent.

Security was another buzzword at the Summit. However, as discussed at CHAOSScon, security is directly tied to the health of open source projects, and maintenance remains an invisible yet critical cost. Regulation should encourage upstream contributions to ensure that the backbone of digital infrastructure remains strong.

If we want to attract funding, Claire Dillon (CURIOSS) pointed out, we need to communicate the value of open source better and more effectively. We have to do a much better job at explaining what we do and why it matters. While proprietary software companies invest heavily in marketing, FOSS struggles with visibility. And yet, FOSS is literally a component of anything you do from the moment you wake up to the moment you go to sleep. As a movement, we have to get better at storytelling (something we also discussed with the Open Knowledge Network during our last gathering in Katowice).

Open source provides a fundamental economic advantage by preventing duplication of costs and efforts, making software development more efficient and collaborative. Shared maintenance costs, rather than repeated investments in proprietary solutions, could lead to more sustainable infrastructure.

There is a gap in the European funding landscape, Gabriele Columbro (Linux Foundation Europe) remarked that while funding is available for early-stage open source projects, that seed investment is not matched with a rich ecosystem of businesses built on FOSS that can then take on those projects.

This disconnect leaves many promising projects without the necessary support to transition into long-term, self-sustaining businesses, limiting the full potential of open source innovation in Europe.

Another hot topic was Digital Public Infrastructure (DPI) and Digital Public Goods (DPGs). Sunita Grote (UNESCO Ventures) pointed out that despite it being referenced more and more, we are still lacking a clear definition of what is a DPI, and DPGs do not always require the stack to be open source. Another key concern is that many of these initiatives remain dominated by the usual industry players, from the same geographies, backed by the same funding sources. 

While pushing for European digital sovereignty, there is a need to ensure that the “Eurostack” remains open to broader contributions from new emerging places, avoiding exclusivity. Instead of ownership, the focus should be on pushing the European shared values: transparency, collective governance, and an open roadmap. At the end of the day, one lesson that open source teaches us every day is that we don’t need to own everything, we can also just be the home.

In praise of middle management / Jonathan Brinley

Inspired by a recent post from Sam Altman (2025-02-09):

We are now starting to roll out AI agents, which will eventually feel like virtual co-workers.

Let’s imagine the case of a software engineering agent, which is an agent that we expect to be particularly important. Imagine that this agent will eventually be capable of doing most things a software engineer at a top company with a few years of experience could do, for tasks up to a couple of days long. It will not have the biggest new ideas, it will require lots of human supervision and direction, and it will be great at some things but surprisingly bad at others.

Still, imagine it as a real-but-relatively-junior virtual coworker. Now imagine 1,000 of them. Or 1 million of them.

Having had some “real-but-relatively-junior” coworkers, I’m familiar with how much supervision and direction they need to contribute to a product/company/team. I invest X units of my time in them; they produce X * Y units of output. If all goes well, Y > 1; reality often falls short.

There’s only so far you can scale with one person directing a team of juniors. I can work with a few; some other engineers may be able to direct a few dozen. Eventually, you get to the point where you’re spending all of your time supervising and none of it producing that output yourself. Congratulations, you’re a manager. (Yes, this is a gross oversimplification of the manager’s role.)

Is it productive for a CEO to directly supervise and direct a thousand (or a million) junior employees? I imagine few (if any) would think so. So our CEO hires managers, who hire managers, who hire managers… who supervise and direct a reasonable handful of junior employees (often mixed in with senior employees who can take an active role in mentoring the juniors). These managers coordinate their teams with all the other teams to achieve the company’s vision, with the coordination challenges growing geometrically as the company scales.

I’ll not predict whether/when AI may eventually replace senior employees or managers. But until it does, good management is our bottleneck for scaling up our “real-but-relatively-junior” armies of virtual coworkers.

Building connections with publishers to bridge the OA discovery gap / HangingTogether

Hortus Botanicus Amsterdam, Bridge 233. Agnes Monkelbaan, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons.

As a member of OCLC’s Publisher Relations Team, my colleagues and I serve as intermediaries, representing the needs of libraries in the publishing world and vice versa. This can feel like mediating between squabbling roommates or sometimes like pairing up lost gloves—a perfect fit, a meant-to-be. While it may be tempting to see only a dichotomy, two sides of a story, a chasm to be bridged (I could go on with the metaphors), we choose to focus on the commonalities between libraries and publishers, building connections instead of dwelling on the gap.  

Much easier said than done. 

Occasionally, we get the opportunity to put our idealistic views to the test and bring the publisher’s perspective to a seemingly library-specific issue, as we did in the recent OCLC Research report Improving Open Access Discovery for Academic Library Users. I consulted with the report’s authors throughout the research process, offering alternative readings from a non-library view. This project shed light on common challenges, emphasizing that no single entity can tackle the enormity open access discovery on its own. The outcome is a report that is valuable to open access (OA) stakeholders outside the library, as well. 

“Truly improving the discoverability of OA publications requires all of the stakeholders involved to consider the needs of others within the lifecycle.”  

Libraries, publishers, technology providers, and aggregators all play a role in the lifecycle of OA publications. These many OA content workflows and responsibilities don’t exist in a silo, but rather integrate with and augment each other. Focusing solely on the library’s role would lose the assistance and efficiencies offered to the library by the others. As the penultimate line of the report states, “Truly improving the discoverability of OA publications requires all stakeholders involved to consider the needs of others within the lifecycle.”  

Although the report is primarily aimed at librarians, the authors have thoughtfully identified significant findings from the study and emphasized key takeaways for publishers and other non-library stakeholders. These important points are visually highlighted using magenta boxes throughout the report. However, to provide even greater clarity for this specific audience, I have extracted a few of the most relevant points under the categories of Metadata, Access, and Trust, and offered brief explanations of their significance.  

Metadata 

Consider the metadata that library staff identified as important for the discoverability of OA publications when identifying potential improvements in how metadata can be created, shared, harvested, and displayed.

We all (should) know that metadata is important for all content discovery. Metadata that is used to discover traditional publications, such as author name, title, abstract, keywords, journal name, volume, issue, publication date, and subject are also important for the discoverability of OA publications. However, librarians also stated an additional need for persistent identifiers such as ISSN, ISBN, DOI, ISNI, ORCID, and ROR, to allow library systems to potentially make linkages between resources to further aid discovery and “reduce confusion among their users”(Improving Open Access Discovery, 13). Moreover, adding persistent identifiers into metadata for all types of content is good practice, as the recent report recommending a US national PID strategy describes. Not only do PIDs improve discovery, but they also support interoperability and automation, reduce administrative burden, improve research assessment and research integrity efforts, and save money when widely adopted.1 So, if you have any PIDs, add them in. If not, go get one… and add it in. 

Users evaluate resources concurrently and iteratively as they search and access them. Both metadata and system capabilities need to support these simultaneous processes of discovery, evaluation, and use.  

The absence or presence of metadata is the differentiator for discovery. Librarians asked that publishers include metadata about the use of peer review, publication version, and OA status (through the inclusion of license information) to help systems differentiate the content and enable users to better evaluate their resources of choice. OA publications may have multiple versions (such as the version of record, author accepted manuscript, or preprint), and these versions may be aggregated across various repositories, with only metadata available to differentiate them in search results. The completeness of the provided metadata will influence users in selecting a version. For instance, if one version clearly indicates that the content has undergone peer review and is the OA version of record, users may be inclined to choose it over a result with minimal context. 

OA content that isn’t discovered doesn’t get used, and OA content that doesn’t get used doesn’t get supported by libraries. Therefore, publishers should provide the most complete metadata possible about OA content as well as partner with library staff to understand what metadata they would like to receive and help “authors understand the role that quality metadata plays in the discovery of their work.” (Improving Open Access Discovery, 35)  

To find out more about how publishers can better create metadata about open access books, see EDItEUR’s application note “Open Access Monographs in ONIX” (both text and video).  

Ensure that article-level metadata is provided by all publishers, regardless of size. This makes it easier for library staff to add these OA publications to their collections to meet users’ needs.  

Even the smallest OA publishers of either books or journals should ensure their metadata is thorough and shared with trusted aggregators such as DOAB or DOAJ. Those Open access platforms and others provide aggregated metadata to libraries in standardized formats that allow systems providers to efficiently index the content. These aggregators may require publishers to apply and be accepted before adding their metadata and content to the repository, but they also provide general advice to all publishers on creating high-quality metadata. Seek it out and follow it. These aggregators see metadata of all levels of quality every day and know what works.  

When adding OA publications to knowledge base collections, clearly name the collection and identify what types of OA resources are in the collection and how much of it is OA. Provide this information consistently to help libraries identify the content they are looking for within the potentially duplicated records. 

Knowledge bases are largely managed through the use of the KBART, or Knowledge Base and Related Tools file. NISO provides more detailed information about the KBART format, but in general, the KBART file is the special sauce that keeps the metadata record connected to the hosted content and the library’s catalog. This very library-centric data format can sometimes prove mysterious to publishers who focus on the title- or article-level metadata, but it is crucial in making collection development and management workflows run smoothly for libraries.  

To help understand KBART, imagine a shipping container full of individual copies of physical books being sent to a warehouse. There is a shipping manifest pasted to the outside of the container that lists what books are found within. Without this manifest, the warehouse staff would have no idea what they are dealing with when they open the container, resulting in inefficient check-in processes and some unhappy staff. The KBART acts as this shipping manifest, itemizing the contents of a publisher’s digital collection and allowing the libraries to expediently add it to their catalog.  

As the OA Discovery report points out, the choice to add OA content to a library catalog is not necessarily simple. Librarians weigh many factors when considering this work. So, make their job easier by providing thorough and consistent KBART files, and name your collections as clearly as possible. If the collection contains only open access content, then please say that in the collection name. Otherwise, the collection might be overlooked for consideration. Unclear labeling can lead to unhappy librarians, which is something nobody wants. 

Access 

Providing seamless authentication to content behind a paywall saves users and library staff time and effort.  

Studies about Information-seeking behaviors always get librarians excited, and the Open Access Discovery report doesn’t disappoint. You can see complete details within the report, but it finds, among other things, that users are most likely to search for scholarly peer-reviewed content first on a search engine, with the library catalog coming in third. The publisher’s website was further down the list in fifth place. So going straight to the publisher is not a common practice. 

After users navigated their search results and tried to access the digital full text they were seeking, they often faced barriers that had a negative impact on their experience. These barriers included the requirement for payment, unavailability through their library, and the need to log in—three barriers that are directly related to the traditional paywalled access model for scholarly publications.   

Publishers should take note that the users’ most common response when hitting the barrier was to seek an OA version of that content. While logging in and accessing the content behind the paywall was a close second, it is significant that users chose to instead pivot to another version of the content. Of course, it’s also worth noting that users were more likely to give up on the content altogether than to ask a librarian for help or find the content in a physical format. 

Bar chart titled 'Actions Most Likely Taken When Unable To Access Full Text,' showing the count of respondents for different actions. The most common actions are looking for an OA version (276), logging in with credentials (270), and looking on research sharing sites (231). Other actions include giving up (182), contacting the author for a copy (129), asking a librarian for help (96), requesting an interlibrary loan (62), and using a physical or print item (50). The total number of respondents was 423, and users could select multiple actions. Categories with fewer than 50 responses were not included. For the original report, see Figure 10, page 29 at https://doi.org/10.25333/4xem-xr80Improving Open Access Discovery for Academic Library Users, Figure 10, p. 29

Publishers should support seamless authentication to their paywalled content to not only save the user and librarian time and effort, but to also ensure that the content they host is being used and found of value. Users don’t care about the business model supporting the content. They just want ready access to the content they are seeking. After taking the effort to make content discoverable, publishers need to make those last mile connections possible and support authentication through to their content, regardless of how that content is funded. 

Trust 

This takeaway was highlighted for library staff, but is relevant to publishers as well:  

Provide users more guidance about how to evaluate whether a scholarly publication is trustworthy, including reasons why it’s important to consider the journal, publisher, and author’s reputation in addition to whether the publication has been peer-reviewed.  

Trustworthiness and reputation are important. Publishers, you know this. Librarians make a choice about what OA content they bring into their collections. They do not just have an open-door policy. If you support the publication and discoverability of high-quality open access content, make sure that you also support the libraries’ collection development processes around OA by following these three recommendations:  

Be transparent. Make it easier for your reputation to be evaluated. Avoid marketing language that may sway the evaluator away from your intention. Fill your metadata with all the PIDs, funder information, and peer review information that you possibly can. 

Be helpful. The Open Access Discovery report calls on libraries to educate users on how to publish OA as well as offer more holistic instruction on OA, like how “licensing and versioning work throughout the publication lifecycle, what different publishing models mean about how OA publications are created and funded, and how to determine what OA publications are trustworthy” (Improving Open Access Discovery,  31). Support libraries by communicating about your OA efforts more broadly. OA interactions should go beyond negotiating transformative agreements and include information to support user awareness of your OA efforts. This can help foster library-wide conversations around OA and lend credence to your trustworthiness.  

Be trustworthy. The need to establish trust is a repeated refrain within this report. Trust is earned and libraries take actions based on this intangible feeling by analyzing tangible actions. By being transparent, providing helpful information, and building trust with library partners, you increase the likelihood of your OA content being readily added to their catalogs. 

Final thoughts 

As publishers, librarians, and discovery partners continue to navigate the evolving landscape of open access, the myriad of publishing models and methods to discover content will continue to strain the sometimes-tenuous bonds between library and publisher. But collaboration remains key. All sides of the story are really just focused on getting the right information to the right user at the right time. Ultimately, improving metadata, streamlining access, and building trust are foundational in ensuring OA content is both discoverable and valued.

But how do we measure the impact of these efforts? Usage data plays a crucial role in understanding how OA publications are accessed and utilized. In a follow-up post, we’ll delve into the significance of usage reporting, exploring how better analytics can help publishers and libraries alike make informed decisions that enhance discoverability and engagement.  

The post Building connections with publishers to bridge the OA discovery gap  appeared first on Hanging Together.

On Not Being Immutable / David Rosenthal

Economist 2/1/25
Regulation of cryptocurrencies was an issue in last November's US election. Molly White documented the immense sums the industry devoted to electing a crypto-friendly Congress, and converting Trump's skepticism into enthusiasm. They had two goals, pumping the price and avoiding any regulation that would hamper them ripping off the suckers.

Back in November of 2022 I added an entry to this blog's list of Impossibilities for The Compliance-Innovation Trade-off from the team at ChainArgos. It started:
tl;dr: DeFi cannot be permissionless, allow arbitrary innovation and comply with any meaningful regulations. You can only choose two of those properties. If you accept a limited form of innovation you can have two-and-a-half of them.

Fundamental results in logic and computer science impose a trade-off on any permissionless system’s ability to both permit innovation and achieve compliance with non-trivial regulations. This result depends only on long-settled concepts and the assumption a financial system must provide a logically consistent view of payments and balances to users.

This is a semi-technical treatment, with more formal work proceeding elsewhere.
Two years later, the "more formal work" has finally been published in a peer-reviewed Nature Publishing journal, Scientific Reports, which claims to be the 5th most cited journal in the world. Jonathan Reiter tells me that, although the publishing process took two years, it did make the result better.

Below the fold I discuss Tradeoffs in automated financial regulation of decentralized finance due to limits on mutable turing machines by Ben Charoenwong, Robert M. Kirby & Jonathan Reiter.

This team were pioneers in applying fundamental computer science theorems to blockchain-based systems, starting in April 2022 with The Consequences of Scalable Blockchains in which they showed that implementing an Ethereum-like system whose performance in all cases is guaranteed to be faster than any single node in the network is equivalent to solving the great unsolved problem in the theory of computation, nicknamed P vs. NP. And thus that if it were implemented, the same technique could break all current cryptography, including that underlying Ethereum.

But, I believe, they were not the first. That appears to have been Tjaden Hess, River Keefer, and Emin Gün Sirer in Ethereum's DAO Wars Soft Fork is a Potential DoS Vector (28th June 2016), which applied the "halting problem" to "smart contracts" when analyzing possible defenses against DOS attacks on a "soft fork" of Ethereum proposed in response to "The DAO".

Charoenwong et al's abstract states:
We examine which decentralized finance architectures enable meaningful regulation by combining financial and computational theory. We show via deduction that a decentralized and permissionless Turing-complete system cannot provably comply with regulations concerning anti-money laundering, know-your-client obligations, some securities restrictions and forms of exchange control. Any system that claims to follow regulations must choose either a form of permission or a less-than-Turing-complete update facility. Compliant decentralized systems can be constructed only by compromising on the richness of permissible changes. Regulatory authorities must accept new tradeoffs that limit their enforcement powers if they want to approve permissionless platforms formally. Our analysis demonstrates that the fundamental constraints of computation theory have direct implications for financial regulation. By mapping regulatory requirements onto computational models, we characterize which types of automated compliance are achievable and which are provably impossible. This framework allows us to move beyond traditional debates about regulatory effectiveness to establish concrete boundaries for automated enforcement.
They summarize the fundamental problem for the automation of DeFi regulation:
DeFi features some computationally challenging properties: (1) Turing-complete programming, (2) permissionless access to both transact and publish code and (3) selectively immutable code. The permissionless mutability of the system combined with the Turing completeness motivates our inquiry. A system running Turing-complete code where updates can be published permissionlessly cannot make any guarantees about its future behavior, a conclusion from early work on Universal Turing Machines (UTM).
Despite this, they show that if it is possible to enforce less-than-Turing-complete programming:
it is possible to construct both (1) classes of algorithms that can make credible promises and (2) restricted update mechanisms that enable credible promises. In other words, DeFi platforms can provide compliant services like traditional centralized providers through fully automatic mechanisms.
What exaxctly do they mean by "compliant"?
Consider an economy modeled as a Turing Machine, where the machine’s state corresponds to the state of the real economy. We formalize compliance as a property of system state transitions that can be verified mechanically, following Theorem 5.8.5 of Savage. Specifically, a compliant system is one where no sequence of permitted operations can result in a state that violates predefined regulatory constraints set by an external regulator.
They translate this into economic terms:
For example, if a regulation prohibits transactions with certain addresses, compliance means no sequence of permitted operations can transfer value to those addresses, either directly or indirectly. Similarly a regulator may impose requirements on intermediaries transacting in certain assets or products akin to depository receipts for those assets. Compliance would then require ensuring one does not unknowingly transact in “products akin to depository receipts” for a given list of assets.
Because they model regulated systems in terms of states and the transitions between them, they can apply results from compputer science:
This formulation maps to well-known results in computability, such as the Halting Problem and the more general impossibility known as Rice’s Theorem: No algorithm exists to determine from the description of a [Turing Machine] whether or not the language it accepts falls into any proper subset of the recursively enumerable languages. In other words, we cannot categorize arbitrary programs into specific subsets automatically and reliably. In financial regulation, the canonical “proper subset” is a ban on interacting with a given address: interactions involving a banned address are forbidden, and the acceptable subset of states includes no such transfers.
They proceed to use cryptocurrency "mixers" as an example. They explain that Rice's Theorem means that because any general description of "mixers" defines a "proper subset" of all programs, and thus there is no automatic or reliable method by which a "smart contract" can be assessed against the general description.

Of course, as Justice Potter Stewart famously said "I know it when I see it". Human regulators will have no difficulty in recognizing a "smart contract" as a mixer, if only because in order to function it needs liquidity. To attract it the "smart contract" needs to advertise its service. So can the regulators ban interactions with the specific addresses of the mixers they recognize? Charoenwong et al make two arguments
First, this severely limits the regulator’s power from regulating a mutable set of protocols to only specific ones. In other words, what is often called “principles-based” regulations (as opposed to rules-based regulations) are impossible. We cannot ban “mixers” generally – we can only ban “mixers A, B and C.” In some sense, this is akin to banning specific means of murder rather than simply banning murder, no matter the means.
Then they introduce the time element inherent in human regulation:
Second, and more importantly, we cannot enforce even these more straightforward rules reliably. Consider these steps:
  1. Deploy a new, confusingly-coded, “mixer” labeled X
  2. Send funds to the mixer X
  3. Withdraw from the mixer X and feed into the mixer A
  4. Withdraw from the mixer A and feed into the mixer X
  5. Withdraw from the mixer X and spend freely
This procedure works because we cannot identify arbitrary mixers, so we are free to deploy and then use them before they get put on the banned list. As a result, the regulator cannot even ban all interaction with enumerated mixers – it can only reliably ban some forms of interaction. This result is a severe limit on regulatory power.

If we consider that compliance exists in an automated form, operating on publicly available data in real-time, anyone accepting that final transfer must operate in a compliant fashion. If, instead, the plan is to decide these things later based on non-mechanical analysis, we are simply operating a conventional legal system with some more computers involved. Concretely, if that last transfer can be ruled illegal after the fact, it was never an automated financial system.
This result generalizes to services other than mixers. Rice's Theorem means it isn't possible to ban a class of services, and banning individual services identified by humans will always be behind the curve.

The authors illustrate the application of their result with real-world examples of regulatory failure:
These case studies-The DAO, Beanstalk Finance, Compound, Terra/LUNA, and MakerDAO-collectively illustrate the practical manifestations of our paper’s theoretical findings. Each example demonstrates a different facet of the challenges in implementing reliable, automated compliance mechanisms in decentralized, Turing- complete systems. From governance attacks to stablecoin collapses and liquidation issues, these incidents underscore the impossibility of guaranteeing specific regulatory outcomes without compromising system flexibility or introducing external interventions.
The authors argue that there are two ways to construct a system that does allow automated regulation, by making it permissioned rather than permissionless, or by enforcing a non-Turing-complete language for the "smart contracts". In practice many cryptocurrencies are permissioned — in Decentralized Systems Aren't I pointed out that:
The fact that the coins ranked 3, 6 and 7 by "market cap" don't even claim to be decentralized shows that decentralization is irrelevant to cryptocurrency users.
There are many examples of cryptocurrency systems that claim to be decentralized but are actually permissioned. Patrick Tan described one in Binance Built a Blockchain, Except it Didn’t:
For all its claims of promoting decentralization, Binance runs two “blockchains” that are not just highly centralized, but regularly alter history, undermining one of the core tenets of the blockchain — immutability.
The authors' example of a non-Turing-complete language is:
Consider a scripting language where we cannot have variables. A simple “splitting the tab” contract might look like:
This is dangerous if we have a regulation that certain addresses cannot be paid. The issues raised surrounding the DAO hack, discussed above, apply here. But what if the only way to transfer a token is to call SendTo and that function looks like:
There is no issue if a function only calls RealSendTo from SendTo. In such cases, the regulator’s responsibility is to maintain the BannedList, and the system is permissioned.
Their example of a system that is not-Turing-complete is not useful, because it requires BannedList to be a constant. As they point out, if the system is to be useful, BannedList must be a variable that is updated by the regulator, and thus the system is permissioned. It may well be that, because BannedList is a variable, that the system is Turing-complete after all. I can't do the analysis to determine if this is the case, but it is known that even a small number of variables makes a system Turing-complete.

Thus the paper is somewhat misleading, in the sense that it reads as if regulated systems can be either permissioned or not-Turing-complete, but it fails to provide an example of a system that is permissionless and not-Turing-complete. The example that looks as if it is going to be not-Turing-complete but permissionless seems to be permissioned and not-Turing-complete.

I would argue somewhat differently:
  • Charoenwong et al show that a permissionless, Turing-complete system cannot be regulated.
  • In the real world no-one is going to cripple their system by making it not-Turing-complete.
  • Even if a not-Turing-complete system could be built it isn't clear that it would be useful.
  • In essence, the regulatory act of enforcing that a system is, and remains, not-Turing-complete is permissioning the system.
  • Thus in practice permissionless systems cannot be regulated.
JPMorgan
While Charoenwong et al's paper is of theoretical interest, in the two years since their initial post it has been overtaken by events. Craig Coben asks Has Michael Saylor’s ‘infinite money glitch’ run into a hitch?:
Reflexivity has been MicroStrategy’s secret sauce. The company prints equity or equity-linked securities to buy bitcoin, boosting bitcoin’s price, which in turn inflates MicroStrategy valuation. This allows it to issue more stock and repeat the cycle.

This MonoStrategy has enabled MicroStrategy to acquire 2.25 per cent of all bitcoin in existence, a hoard worth around $46bn at current prices. The company trades at nearly double the value of its underlying bitcoin holdings, a testament to belief of some investors in Michael Saylor’s project.
...
As long as the stock trades at a premium to its bitcoin holdings, the company can keep issuing novel securities and finding new buyers. Meanwhile, with MicroStrategy comprising a sizeable portion of crypto inflows (28 per cent in 2024, according to JPMorgan), it has helped sustain bitcoin’s ascent.
Both our co-Presidents are heavily invested in pumping cryptocurrencies. The new administration is planning to solve the Greater Fool Supply-Chain Crisis by turning the Federal government into the greater fool of last resort by establishing the strategic Bitcoin reserve. What better greater fool than one who can print his own money? The result is that the S.E.C. Moves to Scale Back Its Crypto Enforcement Efforts, as Matthew Goldstein, Eric Lipton and David Yaffe-Bellany report:
The Securities and Exchange Commission is moving to scale back a special unit of more than 50 lawyers and staff members that had been dedicated to bringing crypto enforcement actions, five people with knowledge of the matter said.
Source
Even if it were possible to regulate real-world cryptocurrencies, there no longer appears to be any political support for doing so. And thus Michael P. Regan reports on A Sunday Night Flash Crash in the ‘Most Insane Casino Ever Created’:
a major flash crash in Ether as leveraged positions were liquidated served as a stark reminder of how digital-asset markets still lack almost all of the guardrails installed on traditional markets over the years due to various misadventures that hurt investors. The second-largest token was down a bit as traditional markets opened for trading Monday morning in Asia, reacting to concerns about US tariffs against Canada and Mexico. Then in a matter of minutes its losses extended to about 27%, before quickly recovering.
...
There’s no indication that any of that is likely to change anytime soon. So Sunday night’s price action in Ether – and similar dives in a slew of other altcoins and memecoins – serves as a reminder that for better or worse this asset class still exists far outside of the padded walls of traditional markets, regardless of how much tradfi embraces it.

Issue 106: How much do you know about the credit card industry? / Peter Murray

With millions of digital transactions taking place every day, have you ever wondered about the complex world behind your simple card swipe? In this week's Thursday Threads, we delve into the multi-layer maze that is the credit card industry. Grappling with $130 billion in fees, merchants are the invisible heroes who bear the cost of our seamless payment experience. As we unravel this thread, we'll dissect the structure of these processing fees, explain how your spending fuels reward systems, and describe the ongoing antitrust battle between credit card processors and merchants. We'll also see what your credit card issuer knows about your spending habits, bringing to light the monetization of these insights. Delving into murkier waters, we'll explore the shadow realm of debt collection and the distress it can cause to consumers. And to wrap up, are we ready for (X)Twitter to become our "everything app"? Plus, one thing I learned this week and a cat picture.

Feel free to send this newsletter to others you think might be interested in the topics. If you are not already subscribed to DLTJ's Thursday Threads, visit the sign-up page. If you would like a more raw and immediate version of these types of stories, follow me on Mastodon where I post the bookmarks I save. Comments and tips, as always, are welcome.

Who pays for credit card operations? Merchants

Today, small businesses face another attack, this time from Wall Street. Reynolds’s main concern is now swipe fees, which are the fees credit and debit card networks charge merchants for processing transactions. For over a decade, Reynolds and his colleagues on Main Streets around the country have watched their monthly billing statements climb because of these fees. In 2021, big banks, in coordination with the two credit card behemoths, Visa and Mastercard, took in over $130 billion from swipe fees (also called interchange fees), more than double what they reaped in 2010.
Small Businesses Rise to Fight Wall Street, The American Prospect, 7-Feb-2023

The article highlights the growing mobilization of small businesses against the rising swipe fees imposed by credit card companies Visa and Mastercard. These fees have increased significantly, costing small retailers more than utilities and approaching their labor costs. The article describes the struggles of small business owners who, after recognizing their shared challenges during the pandemic, formed groups like the Merchants Payments Coalition to advocate for reform. The campaign aims to replicate the success of the Durbin amendment, which previously capped debit card fees, by pushing for the Credit Card Competition Act to do the same for credit transactions. (You might remember the Credit Card Competition Act...it was the target of blanket advertising in 2023 along the lines of how Congress wants to take away your credit card rewards.) Well, how much are the fees we're talking about...

What are the fees?

In 2023, credit card companies in the U.S. earned $135.75 billion from processing fees charged to merchants. Families paid an average of $1,102 in swipe fees in 2023, according to the Merchants Payments Coalition. The money made from these fees increased at a faster rate than the actual money spent on purchases, adding fuel to the already fierce debate between credit card companies and businesses that complain about so-called swipe fees. Businesses claim that raising interchange fees, which are paid by merchants on each transaction made with a credit or debit card, worsen inflation and pinch consumers because businesses could opt to pass the cost of higher interchange fees onto consumers. Most merchants need to accept credit card payments, which makes credit card processing fees a cost of doing business. For more on how much those costs can be -- and how they vary among credit card companies -- we&aposve collected all the latest data.
Average Credit Card Processing Fees and Costs in 2024, The Motley Fool, 10-Dec-2024

Have you ever been charged an extra fee by a company for using a credit card? It is not common, but it does happen, and it is because the company has been charged a fee to accept the card. That is called the "interchange fee". This table is from the article quoted above:

Payment network Average credit card processing fees
Visa 1.23% + $0.05 to 3.15% + $0.10
Mastercard 1.15% + $0.05 to 3.15% + $0.10
American Express 1.10% + $0.10 to 3.15% + $0.10
Discover 1.56% + $0.10 to 2.40% + $0.10

The range—1.23% to 3.15%, in the case of Visa—is based on a few factors:

  1. Merchant category: the type of business
  2. Card tier: the level of rewards a card offers, or no reward at all
  3. Processing method: whether a card was swiped, dipped, tapped, keyed manually, or used online

One of the significant factors is card tier, which leads us to ask:

Why are banks eager to push the higher-fee rewards cards?

To highlight something which is routinely surprising for non-specialists: interchange fees [the fees paid by the card-accepting business, or "merchant"] are not constant and fixed. They are set based on quite a few factors but, most prominently, based on the rank of card product you use. The more a card product is pitched to socioeconomically well-off people, the more expensive interchange is. Credit card issuers explicitly and directly charge the rest of the economy for the work involved in recruiting the most desirable customers.
Anatomy of a credit card rewards program, Bits About Money, 29-Mar-2024

The author of this article was a technology executive at Stripe and now makes a living doing consulting and writing blog posts. The article delves into the mechanics behind credit card rewards, emphasizing the role of interchange fees, which are paid by businesses accepting credit cards and distributed among various parties in the credit ecosystem. It explains that credit card issuers use these fees to attract high-value customers by offering rewards programs that enhance the spending experience. The discussion highlights that not all cards offer rewards, with some cards targeting lower-income users primarily to provide access to credit rather than rewards. If you want a more in-depth view of how credit cards work, I recommend the author's Improving how credit cards work under the covers.

Interchange settlement

...on every credit card transaction in the MasterCard and Visa systems, the merchant pays a swipe fee, also known as the merchant discount fee. That fee is paid to the merchant&aposs bank. The merchant&aposs bank then pays a "network fee" to MC or V and also pays an "interchange" fee to the bank that issued the card. The interchange fee is not one-size-fits-all. Instead, it varies by merchant type (and sometimes volume) and by the level of rewards/service on the card. So merchants are not directly charged the interchange fee, but it is passed through to them, sometimes explicitly. The problem that merchants face is that they cannot exert any pressure on the interchange fee—nominally an interbank fee—even though it is set based on their line of business. Nor can merchants discriminate among types of credit cards by charging more for rewards cards, etc.
The Proposed Credit Card Interchange Settlement, Credit Slips, 24-Mar-2024

Twenty years ago, merchants sued the credit card companies alleging anti-trust violations about this scheme, and the case is still going on. If the "Credit Card Competition Act" isn't re-introduced to Congress, maybe merchants can get relief from the court system.

How much does the credit card company know about you?

A 2002 study of how customers of Canadian Tire were using the company&aposs credit cards found that 2,220 of 100,000 cardholders who used their credit cards in drinking places missed four payments within the next 12 months. By contrast, only 530 of the cardholders who used their credit cards at the dentist missed four payments within the next 12 months.
What Does Your Credit-Card Company Know About You?, New York Times, 12-May-2009

Credit card use is increasing, and the aggregation of all that data can be a goldmine of information about people. A study showed that purchasing habits could predict payment reliability, with certain products indicating a higher likelihood of missed payments. And analysis of that data drives the efforts by banks to get you to use their higher reward cards. There is also the world of "Level 3 Data", where merchants transmit line items about your purchases to the credit card processor. Think of it as: every line on grocery receipt. Except, as near as I can tell, it Level 3 data doesn't apply to consumer credit cards...only to business-to-business and government-to-business cards. Still, it is an interesting fact to know and perhaps something to keep an eye on in case it leaks into the consumer credit card space.

Exercising control of your data at Mastercard

When you use your Mastercard, the company receives data about your transaction, like how much you spent, where and on what day. It needs this information to be your credit card – but Mastercard doesn&apost just use your data to complete payments. It monetizes that information by selling it to data brokers, advertisers and other companies. Mastercard&aposs data practices contribute to a larger economy of data harvesting and data sales that can be harmful to consumers.
How to take more control of your Mastercard transaction data, PIRG, 21-Sep-2023

Mastercard has a program to monetize transaction data by selling it to advertisers. This PIRG article has details and a link to the opt-out page on Mastercard's website. Visa used to have a similar program — Visa Advertising Solutions — but it was shut down in 2021.

Credit card collections

One interesting lens for understanding how industries work is looking at their waste streams. Every industry will by nature have both a stock and a flow of byproducts from their core processes. This waste has to be dealt with (or it will, figuratively or literally, clog the pipes of the industry) and frequently has substantial residual value. Most industries develop ecosystems in miniature to collect, sift through, recycle, and dispose of their waste. These are often cobbled together from lower-scale businesses than the industry themselves, involve a lot of dirty work, and are considered low status. Few people grow up wanting to specialize in e.g. sales of used manufacturing equipment. One core waste stream of the finance industry is charged-off consumer debt. Debt collection is a fascinating (and frequently depressing) underbelly of finance. It shines a bit of light on credit card issuance itself, and richly earns the wading-through-a-river-of-effluvia metaphor.
Credit card debt collection, Bits About Money, 11-Aug-2023

Back to Bits About Money for a view on the opposite side of credit card rewards: credit card debt collection. Most defaulted debt in the U.S. is from credit cards, and the lifecycle of that debt involves a series of internal and external processes before they are sold to debt buyers, often at a fraction of their original value. He notes that most debt collectors operate in high-pressure environments, leading to high turnover rates, a lack of professionalism, and widespread illegal practices. He also discusses how debt collectors rely on automated systems and predictive dialing to maximize efficiency, often leading to a barrage of calls to debtors. Many consumers are unaware of their legal rights and don't have time to fight against these tactics effectively.

ExTwitter adding digital wallet functionality

Elon Musk&aposs social media platform X on Tuesday announced the launch of a digital wallet and peer-to-peer payments services provided by Visa. X struck a deal with Visa, the largest U.S. credit card network, to be the first partner for what it is calling the X Money Account, CEO Linda Yaccarino announced in a post on the platform. Visa will enable X users to move funds between traditional bank accounts and their digital wallet and make instant peer-to-peer payments, Yaccarino said, like with Zelle or Venmo.
Elon Musk’s X partners with Visa to offer digital wallet, CNBC, 28-Jan-2025

When Elon Musk bought Twitter, he said he wanted the company to turn into an "everything app" — use Twitter to buy things online, call for a taxi, and make peer-to-peer payments. One of the first steps on that path is getting access to payment systems. Now, whether you trust Musk with that kind of access to your bank accounts is an entirely separate matter...

This Week I Learned: The origin of the computer term "mainframe" comes from "main frame" — the 1952 name of an IBM computer's central processing section

Based on my research, the earliest computer to use the term "main frame" was the IBM 701 computer (1952), which consisted of boxes called "frames." The 701 system consisted of two power frames, a power distribution frame, an electrostatic storage frame, a drum frame, tape frames, and most importantly a main frame.
The origin and unexpected evolution of the word 'mainframe', Ken Shirriff's blog, 1-Feb-2025

"Mainframe" is such a common word in my lexicon that it didn't occur to me that its origins was from "main frame" — as in the primary frame in which everything else connected. I've heard "frame" used to describe a rack of telecommunications equipment as well, but a quick Kagi search couldn't find the origins of the word "frame" from a telecom perspective.

What did you learn this week? Let me know on Mastodon or Bluesky.

Mittens explores the toilet

Black cat curiously leans into an open toilet bowl, with its head hidden and tail extended in a bathroom setting.

Announcing the Data.gov Archive / Harvard Library Innovation Lab

Today we released our archive of data.gov on Source Cooperative. The 16TB collection includes over 311,000 datasets harvested during 2024 and 2025, a complete archive of federal public datasets linked by data.gov. It will be updated daily as new datasets are added to data.gov.

This is the first release in our new data vault project to preserve and authenticate vital public datasets for academic research, policymaking, and public use.

We’ve built this project on our long-standing commitment to preserving government records and making public information available to everyone. Libraries play an essential role in safeguarding the integrity of digital information. By preserving detailed metadata and establishing digital signatures for authenticity and provenance, we make it easier for researchers and the public to cite and access the information they need over time.

In addition to the data collection, we are releasing open source software and documentation for replicating our work and creating similar repositories. With these tools, we aim not only to preserve knowledge ourselves but also to empower others to save and access the data that matters to them.

For suggestions and collaboration on future releases, please contact us at publicdata@law.harvard.edu.

This project builds on our work with the Perma.cc web archiving tool used by courts, law journals, and law firms; the Caselaw Access Project, sharing all precedential cases of the United States; and our research on Century Scale Storage. This work is made possible with support from the Filecoin Foundation for the Decentralized Web and the Rockefeller Brothers Fund.

Research rewind: reflections on hits from our back catalog / HangingTogether

Color photograph of a wall of framed gold and platinum records.Wall of Gold & Platinum Sales by prayitnophotography on Flickr

December and January are always filled with “best of” content – lists of the music, movies, books, and television that captured our attention and won our admiration over the previous year. Well, it’s February now so we’re not going to do that. Instead, over the next several months members of the Research team are taking a retrospective look back at the OCLC Research oeuvre, highlighting work we think has stood the test of time, and discussing why these outputs were influential at the time of publication, and how, in many cases are, they remain relevant and important.

I’ve been referring to this as our Greatest Hits project, but really it’s more of a revisit of the OCLC Research back catalog. As any musician who has retained their publishing rights can tell you, there’s deep value in the back catalog. Because it is the tried-and-true jams from the back catalog that we turn to when we need them – to hype yourself up, push through the end of a work day, get through a break-up, clean the house, or have a good cry – and thus that have staying power. Certainly, OCLC Research continues to produce new work, and we are excited about it! But we are proud to have work that stands the test of time and remains useful when people need it.

So, we’ll devote some space here to the riches in our own back catalog that deserve reflection. Kate James will kick us off this month with a post about The Metadata IS the Interface: Better Description for Better Discovery of Archives and Special Collections, Synthesized from User Studies. And later we will be revisiting the golden age of report naming at OCLC Research with posts about Beyond the Silos of the LAMs: Collaboration Among Libraries, Archives and Museums and Tiers for Fears: Sensible, Streamlined Sharing of Special Collections. We have all this and more in store for you. So stay tuned, don’t touch that dial.

The post Research rewind: reflections on hits from our back catalog appeared first on Hanging Together.

Advancing IDEAs: Inclusion, Diversity, Equity, Accessibility, 4 February 2025 / HangingTogether

The following post is one in a regular series on issues of Inclusion, Diversity, Equity, and Accessibility, compiled by a team of OCLC contributors.

Black History Month and Black librarians

In the United States, February is Black History Month, a commemoration that has roots that go back to 1926 when Dr. Carter G. Woodson first established “Negro History Week” aligned with the birthdays of Abraham Lincoln and Frederick Douglass. Many libraries and other cultural heritage institutions mark the month with events, a special focus on book and other collections, and more.

The theme for Black History Month in 2025 is “African Americans and Labor,” and in keeping with that theme a WorldCat.org list focusing on the “History of African American Librarians” caught my eye. This list features not only books but articles, audio recordings, archival collections, and images. There is so much to learn and appreciate about the contributions of Black librarians, and this list is just a starting point. Contributed by Merrilee Proffitt

2025 Day of Remembrance

On 19 February 1942, President Franklin D. Roosevelt signed Executive Order 9066, authorizing the removal of Americans of Japanese ancestry from Washington, Oregon, and California. 120,000 people were forcibly moved to one of ten concentration camps. Each February this event is observed as a Day of Remembrance as a way of reflecting on the experience of incarceration and its multi-generational impacts, as well as the importance of protecting civil liberties for all. The website of the Japanese American Citizen League lists many planned events for sharing and commemoration.

Growing up in California, the remains of remote and desolate concentration camps and former “assembly centers” (mostly racetracks and fairgrounds) were physical reminders of the experiences of those who had been displaced. Stories of those who had been incarcerated were part of my childhood as well, but it is only more recently that these memories have been shared more openly. An upcoming event on 18 February at the US National Archives and Records Administration will help kick off the tour of the Ireichō, a book that lists the over 125,000 persons who were incarcerated. The tour will include events at major incarceration sites and will allow many people to interact and engage with the book as part of a learning and healing experience. Contributed by Merrilee Proffitt

The facts about book bans

On 26 January, the American Library Association issued a response to the US Department of Education’s assertions that book bans have been a “hoax” in an article entitled “ALA to U.S. Department of Education: Book bans are real.” Citing the data that ALA has compiled, Censorship by the Numbers breaks down some 1,247 censorship demands during 2023, by the target (including books, displays, programs, and films), by the source (such as patrons, parents, pressure groups, and elected officials), and by the type of library or institution.

The book ban conversation is nothing new, and has been covered in many previous issues of IDEAs. But it is not just ALA that is covering this issue and giving resources. On 27 January, the free weekly online newsletter Shelf Awareness commented on the Department of Education’s actions, folding in responses from PEN America and Authors Against Book Bans. Contributed by Jay Weitz.

The post Advancing IDEAs: Inclusion, Diversity, Equity, Accessibility, 4 February 2025 appeared first on Hanging Together.

Where do you stand? / Mita Williams

“I always tell my students: ‘A style is a means of insisting on something.’ A line of Sontag’s. -- Zadie Smith, Imitations

DLF Digest: February 2025 / Digital Library Federation

A monthly round-up of news, upcoming working group meetings and events, and CLIR program updates from the Digital Library Federation. See all past Digests here

 

Happy February, DLF community! As the new year continues on, we have many opportunities to connect with our working groups coming up this month. Read on below to learn when and where to join. And, if you haven’t yet, be sure to subscribe to the DLF Forum Newsletter, as some information will drop there soon about this year’s events. 

— Team DLF

 

This month’s news:

  • Discuss: We invite folks to join the Climate Circle discussion coming up on Friday, February 7, a discussion component to the fourth session on Indigenous Knowledge and Climate Collaboration in our Climate Resiliency Action Series. Learn more and register here.
  • Take Note: CLIR will be closed Monday, February 17, in observance of Presidents’ Day.

 

This month’s open DLF group meetings:

For the most up-to-date schedule of DLF group meetings and events (plus NDSA meetings, conferences, and more), bookmark the DLF Community Calendar. Meeting dates are subject to change. Can’t find meeting call-in information? Email us at info@diglib.org. Reminder: Team DLF working days are Monday through Thursday.

 

  • DLF Born-Digital Access Working Group (BDAWG): Tuesday, 2/4, 2pm ET / 11am PT
  • DLF Digital Accessibility Working Group (DAWG): Wednesday, 2/5, 2pm ET / 11am PT
  • DLF Assessment Interest Group (AIG) Cultural Assessment Working Group: Monday, 2/10, 1pm ET / 10am PT
  • Digital Accessibility Working Group — IT Subgroup: Monday, 2/17, 1:15pm ET / 10:15am PT
  • DLF AIG User Experience Working Group: Friday, 2/21, 11am ET / 8am PT
  • DLF Committee for Equity & Inclusion:  Monday, 2/24, 3pm ET / 12pm PT
  • DLF AIG Metadata Assessment Working Group: Thursday, 2/27, 1:15pm ET / 10:15am PT
  • DLF Digital Accessibility Policy & Workflows Subgroup: Friday, 2/28, 1pm ET / 10am PT

 

DLF groups are open to ALL, regardless of whether or not you’re affiliated with a DLF member organization. Learn more about our working groups on our website. Interested in scheduling an upcoming working group call or reviving a past group? Check out the DLF Organizer’s Toolkit. As always, feel free to get in touch at info@diglib.org

 

Get Involved / Connect with Us

Below are some ways to stay connected with us and the digital library community: 

Contact us at info@diglib.org.

The post DLF Digest: February 2025 appeared first on DLF.

February 2025 Early Reviewers Batch Is Live! / LibraryThing (Thingology)

Win free books from the February 2025 batch of Early Reviewer titles! We’ve got 196 books this month, and a grand total of 3,388 copies to give out. Which books are you hoping to snag this month? Come tell us on Talk.

If you haven’t already, sign up for Early Reviewers. If you’ve already signed up, please check your mailing/email address and make sure they’re correct.

» Request books here!

The deadline to request a copy is Tuesday, February 25th at 6PM EST.

Eligibility: Publishers do things country-by-country. This month we have publishers who can send books to the US, the UK, Luxembourg, Ireland, Canada, Belgium, Netherlands, Sweden, Spain, Poland and more. Make sure to check the message on each book to see if it can be sent to your country.

FalloutThe WeirdotsThe City of Lost CatsThe Puffin KeeperThe Ghosts of Pandora PickwickWords with Wings and Magic ThingsEpic FACTopia!: Follow the Trail of 400 Extreme FactsPotomac Fever: Reflections on the Nation's RiverShell ShockedThe Village Beyond the MistThe Summer We RanWord Fun Riddles & Crisscross Puzzles: Dynamic DuosThe Remembered SoldierWhy Wolves Matter: A Conservation Success StoryWhat If We Were All Kind!The Magic of Knowing What You Want: A Practical Guide to Unearthing the Wisdom of Your DesiresDragons Can't Eat Snow ConesThe Slither QueenThe Church of Living Dangerously: Tales of a Drug-Running Megachurch PastorHow to Read a Room: Navigate Any Situation, Lead with Confidence, and Create an Impact at WorkThe Big SplashBlue Earth RiverThe Water's CallMy Own Dear PeopleThe Red Car to HollywoodI've Got Questions : The Spiritual Practice of Having It Out with GodSweet Babe!: A Jewish Grandma KvellsMaking Time: A New Vision for Crafting a Life Beyond ProductivityMake Sense of Your Story: Why Engaging Your Past with Kindness Changes EverythingKalman & Leopold: Surviving Mengele's AuschwitzThe Colors of the SeaWould You Rather? True Crime Edition: 1,000+ Thought-Provoking Questions and Conversation Starters on Serial Killers, Mysteries, Crimes, Supernatural Activity and MoreBlack Power Scorecard: Measuring the Racial Gap and What We Can Do to Close ItThe Voyage of The UnicornSterling HeartsConvergenceMayday's Cat-Tastic EscapadesSub-LuminalLoopholeChanged DesiresChemistoriesShallow DepthsGentle HugsTreasures of Castle RowleyPut Your Past in the Past: Why You May Be Reenacting Your Trauma, and How to StopThe Divine DraftMean Low WaterSon of Southtown: My Life Between Two WorldsJesus Revealed in the End Times: Hope for Today from the One Who Holds Our FutureWhen God Speaks: Thrive in Uncertain Times and Gain Confidence for Your FutureDisciples of White Jesus: The Radicalization of American BoyhoodReinventing the Heartland: How One City's Inclusive Approach to Innovation and Growth Can Revive the American DreamThe Voice We FindSunrise ReefThe Curious Inheritance of Blakely HouseThe Light on Horn IslandAn Instrument for FlorendaBailey’s BandanasDawn of Grace: Mary Magdalene's StoryThe Fire ApprenticeFor the One in Each and All of UsThe Three-Berry AcademyMy Bully, My Aunt, & Her Final GiftDegree of GuiltCheckmateRoots and Resilience: California Ranchers in Their Own WordsTempest at Annabel's LighthouseLand of DreamsLast Night at the Nowhere Cliff: A Short Story CollectionOld White Man WritingHaunted Houses Creak: A Horror CollectionToo Strange To Be the End of the World: A Short Story CollectionUnder the RadarRetreat and BreatheA Star Mastering: The Feminine EnigmaDreams in Times of War / Soñar en Tiempos de Guerra: Stories / CuentosThe Jemez Mountains: A Cultural and Natural HistoryWhich Of Your Sons?Mothers Are Made: How One Mom Overcame Perfectionism, Self-Doubt, Loneliness, and Anxiety and Became a Better and Happier ParentDo Not Cry When I Die: A Holocaust Memoir of a Mother and Daughter's Survival in Jewish Ghettos, Auschwitz, and Bergen-BelsenDisco Witches of Fire IslandThe Problem You Have: StoriesThe Adventures of Talliesin and The Crystalline HeartBorderline: A Poetic MemoirMouthThree VoicesThe Summer We RanAutopsy 2024 — From the 7 Energy Types PerspectiveBound to HappenShattered SightThe Treasure of Loon LakeLight Chaser: A Poetry CollectionThe Gods Time ForgotAn End of Troubles: An AnthologyRank Insubordination!Stay until Tomorrow: PoemsEnd of Earth: A Collaboration of Poetry and PaintingFrredA History of Military Encounters with UFOs: Explanations and Combat StrategiesCareful What You HearRescue Run: Capt. Jake Rogers' Daring Return to Occupied EuropeSalt Run: The Heir's Last HopeBarbara Ann Scott: Queen of the IceRainbows and LollipopsUnicorns Can Be DeadlyHodie Mihi, Cras Tibi: Today Me, Tomorrow YouEntangled FateBelieve Nothing, Know Nothing: The Lightworkers' Ultimate Survival ManualBrothers, Blades, and Bugging OutCurse of the Maestro and Other StoriesHunting Hearts: PoetryVeiled LegacyCrucifixusI'm Sorry JunoLove and ConductivityTreasures of Castle RowleyMoonset on Desert SandsOne Year Without Sugar: Unlocking the Secrets to Weight LossThe Fragile Humans We Are: Volume ICall Her LibertyHalloween TerrorAdvent of LibraLife on the Fringe: Tales from the FrontierRogueHyper Traveler: The Next JourneyEvery Thread of LightGood SoulVivid Visions: Tales Woven from the Threads of Diverse Imaginations: A Short Story CollectionThe Art of Loving Yourself: Building Lasting Confidence As a WomanUnderstanding and Overcoming Anxiety in Women: Guide to Biology, Hormones, and Strategies for Lasting Emotional BalanceTech Skills for Non-Tech Professionals: A Beginner's Guide to AI: Boost Your Career with Artificial Intelligence - No Tech Background RequiredThe Next Chapter: Writing in RetirementHouse of Crimson RosesStarry Starry Noir Rebels and Censors: Film Noir in the Public Domain Volume IIIThe Brown SuitcaseScars & SpiritsBeyond the Ocean DoorKeep CloseForever We DreamThe Free Market Myth: Unveiling the Illusions of Economic Justice: Exploring Failures, Inequalities, and Alternative Futures in a Globalized WorldBend, Don't BreakChaos BeckonsSpace StationTrap, Neuter, DieThe Billion Dollar DynastyI Know What UFO Did Last SummerRun AwayThe Trio's Adventure: Bringing Pluto BackBourton BridgeBirds in a Land of no Trees: Notebook A: Habits and HabitatsBirds in a Land of no Trees: Notebook A: Habits and HabitatsWelcome to CemeteryThe Romantic Ideal: The Highest Standard of Romance for a ManDigitally Hijacked: The Age of Influence: How Social Media and AI Are Reshaping Our RealityEchoInner Child Healing Journey: A Practical Guide to Breaking Free from Childhood Wounds, Overcoming Fears and Emotional Blocks, and Building Authentic RelationshipsNon-Compete91-Day SanctionAs the Snow Drifts: A Cozy Winter AnthologyThe Calamities of ToterrumConvergenceVikingstockThe Consensus of BeingsForesightSaving Ellen: A Memoir of Hope and RecoveryShattered 21: Schneewittchen: Illustrated Short Story for AdultsThen She DiedKnives + ForksPaint an Inch ThickFrom the Enlightenment to Black Lives Matter: Tracing the Impacts of Racial Trauma in Black Communities from the Colonial Era to the PresentShadowed Skies50 International Corporate & Commercial Law Cases 2024Brushes With DestinyEroshenkoSophie and the ARC of EmongliaLove, NemesisIncredible Stories at the Bus Stop: A Flash Fiction Collection on Human ExperienceMeantime in GreenwichA Simpler Guide to Gmail 6th Edition: Your Unofficial Handbook for Mastering Your Email, Google Calendar, Keep, and TasksBarbara Ann Scott: Queen of the IceDeep Down the Rabbit HoleA Chorus of Big BangsThe Legend of Robin GoodfellowThe Hollow: The CrossDelitti PostdatatiFiguring Out the New Era: Dystopia 2024The Visionary Leader: The Success Principles of the World's Greatest VisionariesReal WeightTerratron — Gods of MarsTwilight of EvilHope Like SunlightHave You Thanked a Musician Today?Would You Rather? True Crime Edition: 1,000+ Thought-Provoking Questions and Conversation Starters on Serial Killers, Mysteries, Crimes, Supernatural Activity and MoreThis Time Tomorrow and Other StoriesOne in Vermilion May LiveMurder on Middle Ridge

Thanks to all the publishers participating this month!

5 AM Publishing aka Associates Akashic Books
Alcove Press Arctis Books USA Baker Books
Bellevue Literary Press Bethany House Broadleaf Books
Chosen Books eSpec Books Greenleaf Book Group
Harbor Lane Books, LLC. Harper Horizon Henry Holt and Company
History Through Fiction Kinkajou Press Lerner Publishing Group
MDW Press New Vessel Press Pen & Sword Books
Prolific Pulse Press LLC PublishNation Purple Diamond Press, Inc
Restless Books Revell Rootstock Publishing
Running Wild Press, LLC Shadow Dragon Press Tundra Books
University of Nevada Press University of New Mexico Press Unsolicited Press
What on Earth! Wild Press Yorkshire Publishing
Zibby Books

We Check Its Work / Ed Summers

NASA astronaut image of Makatea Island

Richard PowersPlayground is a lot of things, but for me it seemed to be very much a meditation on the near-future (or perhaps present) of Large Language Models like ChatGPT, and how they fit into our culture and politics. I won’t give away any spoilers (it is worth a read!), but one moment in the story has been stuck in my head, due to some things going on at work, so I thought I’d make a note here to get it out of my head.

Near the end of the novel a group of people living on the island of Makatea are trying to decide whether they want to accept a proposal from a consortium of corporations to develop, and thus greatly transform, their island home. Only 50 or so people live on the island, and they have decided to vote on it.

In order to help the Makateans decide how to vote the consortium of corporations provided the island’s inhabitants with exclusive access to a 3rd generation Large Language Model called Profunda, which operates much like ChatGPT does today. Users can engage with it in conversation using their voice, and inquire about how the proposed development will impact the island. Profunda has access to confidential materials related to the consortium and its detailed plans. It was built on top of a foundation model that was assembled from a massive harvest of content from the World Wide Web.

In this short segment below some of the characters are discussing how to vote based on the information they learned by “chatting” with Profunda:

“I don’t know how to vote. I don’t even know who this consortium is! People always say, ‘Follow the money.’ I’m supposed to vote this up or down, without even knowing who exactly is paying for this pilot program or what they stand to gain by this . . . seasteading.”

Pockets of applause followed the comment, suggesting that the priest was not the only one still at sea.

Manutahi Roa was baffled by the objection. He waved a dossier of printouts in the air. “You should have asked Profunda. I did!”

“But how can I trust him?” the priest shouted back. “The consortium made him!”

Neria Tepau, the postmistress, shot to her feet. “Exactly! We should have been researching for ourselves, these last ten days. We have phones. We have a cell tower. We can search every web page in the world. Instead, we’re relying on this construction, this . . . thing to spoon-feed us!”

“Neria!” Wen Lai’s objection sounded tired. “A search engine spoon-feeds us, too.”

“So letting this thing do the work and making a biased summary is somehow better than me going through the pages myself?”

Hone Amaru laughed. “This thing has read a hundred billion pages. How many can you read, in ten days?”

“It’s the ten days that is the crime! We’re being railroaded!” The words cracked in Puoro’s throat. Patrice put his arm around his partner’s shoulders.

The Queen stood up and the room settled down. “People. Friends. Sisters. Brothers. We’re letting the Popa’ā make us as crazy as they are!”

This observation was met by near-universal applause. Even the mayor collected himself and clapped.

“It’s easy,” the Queen went on. “We ask who is paying. It tells us. And then, as Madame Martin would say, we check its work” She looked to the schoolteacher, who held both her thumbs high in the air. The assembly broke into a new round of applause.

When the cheers settled, the mayor said, “Profunda. Please give us short biographies for the five biggest investors in this seasteading consortium.”

This scene was striking to me, because The Queen’s statement seemed kinda obvious, at least on the surface. How do we know if we can trust generative AI? We check its work. Checking the work in this case seems doable, I guess. They were asking who were the biggest investors in the consortium. The assumption being that it was easy for them to find sources to check for verification.

At some level, triangulating fact claims, and reproducibility are how knowledge is built. But it requires work. It takes time. It can sometimes require specialized expertise. As Generative AI tools get used to accelerate “knowledge” generation the need to verify accelerates as well. This is why we need to be thoughtful and slow down when integrating these tools into our existing knowledge systems…if they are integrated at all. It is a choice after all. Just because a generative AI tool is offering you citations to back up its assertions does not mean the citations refer to actual documents, or if the documents exist, that the they back up the claims that are being made (Liu, Zhang, & Liang, 2023).

Is it realistic for us to be checking the work of these systems? Wouldn’t it be better if it was the other way around?

I’m reminded of the nudges my Subaru Outback will give me if I begin to stray out of my lane. These nudges gently move the car back into the lane, and prompt me to make additional corrections. They get me to pay better attention. But the automated system doesn’t take full control of the car. Subaru’s lane keeping system is no doubt a machine learning model that has been trained on many hour of driving video. But the complete system is oriented around helping (not replacing) drivers, and reducing car accidents, rather than trying to create a fully automated self driving car. I’m always struck by how gentle, collaborative and helpful these nudges are, and I can’t help but wonder what an equivalent would look like in information seeking behavior, where rather than generating complete answers to questions, or posing as an intelligent conversation partner, the system collaborates with us.

At work we’ve been looking to integrate OpenAI’s Whisper into our institutional repository, so that we can generate transcripts for video and audio that lack them. There were some routine technical difficulties involving integrating on-premises software with on-demand services in the cloud, where we could have access to compute resources with GPUs. But from my perspective it seemed that the primary problems we ran into were policy questions around what types of review and correction needed to happen to the generated transcripts, and how to make these checks part of our workflow. This is all still a work in progress, but one small experiment I got to try was helping to visualize the confidence level that Whisper reports for words in its transcript:

Viewing a Whisper generated transcript with confidence levels

whisper-transcript is a tiny piece of software, a Web Component you can drop into any web page using Whisper’s JSON output (a demo is running here). It’s clearly not a complete system for correcting the transcript, but simply a way of listening to the media, while seeing the transcription, and the models confidence about its transcription. I’m mentioning it here because it felt like a clumsy attempt at providing these kinds of nudges to someone reviewing the transcript.

A novel, a car and a transcript make for a kind of unruly trio. This post was just me expressing my hope that we see a move towards specialized computer assisted interfaces that don’t create more work for us, and that the promises of automated systems that replace people get left behind in the dust (again).

PS. Powers’ book, like the others I’ve read, is beautifully written. The life stories of his characters really stick with you, and the descriptions of the ocean and the natural world will transport you in the best way possible.


Liu, N. F., Zhang, T., & Liang, P. (2023, October 23). Evaluating Verifiability in Generative Search Engines. arXiv. https://doi.org/10.48550/arXiv.2304.09848

Paul Evan Peters Award / David Rosenthal

YearAwardee
2024Tony Hey
2022Paul Courant
2020Francine Berman
2017Herbert Van de Sompel
2014Donald A.B. Lindberg
2011Christine L. Borgman
2008Daniel E. Atkins
2006Paul Ginsparg
2004Brewster Kahle
2002Vinton Gray Cerf
2000Tim Berners-Lee
It has just been announced that at the Spring 2025 Membership Meeting of the Coalition for Networked Information in Milwaukee, WI April 7th and 8th, Vicky and I are to receive the Paul Evan Peters Award. The press release announcing the award is here.

Vicky and I are honored and astonished by this award. Honored because it is the premiere award in the field, and astonished because we left the field more than seven years ago to take up our new full-time career as grandparents. We are all the more astonished because we are not even eligible for the award; the rules clearly state that the "award will be granted to an individual".

You can tell this is an extraordinary honor from the list of previous awardees, and the fact that it is the first time it has been awarded in successive years. Vicky and I are extremely grateful to the Association of Research Libraries, CNI and EDUCAUSE, who sponsor the award.

Original Logo
Part of the award is the opportunity to make an extended presentation to open the meeting. The text of our talk, entitled Lessons From LOCKSS, with links to the sources and information that appeared on slides but was not spoken, should appear here on April 7th.

The work that the award recognizes was not ours alone, but the result of a decades-long effort by the entire LOCKSS team. It was made possible by support from the LOCKSS community and many others, including Michael Lesk then at NSF, Donald Waters then at the Mellon Foundation, the late Karen Hunter at Elsevier, Stanford's Michael Keller and CNI's Cliff Lynch.

Issue 105: Facial Recognition / Peter Murray

In this week's Thursday Threads, I'll point to articles on the contentious subject of facial recognition technology. This tech, currently used by law enforcement and various businesses around the world, raises critical ethical and privacy questions. Beyond the instances where facial recognition use has resulted in wrongful apprehensions by law enforcement or fails to recognize a student taking an exam, we have examples of individuals taking the technology to the dystopian extreme: doxing smart glasses and invading the privacy of social media users. Even police officers are reluctant to submit to facial recognition, and in a surprising turn of events, places like China have started implementing restrictions on companies.

It is possible that facial recognition might be useful in some circumstances someday. We're a long way from that day, though.

Feel free to send this newsletter to others you think might be interested in the topics. If you are not already subscribed to DLTJ's Thursday Threads, visit the sign-up page. If you would like a more raw and immediate version of these types of stories, follow me on Mastodon where I post the bookmarks I save. Comments and tips, as always, are welcome.

Catalog of police misuse

Police have shown, time and time again, that they cannot be trusted with face recognition technology (FRT). It is too dangerous, invasive, and in the hands of law enforcement, a perpetual liability. EFF has long argued that face recognition, whether it is fully accurate or not, is too dangerous for police use, and such use ought to be banned. Now, The Washington Post has proved one more reason for this ban: police claim to use FRT just as an investigatory lead, but in practice officers routinely ignore protocol and immediately arrest the most likely match spit out by the computer without first doing their own investigation.
Police Use of Face Recognition Continues to Wrack Up Real-World Harms, Electronic Frontier Foundation, 15-Jan-2025

I have saved a bunch of articles about law enforcement misuse of facial recognition technology, but rather than including them individually, I'm pointing to this article from the Electronic Frontier Foundation that catalogs the problems and points to individual cases. The EFF analysis emphasizes that the technology poses significant risks to civil liberties and can lead to wrongful arrests. Despite claims from law enforcement that it is used merely as an investigatory tool, evidence shows that police often bypass protocols, leading to immediate arrests based solely on computer matches. It notes a troubling pattern where many individuals wrongfully arrested based on FRT are Black, underscoring the technology's lower accuracy for individuals with darker complexions.

Layering facial recognition atop DNA analysis

Parabon NanoLabs ran the suspect’s DNA through its proprietary machine learning model. Soon, it provided the police department with something the detectives had never seen before: the face of a potential suspect, generated using only crime scene evidence.... The face of the murderer, the company predicted, was male. He had fair skin, brown eyes and hair, no freckles, and bushy eyebrows. A forensic artist employed by the company photoshopped a nondescript, close-cropped haircut onto the man and gave him a mustache—an artistic addition informed by a witness description and not the DNA sample. In a controversial 2017 decision, the department published the predicted face in an attempt to solicit tips from the public. Then, in 2020, one of the detectives did something civil liberties experts say is even more problematic—and a violation of Parabon NanoLabs’ terms of service: He asked to have the rendering run through facial recognition software.
Cops Used DNA to Predict a Suspect’s Face—and Tried to Run Facial Recognition on It, WIRED, 22-Jan-2024

This is perhaps the most egregious example of misuse: extrapolating an image of a suspect based on DNA analysis, then running that image through facial recognition technology in search of leads.

When the face can't be found

To prevent students from cheating, the university had bought software from the tech firm Proctorio, which uses face detection to verify the identity of the person taking the exam. But when Pocornie, who is Black, tried to scan her face, the software kept saying it couldn’t recognize her: stating “no face found.” That’s where the Ikea lamp came in. For that first exam in September 2020, and the nine others that followed, the only way Pocornie could get Proctorio’s software to recognize her was if she shone the lamp uncomfortably close to her face—flooding her features with white light during the middle of the day
This Student Is Taking On ‘Biased’ Exam Software: Mandatory face-recognition tools have repeatedly failed to identify people with darker skin tones, WIRED, 5-Apr-2023

Here is one of the biggest problems of this unregulated technology: biases in the data used to train the algorithm call into question any results you get from it. The article discusses a student challenging biased exam software that may unfairly affect test outcomes. Just because a machine that can count and compare numbers really, really fast says something is true doesn't make it true.

Police officers don't want to be subject to facial recognition

A Las Vegas police union has raised concerns about a new NFL policy that would require officers who work security at Raiders games to share their photo for facial recognition purposes and is urging officers to think twice before complying. Traditionally, officers who worked overtime hours as security for Raiders games would receive a wristband that got them access to different parts of the field and stadium, explained Steve Grammas, president of the Las Vegas Police Protective Association. But now, the NFL is asking that officers each provide a photo, which will be used for “identification purposes when an individual steps up to a scanner to verify who the person is and if they have access to that particular space,” explained Tim Schlittner, director of communications for the NFL, in an email.
NFL facial recognition policy upsets Las Vegas police union, Las Vegas Review-Journal, 14-Aug-2024

Speaking of unregulated, police officers themselves don't want their biometrics cataloged in a company's database with no oversight. This also points to the problem of using biometrics as an authentication tool: the shape of your face isn't something you can easily change. Suppose your facial markers leak from one of these companies. What stops someone from 3-D printing a facsimile of those markers to fool this technology?

China tells companies to stop using facial technology

Authorities in several major Chinese cities have ordered hotels to stop using facial recognition technology to verify the identity of guests in a sign the government is responding to public concerns over privacy, financial news site Caixin reported. Guests staying at hotels in Beijing, Shanghai, Shenzhen, and Hangzhou will now only be required to present identification in order to check in, according to state-run tabloid The Global Times.
China says no more facial recognition at hotels, Semafor, 25-Apr-2024

The government in China is well known for using facial recognition in public places for surveillance, so I think it is notable when the government responds to public pressure to stop companies from using the technology.

Facial recognition in smart glasses

The technology, which marries Meta’s smart Ray Ban glasses with the facial recognition service Pimeyes and some other tools, lets someone automatically go from face, to name, to phone number, and home address.
Someone Put Facial Recognition Tech onto Meta's Smart Glasses to Instantly Dox Strangers, 404 Media, 2-Oct-2024

What happens when you pair off-the-shelf facial recognition with off-the-shelf smart glasses? Something very creepy. As a society, we're not nearly ready to dramatically change the social contract that this technology is demonstrating.

Scanning the faces in social media videos

A viral TikTok account is doxing ordinary and otherwise anonymous people on the internet using off-the-shelf facial recognition technology, creating content and growing a following by taking advantage of a fundamental new truth: privacy is now essentially dead in public spaces. The 90,000 follower-strong account typically picks targets who appeared in other viral videos, or people suggested to the account in the comments. Many of the account’s videos show the process: screenshotting the video of the target, cropping images of the face, running those photos through facial recognition software, and then revealing the person’s full name, social media profile, and sometimes employer to millions of people who have liked the videos.... 404 Media is not naming the account because TikTok has decided to not remove it from the platform. TikTok told me the account does not violate its policies; one social media policy expert I spoke to said TikTok should reevaluate that position.
The End of Privacy is a Taylor Swift Fan TikTok Account Armed with Facial Recognition Tech, 404 Media, 25-Sep-2023

The "Taylor Swift Fan" part is quite click-baity. The article's author noted in the second paragraph that this anonymous TikTok user liked to focus on fan videos, but the content of the article stands on its own. Again: it is an off-the-shelf service that dramatically affects the social contract between humans.

Sending a message at airport security gates

A bipartisan group of 12 senators has urged the Transportation Security Administration’s inspector general to investigate the agency’s use of facial recognition, saying it poses a significant threat to privacy and civil liberties.... While the TSA’s facial recognition program is currently optional and only in a few dozen airports, the agency announced in June that it plans to expand the technology to more than 430 airports. And the senators’ letter quotes a talk given by TSA Administrator David Pekoske in 2023 in which he said “we will get to the point where we require biometrics across the board.” ... The latest letter urges the TSA’s inspector general to evaluate the agency’s facial recognition program to determine whether it’s resulted in a meaningful reduction in passenger delays, assess whether it’s prevented anyone on no-fly lists from boarding a plane, and identify how frequently it results in identity verification errors.
Senators Say TSA's Facial Recognition Program Is Out of Control, Here's How to Opt Out, Gizmodo, 22-Nov-2024

Because of the problems with unregulated, unaudited facial recognition technology, I opt out of its use whenever possible. With study, evaluation, auditing, and quite possibly some regulation, this might become a useful technology for some use cases. Until that happens, my face will vote my consciousness: do not use it.

This Week I Learned: A biographer embedded with the Manhattan Project influenced what we think about the atomic bomb

In early 1945, a fellow named Henry DeWolf Smyth was called into an office in Washington and asked if he would write this book that was about a new kind of weapon that the US was developing. The guy who had called him into his office, Vannevar Bush, knew that by the end of the year, the US was going to drop an atomic bomb that had the potential to end the war, but also that as soon as it was dropped, everybody was going to want to know what is this weapon, how was it made, and so forth. Smyth accepted the assignment. It was published by Princeton University Press about a week after the bomb was dropped. It explained how the US made the bomb, but it told a very specific kind of story, the Oppenheimer story that you see in the movies, where a group of shaggy-haired physicists figured out how to split the atom and fission, and all of this stuff. The thing is, the physics of building an atomic bomb is, in some respects, the least important part. More important, if you actually want to make the thing explode, is the chemistry, the metallurgy, the engineering that were left out of the story.
Wars Are Won By Stories, On the Media, 22-Jan-2025

The quote above comes from the transcript of this podcast episode. I've thought about this a lot in the past week as the Trump administration's flood-the-zone strategy overwhelms the senses. In a valiant effort to cover everything that is news, I can't help but wonder about the lost perspective of what isn't being covered. And I wonder where I can look to find that perspective.

Alan's chair

A white cat with black spots stretches across the back of a lounge chair seat.
Alan thinks he owns this chair...so much so that he is going to stretch out as big as he can to cover it. In reality, it is my chair. And, yes, right after taking this picture I insisted that he let me sit down. He got to take a nap in my lap, though.

Preserving Public U.S. Federal Data / Harvard Library Innovation Lab

In recent months the Harvard Law School Library Innovation Lab has created a data vault to download, sign as authentic, and make available copies of public government data that is most valuable to researchers, scholars, civil society and the public at large across every field. To begin, we have collected major portions of the datasets tracked by data.gov, federal Github repositories, and PubMed.

The Harvard Law School Library has collected government records and made them available to patrons for centuries, and this continues that work.

We know from our web archiving project, Perma.cc, which preserves millions of links used by courts and law journals, that government documents often change or go away. And we know from our Caselaw Access Project, which produced free and open copies of nearly all US case law from the inception of each state and Federal court, that collecting government documents in new forms can open up new kinds of research and exploration.

This effort, focusing on datasets rather than web archives, collects and will make available hundreds of thousands of government datasets that researchers depend on. This work joins the efforts of many other organizations who preserve public knowledge.

As a first step, we have collected the metadata and primary contents for over 300,000 datasets available on data.gov. As often happens with distributed collections of data, we have observed that linkrot is a pervasive problem. Many of the datasets listed in November 2024 contained URLs that do not work. Many more have come and gone since; there were 301,000 datasets on November 19, 307,000 datasets on January 19, and 305,000 datasets today. This can naturally arise as websites and data stores are reorganized.

In coming weeks we will share full data and metadata for our collection so far. We look forward to seeing how our archive will be used by scholarly researchers and the public.

To notify us of data you believe should be part of this collection please contact us at publicdata@law.harvard.edu.

ILL across borders: Insights from SHARES on sharing physical materials internationally / HangingTogether

Image from Pixabay

In March 2020, in the early days of the pandemic, members of the SHARES resource sharing consortium started gathering weekly for informal virtual town halls. This week, nearly five years later, we convened our 238th SHARES town hall, with 30 attendees and no preset agenda. The sense of community that has developed around these sessions has been described by participants as welcoming, innovative, and fun — an ideal environment where staff from the most outward-facing department in the library can bond with peers, collaborate, teach, learn, and flourish.

Previously, I’ve shared some significant, tangible outcomes from the town halls, such as the creation of the International ILL Toolkit by SHARES volunteers and a crowd-sourced set of preferred practices around processing interlibrary loan returns and overdues. This time let’s explore the value of coming together within a trusted community to address a long-standing challenge that stubbornly resists definitive solutions.

Sharing physical library materials across borders

Interlending library books and other physical formats internationally has always been fraught, and the obstacles to sharing across borders have stayed pretty much the same since I started working in ILL in 1983:

  • Ever-rising shipping costs
  • Ineffective shipment tracking
  • Customs complications
  • Increased risk to the material
  • Difficulty in identifying willing lenders
  • Language issues, complicating communications
  • Lack of effective, universally available payment options
  • Lengthy, unpredictable request lifecycle
  • Negative impact on the environment

Over the years, every step forward in international sharing—such as the development of the International ILL Toolkit, which compiles vetted contact and policy information on international suppliers along with request templates in over a dozen languages—has been tempered with a significant step back, like the new customs rules for all European Union countries which are erratically enforced. The challenges remain as persistent as ever.

The community came together to pool uncertainties, share strategies, and identify preferred practices

SHARES’s governing body, the SHARES Executive Group (SEG), noticed that international ILL issues had been coming up constantly in town halls throughout 2024 and launched a suite of interrelated activities:

  • Facilitated two special town halls in November 2024 devoted to discussing borrowing and lending physical items across borders, inviting SHARES members to come ready to share their challenges, successes, and questions
  • Prepared a statistical analysis comparing SHARES international ILL activity in 2023 and 2024
  • Gathered and shared information on current international shipping practices, issues, and aspirations from SHARES participants
  • Drafted a new international ILL section that will soon be added to the SHARES Procedures Web page, documenting preferred practices and mitigation strategies

Top insights gleaned from these activities include:

  • Over 60% of SHARES libraries loaned a returnable item overseas in 2024, same as in 2023 (Note: 25 years ago, only 10% of SHARES libraries loaned returnable items overseas)
  • Overseas shipping expenses when using carriers such as FedEx, UPS, and DHL have skyrocketed, with reports of $70 charges for one book not uncommon, but these carriers usually do well getting things through customs
  • US Postal Service and Royal Mail are cheaper but offer poor tracking capabilities overseas
  • Customs processing is the biggest wildcard, especially in European Union countries; you can do everything correctly and still have your package get stuck in customs, incurring delays and extra fees and in some cases resulting in items being returned to the sender undelivered
  • ILL practitioners all over the world recognize the value in sharing research materials across borders and tend to be extremely patient and helpful in sorting out difficulties

Biggest tip of all: mitigate

Given the extra expense and risk of sharing physical items across borders, town hall participants strongly agreed: libraries should implement mitigation strategies to ensure that such items aren’t requested from international suppliers unless it’s the only way to fulfill the patron’s information need.

Borrowers should:

  • Exhaust all domestic ILL sources before requesting physical items internationally
  • Ask patrons if having the entire work in hand is critical, or if a scan of the index or table of contents might be a workable first step in identifying what portions of the work should be copied to fulfill their request
  • If the entire work is needed, make sure the patron is willing to wait for the needed item to arrive from another country before you request it
  • Consider purchasing an in-print title for your collection and user as this may be cheaper than paying the lending fee and shipping costs for an international loan (and allow you to be a new domestic lender for the title!)

Lenders should:

  • Provide a digital surrogate when licensing and copyright permit
  • Offer to scan tables of contents or indexes of works you are unable or unwilling to ship

Uncertainty loves company

The SHARES community certainly didn’t solve all the vexing issues around sharing physical items across borders. But we shared plenty of tips, tricks, and data, reached consensus on preferred practices, affirmed the immense value of connecting library patrons with the global research materials they need, and supported each other in our shared calling of making ILL magic happen.

The post ILL across borders: Insights from SHARES on sharing physical materials internationally appeared first on Hanging Together.

[Call for Organisations] Working with data but don’t know coding? The Open Data Editor pilot is for you / Open Knowledge Foundation

Photo: Desola Lanre-Ologun at Unsplash

Last December, the Open Knowledge Foundation (OKFN) released the Open Data Editor (ODE), an open source desktop application for data practitioners to explore and find errors in tables. 

What makes this software unique is its target audience: people who don’t have a technical background and don’t know code, formulas or programming languages. Last year, during the final testing phase, we worked with journalists and human rights activists to understand how ODE could be integrated into their workflows. And now we want to work together with more non-technical profiles, such as social workers, government officials, etc.

Today the Open Knowledge Foundation is launching a new pilot programme: we are inviting organisations from any geography and any area of expertise to express interest in participating in a four-month funded pilot. The task is to learn how to integrate Open Data Editor into your work and improve the tool’s components while doing so.

Examples of how ODE can help your daily work:

📊 If you have huge spreadsheets with data obtained through forms with the communities you serve, ODE helps you detect errors in this data to understand what needs to be fixed.

🧑🏽‍🍼 If you manage databases related to child malnutrition (just an example, it could be any social issue), ODE can quickly check if there are empty rows or missing cells in the data collected by different social workers and help you better allocate family assistance.

🏦 If you monitor and review government spending or public budgets for a particular department, ODE helps you find errors in the table and make the data ready for analysis.

Check out the details of the call and see how you can register your interest below:

Call for Pilot Organisations

Target

The Open Knowledge Foundation is looking for five high-impact organisations to integrate the Open Data Editor into their workflow. The selected organisations will be part of a first cohort (with a second one scheduled later in the year) and distributed across disciplines and geographies. Our goal is to improve the tool to fit their needs and show a collection of applied uses of ODE.

Eligibility

The selected organisations can:

  • Operate in any geography.
  • Specialise in any area of knowledge (health, education, socio-economic indicators, transport, research institutions, etc.).
  • Have any legal status (non-governmental, civil society, independent collective, public institution, or part of a government).
  • Be small or medium-sized, whose work has the agility of a human scale.

The selected organisations must

  • Have concrete purpose and outcomes for integrating the Open Data Editor into their work.
  • Have a non-technical team or group of collaborators who work with data without coding-skills.
  • Be able to communicate in English (at least one team member serving as a point of contact).

Commitments

The selected organisations will commit to:

  • Integrate Open Data Editor into their workflow under OKFN guidance. 
  • Conduct a thorough assessment of the Open Data Editor.
  • Participate in bimonthly pilot cohort calls.
  • Tell the story of what for and how they deployed the tool.
  • Engage in events run by the Open Knowledge Foundation.

(More detailed commitments will form part of a Memorandum of Understanding to be signed as part of the collaboration.)

Compensation & Benefits

The selected organisations will:

  • Receive up to USD 5,000 to support their four-month collaboration with the Open Knowledge Foundation.
  • Gain data literacy training and tailored support from OKFN, hands-on experience with ODE, and exposure to the Open Knowledge community.
  • Help shape the future of a tool designed to make data workflows more accessible and efficient.
  • Contribute to the development of your sector and beyond.

Timeline

This pilot programme will start on 3 March, 2025 with an inception meeting in mid-February (date to be defined). The programme is expected to be completed by June 2025.

Expressing your interest

Organisations have until 10 February, 2025 to express their interest. The selected ones will be contacted on a rolling basis until 17 February, 2025.

To express your interest, please fill in this online form. It’s very quick: just a few simple questions to get to know you and understand why the Open Data Editor can make a difference for your work. 

Contact

If you have any questions or want any additional information, you can contact us at info@okfn.org.

Join us for this exciting opportunity to co-create and refine the Open Data Editor, while streamlining your internal work and increasing your social impact!

The Crushing Mental Load of Disability / Meredith Farkas

Photo credit: Statue Atlas by PeterKraayvanger on Pixabay

This isn’t the essay I was planning to publish next. I’ve been working on an essay about the lack of solidarity around COVID protection, but this is very much related. While I’ve been very sick since 2022, when I had my first COVID infection and then Long COVID, I came back from sabbatical in April 2024 with a diagnosis of an autoimmune condition and on medication that is designed to suppress my immune response. Since then, I have struggled to figure out how to erect boundaries that protect my health and keep me from having a flare, which means ensuring both that I don’t exhaust myself (mentally or physically) and that I don’t put myself in situations where there’s a good likelihood of my getting sick.

Something I’ve been struggling a lot with is when to say “no” when there is an in-person event at work — whether it’s an student outreach program, a meeting, a class, or something else. Sometimes these are fully optional things, sometimes they are things where volunteers are requested, and sometimes they are things where I’m straight-up expected to go. In most cases, my holding to a boundary and saying no means more work for someone else. More and more of our meetings now that worked fine online are going back to face-to-face. Sometimes I get the awkward experience of being one of a few people on Zoom in a mostly F2F meeting, but even those opportunities are dwindling. Once or twice I took a sick day rather than be forced to either take a risk that was unacceptable to me. There are a lot of things I don’t volunteer for anymore and I feel like that’s being noticed. And it’s not exactly fair to my colleagues when other campuses have all of their librarians participating in face-to-face outreach activities and mine has two COVID-cautious people. It makes me feel like I’m not being a “team player.”

The irony is that, pre-COVID, I was the librarian on my campus most likely to volunteer to do outreach events. I did lots of tabling, presented at a bunch of student and faculty orientations, and even organized library tabling at events like our Theatre program’s performances. I love love love interacting with our community. During the height of COVID, the one thing that gave me life was meeting with students on Zoom for research help. The fact that I’m putting my health first doesn’t mean I don’t care as much about students. It doesn’t mean I’m not dedicated to my job or a “team player.” If I’m dead or further disabled, I can’t help anyone. I still teach in-person classes (masked and with the door always open) and I have a weekly shift at the reference desk, but I also make online tutorials, am embedded in online classes, do virtual reference, and teach Zoom sessions for online synchronous classes. What I find strange is that things like making tutorials and supporting online classes seem to be considered optional while much of the face-to-face stuff is treated like it’s required. In a perfect world, we’d all contribute in our own ways based on our strengths, limitations, and passions, but it feels like that extends to some library work (the stuff I do most) and not other library work (the stuff that puts me at greatest risk).

I spent the first 15 years of my career with extremely porous boundaries if I had any at all. I’d been so programmed by my childhood to believe that I was a terrible, unloveable human being and that I had to make up for it by working hard and pleasing everyone. I think a lot of people-pleasers have a critic in their heads who tells them that they are a horrible person if they disappoint anyone. If a random stranger online doesn’t like you? If you have to set a boundary that supports your well-being but means saying no to a colleague? Well, that just proves what a piece of shit you are. Because I thought so badly of myself, I was plagued by the idea that I was never doing enough. So I always felt pressure to do all the things. I raised my hand and said yes to everything. I basically made librarianship my job, my hobby, my life. Work bled completely into my home life. And it wasn’t until around 2019, when I had a major health issue that had me thinking about mortality, that I started questioning how I was living my life. It wasn’t until really considered “what if you’re enough right now, just as you are?” that I was able to jump off that treadmill of striving and start unlearning those unhealthy habits and assumptions about myself.

In spite of the fact that I have done so much work on myself, my orientation towards work, and my boundaries in recent years, I feel a lot of anguish over these decisions. I think I still struggle with internalized ableism. I feel a lot of guilt that I can’t do all the same things my colleagues can, and that, when it’s an invisible disability, it’s often seen by others as a choice rather than a necessity. Because I could choose to put my health at risk (and have, in some cases out of a sense of duty). The risk is not always so clear cut and I have to be the one each time making the decision about whether something is too risky to do. And often, you’re making these decisions without all the information, like when this December, the International Student Resource Fair moved to a smaller space with many more people in attendance, which increased the risk to me exponentially. And I only found out about all that when I showed up that day and what could I do at that point? I ended up teaching an in-person class while I was on a month-long course of steroids (and so was even more immunocompromised than usual) because no one else was volunteering and I technically was available at the time (though so were others) and felt guilty. I have to make these risk-calculations every day, as do tons of other people in similar situations. There are no clear-cut guidelines for this, no metrics to easily help us navigate these decisions. I never know for sure if I’m being overly cautious or the polar opposite and someone else in the same situation may make completely different decisions. And tell me again how I’m supposed to be not exhausting myself mentally? This is another dimension of crip time that is far less liberatory than others.

Since COVID and flu cases were pretty low in my area in early December, I actually RSVP’ed yes to an end of term party for my division of the college. I’d planned to not eat and stay masked the whole time. But when I saw the huge number of people who RSVP’ed yes and I had no idea what room the celebration was being held in nor what safety precautions were being taken (from what I learned later, the answer was “none”), I decided not to go, which was a bummer because I miss socializing with my colleagues. It’s exhausting to constantly have to do this calculus for everything, to miss out on things, or to go to something that is a significant risk and feel anxiety about it. You can’t win. And then you have to deal with the perceptions of your colleagues who maybe think you’re being overly anxious or who don’t know about your condition at all and maybe think you’ve just become antisocial. But they don’t understand that the one time you actually went to a larger event in the past five years, totally masked the whole time and 6 weeks after getting vaccinated, you ended up getting COVID, which completely flattened you for almost a month. And after that, you had a horrible flare of your autoimmune condition which left you barely able to walk (and with vertigo, fatigue, neuropathy, and a host of other symptoms). And at that point, you’re then left with the choice of living in that flare for weeks or taking a course of steroids which will make you even more immunocompromised (among other side-effects) for a month or more. Your small choice of whether to go to a celebration or a student outreach event or a meeting could affect your ability to work and function for months, leaving your colleagues to pick up the pieces. They don’t have to think about these things (or they just don’t think about these things) and I wish I could do the same. I so miss just being able to go out and do stuff without a second thought! But given that I’m currently on day 6 of a flare that started because I got a tiny cold from my son that only lasted about two days, I know the consequences of that would be immense and long-lasting.

Me trying to figure out whether or not to go to an in-person meeting

I try to remind myself that these are accessibility issues and that no one is trying to make these in-person events or meetings more accessible to people like me and to folks who are still trying to not be disabled or killed by COVID. Not all of my colleagues know about my condition, but even of those who do, no one has made an effort to make in-person events more accessible (beyond offering a remote option sometimes, which I do appreciate as much as it makes me feel awkward). People don’t put on masks around me. The spaces in which these events are held don’t have open windows or air filters. We have a big all-day in-person meeting coming up in a month and I’m already feeling anxious about it. My dean has written that she expects in-person attendance, but also made it clear to me that she didn’t mean it for people with a medical issue. Still, the thought that I’d be the only person participating remotely is filling me with a sense of dread that I can’t even describe. But the idea of being in a closed room all day with my unmasked colleagues (save one who still masks) during the height of so many winter illnesses fills me with just as much dread and is objectively more risky. More calculating – shame vs. health? The title of Geena Davis’ memoir Dying of Politeness comes to mind in this situation. Does a wheelchair user feel responsible for not attending an event only accessible by staircase? I hope not. Yet I feel all too responsible for my situation.

People who don’t have disabilities don’t know that the disability itself is only one piece of what disables us. Contending with the ableist world around us is often just as much if not more of a cause of pain, depletion, and harm. It’s in those moments that I’m most keenly aware of my disability. I spend a lot of time where I don’t think about my disabilities at all and just live my life like anyone else (just perhaps with a bit more pain), but it’s when I’m at work and in other spaces where that disability becomes an issue that I feel hyper-aware – sometimes feeling invisible, or hypervisible, or somehow both at the same time. It’s the constant calculating that we have to do about the spaces we are going to enter that I find most exhausting. I recently started Margaret Price’s Crip Spacetime (available open access, yay!) and I felt seen right from the jump. Here’s a small excerpt from the beginning:

We know what the room ­we’re ­going to looks like, and we know how to ask—­ with charm and deference—if we need the furniture rearranged, the fluorescent lights turned off, the microphone turned on. We know how much pain it will cost to remain sitting upright for the allotted time. We know how to keep track of the growing pain, or fatigue, or need to urinate (­there’s no accessible bathroom), and plan our exit with something resembling dignity. We know that no one else will ever know. What ­you’ve just read is a litany—or maybe a rant. I use it for two reasons: first, to remind ­ those who ­ haven’t performed that series of calculations that they are an everyday experience for some of us; and second, to call to ­ those for whom the litany, with little adjustment, is painfully familiar. (Price, 2024, p. 1)

I love when she writes that “crip spacetime is un/shared” (p. 29). I might be existing in the same physical space as my colleagues, but we are experiencing it differently. They may sit in the same chair at the reference desk as I do, but they may not experience it as such a malevolent presence in their lives. At home, I have a lot of control over my environment, but that’s less the case when I’m at work. The cold aggravates my Raynaud’s and makes my fingers hurt (even when wearing gloves). Sitting at the reference desk in a chair too big for my body when even under the best circumstances, I can only manage about 40 minutes of sitting in any chair before I end up in a significant amount of pain. But put me in a hard or ill-fitting chair and it’s so much worse. Right now, I work at the reference desk for no more than two hours straight, but come next term, I’ll be doing four hours, which I’m worried about. I’ve thought about seeking an accommodation so I could stand most of the time, but I can’t picture what that would look like and the cost would probably be huge given the uncompromising setup of our giant reference desk (also, going through the invasive and dehumanizing accommodation process again fills me with more dread than I can describe). Again, more calculus. I work on campus on Mondays and when I come home, I’m in pain and am so exhausted that I usually don’t feel myself again until Wednesday. I’m pretty sure my colleagues don’t need two days to recover from their day on-campus. And I don’t want to complain because I feel very lucky that I can work from home the rest of the week. It’s a gift and one that I know could go away at any moment. And it’s probably the only thing that is allowing me to keep doing my job. I don’t think my body could take it if that changes. So I also don’t want to push things because I’m afraid that rocking the boat could just make things worse.

I’m still pretty early in my journey with this particular condition (and with recognizing I have disabilities at all though I’ve had migraines for 24 years) and I guess I’m going to have to learn to have thicker skin and stronger boundaries if I want to stay well. I really love the work I do, I love students, and I don’t want to disappoint anyone, but I need to keep reminding myself that there are many ways to contribute and that a huge percentage (41%) of our students don’t come to campus and also deserve outreach and support. I can still do valuable work and be a valuable part of our team and still take care of myself. I just need to stop caring how others see me when I refuse to do certain things or when I’m the only person (gulp) online in an in-person meeting, which is hard for a life-long people-pleaser. While I’d love it if people tried to make these spaces more accessible, all I can control in this situation is my own choices and I need to stop worrying about whether people think I’m not committed or am not a team player (and I’m not suggesting everyone thinks that; I have no idea). My work over the previous ten years should speak for itself. And if people really wanted me around at any of these events, they could show it by masking to keep me (and others, and also themselves) safe.

New Work Page! / LibraryThing (Thingology)

We’re excited to announce a major update to LibraryThing’s work pages—the pages you use to look at a work, edit your book, read reviews, get recommendations, etc. These pages are now easier to use and more informative.

Here’s some links to check it out:

Nobody likes change, so our goal was to improve work pages while keeping them familiar. We hope members aren’t too shocked, but come to love the new pages as much as we do.

The new work page was spearheaded by Chris (conceptdawg) and Lucy (knerd.knitter). They did a ton of great work to get us here! The missing element, however, is your reaction and suggestions for improvement, so come tell us what you think and talk about the changes on New Features: New Work Page!.

Major Improvements

  • “LT2” — Work pages join most other LibraryThing pages in being consistently formatted, fully “mobilized,” and accessible.
  • Your Books — The “Your Books” part of work pages is much improved, with better editing and the ability to choose which fields you want to see.
  • Quick Facts — We created a “Quick Facts” section on the right, with some of the key details, including publication year, genres and classifications. It works something like the info boxes on Wikipedia pages.
  • Side Bar — Besides “Quick Facts,” we’ve improved the right side panel with a popularity graph, a links section, author info and an improved share button.
  • Reviews — Reviews are now displayed and sorted better, with reviews from your friends and connections first. After that, we’re sorting reviews by a quality metric, incorporating thumbs-up votes, recentness and member engagement. Ratings have also been added to the reviews section, in a section after full reviews. Altogether, we think reviews will prove more useful and interesting.
  • Sections — All work-page sections can be collapsed and reordered by members, and a special “On This Page” area lays out what’s on the page, with links to jump there.
  • Classification — We fronted something LibraryThing is best at—library data—by giving classifications a prominent place in “Quick Facts.” Work pages now also include a “Classification” page with detailed information and charts about the work’s tags and genres as well as positions within the library classifications DDC/MDS, LCC, and—a new one—BISAC, the classification system used by publishers and booksellers.
  • Member Info — Hovering over a member’s name now pops up a quick summary and preview of their profile page, much as hovering over a work pops up a summary and preview of the work page. We’re testing this out here, but will expand it across the site.
  • Helper Hub — The works page now has a Helper Hub, listing everyone who’s contributed to the work, and a separate Helper Hub page, listing contributions by type.
  • Member Descriptions — A new type of member description field has been added on the “Community” page which includes the current haikus, but also has added options for adding five word descriptions, emoji descriptions, and “bad” descriptions. As enough of these are added, they will be included in TriviaThing!
  • Speed — Work pages now load faster.

Smaller Improvements

  • The “Your Books” section on a work page is blue. If you are looking at someone else’s book, however, the box turns yellow—making it more obvious what’s going on.
  • The “Edit Book” button is now at the bottom of the blue “Your Books” section, rather than the lefthand panel. On the “Book Details” page, you can also switch from “View” to “Edit” to edit your book.
  • The “Book Edit” page has a number of clever changes, such as an intuitive way to indicate the character a book’s title should sort by.
  • The collections menu is now easy and quick, so you can select or deselect as many collections as you want before closing the popup.
  • The “Reviews” section now has a “Rating” selector, and a revamped “Language” menu.
  • When you have multiple editions of a book, you now get small cards under the main card, so you can switch between your copies easily.
  • The “Quick Links” section has been streamlined and simplified.
  • The work sections have been reordered somewhat. If you don’t like the current order, you can reorder the sections, and the changes will “stick” for you.
  • A “Statistics” section at the bottom of the page lists key facts, including some new ones, covering the media (paper, ebook, audiobook) and languages the book has been published in. We also count up the ISBNs, UPCs and ASINs of all the editions.
  • The ratings graph on the right now defaults to showing only full stars—with half-stars rounded up. You can click the graph to see half-stars.
  • Empty sections are now hidden by default. There’s a button at the bottom of the work page to unhide them.
  • As with some other, new pages, Common Knowledge now defaults to a “View” mode. Click “Edit” to see the more detailed editing interface. The button here “sticks” so if you want to keep it in “Edit,” that’s fine.
  • The addition of publisher BISAC standards was mentioned above. The addition also includes a full set of BISAC pages, separate from the work pages, like CRA > CRAFTS & HOBBIES > Candle Making.
  • The “Editions” page now allows searching and sorting.
  • The “Share” button includes Threads and BlueSky.

Incomplete Features and Questions

  • The “Covers” page has a few improvements, including a better pop-up for each cover, and color coding of cover quality, but a larger revamp is still to come.
  • We’re still working on the “Collections” edit, which currently lacks a button to create new collections.
  • We’ve pulled back on LCSH (Library of Congress Subject Headings). A new much-expanded subject system—way beyond LCSH—is coming.
  • We’re eager to get feedback on the “Member Info” sections. If you don’t like them at all, you can turn them off, together with our work popups under Disable work and member info boxes.

That’s it! Thank you for reading. We’re eager to know what you think on Talk!

Using CloudFlare Turnstile to protect certain pages on a Rails app / Jonathan Rochkind

I work at a non-profit academic institution, on a site that manages, searches, and displays digitized historical materials: The Science History Institute Digital Collections.

Much of our stuff is public domain, and regardless we put this stuff on the web to be seen and used and shared. (Within the limits of copyright law and fair use; we are not the copyright holders of most of it). We have no general problem with people scraping our pages.

The problem is that, like many of us, our site is being overwhelmed with poorly behaved bots. Lately one of the biggest problems is with bots clicking on every possible combination of facet limits in our “faceted search” — this is not useful for them, and it overwhelms our site. “Search” pages are one of our most resource-constrained category of page in our present site, adding to the injury. Peers say even if we scaled up (auto or not) — the bots sometimes scale up to match anyway!

One option would be putting some kind of “Web Application Firewall” (WAF) in front of the whole app. Our particular combination of team and budget and platform (heroku) makes a lot of these options expensive for us in licensing, staff time to manage, or both. Another option is certainly putting the the whole thing behind (ostensibly free) CloudFlare CDN and using its built-in WAF, but we’d like to avoid giving our DNS over to CloudFlare, I’ve heard mixed reviews of CloudFlare free staying free, and generally am trying to avoid contributing to CloudFlare’s monopoly unaccountable control of the internet.`

Although ironically then, the solution we arrived at is still using CloudFlare, but Cloudflare’s Turnstile “captcha replacement”, one of those things that gives you the “check this box” or more often entirely interactive “checking if you are a bot” UXs.

[If you’re a tldr look at the code type, here’s the initial implementation PR in our open repo, there are some bug fixes since then]

While this still might unfortunately lock people using unconventional browsers etc out (just the latest of many complaints on HackerNews), we can use this to only protect our search pages. Most of our traffic comes directly from Google to an individual item detail page, which we can now leave completely out of it. We have complete control of allow-listing traffic based on whatever characteristics, when to present the challenge, etc. And it turns out we had a peer at another institution who had taken this approach and found it successful, so that was encouraging.

How it works: Overview

While typical documented Turnstile usage involves protecting form submissions, we actually want to protect certain urls, even when accessed via GET. Would this actually work well? What’s the best way to implement it?

Fortunately, when asking around on a chat for my professional community of librarian and archivist software hackers, Joe Corall from Lehigh University said they had done the exact same thing (even in response to the same problem, bots combinatorially exploring every possible facet value), and had super usefully written it up, and it had been working well for them.

Joe’s article and the flowchart it contains is worth looking it. His implementation is as a Drupal plugin (and used in at least several Islandora instances); the VuFind library discovery layer recently implemented a similar approach. We have a Rails app, so needed to implement it ourselves — but with Joe paving the way (and patiently answering our questions, so we could start with the parameters that worked for him), it was pretty quick work, bouyed by the confidence this approach wasn’t just an experiment in the blue, but had worked for a similar peer.

  • Meter the rate of access, either per IP address, or as Joe did, in buckets per sub-net of client IP address.
  • Once client has crossed a rate limit boundary (in Joe’s case 20 requests per 24 hour period), redirect them to a page which displays the Turnstile challenge — and has the original destination in a query param in url —
  • Once they have passed the Turnstile challenge, redirect them back to their original destination, which now lets them in because you’ve stored their challenge pass in some secure session state.
  • In that session state record that they passed, and let them avoid a challenge again for a set period of time.

Joe allow-listed certain client domain names based on reverse IP lookup, but I’ve started without that, not wanting the performance hit on every request if I can avoid it. Joe also allow-listed their “on campus” IPs, but we are not a university and only have a few staff “on campus” and I always prefer to show the staff the same thing our users are seeing — if it’s inconvenient and intolerable, we want to feel the pain so we fix it, instead of never even seeing the pain and not knowing our users are getting it!

I’m going to explain and link to how we implemented this in a Rails app, and our choices of parameters for the various parameterized things. But also I’ll tell you we’ve written this in a way that paves the way to extracting to a gem — kept everything consolidated in a small number of files and very parameterized — so if there’s interest let me know. (Code4Lib-ers, our slack is a great place to get in touch, I’m jrochkind).

Ruby and Rails details, and our parameters

Here’s the implementing PR. It is written in such a way to keep the code conslidated for future gem extraction, all in the BotDetectController class, which means kind of weirdly there is some code to inject in class methods in the controller. While it does turnstile now, it’s written with variable/class names such that analagous products could be made available.

Rack-attack to meter

We were already using rack-attack to rate-limit. We added a “track” monitor with our code to decide when a client had passed a rate-limit gate to require a challenge. We start with allowing 10 requests per 12 hours (Joe at Lehigh did 20 per 24 hours), batched together in subnets. (Joe did subnets too, but we do smaller /24 (ie x.y.z.*) for ipv4 instead of Joe’s larger /16 (x.y.*.*)).

Note that rack-attack does not use sliding/rolling-windows for rate limits, but fixed windows that reset after window period. This makes a difference especially when you use such a long period as we are, but it’s not a problem with our very low count per period, and it does keep the RAM extremely effiicent (just an integer count per rate limit bucket).

When the rate limit is reached, the rack-attack block just sets a key/value in the rack_env to tell another component that a challenge is required. (setting in the session may have worked, but we want to be absolutely sure this will work even if client is not storing cookies, and this is really only meant as this-request state, so rack env seemed the good way to set state in rack-attack that could be seen in a rails controller)

Rails before_action filter to enforce challenge

There’s a Rails before_action filter that we just put on the application-wide ApplicationController, that looks for the “bot challenge key” required in the rack env — if present, and there isn’t anything in the session saying they have already passed a bot challenge, then we redirect to a “challenge” page, that will display/activate Turnstile.

We simply put the original/destination URL in a query param on that page. (And include logic to refuse to redirect to anything but a relative path on same host, to avoid any nefarious uses).

The challenge controller

One action in our BotDetectController just displays the turnstile challenge. The cloudflare turnstile callback gives us a token we need to verify server-side with turnstile API to verify challenge was really passed.

the front-end does a JS/xhr/fetch request to the second action in our BotDetectController. The back-end verify action makes the API call to turnstile, and if challenge passed, sets a value in Rails (encrypted and signed, secure) session with time of pass, so the before_action guard can give the user access.

if the JS in front gets a go-ahead from back-end, it uses JS document.replace to go to original destination. This conveniently removes the challenge page from the user’s browser history, as if it never happened, browser back button still working great.

In most cases the challenge page, if non-interactive, wont’ be displayed for more than a few seconds. (the language has been tweaked since these screenshots).

We currently have a ‘pass’ good for 24 hours — once you pass a turnstile challenge, if your cookies/session are intact, you won’t be given another one for 24 hours no matter how much traffic. All of this is easily configurable.

If the challenge DOES fail for some reason, the user may be looking at the Challenge page with one of two kinds of failures, and some additional explanatory text and contact info.

Limitations and omissions

This particular flow only works for GET requests. It could be expanded to work for POST requests (with an invisible JS created/submitted form?), but our initial use case didn’t require it, so for now the filter just logs a warning and fails for POST.

This flow also isn’t going to work for fetch/ajax requests, it’s set up for ordinary navigation, since it redirects to a challenge then redirects back. Our use case is only protecting our search pages — but the blacklight search in our app has a JS fetch for “facet more” behavior. Couldn’t figure out a good/easy way to make this work, so for now we added an exemption config, and just exempt requests to the #facet action that look like they’re coming from fetch. Not bothered that an “attacker” could escape our bot detection for this one action; our main use case is stopping crawlers crawling indiscriminately, and I don’t think it’ll be a problem.

To get through the bot challenge requires a user-agent to have both JS and cookies enabled. JS may have been required before anyway (not sure), but cookies were not. Oh well. Only search pages are protected by the bot challenge.

The Lehigh implementation does a reverse-lookup of the client IP, and allow-lists clients from IP’s that reverse lookup to desirable and well-behaved bots. We don’t do that, in part because I didn’t want the performance hit of the reverse-lookup. We have a Sitemap, and in general, I’m not sure we need bots crawling our search results pages at all… although I’m realizing as I write this that our “Collection” landing pages are included (as they show search results)… may want to exempt them, we’ll see how it goes.

We don’t have any client-based allow-listing… but would consider just exempting any client that has a user-agent admitting it’s a bot, all our problematic behavior has been from clients with user-agents appearing to be regular browsers (but obviously automated ones, if they are being honest).

Possible extensions and enhancements

We could possibly only enable the bot challenge when the site appears “under load”, whether that’s a certain number of overall requests per second, a certain machine load (but any auto-scaling can make that an issue), or size of heroku queue (possibly same).

We could use more sophisticated fingerprinting for rate limit buckets. Instead of IP-address-based, colleague David Cliff from Northeastern University has had success using HTTP user-agent, accept-encoding, and accept-language to fingerprint actors across distributed IPs, writing:

I know several others have had bot waves that have very deep IP address pools, and who fake their user agents, making it hard to ban.

We had been throttling based on the most common denominator (url pattern), but we were looking for something more effective that gave us more resource headroom.

On inspecting the requests in contrast to healthy user traffic we noticed that there were unifying patterns we could use, in the headers.

We made a fingerprint based on them, and after blocking based on that, I haven’t had to do a manual intervention since.

def fingerprint
result = “#{env[“HTTP_ACCEPT”]} | #{env[“HTTP_ACCEPT_ENCODING”]} | #{env[“HTTP_ACCEPT_LANGUAGE”]} | #{env[“HTTP_COOKIE”]}”
Base64.strict_encode64(result)
end

…the common rule we arrived at mixed positive/negative discrimination using the above

request.env["HTTP_ACCEPT"].blank? && request.env["HTTP_ACCEPT_LANGUAGE"].blank? && request.env["HTTP_COOKIE"].blank? && (request.user_agent.blank? || !request.user_agent.downcase.include?("bot".downcase))

so only a bot that left the fields blank and lied with a non-bot user agent would be affected

We could also base rate limit or “discriminators” for rate limit buckets on info we can look up from the client IP address, either a DNS or network lookup (performance worries), or perhaps a local lookup using the free MaxMind databases that also include geocoding and some organizational info.

Does it work?

Too early to say, we just deployed it!

I sometimes get annoyed when people blog like this, but being the writer, I realized that if I wait a month to see how well it’s working to blog — I’ll never blog! I have to write while it’s fresh and still interesting to me.

But encouraged that colleagues say very similar approaches have worked for them. Thanks again to Joe Corral for paving the way with a drupal implementation, blogging it, discussing it on chat, and answering questions! And all the other librarian and cultural heritage technologists sharing knowledge and collaboration on this and many other topics!

I can say that already it is being triggered a lot, by bots that don’t seem to get past it. This includes google bot and Meta-ExternalAgent (which I guess is AI-related; we have no particular use-based objections we are trying to enforce here, just trying to preserve our resources). While Google also has no reason to combinatorially explore every facet combination (and has a sitemap), I’m not sure if I should exempt known resource-considerate bots from the challenge (and whether to do so by trusting user-agent or not; our actual problems have all been with ordinary-browser-appearing user-agents).

Update 27 Jan 2025

Our original config — allowing 10 search results per IP subnet before turnstile challenge — was not enough to keep the bot traffic from overwhelming us. Too many botnets had enough IPs making apparently fewer than 10 requests each.

Lowering that to 2 requests was enough to reduce traffic enough. (Keep in mind that a user should only get one challenge per 24 hours unless IP address changes — although that makes me realize that people using Apple’s “private browsing” feature may get more, hmm).

Pretty obvious on these heroku dashboard graphs where our succesful turnstile config was deployed, right?

I think I would be fine going down to challenge on first search results, since a human user should still only get one per 24 hour period — but since the “success passed” mark in session is tied to IP address (to avoid session replay for bots to avoid the challenge), I am now worried about Apple “private browsing”! In today’s environment with so many similar tests, I wonder if private browsing is causing problems for users and bot protections?

You can see on the graph a huge number of 3xx responses — those are our redirects to challenge page! The redirect to and display of the challenge page seem to be cheap enough that they aren’t causing us a problem even in high volume — which was the intent, nice to see it confirmed at least with current traffic.

We are only protecting our search result page, not our item detail pages (which people often get to directly to google) — this also seems succesful. The real problem was the volume of hits from so many bots trying to combinatorially explore every possible facet limit, which we have now put a stop to.

Exploring Tweets to Donald Trump / Nick Ruest

Overview

The dataset was collected with Documenting the Now’s twarc using a combination of the Twitter Search and Filter (Streaming) APIs. Between May 7, 2017, and October 16, 2018, data collection utilized both the Filter (Streaming) API and the Search API; however, the Filter API failed on June 21, 2017. From June 23, 2017, onward, only the Search API was employed. Data collection was automated to run every five days using a cron job, with periodic deduplication. A data gap occurred between Tue Jul 28, 2020, 13:53:50 +0000, and Thu Aug 06, 2020, 09:36:23 +0000, due to a collection error. The collection resulted in 362,464,578 unique tweets. Which represents 1.5T of jsonl on disk 🤯

I also have a overview of this dataset here as well, but this post will provide a little bit more detail.

Tweets to Donald Trump tweet volume Tweets to Donald Trump Volume

There are a few other Tweet Volume graphs embedded in the post mentioned above that are worth checking out as well.

Tweets to Donald Trump wordcloud Tweets to Donald Trump wordcloud

In addition to the wordcloud of the entire dataset, I also created a wordcloud for each day of a month of tweets from this dataset back in 2017.

Unbelievably, four years later, we’re back with this godforsaken hamberder of a dataset and somehow, this pants-shitting, sexually abusive, defaming, convicted felon, rapist, wannabe dictator, and walking embodiment of corruption has slithered his way back into the presidency, proving once again that rock bottom is just a pit stop. Anyway, how are you feeling? Here’s a little video snippet from a project I worked on back in 2018. I extracted the text of every tweet containing the 🤡 emoji and used ImageMagick to create a “small” animated GIF (a mere 1.7GB 🤣🤣🤣).

Clown Car

Top languages

Using the full line-oriented JSON dataset and twut:

import io.archivesunleashed._

val tweets = "deardonald.jsonl"
val df = spark.read.json(tweets)

df.language
  .groupBy("lang")
  .count()
  .orderBy(col("count").desc)
  .show(10)
+----+---------+
|lang|    count|
+----+---------+
|  en|298997602|
| und| 54168869|
|  es|  2190225|
|  fr|   746477|
|  tl|   735650|
|  in|   574002|
|  pt|   440357|
|  ar|   410298|
|  tr|   363490|
|  zh|   344408|
+----+---------+

Top tweeters

Using deardonald-user-info.csv from df.userInfo.coalesce(1).write.format("csv").option("header", "true").option("escape", "\"").option("encoding", "utf-8").save("deardonald-user-info"), and pandas:

Tweets to Donald Trump Top Tweeters Tweets to Donald Trump Top Tweeters

Retweets

Using the full line-oriented JSON dataset and twut:

import io.archivesunleashed._

val tweets = "deardonald.jsonl"
val df = spark.read.json(tweets)

df.mostRetweeted.show(10, false)
+------------------+-----------------+                                          
|tweet_id          |max_retweet_count|
+------------------+-----------------+
|740973710593654784|569401           |
|870090101765931008|317660           |
|836060793783267328|257915           |
|835488569850494976|210894           |
|836060875169542145|194172           |
|865238855800279040|164118           |
|741007091947556864|163191           |
|633769653421109248|157976           |
|871342122984759297|156888           |
|822519410212515840|152282           |
+------------------+-----------------+

From there, we can use append the tweet ID to https://twitter.com/i/status/ to see the tweet. Here’s the top three:

  1. 569,401

  2. 317,660

  3. 257,915

Top Hashtags

Using deardonald-hashtags.csv from df.hashtags.coalesce(1).write.format("csv").option("header", "true").option("escape", "\"").option("encoding", "utf-8").save("deardonald-hashtags"), and pandas:

Tweets to Donald Trump hashtags Tweets to Donald Trump hashtags

Top URLs

Using the full line-oriented JSON dataset and twut:

import io.archivesunleashed._

val tweets = "deardonald.jsonl"
val df = spark.read.json(tweets)

df.urls
  .filter(col("url").isNotNull)
  .groupBy("url")
  .count()
  .orderBy(col("count").desc)
  .show(10, false)
+-------------------------------------------------------------+------+          
|url                                                          |count |
+-------------------------------------------------------------+------+
|https://twitter.com/realdonaldtrump/status/870077441401905152|292199|
|https://t.co/M7oK5Z6qwF                                      |280188|
|https://twitter.com/realDonaldTrump/status/865173176854204416|168196|
|https://t.co/8yJIzZBSE8                                      |140706|
|https://twitter.com/realdonaldtrump/status/865173176854204416|81361 |
|https://twitter.com/realdonaldtrump/status/869766994899468288|75358 |
|https://twitter.com/realdonaldtrump/status/871143765473406976|68377 |
|https://youtu.be/mJ74YdyuYHg                                 |65590 |
|https://t.co/B8egsFLlg7                                      |65590 |
|https://twitter.com/realdonaldtrump/status/870445001125355522|57219 |
+-------------------------------------------------------------+------+

Top media urls

Using the full line-oriented JSON dataset and twut:

import io.archivesunleashed._

val tweets = "deardonald.jsonl"
val df = spark.read.json(tweets)
  
df.mediaUrls
  .filter(col("media_url").isNotNull)
  .groupBy("media_url")
  .count()
  .orderBy(col("count").desc)
  .show(10, false)  
+--------------------------------------------------------------------------------------------------+-----+
|media_url                                                                                         |count|
+--------------------------------------------------------------------------------------------------+-----+
|https://pbs.twimg.com/media/DAgpPEmXoAASp6X.jpg                                                   |31922|
|https://pbs.twimg.com/media/DBIW_UUU0AAaX6s.jpg                                                   |28888|
|https://pbs.twimg.com/media/DB6tiJoWAAEh_Vc.jpg                                                   |27461|
|https://pbs.twimg.com/media/ETFp5GEX0AEfp13.jpg                                                   |26327|
|https://video.twimg.com/amplify_video/1273369725490380804/vid/640x360/IlaFPv-rbURV8bgo.mp4?tag=13 |19983|
|https://video.twimg.com/amplify_video/1273369725490380804/vid/1280x720/5Ohn7o3xJsE5O25p.mp4?tag=13|19983|
|https://video.twimg.com/amplify_video/1273369725490380804/pl/F8qBWngbsARSlkYa.m3u8?tag=13         |19983|
|https://video.twimg.com/amplify_video/1273369725490380804/vid/480x270/7fgP_0uOckJ2AVLu.mp4?tag=13 |19983|
|https://pbs.twimg.com/media/EbhZt7OWkAAs-kY.jpg                                                   |19469|
|https://pbs.twimg.com/media/CovKyD6WgAA2fDs.jpg                                                   |18638|
+--------------------------------------------------------------------------------------------------+-----+

A couple years ago I created a juxta (collage) of some of the images from this dataset. It features 17,525,913 images tweeted at Donald Trump from May 2017 through January 2019, and you can check it out here. I also have a post about creating the collage here. If you want to see the clown tweets, here is a juxta of that too! Or images from replies to “Totally clears the President. Thank you! or “HAPPY NEW YEAR TO EVERYONE, INCLUDING THE HATERS AND THE FAKE NEWS MEDIA!.”

Emotion score

Using deardonald-text.csv from df.text.coalesce(1).write.format("csv").option("header", "true").option("escape", "\"").option("encoding", "utf-8").save("deardonald-text"), Polars, and j-hartmann/emotion-english-distilroberta-base:

Emotion Distribution in Tweets to Donald Trump Emotion Distribution in Tweets to Donald Trump

(That chart took 580.5 hours to complete! Thank you Jimmy!!)

Persistence Tea / Coral Sheldon-Hess

Preface

Before we start, I should point out: I am not a doctor. I have no medical training. I only have a couple years’ worth of herbal studies, so I consider myself a student of herbalism more than a practitioner. Nothing in this post should be read as medical advice or even as herbal advice; I am, in fact, barred from providing herbal advice by virtue of being a Maine Master Gardener Volunteer.

But I am allowed to tell you about my final project for the Racemes of Delight program, which I’ve worked on since classes ended in early November. I hope you find value in what I’m sharing, but it’s a project report, not advice.

I will give this one piece of advice, though: if you’re going to try a new-to-you herb, especially if you have any medical conditions or are on any medications, it’s best to talk to your doctor and/or pharmacist first and also to start with only a small amount, to make sure it agrees with you.

If you want to skip ahead, you can go straight to the recipe, as long as you promise to at least read the bold parts of the precautions as well.

Project goals

The Racemes of Delight program ends in a medicine show. Each student brings one or more teas, tinctures, salves, shrubs, oxymels, etc. that they’ve formulated over the last few months of class to show off, ideally allowing others to smell and taste their creations. Because I only attended distance sessions, it didn’t occur to anyone that I could present a project over Zoom until a couple of days before it happened. Rather than rush, I opted to take more time creating something and to share it via a blog post. (And Zoom was misbehaving all that day, so I’m glad this is how hit ended up.)

For my project, I set out to create a tea blend that might help a person through hard times … just, you know, as a fun hypothetical exercise.

Imagine, if you will, living during a difficult period of history, a time in which one’s heart is heavy with the weight of everything they’re witnessing, in which one feels tired and depleted but also overwhelmed, anxious, and stressed. Imagine someone having to constantly fight feelings of helplessness, because it all seems like so much. There’s danger ahead, and also uncertainty. … That would be hard on a person’s body and mind.

So my goal was blend some herbs into a tea that might relieve a little bit of a person’s heartache and stress, help them ground themself, and give them some strength to help them carry on, working to save what and who they can. The tea would also have to taste nice, both because someone living through this would deserve some pleasant experiences and because “drink your tea” shouldn’t become another item on a to-do list during hard times; it should be a source of joy.

I wanted it to be helpful to as many people as it could be, so I tried to use herbs that are safe for as many people as possible and for which the safe dosages are high (“food-like,” to use the terminology). I also wanted to stick to herbs that are easy to acquire. And here is a little bit of herbal jargon that you’re free to skip, but I was aiming to get as close to energetically neutral as I could: nothing that anyone would find too cooling, heating, drying, or moistening.

In a departure from my usual approach to herbs, I focused more on the emotional side of these plants than on their effects on the physical body, though I (of course!) considered both, as you’ll see if you read the next section.

Research montage

I knew from the first glimmer of the idea for this tea that I would be building it around hawthorn; it’s one of my favorite herbs, a relaxing nervine with a strong affinity for the heart (both emotionally and physically), offering nutrition and fighting inflammation. Rosalee de la Forêt says (same link) hawthorn “is for when you’re feeling broken hearted, experiencing a loss of heart or when you find yourself in a self-protective and walled-off survival mode during a stressful season.” And I can’t find the quote — I think it was said aloud during one of Rosalee de la Forêt’s classes — but it was something like, “if you love someone, give them a mug of hawthorn tea each morning, but only if you love them a lot,” with the implication that it’ll make them live longer.

It is also a protective herb, and I would wish protection upon anyone who felt a need for this tea.

Hawthorn’s leaves and flowers were an obvious choice for a tea. I’d have loved to have used the berries (“haws”), too, but they need longer in the water (really, some time at a boil) to release their constituents and their flavor.

Energetically, hawthorn is slightly cooling and considered “dry to neutral.” Maybe it’s just me misinterpreting its astringency as dryness (herbal energetics aren’t my best thing), but I felt like I needed a moistening plant to balance it, something else with a heart affinity. My first choice was violet, another nourishing plant, but I liked that it has a more active vibe, offering a sense of joy and of movement. One source mentioned using it for “anger headaches and ‘discontented mind,'” which felt pretty perfect. And I like its taste. The thing is, violet can’t be found at any online herb retailer right now; even my own supply is running low. Perhaps in the spring, it could be re-added to this formulation, but for winter I’ve had to leave it out. Its replacement was the last herb I added, though, so I’m going to skip to the next herb I settled on:

Lavender insisted on being included, or perhaps it was my brain that insisted lavender should go in. I’ve found myself craving it, and I couldn’t bring myself to leave it out of the tea entirely — though its contribution to the flavor is pretty subtle. Lavender is another relaxing nervine with anti-inflammatory properties, but its inclusion is largely about getting necessary rest and grounding oneself. To be formal, I’d point out that it has been well studied for fighting anxiety, depression, headaches, and pain. And, sure, I’d acknowledge that some (most?) of those studies are about the scent of the essential oil, rather than the constituents of the herb itself. But informally, on a personal level? It helps me with all four of those things. To be even less formal? I just feel like it’s a great herb for turning down the volume and getting out of frenzy mode, which is something someone living through challenging times might need.

I wanted ashwagandha in this tea. I really did. It’s calming while increasing energy, which felt so necessary. It gives a cognition boost, which seemed useful. It’s safe even in pretty large quantities and doesn’t taste bad. … But it’s a root. People most often take it as a powder. If you’re going to try infusing and drinking a root, you aren’t going to do it in a cup with a bunch of leaves, as a rule. Roots usually need longer to steep, and I just can’t find much in the way of reliable sources talking about ashwagandha as a tea. I’ll leave this page of information about ashwagandha here, because I’m looking for ways to add it to my life; maybe it’ll appeal to other people, too.

I considered rose, as well. I liked its flavor (in very small quantities) in an early version of the tea, and I don’t regret buying some to have on hand for myself — I’ve been sitting on a rose cookie recipe for two years! — but being realistic, I thought its inclusion would drive the price too high, for properties we could get from other, less expensive herbs. Plus, even more so than lavender, people might be tempted to pull rose petals from florists’ bouquets, and there are some nasty pesticides there. So it was rejected, though of course anyone who already had some around could certainly add a tiny sprinkle if it would bring them joy.

Oatstraw showed up in an early version of this tea blend, too, but I later learned (relearned?) that that’s really better as an overnight infusion or boiled for a long time, which was, after all, what I had originally bought it for. (I like to throw that and some hawthorn berries into a pot and cook them for a while to make a nice nutritious decoction.) So no oatstraw in this blend, because it’s better used in other ways.

There was another relaxing nervine that I felt I needed to consider: holy basil, also known as tulsi. It is beloved among my herbal community, but it has just never settled all that well for me when I’ve tried consuming it. I don’t know if it’s too dry — I don’t think it’s too hot, because ginger is my best pal and also extremely hot — or if there’s something else about it, but it’s an herb I’ve bounced off of several times. I still felt that it deserved a fair shot, so I purchased the mildest form, Rama tulsi (Ocimum sanctum), which is conveniently also the easiest to buy, and decided it would be one of the smaller parts of the blend, if it went in.

Why the fair shot? Well, I had to give up on aswhagandha but still wanted something to help with thinking straight during stressful times. Tulsi is a powerful adaptogen (there’s a great definition on that page for anyone who wants it), anti-inflammatory, stress-relieving, and just super full of really nice properties. I’ve heard multiple people in my class say that they find it soothing and joyful; one person compared its energy to that of rose, even saying it tasted “pink” to them.

And it tasted nice with the other herbs in the early versions of this tea that used violet, so it stayed. (It does not taste like the basil most of us have in our kitchen.)

The last herb to join the blend was one that I didn’t have as much familiarity with. I started looking at it because it has a lot in common with violet: an association with joy, affinity with the heart, cooling and moistening energetics, sweet taste, usefulness in a nourishing infusion (which means it offers vitamins and minerals), immune modulation, and anti-inflammatory properties. I wasn’t sure I would like its flavor as much as violet’s, and I was a little sad to lose the “get things moving” aspect of violet’s actions, but I ordered some and gave it a try. I’m glad I did! A couple of weeks ago linden took its place as the last ingredient in — but also the second largest constituent of — this tea.

The thing I like about linden, in this tea? Besides all those things in the last paragraph? If one’s muscles are tight from stress, like maybe one needs reminders to “get your shoulders out of your ears” or “unclench your jaw,” linden’s an herb worth considering. That felt right. Linden has a lot going for it.

So that’s the tea: hawthorn leaves and flowers, linden leaves and flowers, Rama tulsi leaves, and lavender flowers.

I take it as a good sign that pairs of these herbs are often used together:

The recipe, two ways

By volume (which is how I blended it):

  • 2 parts dried hawthorn leaves and flowers
  • 2 parts dried linden leaves and flowers
  • 1 part dried lavender flowers
  • 1 part dried Rama tulsi leaves

By weight (approximate, based on measured weights while I blended by volume):

  • 3 parts dried hawthorn leaves and flowers
  • 2 parts dried linden leaves and flowers
  • 1 part dried lavender flowers
  • 1 part dried Rama tulsi leaves

When I’m mixing up tea to taste, each “part” in the recipe by volume is a tablespoon, and it goes in a little jar. After I try a few cups and know I like it, I’ll make a larger batch: depending on the size of my storage container, I use whatever makes sense, maybe a quarter-cup or third-cup measuring cup if it’s going in a pint jar. The main thing is to keep one’s dried herbs in something airtight.

To make a mug of tea, put a heaping teaspoon into an infuser (either one of these mesh dealies or, for folks who can’t abide any plant matter in their tea, a paper tea filter) and steep for 5 minutes, covered — that’s how you get the most out of the tulsi. (Or one could steep longer, uncovered, which is what I do most of the time.)

I use at least a tablespoon of tea for a 20 ounce mug.

Perhaps because the linden and hawthorn are both a bit astringent, my second taste tester (my spouse) thought I’d given him a black tea with herbs added, rather than a pure tisane (herbal tea, no actual tea leaves). We agree that, like a black tea, this could be enjoyed with milk and sweetener, if one so desires. I usually put in just a tiny bit of honey.

Once I’ve made someone else’s recipe, I always feel welcome to add things that I enjoy. For instance, assuming I didn’t have any trouble with stomach acid (no GERD or acid reflux) and liked the flavor, I’d consider throwing a pinch of whichever mint (peppermint, spearmint, lemon mint, etc.) most appeals to me into a cup, to see how I like it. Or food grade rose petals, if I had them on hand already. Or if it’s nearly bedtime, maybe some chamomile (I mean, I wouldn’t, because I’m allergic to it, but if I weren’t, it’d be a nice addition), or lemon balm (assuming my thyroid is generally well-behaved).

Precautions

All of the precautions (particularly the quoted portions) come from Rosalee de la Forêt’s information pages on these herbs and from the A-Z Guide to Drug-Herb-Vitamin Interactions Revised and Expanded 2nd Edition, edited by Alan Gaby.

Hawthorn is not to be used by people who have diastolic congestive heart failure. If I were in that boat, I would be inclined not to mess around with it at all, and I might reject this tea outright unless a medical professional told me otherwise.

Hawthorn requires caution for people who are on heart medications, especially digitalis and beta blockers. If I were in that boat, I would check with my doctor and an experienced herbalist, who might agree to allow this tea with the hawthorn cut down a bit, or as-is but in a very limited amount per day. (More on amounts below.)

A very small number of people experience contact dermatitis from linden, or find it hypes them up rather than calming them down. Since it was a new herb for me, I started by trying a very small amount, to make sure I tolerated it well.

DO NOT use lavender from a flower shop. Food-grade lavender and home-grown, pesticide-free lavender are the only safe lavenders to consume. (The species of lavender is less important than its intended use, but most herbalists use Lavandula angustifolia or another Lavandula species or cultivar.)

Tulsi may have an anti-fertility effect … and thus should not be taken by couples wishing to conceive” or by people who are pregnant.

Those who are taking insulin to control their diabetes may need to adjust their insulin levels while taking tulsi.” Having consistent amounts from day to day may help its effects become more predictable.

Sometimes people have allergies to herbs. So any new herb should be tried in small quantities, just in case. I’m allergic to like half the Asteraceae family and most of the Apiaceae/Umbellliferae, for instance.

Any herb can make you feel nauseated or cause other digestive distress in too large a dose. If that happens, decrease the amount. If it still happens, maybe you’ve found an herb that doesn’t agree with you. It happens, because bodies and plants are both weird.

This tea would be difficult, but technically not impossible, to drink too much of. When you gather Rosalee de la Forêt’s dosages of these four herbs, our very safe friend lavender appears to be the limiting factor; and even so, my math suggests a limit of 25 tablespoons of the dried herb mix per day (or ~200 ounces of prepared tea), once a person has determined that it agrees with them in smaller amounts.

  • Hawthorn leaf and flower dosage: up to 30 grams per day
  • Lavender flower dosage: 1 to 3 grams per day
  • Linden flower and bracts dosage: 15-30 grams per day
  • Tulsi leaves dosage: 2 grams to 113 grams (by weight)

Acquiring ingredients

I bought my herbs to test out my recipes from Mountain Rose Herbs (hawthorn, linden, Rama tulsi, lavender, ashwagandha); they also sell both styles of tea infuser. I’ve had luck in the past with Starwest Botanicals and occasionally with Frontier Co-op. Three of the herbs, plus infusers, are available at my old Pittsburgh go-to, Prestogeorge Coffee & Tea. (I have no relationship with any of these vendors/shops, though my membership in United Plant Savers gets me a discount at one or more of them.)

Clearly, I trust Rosalee de la Forêt. (I link her a lot here, because I can’t directly link to my notes from Wild Cherries or Racemes of Delight. Besides following her online, I’ve read one of her books and am taking one of her classes.) I also trust the Herbal Academy (with whom I’ve taken a class, too), and they have a longer list of suppliers, some with discount codes.

And, look: I have a fair amount of these herbs on hand, now. If we’ve exchanged addresses in the past, or phone calls, or hugs, I can just … send you some tea. Drop me an email or Discord message if you’d like to try it. (I can’t give you advice, but I can give you tea. I’d package it really formally if anyone international wanted it, because Customs is weird about these things.)

Final note

I reserve the right to improve upon this recipe over time. Like I said, I’m just a student. I imagine I’ll learn more about herbs and about blending teas, and my opinions will change. I’ll mark what I edit when I do, though, or if it’s a whole new recipe, I’ll make a whole new post and link it here.

Bibliomancy as the new PKM / Mita Williams

I love the practice of bibliomancy because re-introduces myself to the books that I have bought, and it re-animates my writing with ideas that I've already responded to and may have forgotten.

A new round of strategic funding to enhance data literacy and accessibility through the Open Data Editor / Open Knowledge Foundation

We are pleased to announce that the Open Knowledge Foundation (OKFN) has been selected as a grantee of the Patrick J. McGovern Foundation for the second year in a row to continue working on improving the Open Data Editor (ODE) application, making it more accessible and widely used by organisations around the world, and increasing data literacy in key communities. The Patrick J. McGovern Foundation, in a recent announcement, revealed a total grant allocation of $73.5 million to advance AI innovation with social purpose, $395,000 of which will be allocated to OKFN. To truly democratise AI and unlock innovation by the many, it is supporting efforts like ours, focusing on learning by doing, as well as increase the confidence of small organisations to apply AI to their work. 

ODE is Open Knowledge’s new open source desktop application that makes working with data easier for people with little or no technical skills. It helps users validate and describe their data, improving the quality of the data they produce and consume. The release of stable version 1.2.0 was announced in early December, followed by a Release Webinar with early adopters early this week.

A different AI future, with communities shaping our tech 

Collaboration has been at the heart of this project since its inception in January 2024. Counting on the Patrick J. McGovern Foundation’s critical support, we conducted extensive user research to understand what data practitioners with no technical skills needed most, and these insights helped us change course several times throughout the year. With feedback always in mind, our team designed an interface that is not only visually appealing but also clear, simple, and easy to navigate. Every feature of the Open Data Editor was chosen with the user experience in mind, making it easy to explore and identify errors in tables.

Now, our challenge is to move from an application to a skilled community of practice. This year, our goal was to create an open source application that is, above all, simple and useful, following our vision of The Tech We Want of open, long-lasting, resilient and affordable technologies that are good enough to solve people’s problems. Going ahead, we will improve the app technology and develop literacy and a governance structure among the communities using it. 

Learning by doing through pilots with public interest organisations 

We want to expand the app’s use while building data literacy in key communities. We are collaborating with organisations and collectives, helping them integrate data and learn good practices for their work. We started to test this approach with (add here two lines about StoryData and ACIJ). 

There will be more of that in the upcoming year. With this generous support, we will: 

🌱 Launch community pilots around the world, working together with ten key communities. 

⚙ Pilot how to develop affordable, sustainable and scalable technologies, sharing our knowledge with others. 

🤖 Increase AI literacy  with a practical approach, engaging in conversations about AI integration and potential interactions with open and offline LLMs. 

🎈 Convey a wider network of allied communities: Rather than competing with similar initiatives and projects, we want to build a stronger alliance with like-minded organisations to pool our resources, share best practices and ultimately support each other. We will use ODE as a vehicle to build a network of allied communities who are committed to ensuring that data is open and FAIR, and that technology is community-driven and human-centred.

Engage & Learn More

If you are interested in improving your data literacy skills and getting involved in community feedback sessions and pilots with the Open Data Editor, please contact us at info@okfn.org and share your thoughts. Together, we’ll make ODE even better.

To learn more about the Patrick J. McGovern Foundation’s commitment to redefining AI innovation with social purpose, read the official announcement

To learn more about the Open Data Editor and its journey towards a no-code data application for everyone, you can visit the selected content below.

Collective collections through collective wisdom / HangingTogether

Collective collections—the combined holdings of multiple institutions, analyzed and sometimes even managed as a single resource—have transformed both the stewardship and impact of library collections. OCLC Research’s latest work in this field highlights a key insight into operationalizing collections at scale: collective wisdom, in the form of aggregated data and shared practitioner knowledge, makes collective collections work. Our research shows that collective wisdom in these forms supports the sustainability and strategic management of shared monographic print collections. 

OCLC Research has a long history of studies focused on collective collection analysis. The scope of this work is extensive, touching on the intersection of collective collections with a host of library strategic interests. A frequent topic addressed in this work is the role of collective collections in the cooperative management of monographic print collections – shared print programs. In a 2020 College & Research Libraries article summarizing some of our insights, we note that collective collection analysis supports local decision-making by making it system-aware: 

“The system can be a group, a consortium, a region, or even all libraries everywhere. Knowledge about the collective collection helps libraries orient their local collection management decisions—such as acquisitions, retention, and de-accessioning—within a broader context. In this sense, the rising importance of collective collections illuminates a shift in the strategy of managing collections, in which local collections are seen not just as assemblies of materials for local use, but also as pieces of a larger systemwide resource.” 

A selection of collective collection studies from OCLC Research

Much of the earlier work OCLC Research has done on collective collections has been of a descriptive nature, concentrating on what collective collections constructed in data look like in terms of size and scope. More recently, our emphasis has shifted to operationalizing collective collections: in other words, the practical aspects of making them a reality. For example, a few years ago we collaborated with the Big Ten Academic Alliance (BTAA) to publish a study that offered a framework and recommendations on how BTAA could move toward greater coordination of their collective print holdings. Some of our most recent work looks at how art libraries could use collaborative approaches to better support the stewardship and sustainability of their collective art research holdings.

Our research complements a similar OCLC service focus on operationalizing collective collections. Choreo Insights and GreenGlass offer analytics solutions that, among other things, provide “system aware” decision support for managing collections. WorldCat, OCLC’s vast database of information about library collections, serves as a platform for libraries to register retention commitments for materials covered by shared print programs. Resource Sharing for Groups allows groups of libraries to streamline sharing of materials within their collective holdings. These services help bring collective collections to life as a core element of collection management strategy. 

Stewarding the Collective Collection

Our latest research continues the theme of operationalizing collective collections with the Stewarding the Collective Collection project. This project extends OCLC Research’s considerable body of work on the role of collective collections in shared print programs, through two strands of inquiry: 

  • An Analysis of Print Retention Data in the US and Canada explores monographic print retentions registered in OCLC’s WorldCat database. This study analyzes over 100 million bibliographic records and 30 million retention commitment records covering libraries across the United States and Canada. 
  • US and Canadian Perspectives on Workflows, Data, and Tools for Shared Print gathers insights from library leaders, shared print program managers, and collection, metadata, and resource sharing librarians on the key workflows associated with managing monographic shared print efforts, and the data and tools needed to support them.  

The results from the first strand of work were recently published. Findings from the second strand of work will be shared later this year. 

The grand theme uniting both strands of the Stewarding the Collective Collection project is collective wisdom, achieved through two approaches: 

  • Aggregated data 
  • Insights and perspectives from librarians 

Aggregated data 

Aggregated data is collective wisdom in the sense that it gathers the results of decentralized, local library decision-making, transforming it into strategic intelligence that informs future decision-making by individual libraries or groups of libraries. For example, library holdings data represents the results of acquisition/collection development decisions; similarly, the assignment of subject headings in original cataloging represents a local decision on how to describe an item’s contents. Aggregating holdings data from many libraries yields strategic intelligence by illuminating the contours and features of the collective collection, which in turn can inform both local and group-scale collection management strategies. Aggregating subject headings data in a cooperative cataloging environment can also yield strategic intelligence—for example, uncovering historical trends in descriptive practices that would benefit from new, more inclusive thinking. 

Retention commitments data reflect another form of library decision-making—in this case, the decision to commit to steward a print publication, and (usually) make a copy or copies of the publication available for sharing. This commitment may be in effect for a finite period of time, or it may extend indefinitely. Often, these commitments are made in the context of a shared print program, leading to the creation of a shared print collection consisting of materials covered by retention commitments made by the program’s participants. The aggregation of retention commitments, such as those registered in the WorldCat database, provides valuable intelligence on the current state of stewardship of the collective print collection, including retention coverage, key gaps, and unnecessary duplication evident across current commitment patterns. This intelligence, in turn, can inform future decision-making on the renewal of existing commitments, or the creation of new ones.

Stewarding the Collective Collection: An Analysis of Print Retention Data in the US and Canada (OCLC Report)

The gathering of collective wisdom through analysis of aggregated retention data is the topic of OCLC Research’s new study, Stewarding the Collective Collection: An Analysis of Print Retention Data in the US and Canada. Exploring the retention commitments attached to the US and Canadian collective print monograph collection, as it is represented in WorldCat, led to findings that provide insight into the current state of retention coverage, as well as priorities for the shared print community to address in the near future, such as the imminent expiration of a significant fraction of current retention commitments.

Insight and perspective 

Insight and perspective are perhaps more conventional types of collective wisdom, in that they draw from and aggregate the “personal wisdom” of individuals—in this case, librarians who have shared their experiences and hard-earned lessons learned from participating in some activity. This knowledge is invaluable for other librarians facing similar scenarios and challenges as they formulate their own decision-making. For example, recent work by OCLC Research gathered and synthesized library experiences in collaborative partnerships, in the areas of research data management and stewarding art research collections. As we note in one of these studies: 

“Our interview-based approach elicited a wealth of invaluable perspectives, insights, and advice on library collaboration that we synthesized into a set of recommendations for libraries contemplating future collaborations. . . . Effective library collaboration is art as much as science. While concepts, frameworks, and theory are important for deepening our understanding of what makes collaborations successful and sustainable, we believe that sharing practical experiences of successful collaboration is also essential.” 

We followed a similar approach for Stewarding the Collective Collection’s second strand of work, which explores workflows, data, and tools used to manage shared print collections for monographic materials. Questions addressed in the project include: 

  • What are the key workflows supporting stewardship of shared print monograph collections? 
  • What data and tools are currently used to support these workflows? 
  • What gaps in data, tools, or other resources exist, and how might addressing these gaps open new opportunities for collective stewardship of print collections? 

To answer these questions, we gathered “collective wisdom” through individual and focus group interviews, as well as an online survey. We are in the process of analyzing and synthesizing this data, and we’ll be disseminating our findings through a variety of channels. Our hope is that these findings will provide libraries with a benchmark view of the current state of practice surrounding shared print workflows, data, and tools; help optimize practices having to do with collection evaluation and coordinated collection stewardship, both at the local and group level; and consolidate community views on data and functionality needs and priorities in the area of monographic shared print. 

Collective wisdom drives conscious coordination 

OCLC supports the gathering of collective wisdom and its transformation into strategic intelligence for libraries through a wide range of channels. Tools like Choreo and GreenGlass are one approach. Another is OCLC Research, which gathers collective wisdom through data-driven studies like its WorldCat-based collective collection analyses, but also through studies that collect and synthesize the perspectives and lessons learned from library practitioners. Many of these studies have been conducted under the auspices of the OCLC Research Library Partnership, which is itself a channel for assembling collective wisdom through its mission of bringing research libraries together around mutual interests. 

Gathering collective wisdom through these and other channels is important because it informs stewardship of collective collections, which is itself a leading example of conscious coordination. Conscious coordination is a concept OCLC Research introduced in a 2015 report, where it is defined as “a strategy of deliberate engagement with—and growing dependence on—cooperative agreements, characterized by increased reliance on network intelligence (e.g., domain models, identifiers, ontologies, metadata) and global data networks.” Stewardship strategies based on conscious coordination are marked by four key features: 

  • Local decisions about stewardship are taken with a broader awareness of the system-wide stewardship context—who is collecting what, what commitments have been made elsewhere in terms of stewarding various portions of the scholarly record, and how the local collection fits into the broader system-wide stewardship effort. 
  • Declarations of explicit commitments are made in regard to portions of the local collection, in which institutions acknowledge, accept, and undertake to fulfill explicit collecting, curation, and accessibility responsibilities for certain materials. Fulfillment of these responsibilities is seen as a commitment to an external stakeholder community. 
  • A formal division of labor emerges within cooperative arrangements, with a greater emphasis on specialization. This will occur in the context of a broader, cross-institutional cooperative arrangement in which different institutions specialize in collecting, curating, and making available different portions of the scholarly record.  
  • More specialization in collecting activity must be accompanied by robust resource sharing arrangements that ensure relatively frictionless access to all parts of the collective collection, providing mutual assurance that materials collected by one institution will be made available to other partners, and vice versa. 

Conscious coordination, as a strategy for managing collections and stewarding the scholarly record, underscores the importance of effective collaboration to meet shared objectives, as well as data-driven intelligence to fuel understanding of collective collections and how best to build, manage, and sustain them. In other words, it amplifies the need for collective wisdom—in the form of both aggregated data and collective insight and perspective—to inform decision-making and strengthen partnerships. 

Turning collective wisdom into collective impact 

Collective wisdom, in the form of aggregated data and insights from librarians’ experiences, are vital sources of intelligence that can help build and sustain shared print efforts and other types of collective collections. The Stewarding the Collective Collection project taps into the collective wisdom of the library community in the service of strengthening and sustaining shared print programs, and ultimately, amplifying the impact of both past and future investment in the collective print resource. Watch for more findings from this project throughout 2025!

Thanks to my colleagues on the Stewarding the Collective Collections project – Inkyung Choi, Lynn Connaway, Lesley Langa, and Mercy Procaccini – for their comments on a draft of this post. Special thanks to Erica Melko for her usual editorial magic!

The post Collective collections through collective wisdom  appeared first on Hanging Together.

Issue 104: Long Term Digital Storage / Peter Murray

This week's Thursday Threads looks at digital storage from the past and the future. There are articles about the mechanics of massive data storage systems in tech giants like Google and Amazon, the still existing use of floppy disks in certain industries, and the herculean efforts of digital archivists to access stored data from outdated mediums.

This week:

Feel free to send this newsletter to others you think might be interested in the topics. If you are not already subscribed to DLTJ's Thursday Threads, visit the sign-up page. If you would like a more raw and immediate version of these types of stories, follow me on Mastodon where I post the bookmarks I save. Comments and tips, as always, are welcome.

Hard Drives Go Bad

A few years ago, archiving specialist Iron Mountain Media and Archive Services did a survey of its vaults and discovered an alarming trend: Of the thousands and thousands of archived hard disk drives from the 1990s that clients ask the company to work on, around one-fifth are unreadable. Iron Mountain has a broad customer base, but if you focus strictly on the music business, says Robert Koszela, Global Director Studio Growth and Strategic Initiatives, “That means there are historic sessions from the early to mid-’90s that are dying.”
Inside Iron Mountain: It’s Time to Talk About Hard Drives, Mix Magazine, 19-Aug-2024

This article focuses on the music industry, but its story is applicable across all fields. Music production once used multi-track analog tape (where splicing was done with physical cuts and tape); when the process was done, the analog tape went into storage. Alarms went up in the field about media deterioration and a lot of effort was made to digitize the source materials. Those digitized artifacts were stored on hard drives, and everyone assumed they were now safe. But preservation of digital media is an active process — one can't assume that the disks will spin and that the software to read the files still runs.

When your goal is out living the heat death of the universe

I sometimes think about the fact that Amazon S3 effectively has to exist until the heat death of the universe. Many millennia from now, our highly-evolved descendants will probably be making use of an equally highly evolved descendant of S3. It is fun to think about how this would be portrayed in science fiction form, where developers pore through change logs and design documents that predate their great-great-great-great grandparents, and users inherit ancient (yet still useful) S3 buckets, curate the content with great care, and then ensure that their progeny will be equally good stewards for all of the precious data stored within.
S3 as an Eternal Service, Last Week in AWS, 29-Mar-2023

The idea that struck me in this article is that as service provider like Amazon can't distinguish between what is important and what is not: if a customer asked Amazon to store it, it will do its best to make sure it retrievable. How much storage is in use — multiple copies on multiple drives in multiple servers and multiple locations — for files that have zero value?

Distributed Storage Systems

The impact of these distributed file systems extends far beyond the walls of the hyper-scale data centers they were built for— they have a direct impact on how those who use public cloud services such as Amazon&aposs EC2, Google&aposs AppEngine, and Microsoft&aposs Azure develop and deploy applications. And companies, universities, and government agencies looking for a way to rapidly store and provide access to huge volumes of data are increasingly turning to a whole new class of data storage systems inspired by the systems built by cloud giants. So it&aposs worth understanding the history of their development, and the engineering compromises that were made in the process.
The Great Disk Drive in the Sky: How Web giants store big—and we mean big—data, Ars Technica, 25-Jan-2012

This 13-year-old article explores the massive data storage systems utilized by major tech companies like Google, Amazon, and Facebook to manage their vast information stores. Traditional methods of scaling storage, such as increasing disk capacity or adding more servers, fall short at the size of in cloud computing environments. While you may not ever operate at the scale of these companies, it is interesting to read about how the tech giants do data storage and management. (The article's subtitle also refers to "big data" — a phrase that was fashionable in the previous decade but one which we don't hear much about anymore.)

Industries are still using floppy disks

A surprising number of industries, from embroidery to aviation, still use floppy disks. But the supply is finally running out.
Why the Floppy Disk Just Won’t Die, Wired, 6-Mar-2023

8-inch floppy disks were invented in the early 1970s; they could store a megabyte a piece. 5.25-inch floppy disks were introduced in late 1970s; while obviously smaller, its high density capacity could also store about a megabyte and a quarter per disk. 3.5-inch disks (no longer called "floppy" because they were in a hard plastic case) came to the market in the early 1980s and could store a megabyte and a half. Each of these formats are still used today. (Maybe not the 8-inch floppies; those were retired from nuclear weapons silos in 2019.)

Reading old floppy disks

Raw flux streams and obscure formats: Further work around imaging 5.25-inch floppy disks, Digital Preservation at Cambridge University Libraries, 19-Apr-2024

Speaking of floppy disks, digital archivists from Cambridge University Library and Churchill Archives Centre detail their efforts to create copies of 5.25-inch floppy disks. Remember 5.25-inch floppy disks? From soliciting donations of old floppy disk drives to the hardware and software required to access these old disks on new hardware, the report is a fascinating look at the past (and maybe a preview of what future generations will need to do to read today's digital storage media).

Century-scale Storage

This piece looks at a single question. If you, right now, had the goal of digitally storing something for 100 years, how should you even begin to think about making that happen? How should the bits in your stewardship be stored with such a target in mind? How do our methods and platforms look when considered under the harsh unknowns of a century? There are plenty of worthy related subjects and discourses that this piece does not touch at all. This is not a piece about the sheer volume of data we are creating each day, and how we might store all of it. Nor is it a piece about the extremely tough curatorial process of deciding what is and isn’t worth preserving and storing. It is about longevity, about the potential methods of preserving what we make for future generations, about how we make bits endure. If you had to store something for 100 years, how would you do it? That’s it.
Century-Scale Storage: If you had to store something for 100 years, how would you do it?, Harvard Law School Library Innovation Lab, undated

This 15,000-word essay looks at digital storage from the earliest hard drives (including restoring data from a 1960s-era IBM hard disk prototype) to the cloud to old fashion print-on-paper. There are discussions of the reliability and longevity of different storage methods, such as RAID systems, cloud storage, and physical media like vinyl records and tape drives. But it isn't just the physical medium...the article also highlights the importance of institutional commitment, funding, and cultural values in ensuring the preservation of data. Ultimately, the writers suggest that successful century-scale storage requires a combination of methods, a culture of vigilance, and a commitment to preserving human cultural memory.

This Week I Learned: In Ethiopia, time follows the sun like nowhere else

Because Ethiopia is close to the Equator, daylight is pretty consistent throughout the year. So many Ethiopians use a 12-hour clock, with one cycle of 1 to 12 — from dawn to dusk — and the other cycle from dusk to dawn. Most countries start the day at midnight. So 7:00 a.m. in East Africa Time, Ethiopia&aposs time zone, is 1:00 in daylight hours in local Ethiopian time. At 7:00 p.m., East Africa Time, Ethiopians start over again, so it&aposs 1:00 on their 12-hour clock.
If you have a meeting in Ethiopia, you'd better double check the time, The World from PRX, 30-Jan-2015

This could have easily gone in last week's Thursday Threads on time standards. There are 12 hours of daylight, numbered 1 through 12. Then 12 hours of night, numbered 1 through 12. What could be easier?

Alan and Mittens squabble in the cat tree

Two cats on a cat tree by a window; one above, white with black spots, and one below, black with a white chest. The one on the bottom is looking up with a 'hiss' in its mouth; the one on top looks down unconcerned.

These two troublemakers. Alan is the cat on top, looking down on Mittens below. In this cozy sunlit room with a cat tree by an open window, you'd think these two would get along. Not so. Alan's typical perch is on top of the cat tree, so it is Mittens that is intruding (if you could call it that.)

The Tech We Want: Reflecting on the Super Election Year 2024 / Open Knowledge Foundation

On January 14th, 2025, the Open Knowledge Foundation brought together leading voices from around the world to reflect on the Super Election Year 2024, during which 3.7 billion people voted in 72 countries. This online event, part of the The Tech We Want initiative, examined the profound role of technology in electoral processes, its potential, and its pitfalls, and was a continuation of the Digital Public Infrastructure for Electoral Processes roundtable discussions we held in 2023.

The event featured nine experts who shared their regional perspectives and actionable insights:

Alejandra Padilla, journalist at Serendipia, recounted Mexico’s innovative use of technology, particularly online voting for citizens abroad. However, she highlighted the complications that arose, such as thousands of registrations being rejected due to technical issues and user errors. This case exemplified how tech, while aiming to simplify voting, can unintentionally create barriers. Alejandra also shared Serendipia’s project using AI to summarize candidate platforms and help voters identify alignment with their own views, illustrating how technology can empower informed decision-making.

Emmanuel K. Gyan, from Fact-Check Ghana, shared insights from Ghana’s elections, where misinformation and disinformation were widespread, especially through social media. He highlighted initiatives like setting up situational rooms to counter false narratives in real-time. However, challenges such as limited access to fact-checking tools and the cost of combating disinformation were significant hurdles. Emmanuel emphasised the importance of holding accountable those deliberately sharing fake news in order to deter future incidents.

Juan Manuel Casanueva, Executive Director of SocialTIC, explored the gap between technological ambitions and realities in Mexico. He pointed out issues like incomplete or unreliable candidate data and the need for standardised, comparative election results data. Civil society stepped in to address these gaps by creating tools and databases, such as verifying candidate information and visualising historical election data. He warned us about the growing influence of pseudo-journalists and influencers spreading political misinformation.

Julia Brothers, Deputy Director for Elections at the National Democratic Institute (NDI), presented a global perspective, acknowledging progress in areas like voter registration technology and open election data. However, she stressed that public confidence and trust remains a significant challenge. Julia noted that election technologies are often vendor-driven rather than problem-driven, leading to transparency and accountability deficits. She highlighted that more often than not, technology is developed in a way that is not solution-oriented, but rather adding extra problems – something that deeply resonated with the way we are thinking about technology at OKFN and our initiative The Tech We Want. Her call to action included better public communication about the scope and limits of election technologies to address this issue of trust..

Miazia Schüler, researcher at AI Forensics, focused on the risks posed by generative AI in elections. Her investigations, for example on the French Elections, revealed errors and inconsistencies in AI-generated election-related content, posing threats to voter trust. She noted that AI was increasingly used to create disinformation, such as AI-generated images dramatizing political narratives. Miazia called for robust safeguards, stricter content moderation, and transparency to mitigate risks associated with generative AI in political campaigns.

Narcisse Mbunzama, Open Knowledge Network Hub Coordinator for Francophone Africa, shared lessons from the DRC’s elections, where technology improved voter registration but also revealed vulnerabilities. In a context where trust in democratic institutions is low, centralized control over election servers raised concerns about data manipulation. Narcisse highlighted the need for decentralised and transparent systems to ensure accountability and trust.

Oluseun Onigbinde, from Open Knowledge Nigeria, discussed how technology can decentralise access to election data, empowering civil society to act as a check on governmental irregularities. However, he cautioned against the misuse of tech, citing examples of cybersecurity vulnerabilities and data privacy issues in Nigeria’s elections. Oluseun advocated leveraging informal influencer networks to counter disinformation effectively. He also underlined the importance of speed and influence in combating misinformation

Setu Bandh Upadhyay, Open Knowledge Network Hub Coordinator for Asia, reflected on elections across Asia, where platforms like TikTok amplified foreign narratives and misinformation, particularly in multilingual contexts. He highlighted the lack of tools like CrowdTangle that once helped researchers track disinformation trends. Setu also raised concerns about internet shutdowns, which disproportionately impacted marginalized communities, including incidents of voter suppression and violence.

Sym Roe, CTO at Democracy Club, provided insights from the UK, where traditional forms of misinformation, such as misleading newspaper articles, remain a problem. He highlighted that disinformation is not solely a modern problem linked to new technologies—although they undoubtedly amplify its reach—but rather a challenge that has existed for centuries, dating back to the very origins of the press. He noted the retreat of social media companies from proactive election engagement, leaving civil society to fill the gaps. Sym called for a renewed focus on producing positive information to counter misinformation, rather than solely reacting to disinformation.

The event was introduced by OKFN Tech Lead, Patricio del Boca, and moderated by the International Network Lead, Sara Petti.

A Call to Action

The speakers highlighted both the potential and the perils of technology in electoral processes. Their collective insights emphasised the urgent need for:

  • Greater transparency and accountability in election technologies.
  • Stronger safeguards against generative AI misuse.
  • Equitable resources to combat disinformation globally.
  • Collaborative, scalable solutions to make technology accessible and impactful.

About the initiative

The Tech We Want is the current technological vision of the Open Knowledge Foundation. We are advocating for open, long-lasting, resilient and affordable technologies that are good enough to solve people’s problems, and for open and fair governance mechanisms and tools to truly democratise Data and AI.

In October 2024 we launched this initiative with a highly successful two-day online summit, where we tried to imagine together with key experts and activists new ways of developing tech, a kind of tech that is a common good, developed with and for the community, maintained with care, sustainable for the people and the planet, and built to last.

The summit featured 43 speakers from 23 countries in 14 hours of live streaming followed by 711 participants. We also gave space and context to 15 project demonstrations and featured major influencers from the civic and government space working at the intersection of technology and democracy.

The full documentation of the summit is gradually being published on our blog. You can follow the hashtag #TheTechWeWant on social media platforms to keep up with the new activities that will unfold over the coming months.

Readings for people working for the government / John Mark Ockerbloom

A key reason I got involved in digital libraries years ago was the promise of reliable information empowering people to be more knowledgeable and responsible in their actions. One of the oldest digital library sites on the Web is Cornell’s Legal Information Institute, which has had the mission since 1992 to make legal information free for all.

Here is some of the information provided on the site that I was recently reminded of:

All persons born or naturalized in the United States, and subject to the jurisdiction thereof, are citizens of the United States and of the state wherein they reside. No state shall make or enforce any law which shall abridge the privileges or immunities of citizens of the United States; nor shall any state deprive any person of life, liberty, or property, without due process of law; nor deny to any person within its jurisdiction the equal protection of the laws.

[US Constitution, Amendment XIV, Section 1]

An individual, except the President, elected or appointed to an office of honor or profit in the civil service or uniformed services, shall take the following oath: “I, AB, do solemnly swear (or affirm) that I will support and defend the Constitution of the United States against all enemies, foreign and domestic; that I will bear true faith and allegiance to the same; that I take this obligation freely, without any mental reservation or purpose of evasion; and that I will well and faithfully discharge the duties of the office on which I am about to enter. So help me God.” This section does not affect other oaths required by law.

[5 U.S. Code § 3331 – Oath of office]

Any employee who has authority to take, direct others to take, recommend, or approve any personnel action, shall not, with respect to such authority…. take or fail to take, or threaten to take or fail to take, any personnel action against any employee or applicant for employment because of… refusing to obey an order that would require the individual to violate a law, rule, or regulation.

[5 U.S. Code § 2302 – Prohibited personnel practices, traced through increasingly specific subsections (b), (9), and (D)]

In summary: Most people working for the government have taken an oath to support and defend the Constitution of the United States, which supersedes any requirement to follow the orders of any particular person, up to and including the President, when that person’s orders contradict the Constitution. They have the right to refuse to obey an order that violates the law. Furthermore, if the order also violates the Constitution, their oath makes that right a duty.

The Legal Information Institute, and other free digital libraries, also include lots of rulings of the Supreme Court and lower courts. These courts have the last word under American rule of law about the meaning of the Constitution. If the President, or any other individual, claims the Constitution means something it doesn’t say, such as that many people the Fourteenth Amendment says are born citizens aren’t really citizens, and doesn’t have the courts backing up his claim, that claim does not merit any more credence than his claims, say, that some people aren’t really people. Any orders he makes based on those claims do not override the duty of government officials to follow the Constitution, which in the Fourteenth Amendment quoted above guarantees birthright privileges and immunities to citizens, as well as due process and equal protection to “any person”, whether citizen or not.

If you know people who work for the government, or witness people in their work for the government, you have the power, and often the duty, to remind them of their duties and rights regarding the Constitution and the guarantees and obligations it sets.

As former government officials like Jeff Neal point out, even though government officials know about their oaths of office and know that orders can be unlawful, it can still be a challenge to recognize and respond appropriately to unlawful orders. If you’re in need of information to help you determine, and decide what to do about, orders that may violate the law, the Constitution, or your conscience, consider reaching out to a librarian near you for guidance. That’s what we’re here for.