Introduction and summary: Library Studies, the Informational Disciplines, and the iSchool: Some Remarks Prompted by LIS Forward / Lorcan Dempsey

This is an excerpt from a longer contribution I made to Responses to the LIS Forward Position Paper: Ensuring a Vibrant Future for LIS in iSchools. [pdf] It is a sketch only, and somewhat informal, but I thought I would put it here in case of interest. I have not revised it.
This is the introduction and summary, and it extensively references the full response and the Position Paper itself. I link to the position paper, my response, and sections excerpted here below.
As I note also below, it is very much influenced by the context in which it was prepared which was a discussion of the informational disciplines and the iSchool in R1 institutions.
If you wish to reference it, I would be grateful if you cite the full original: Dempsey, L. (2025). Library Studies, the Informational Disciplines, and the iSchool: Some Remarks Prompted by LIS Forward. In LIS Forward (2025) Responses to the LIS Forward Position Paper: Ensuring a Vibrant Future for LIS in iSchools, The Friday Harbor Papers, Volume 2. [pdf]

Introductory

iSchools have created a space where many disciplinary backgrounds come together in potentially exciting ways. My focus here is on the position of Library Studies within this mix. I don’t mean to ignore the bigger questions about iSchools and their future, or about Library Studies in other disciplinary configurations; they are just not within my scope. A strong Library Studies component has much to gain from this mixed disciplinary hinterland.
This response reflects its origins as a personal commentary on the LIS Forward document, and as such manifests some informality, in both presentation and preparation. It covers several areas that could be treated much more extensively or completely in a different kind of consideration.
Given the original emphases of the report, I focus on libraries and library studies, LIS, and IS, and assume an R1 setting.
I am very aware of my partial and particular perspectives. I make a note of my experience in a coda. It should be clear that my aim most of the time is to be suggestive rather than comprehensive.
I use IS (information science), LIS (library and information science/studies) and LS (library studies) throughout.

Introduction and summary: Library Studies, the Informational Disciplines, and the iSchool: Some Remarks Prompted by LIS Forward

The LIS Forward report is motivated by “a sense of urgency concerning the future of LIS in information schools.” It is a welcome and interesting contribution.

The Report looks at the current position of LIS (Library and Information Science/Studies)[1] in the iSchool. It echoes recurrent discussions about the prestige or position of LIS/Library Studies within the academy, the unsettled relationship between various informational disciplines, circular debates about the nature of those informational disciplines, and the challenges to a practice-based discipline in a research environment.

I share the authors’ sense of urgency. However, this sense of urgency is not driven primarily by questions of disciplinary definition or boundary. It is driven by a sense that this is an important moment for libraries and for librarianship.

It is an exciting time, as the library has been transforming from a transactional, collection-centered one to a relational, community-centered one. It is also a challenging time as both library value and library values are being questioned.

It is very much a field in motion, and occasional commotion.

For these reasons it is a time when it is more important than ever to have strong learning, research and advocacy partners which can strengthen the field and provide leadership in these areas:

Research: enrich the library’s knowledge base about its community and environment.
Career preparation: understood broadly as provision of skills and outlooks to understand and act in changing social, political, technical and service environments.
Engagement: connect libraries to frameworks, evidence and arguments which influence practice and inform policy.

Library or Information Studies is one important place this leadership can come from, although as I note below, there are others. The current iteration of the Report focuses on research, although I make comments on all three areas.

Accordingly, the urgency that I see is in the connection between, on one hand, the library of today and tomorrow and, on the other, the university educational capacity that prepares much of the library workforce, and the university research capacity that can potentially advance the field.

Is iSchool education and research helping libraries succeed? I am not sure to what extent it is. I would be interested to see more of an outside-in view of the questions raised in the report in future iterations. What do libraries and related organizations need in terms of iSchool (or other) research, education and innovation partners?

I mostly use the term ‘Library Studies’ (LS) – or librarianship -- in my response.[2] This is certainly because libraries and librarianship are my primary focus; however, I also want to avoid some of the definitional blurriness around the use of ‘LIS’ that the report acknowledges. I believe that this blurriness can be politically and personally convenient while also being more generally unhelpful.

It is clearly significant that LIS Forward was initiated by a group of iSchools within R1 institutions, given the reputational hierarchies and research focus at play in those institutions. Library studies may have low prestige or ‘symbolic capital’ in these environments. This comes across very clearly in the contribution in Chapter 3 of the Early Career Faculty. It is also evident in some comments in the Deans’ interviews.

Indeed, in support of this view, one need not look beyond the disciplinary backgrounds of the Deans of the involved iSchools, and the relative absence among them of a disciplinary or practitioner background in library studies. The low proportion of LS hires in 2022 (as opposed to LIS hires) noted by the report authors also seems telling.

At the same time, the perception of libraries themselves, particularly within university structures of influence and opinion, may be static and out of date. They may not be seen as complex management environments embedded in demanding political situations, or as sites of evolving technology or strategic choices, or as public investments in wellness, education and research.

At a high level, this leads to a two-fold challenge for Library Studies within the iSchool, especially within the R1 environment. First is the lack of academic prestige and hence emphasis; second is a potential lack of belief that libraries can motivate theoretical frameworks or professional preparation that is sufficiently worthy of academic attention.

These and other challenges have led to what I call here a story of progressive subsumption, as library studies is subsumed into various broader informational constructs. Schools of Library Studies diversified into LIS to reflect the changing technology environment, and the variety of informational careers students were following. LIS was subsumed into broader schools of information, as information processing and management became more common. The iSchool range is wide, with emphases as various as, say, vocationally oriented information systems work, or as social and philosophical aspects of an informational society, or as values-driven social justice and equity explorations. In some cases, informatics or related undergraduate degrees were added. The Report makes clear that Library Studies feels squeezed or undervalued, despite reporting continued demand for the MLS. There are also questions about the balance between teaching-oriented faculty and research faculty, as this educational demand continues.

This pattern of subsumption also relates to scale, the LS research and education capacity available in any iSchool or in aggregate across iSchools and other institutions. Is this a community that is stable or in decline? Of course, an advantage of the bigger school unit is the ability to work in concert with others.

There is a third challenge, one of agency and change. Local evolution will be driven by individual university configurations, influences and needs. Any change process is diffuse and slow, and is subject to the collective action problem across schools. It may also be resisted, although the report does not emphasise this, as may happen if some reallocation of resources, influence and vested interests is suggested.

Together, these challenges lead to a major question the Report needs to address. Given the depressingly recurrent nature of the discussions about the position of Librarianship/LIS in R1 institutions, what needs to be done differently now to secure and elevate that position?

This is why I emphasise four factors throughout this response:

the benefits of increasing the awareness, scale and impact of research and policy work through a more concertedly collaborative approach,
the benefits of reconnecting more strongly with libraries and related organizations, and the organizations that channel their interests, which includes discussion of more flexible and tailored learning and certification reflecting evolving skills and workplace demands,
the possible benefits of refocusing this particular discussion of Library Studies around the institutional and service dynamics of LAM and connecting that with a variety of disciplinary hinterlands (public administration, social studies, and so on) and moving away from the familiar and maybe superseded discussions about IS, LIS and so on,
the benefits of developing an agenda of Key Areas which connect with current library needs, and which can provide some rationale or motivation for recruitment, research activity, granters, collaborative activity and so on. If iSchool education and research respond more actively to evolving library issues, the people with appropriate skills and interests need to be in place.

Of course, there is a prior question about whether Library Studies has a place in the iSchool at all, but for now I assume it does. In fact, one is aware of opportunity. R1 institutions contain the space, resources and ambition to potentially remake Library Studies in ways that address the challenges of the times. If Library Studies is to benefit from strong collaborations within the range of specialisms an iSchool offers then Library Studies itself must be strongly represented in the structures of the school, be connected with the practitioner community, and have a strong research profile. The report recognizes this, notably in recommendation 3.

It seems unlikely that library studies has the prestige, track record or scale to exist in the R1 environment outside of a bigger grouping such as the iSchool. Given the history of library schools within this environment, a favorable position with an iSchool or other school seems like a win. (And again, Library Studies exists outside of this environment also.)

I comment on iSchools here given that that is the focus of the report, acknowledging that other disciplinary-departmental configurations exist and may offer other advantages. A key issue here is the scale required to have a range of educational offerings and some research capacity.

Finally, as the Report notes, iSchools are quite different from each other and given the fluidity of disciplinary definition, origin stories, and departmental organization, it is clear that practices are highly contingent – on institutional histories and configurations, on Deans’ preferences and outlooks, on hiring decisions, on institutional political and economic drivers, and on existing faculty interests. This comes across quite strongly in some places in the report, and I thought that it would be interesting for the authors to lean a little more into some of these questions at some stage.

For this reason, when I talk about iSchools I tend to have a somewhat idealized version in mind, assuming a broad scope that entails a sociotechnical exploration of technology, people and organizations within a multidisciplinary setting. Thinking about the position and future of the iSchool within the university is outside my scope here, and, indeed, outside my competence. Library Studies certainly have a place within such an environment. Of course, actually existing iSchools variably embody that idealized version.

Summary note

My response is in seven sections. This introduction is followed by five relatively self-standing pieces which could be considered independently, each looking at a different aspect of Library Studies and the iSchool. For this reason, there may be a little redundancy between them. Finally, there is a section which responds to the Report’s invitation to provide some candidate recommendations.

Five opening remarks:

The initiative should have ambition which is proportional to the urgency they describe, and to the benefits of having a strong research and education partner for a challenged and changing library community. This needs some thinking about agency in respect of collaborative activity, and I suggest the group seeks funding to coordinate and seek additional support.
Given the recurrent nature of the discussion about the position of Library Studies/LIS within the University, and given the prestige dynamics of an R1 institution, a (the?) major question before the initiative is how to secure and elevate that position. This involves steps to elevate the interest in, the awareness of, and the impact of the research and education questions Library Studies addresses, to sharpen focus on the distinctiveness of LS, and to reconnect with evolving library challenges and interests.
The report notes an identity challenge within an identity challenge (LIS within the iSchool). One could extend the identity challenge concertina and add Informatics, Library Studies, Information Science and Information Sciences to the mix. At the same time, as the Deans note, outside perceptions can be confused about labels, be unaware of intellectual traditions, or be confused about the nature of the scholarly enquiry or education involved. After many years of discussion, the LS/LIS/IS discourse does not cumulate to consensus about a unified field, about the main emphases in such a field, or about the terms used to describe it. Seen against the challenge I describe above, this disciplinary discussion seems like a played-out topic, of declining interest; however, it still commands attention and I explore it further below. At the same time, the library is not just a set of information management techniques. Nor is the future of the library more Python or data science, important as those are for some who work in libraries. The library is a service organization that needs to be designed and sustained and a social and cultural institution with a long history and a vital educational and civic role. Library education and research have other potential partners in the disciplinary ecosystem, which is one reason that the potentially broader social and cultural range of the iSchool is of interest. If the focus of the initiative is on libraries, I think that it makes sense to talk about Library Studies (rather than LIS), noting, as the report does, the evolution of and interaction between informational disciplines.
I question above whether switching from LIS to LAM as a focus would clarify and sharpen the goals. This would move it away from definitional discussion within the informational fields, although it could still benefit from their focus. Of course, libraries, archives and museums have quite different traditions and emphases, but they do share that they are practitioner-focused, and represent institutional responses to social, cultural and scholarly needs. Establishing a new focus, within the rich multi-disciplinary hinterland of the iSchool, connects to the future rather than to a past struggling to establish its place. A Library/Archive/Museum focus also potentially presents a more understandable and relatable focus to other parties.[3]
A key factor here is economics. I do not have the data or the knowledge to comment on this. However, it is important to understand local economic drivers, market demand and willingness to pay, and so on. This plays into the interesting discussion about certificates, stackable qualifications, and other flexible learning pathways in evolving practitioner-oriented disciplines.

Here is a brief description of what is to follow.

1 Information – an elusive and changing concept

I sketch a brief historical schematic, just to provide some context for subsequent discussion of libraries and the informational disciplines. I note the move from the long age of literacy and print, to the proliferation of recorded information and knowledge resources after World War II, to the current ‘informationalized’ environment where information is an integral part of social organization. I note how the library, Information Science, and the iSchool emerged, respectively, in these successive phases. And I also discuss the current broad philosophical, technical, cultural and social interest in informational issues across disciplines. This interest is also turned to the past, which may be reinterpreted in informational terms. In parallel with this rise, information critique becomes more important.

2 Libraries – organizational responses to learning and creative needs

My main focus is to reaffirm that libraries are complex and changing social organizations, which are deeply collaborative and networked, and which are often working in challenging political and social contexts. They are shifting from being transactional and collections-centered to being relational and community-centered. This means that they manifest interesting educational and research issues, from the technical, to the management, to the social and political. For example, libraries are relevant if you are interested in the public sphere, the nature and support of public goods, equity of opportunity, childhood learning, building sustainable scholarly infrastructure, the balance between creators and consumers, the nature of memory and forgetting. Information management skills remain important but there is a range of community, management, organization, advocacy and other skills that are also very important. This underlines how the library can generate interesting education and research agendas, and a principal theme here is how to elevate awareness of this and interest in it among peers from other disciplines. I suggest that the initiative convene with central library organizations and funders to develop a motivating list of Key Areas that will be important for libraries, and might provide some signals for recruitment, research, and course development.

3 The informational disciplines

I review some of the literature about LIS, Information Science (IS) and the emergence of the iSchool. I note the binary nature of LIS (L + IS) and the subsequent ambiguity in its use as well as occasional tension between the two wings. The relative L or IS emphasis may vary by context of use or person. In its classic form, Information Science may only be fully appreciated by those familiar with its tradition. Others may understand it more generically as a more applied companion to computer science or as a general enquiry. We already see discussion of ‘information sciences’ or ‘information science’ which make no reference to the classical Information Science intellectual legacy or social community. Nevertheless, there are strong intellectual traditions and affiliations at play, and I conclude that it may be appropriate to think of LIS and of Information Science as ongoing social communities, supported by particular educational affiliations, journals, conferences and associations rather than as self-standing disciplines. The iSchools themselves are an important part of the IS and LIS social architecture. However, this may mean that the specificity of the IS or LIS community may be diluted as the iSchool faculty continues to be diversified, as the broader iSchool informational agenda subsumptively addresses the core IS issues in a hybrid disciplinary setting, and as iSchool graduates have a variety of disciplinary outlooks. Library Studies can potentially benefit from being a strong focus within the multidisciplinary mix of the iSchool.

4 Ideas – impact on policy and practice

I briefly look at the general transmission of ideas and innovation in the library domain and wonder if the impact of Library Studies here is less than one might expect. The report is relatively silent about this area, but some disciplines (business and economics, for example) do aim to inform policy and influence practice. This seems like an important discussion point. I would be interested to see some follow-up work comparing Library Studies with other disciplines. Education, Health Sciences, Social Work, Industrial Relations and Hospitality Studies come to mind.

5 Prestige and power in the university

I talk about prestige and power, which are clearly very much at play in the dynamics of the iSchool and the influences on the Report. I draw on Bourdieu’s discussion of field, capital and power. The historical trajectory of Library Studies in the context of successively emerging informational disciplines (Information Science, LIS and the iSchool) lends itself extremely well to analysis using these categories. To what extent does the historical addition of ‘Information Science’ to library (as in LIS) or the positioning within a broader technology hinterland represent a desire to increase the symbolic capital of the discipline? Has the ‘IS’ subsequently crowded the ‘L’, as the report seems to suggest? How do Library Studies play within the prestige politics of an R1 institution? Library studies undoubtedly benefit from the scale and diversity a broader iSchool brings and the Report also notes the addition of interesting and relevant new domains of study, Indigenous knowledges, for example. However, Library Studies also struggles for attention or prestige and is pressed in some of the ways discussed emphatically by the Early Career Faculty. The Library Studies field typically has a professional focus, does not generate large collaborative grants, and its community engagement and contribution does not result in industry breakthroughs, major policy initiatives, newsworthy prizes, or other markers of distinction valued in this environment. Reception is also gendered, given the historically feminized nature of the profession. There is a comment in the Dean’s section about having to continually make the case for librarianship: I would have been interested to have heard more from the iSchool Deans about the realities of leading iSchools in an R1 institution, and indeed more generally about power, prestige, and politics in this environment. This section was written in dialogue with ChatGPT, which notes “The struggle for symbolic capital in LIS involves both defending the value of library-centric research and aligning with the broader, more interdisciplinary goals of the iSchool movement.”

6 Recommendations and candidate recommendations

A concluding section takes up the invitation to offer some candidate recommendations. They mostly focus on how to elevate the status and recognition of Library Studies, in part through stronger collaboration across the iSchools. I also include some suggestions for additional studies that would amplify topics or questions raised in the Report.

There are two Codas. In Coda 1 I include some NGram diagrams of vocabulary used in the report (LIS, Information Science, etc.). The curves are interesting, but I did not incorporate them here given uncertainty about NGram data and interpretation. Coda 2 is a note describing my experience, and acknowledging my partial perspective.

Some preliminary notes

Finally, here are some additional notes of areas that are important but not pursued here.

Library studies shares with some other practice-oriented disciplines (health professions, hospitality, social work, and others) a historic association with service and community engagement. These areas have been traditionally feminized and undervalued. How much is this dynamic in play in the relationship between Library Studies and the informational qualifications in part designed to make it appear more technological or scientific? This definitely bears further exploration by the initiative.
The report notes that libraries can stand in for LAM generally in discussion. I think that ‘LAM’ should only be used when it is applicable to and inclusive of all three strands, not when the discussion is really library or LIS based. While acknowledging that there are variously converging interests, each practice also has a history, professional community(ies), and curatorial traditions that warrant individual discussion. Assimilating LAM to an LIS/LS discussion like this is unhelpful. That said, schools that are lucky to have all three strands benefit greatly.
I am very conscious that the report does not consider student or potential employer interests. My own knowledge here is largely personal and subjective. Given that many students are looking for the MLS credential as an entry point to a library or related career, this seems like an important gap which future work may address. The relevance of LS research, education and other work to libraries and related organizations is an important element of their claims to research and education status.
The Report is about iSchools. Of course, as noted above, Library Studies may achieve some scale and collaboration possibilities in other configurations – within cultural studies or communications schools, for example, or business or education. Given the organizational context of libraries, there is a variety of relevant disciplinary hinterlands, and it is increasingly the case that ‘information’ is only one possibility. Given the topic of the report, I do not discuss those, although I do suggest some further exploration.
I use ‘librarian’ generally to refer to those who work in libraries and have a vocational interest in their evolution. Only occasionally do I explicitly limit it to those who have the MLIS credential.
I include references. Given the historical nature of the discussion, some of these are to older materials.

Coda: overview of and links to full contribution

Collection: LIS Forward (2025) Responses to the LIS Forward Position Paper: Ensuring a Vibrant Future for LIS in iSchools, The Friday Harbor Papers, Volume 2. [pdf]

Contribution: Dempsey, L. (2025). Library Studies, the Informational Disciplines, and the iSchool: Some Remarks Prompted by LIS Forward. In LIS Forward (2025) Responses to the LIS Forward Position Paper: Ensuring a Vibrant Future for LIS in iSchools, The Friday Harbor Papers, Volume 2. [pdf]

Contents: Here are the sections from my contribution. Where I have excerpted them on this site, I provide a link.

1 Introduction [excerpted here]
2 Information: a brief schematic history [excerpted here]
3 Libraries and library studies [excerpted here]
4 Informational disciplines
5 On the dissemination of ideas and innovation [excerpted here]
6 Symbolic capital
7 Recommendations and candidate recommendations
Coda 1: Google Ngram
Coda 2: Personal position
References

[1] LIS can be expanded as Library and Information Science, which is how it seems to be used in the report. Less often, perhaps, it is expanded as Library and Information Studies.

[2] Of course, not all iSchools include Library Studies in their portfolio.

[3] While Archival Studies is present in many ischools, Museum Studies is less so, which can complicate use of the LAM label. I am not suggesting that the designation ‘LAM’ is readily understood, but libraries, museums and archives are. I understand various objections to this, not least the perception that LAMs are about the past.

2025-09-03: Summer Internship with Microsoft / Web Science and Digital Libraries (WS-DL) Group at Old Dominion University

This summer, I had the privilege of interning with the Microsoft AI organization at Microsoft, located in Redmond, Washington, USA. My internship was a 12-week program that started on May 19th, 2025. During this internship, I worked as a data scientist intern under the supervision of Yatish Singh. Throughout my internship, I attended weekly meetings with the entire Microsoft Bing team. The weekly meetings were to update my progress, obtain feedback, resolve issues, or improve the solution. I had one-on-one meetings with my mentor, Tom Potier, twice weekly to discuss my progress and any issues I faced. I also met with my manager, Yatish Singh, at least twice a week.

My team was responsible for the allocation of cloud resources, such as servers, virtual machines (VMs), and operating systems (OS), to customers and partners across several regions globally. At the time of my internship, the allocation of resources was done manually, so this usually caused a delay of weeks in the resource allocation. The manual process is also prone to error. This blog post will dive deep into the project goal, solutions, and achievements of my internship project.

Project Goal

My goal was to automate the manual resource allocation process and to optimize the allocated resources in such a way that we have fewer unused resources.

To tackle the problem, I developed a two-stage solution consisting of a mathematical model and an LLM-based multi-agent system to automatically design and optimize a cluster configuration of resources to allocate to customers.

The key components were:

Mathematical model for cluster-types generation: This model allowed us to automatically design the cluster configurations by first determining the number of clusters needed to satisfy the customers' demands, then deciding the pools of resources within each cluster.

LLM-based multi-agent system for cluster-types optimization: This framework was introduced to improve the configurations obtained by our mathematical model.

Figure 1. Cluster Type Generation and Optimization

One of the most exciting parts of my project was creating the LLM-based multi-agent system for optimizing generated cluster-type configurations. This section will explain the challenges, design, and impact of my solution to my team.

Designing the LLM-based Multi-agent System for cluster type optimization

My mathematical model was able to solve the manual process of generating the cluster-type configurations for my team. However, the generated configurations leave many resources unused. To address this challenge, I developed a multi-agent system consisting of three agents: an optimization agent, a quality assurance agent, and a project management agent to optimize the generated configurations. Below is a step-by-step explanation of how my solution works:

User Input: Customers across several regions submit their resource demands (servers, VMs, OS) to our platform. Then, we automatically generate JSON files consisting of each customer's demand and available resources.

Mathematical Model: The mathematical model receives this input file and generates a configuration file in less than 1 minute, consisting of all the cluster-types to deploy to the customers based on their demands and available resources.

LLM-based Multi-agent System: A system consisting of three agents described below.

Optimization agent: This agent receives the generated configuration and checks if it satisfies the customers' demands, and provisioned resources do not exceed available resources. Then, it checks how many resources are left unused (goal of less than1000 IPs). If the unused IPs are more than 1000, it modifies the configurations for pools with a smaller VM size, provided there is a larger size available.
Quality assurance agent: This agent analyzes the modified configurations by the optimization agent and provides a detailed analysis of the quality of the configurations, areas for improvement, and specific steps for improvement.
Project management agent (decision agent): This agent decides whether the loop can be terminated or the configurations should be passed back to the optimization agent for improvement.

By introducing this two-stage solution, I ensured that the LLM agents do not start from scratch, which prevents hallucination.

Key Accomplishments

By the end of my internship, I had successfully developed a two-stage solution that delivers cluster-type configurations to satisfy customer demands across 18 regions. My solution reduces the team's development hours from 3-5 days to less than 1 hour. It also eliminates generation errors by 100%. I also worked with the team to deploy my solution to the cloud to serve customers globally.

Conclusion

In conclusion, my implemented solution consists of a mathematical model to generate cluster-types configurations and an LLM-based multi-agent system for optimizing the generated configurations. Through this project, I gained a deep understanding of developing autonomous LLM agents in real-world applications while contributing to the advancement of Microsoft AI's performance.

This is my fifth internship program in the United States, following my internship at Amazon in the summer of 2024, where I focused on automating BDD testing, developing a Multi-turn LLM-based Transcript Generator to streamline the testing of Alexa Banyan skills, developing Bedrock Agents, and integrating them with an enhanced knowledge base for more accurate feature file generation. The experience I gained in AI-driven automation and LLMs, along with the opportunity to improve my communication of technical concepts to diverse audiences, is invaluable for my future career.

Acknowledgments

I would like to express my gratitude to my PhD advisor, Dr. Jian Wu, for his boundless support and encouragement towards getting this internship, to my internship manager, Yatish Singh, and mentor, Tom Potier, for guiding me throughout my internship by providing feedback and suggestions. I am thankful for the opportunity to work as a data scientist intern with the Microsoft AI organization!

Kehinde Ajayi (@KennyAj)

2025 DAWG Technology Survey Report / Digital Library Federation

In an attempt to better serve the needs of its community, the DLF Digital Accessibility Working Group (DAWG) sent a survey to the DAWG email group asking about the technologies used by its members in their institutions. There were 12 respondents who shared 44 technologies used in their institutions. Among those 44 technologies, 30 were unique with different versions of the same platform being grouped together (e.g. different versions of Hyrax were all labeled under “Hyrax”). The majority, 75%, of respondents categorized their institutions as “Academic Library/University,” 33.3% chose “Archive/Special Collection,” 16.7% chose “Research Library,” and 8.3% chose “Gallery/Museum” or “Public Library.”

We asked respondents how they categorized the technologies they shared in regards to the type of work the technology supported. Respondents could select multiple responses from the following:

DAM’s (Islandora, CDM)
Digital exhibition platforms (Omeka)
ILS’s (Alma)
Institutional repositories (DSpace)
Cataloging tools
Publishing platforms (OJS, Pressbooks)
Archives tool (Archivematica)
Tools for metadata work (open refine)
Other [please specify] (ResearchWorks, Springshare)

The most common categories were:

Archives tool (ex: Archivematica) (11 responses)
DAM’s (ex: Islandora, CDM) (10 responses)
Institutional repositories (ex: DSpace) (7 responses)
Publishing platforms (ex: OJS, Pressbooks) (3 responses)
ILS’s (ex: Alma) (3 responses)

The following technologies were all assigned multiple categories:

DIMES
- Archives tool (ex: Archivematica)
- Archival discovery interface
Cantaloupe IIIF Image Server
- Archives tool (ex: Archivematica)
- Image server for JPEG2000s
Aeon (Atlas)
- Archives tool (ex: Archivematica)
- Request and workflow management
Hyrax
- DAM’s (ex: Islandora, CDM)
- Institutional repositories (ex: DSpace)
Open Journal Systems
- Publishing platforms (ex: OJS, Pressbooks)
- Hosting platform for OA journals
CollectionSpace
- Cataloging tools
- Museum collections management tool
Islandora
- DAM’s (ex: Islandora, CDM)
- Institutional repositories (ex: DSpace)

The technologies people mentioned were:

ArchivesSpace (5)
- Archives tool (ex: Archivematica)
dSpace (3)
- Institutional repositories (ex: DSpace)
Hyrax (3)
- DAM’s (ex: Islandora, CDM)
- Institutional repositories (ex: DSpace)
Alma/Primo (3)
- ILS’s (ex: Alma)
Islandora (2)
- DAM’s (ex: Islandora, CDM)
- Institutional repositories (ex: DSpace)
OpenRefine (2)
- Tools for metadata work (ex: open refine)
The following each had one person mention them:
- ABBYY FineReader
  - OCR software
- Aeon (Atlas)
  - Archives tool (ex: Archivematica)
  - Request and workflow management
- Archipelago
  - DAM’s (ex: Islandora, CDM)
- Archivematica
  - Archives tool (ex: Archivematica)
- ArcLight
  - Archives tool (ex: Archivematica)
- Avalon Media System (custom fork)
  - DAM’s (ex: Islandora, CDM)
- Cantaloupe IIIF Image Server
  - Archives tool (ex: Archivematica), Image server for JPEG2000s
- CollectionSpace
  - Cataloging tools, Museum collections management tool
- CONTENTdm
  - DAM’s (ex: Islandora, CDM)
- DART (APTrust)
  - Archives tool (ex: Archivematica)
- Dataverse
  - Institutional repositories (ex: DSpace)
- DIMES (Rockefeller Archive Center)
  - Archives tool (ex: Archivematica)
  - Archival discovery interface
- DuraCloud
  - Digital preservation
- Elasticsearch
  - Distributed search engine
- Esploro
  - Institutional repositories (ex: DSpace)
- Fedora (Lyrasis)
  - Institutional repositories (ex: DSpace)
- Janeway
  - Publishing platforms (ex: OJS, Pressbooks)
- MarcEdit
  - Cataloging tools
- Matomo
  - Web analytics tool
- Omeka
  - Digital exhibition platforms (ex: Omeka)
- Open Journal Systems
  - Publishing platforms (ex: OJS, Pressbooks), Hosting platform for OA journals
- osTicket
  - Support ticket system
- PressBooks
  - Publishing platforms (ex: OJS, Pressbooks)
- TIND DA
  - DAM’s (ex: Islandora, CDM)

With this information, we intend to develop more activities based around examining and developing accessibility practices for these tools. We thank everyone who responded for their time. Keep an eye out for new audits, policy reviews and recommendations, programs, and conversations in the upcoming years.

Posted on behalf of the DAWG Admin Team

Wendy Roberston

Jasmine Clark

The post 2025 DAWG Technology Survey Report appeared first on DLF.

2025-08-29: DAAD RISE Professional Research Internship Experience - Summer in Germany / Web Science and Digital Libraries (WS-DL) Group at Old Dominion University

DAAD RISE Professional scholarship holders at RISE Professional conference in Kassel, Germany

This summer, I had the privilege of being selected for the RISE (Research Internships in Science and Engineering) Professional program organized by the German Academic Exchange Service (DAAD). During the RISE Professional research internship, I worked at Petanux GmbH in Bonn, Germany, under the supervision of Prof. Mahdi Bohlouli.

DAAD RISE has two programs that offer summer research internships in Germany to students from the North America, Great Britain, and Ireland. RISE Germany program offers internships to undergraduate students at German universities and research institutions, while RISE Professional program offers internships to graduate students at German companies and research institutions with strong relations to the industry. This year, 246 North American, British, and Irish students applied for 122 internship positions and 70 scholarships were awarded for RISE Professional program. RISE Germany program received 2,512 applications from undergraduate students and awarded around 250 scholarships.

Internship Experience at Petanux

Petanux GmbH is an artificial intelligence (AI) company located in Bonn, Germany. Petanux is specializing in providing innovative AI solutions to businesses across multiple sectors including healthcare, industry, and retail. Petanux’s products include proprietary AI platforms such as PetaGPT, PetaBot, Navidor, and DermVision.

During my internship, I was a part of the AI team at Petanux and mainly contributed to their AI projects, PetaBot and PetaGPT. PetaBot is an AI chatbot platform that provides multilingual, automated customer support by collecting and analyzing content from organizational websites and documents to build a dynamic knowledge base. It delivers accurate, context-aware answers and continuously improves through learning from customer interactions. PetaGPT, a large language model (LLM)-based AI assistant tailored for enterprise knowledge management, is built to help organizations organize and access information intelligently and securely. Both of these AI platforms leverage Retrieval-Augmented Generation (RAG), which combines information retrieval with language generation to deliver accurate, context-specific answers. Organizations can upload documents or provide website content, which is integrated into the knowledge base for intelligent analysis rather than simple search.

The documents uploaded into the knowledge base can be in PDF, DOCX, text, or audio file formats. My primary contribution was enhancing figure and table extraction support for the PetaGPT and PetaBot. I began by researching tools and techniques for parsing PDFs with a focus on extracting figures and tables along with their contextual information. After a thorough qualitative analysis, the MinerU PDF parser, based on YOLO, was selected for its accuracy in detecting figure and table objects. I then developed the MinerU parser as a microservice using Flask with Gunicorn, Python, Docker, and Swagger technologies. The deployed MinerU service receives PDF files via API, parses them using the MinerU Python SDK, uploads the extracted figures and tables as images into a MinIO object store, and returns the parsed content in the API response. I also implemented API endpoints to retrieve and delete figures and tables from the MinIO object store. I enhanced the PetaBot and PetaGPT AI assistants to process uploaded PDF documents by integrating the deployed MinerU microservice through its API endpoint.

Architecture of the internal MinerU microservice (based on Flask) which receives PDF files from PetaGPT and PetaBot, parses the PDF, uploads the extracted figures and tables to an object store, and returns the parsed content.

Additionally, I added features such as AI personality, expert model support, and inferring the language from the incoming prompt to the Petabot and PetaGPT. In order to get effective responses from the AI models for these features, I applied prompt engineering strategies—such as refining instructions, structuring inputs, and tailoring context.

DAAD RISE Professional Conference 2025

One of the most memorable experiences of the RISE program was attending the RISE Professional Conference, organized by DAAD and held from July 31 - August 2, 2025 in Kassel, in the heart of Germany. During this event, around 40 RISE Professional scholarship holders came together to share both their research and personal experiences from internships across companies and research institutions throughout Germany. The conference offered a great space to connect with fellow interns, exchange ideas, and learn more about the academic and professional opportunities available in Germany.

Had an amazing time attending @DAAD_Germany RISE Professional Conference in Kassel, Germany last week. It was inspiring to connect with fellow interns across Germany - sharing research experiences and networking.🤝

👉 https://t.co/2rrfSgUsjn @WebSciDL @NirdsLab @ODUSCI @petanux pic.twitter.com/sPbz0w2lnS
— Yasasi (@Yasasi_Abey) August 7, 2025

During this conference, RISE interns had the opportunity to visit Fraunhofer Institute for Energy Economics and Energy System Technology (Fraunhofer IEE) in Kassel. Fraunhofer IEE is a German research institute that focuses on the transformation of energy systems for national and international needs. We had a guided visit of their Wind Measurement Lab and Medium Voltage Lab and learned about their research on advanced generator systems for wind energy. Following the visit to Fraunhofer IEE, we visited the University of Kassel and had a tour of the campus on Holländischen Platz, Kassel.

The highlight of the RISE Professional Conference was the memorable excursion in the beautiful city of Kassel, featuring a guided tour of the historic Löwenburg castle, exploring the stunning Bergpark Wilhelmshöhe, and the iconic Hercules monument.

DAAD RISE Professional interns exploring Bergpark Wilhelmshöhe and visiting Hercules monument and Löwenburg castle in Kassel

Summer in Germany

Alongside my research internship, I had the opportunity to experience everyday life in Germany—from life in a university city to exploring the country’s rich history and cultural landmarks. This summer, I lived in Bonn, a charming city along the Rhine river best known as the birthplace of Beethoven. The city’s cultural heritage and welcoming environment made it a wonderful place to live. Traveling by train made it easy to venture beyond Bonn, and weekends often became opportunities to discover new places. In Germany, I visited major cities such as Berlin, Cologne, and Frankfurt, exploring local landmarks, museums, and historic neighborhoods. I also tried authentic German food, including currywurst, reibekuchen, and pretzels, which enriched my cultural experience. Beyond Germany, I traveled to neighboring countries including the Netherlands, Luxembourg, and Belgium, each offering a unique cultural perspective. I visited iconic cities such as Amsterdam, Brussels, and Bruges, where I explored historic sites and enjoyed a variety of European cuisines. Overall, I greatly appreciated the chance to explore beautiful places, savor local foods, and walk along streets steeped in history.

Highlights from my summer in Germany and travels in Europe—exploring history, landmarks, and cuisines

Conclusion

This summer was an invaluable blend of professional development and exploring new places. Through my internship, I contributed to advancing key product features by developing a PDF parsing microservice with MinerU, integrating it into Petanux's AI assistants, and enhancing functionality with prompt engineering for improved model responses. These experiences strengthened my technical skills in API development, containerization, and applied AI. At the same time, living in Bonn and traveling across Germany and neighboring countries allowed me to immerse myself in European culture, history, and everyday life. Altogether, the internship not only expanded my technical expertise but also broadened my travel experience, making this summer a truly rewarding and transformative experience.

Acknowledgements

I would like to express my gratitude to my advisor, Dr. Sampath Jayarathna, for supporting my RISE internship application and for his guidance. My thanks also go to Dr. Patricia Schiaffini-Vedani from the Graduate School at ODU, for her support for this application. I am grateful to NIRDS Lab, WS-DL research group, and Computer Science department for their continued support.

I would like to acknowledge Annkristin Ermel and Michaela Gottschling for coordinating the RISE Professional program, organizing RISE Professional conference, and offering steadfast support throughout the summer. I am also thankful to DAAD and Petanux GmbH, for providing me this opportunity. I would like to extend my sincere gratitude to my supervisor, Prof. Mahdi Bohlouli, the AI team lead, Milad Bohlouli, and my colleagues at Petanux GmbH for their invaluable guidance and support throughout the internship.

-- Yasasi Abeysinghe (@Yasasi_Abey)

Open Data Editor in Action: Strategic Data Cleansing for Catalan Municipalities / Open Knowledge Foundation

Thanks to ODE, StoryData was able to cleanse and standardize complex municipal datasets and understand the extent to which they aligned with the Sustainable Development Goals (SDGs)

The post Open Data Editor in Action: Strategic Data Cleansing for Catalan Municipalities first appeared on Open Knowledge Blog.

Two Training Sessions to Strengthen Open Data Culture in Benin / Open Knowledge Foundation

The Open Data Editor workshop marked another important step in building an open data dynamic in the country, a civic opportunity for innovation and collaboration.

The post Two Training Sessions to Strengthen Open Data Culture in Benin first appeared on Open Knowledge Blog.

Luke 15:7 / David Rosenthal

Source

The title of the post refers to the King James Version of the Bible:

I say unto you, that likewise joy shall be in heaven over one sinner that repenteth, more than over ninety and nine just persons, which need no repentance.

Luke 15:7

In the throes of 2008's Global Financial Crisis Satoshi Nakamoto published Bitcoin: A Peer-to-Peer Electronic Cash System. It inspired a large group of enthusiastic advocates who asserted that Bitcoin would possess the following attributes:

It would be decentralized.
It would be trustless.
It would be censorship resistant.
It would be securely encrypted.
Users would be anonymous.
Users could transact without intermediaries.
Users could transact cheaply.

In short, it would enable users to escape the clutches of the TradFi (traditional finance) system that had so obviously failed. It has been obvious for many years that it doesn't, and in July there appeared a truly excellent mea culpa from a former advocate, Peter Ryan's Money by Vile Means. Below the fold I comment on it, and a couple of other posts describing how TradFi has obliterated Nakamoto's vision.

Ryan explains his involvement with the Bitcoin cult:

I was once a Bitcoin true believer. Back in 2013, I sat in a New York University classroom when an eccentric libertarian took to the podium to evangelize the new currency. Drawn to the message, I soon interned at the Bitcoin Center on Wall Street, where I participated in the nascent community. I later hosted events promoting the merits of cryptocurrencies. In 2018, I joined the crypto-focused news site CoinDesk, where I reported on the industry. I also produced a documentary series on cryptocurrencies.

I was never a major player in the ecosystem, but as I met various influential figures and learned more about how crypto really worked, I began to realize that behind the rhetoric of freedom, decentralization, and empowerment was a reality in which a few elites were accumulating immense wealth under false pretenses. As the cryptocurrency economy expanded and evolved over the course of the 2010s, it became hard to ignore how radically the facts diverged from the vision I had been sold. So in 2019, I wrote a Medium post called “Leaving Crypto,” terminated all my crypto-related contracts and collaborations, and sold all my crypto assets.

His current take on the cult is:

Bitcoin’s founding goal of fighting unconstrained government spending has been inverted, as crypto is increasingly serving as a means of enabling more deficit spending, an agenda the Trump administration has all but explicitly embraced. Today, crypto is merely the latest ruse to persuade the public to surrender democratic freedom and financial sovereignty to oligarchs.

One interesting part is his history of the internal politics of the cult:

The Bitcoin Foundation was created in 2012, with Ver, among others, on the leadership team. Its mission stated that it would be “funding the Bitcoin infrastructure, including a core development team.” Funding came from Bitcoin insiders, whether large investors like Ver or companies like Bitcoinstore, Mt. Gox, Bitinstant, Coinbase, and CoinDesk. It is estimated that the foundation had a few million (in dollar terms) under its control to finance developers and other activities. But amid allegations of mismanagement and ties to individuals accused of corruption, criminality, and incompetence, the Bitcoin Foundation became defunct in 2015.

A new centralized authority arose in the same year when the entrepreneur and former MIT Media Lab director Joi Ito announced the launch of the Digital Currency Initiative under his department within MIT. As its website states, “DCI was formed… in order to provide a stable and sustainable funding for long-term Bitcoin Core developers.” Parallel organizations emerged as well. For-profit entities like Blockstream, Block (formerly Square), Xapo, and BitMEX, as well as non-profit organizations like Chaincode Labs and Brink, also helped fund developers.

Ryan's point is that the software that defines and implements Bitcoin is controlled by a small group of people who are lavishly funded by institutions that profit from their efforts.

He doesn't mention the other central locus of control over Bitcoin, Bitmain, the dominant supplier of mining ASICs and power behind 7 of the top 8 mining pools. The large mining pools dominate Bitcoin:

By one estimate from Hashrate Index, Foundry USA and Singapore-based AntPool control more than 50 percent of computing power, and the top ten mining pools control over 90 percent. Bitcoin blogger 0xB10C, who analyzed mining data as of April 15, 2025, found that centralization has gone even further than this, “with only six pools mining more than 95 percent of the blocks.”

But by far the most interesting part is Ryan's history of the block size war:

This conflict was over whether to increase transaction capacity in each block by increasing the block size limit from 1 MB to 2 MB. The impetus for the proposed increase came from miners; in opposition was the “small block” faction, led by Bitcoin software developers.

The software developers:

argued that Bitcoin was best scaled by way of non-Bitcoin based “layer 2s,” which interfaced with Bitcoin but were not bound by its structure and incentives. Increased Bitcoin transaction capacity would increase the aggregate variable fees miners could receive, whereas frozen Bitcoin transaction capacity meant that providers of layer 2s could divert those fees to themselves. For example, Blockstream—one of the major Bitcoin companies—offered a layer 2 product called the Liquid Network, which promised to accommodate increased demand for transactions. Blockstream employed high-profile Bitcoin Core developers. Digital Garage, co-founded by MIT’s Joi Ito, invested in Blockstream. This put lead maintainer van der Laan, whose role was funded by Ito’s DCI, in the nexus of Blockstream as well.

In other words, the institutions backing the small group in charge of the software stood to benefit from abandoning the idea that Bitcoin was a medium of exchange that individuals could use for anything but speculation. Nakamoto never saw the block size limit as fundamental, so this was clearly heretical.

Ryan's account is long and detailed, including the victory of Bitfinex and Tether:

No cryptocurrency exchange was bigger than Bitfinex. Bitfinex was an early investor in small blocker-led Blockstream. Bitfinex’s executives also managed Tether, the company that pioneered stablecoins: cryptocurrency tokens backed 1-to-1 with the equivalent dollar reserves.
...
On Jan. 1, 2017, the tether supply was about $10 million. By Aug. 1, it grew 32 times to a total of $319 million. By Jan. 28, 2018, it was $2.3 billion. In just about one year, the Tether company’s supply of tether tokens and its alleged 1-to-1 reserves grew 230 times.

Ryan quotes Griffin and Shams and NY AG Letitia James and concludes:

When all these sources are digested together, the logical conclusion is that unbacked dollar-like tokens were printed to tilt prices on an exchange bottleneck. Bitfinex, an exchange with a clear small block conflict of interest, was in total control of what Griffin and Shams described as a pseudo-central bank.

He sums up the result of the war:

With the victory of the small blockers, Nakamoto’s theory that miners controlled Bitcoin crashed on the shores of reality. Miners were subordinate actors. They were scared of backing the losing side in the Scaling War and having their holdings go to zero. They also had to deal with short-term operating expenses of running server farms, which meant they needed to convert Bitcoin holdings to fiat currencies to pay their bills. The miners couldn’t afford to fight a war of attrition. A decentralized market didn’t decide the outcome of the Scaling War—a small group with conflicts of interest and a de facto money printer did.

There is far more good stuff in Ryan's account than just the snippets I quoted above. You should go read the whole thing.

You should also read Michael Kendall's Bitcoin: The New Tradfi:

Adam Back was an OG anarchist cypherpunk. Bitcoin arose out of the cypherpunk movement connected to online messaging during the infancy of the internet. Back shared correspondence with Satoshi and has been involved with Bitcoin since its inception. He’s the inventor of Hashcash whose cryptography became foundational to Bitcoin’s proof-of-work consensus, the founder of Blockstream, and a main contributor to Bitcoin Core. To say that Back has been a proponent of the original promise of Bitcoin would be a gross understatement. Back has been one of the strongest champions of every initial promise of Bitcoin’s decentralized financial system. That is until he sold out to Tradfi and essentially pulled the plug on everything he and his long promotion of Bitcoin stood for.

Because Bitcoin is useless for anything except speculation, most of the time the demand for transactions is low, meaning transaction fees are low and miners' income depends upon the block reward. The characteristic of speculators is herd behavior, so occasionally they all want to transact at the same time and fees spike enormously for a brief interval. A payment system in which not merely is the value of the coin volatile but so is the cost of transacting is not useful.

In an article entitled Curtain falls on BTC’s decentralization theater; mining crumbles, Steve Stradbrooke explains what Back did:

Bitcoin Core/Blockstream developers were largely responsible for ensuring the network didn’t proceed down the path that would have allowed individual network blocks to carry a sufficient number of transactions to make mining profitable as block rewards shrink.

The Financial Times reported this week that Blockstream’s founder/CEO, Adam Back, is in “late-stage talks” to contribute up to 30,000 BTC tokens worth ~$3 billion to a new ‘treasury’ firm being cooked up by Cantor Fitzgerald ...

Back once declared that “individuals using a peer to peer ecash, are the very reason for existence of Bitcoin.” Back now believes that the one true path is handing BTC to Wall Street investment bankers to launch a company that does nothing but warehouse tokens while selling shares in that company (at a premium to the value of its BTC) to ‘individuals’ who don’t know better.

Kendall concludes:

From the foundational backing by the unaudited Tether Ponzi that manipulates bitcoin’s price, to the shift from Defi to Tradfi as the only means to keep the system propped up (defeating Bitcoin’s reason for existence), and to vocal charlatans like Michael Saylor who are now the face of Bitcoin, Bitcoin belies Satoshi’s decentralized digital promise. Everything Satoshi’s White Paper envisioned for Bitcoin no longer exists. All that remains is a non-sustainable, manipulated asset where speculative frenzy defines its valuation.

Also worth reading is Ann Pettifor's Capitalism Devours Crypto:

Crypto - the criminal currency promoted by libertarians of a pioneering, frontier spirit - is being looted by the age-old criminality of capitalism - even while its global valuation rockets to $4 trillion; and even as it is clothed in the respectability of US Congressional and Presidential approval, with cover promised by US regulatory institutions.

Pettifor focuses first on (meta)stablecoins:

Crypto interests wanted to integrate Stablecoins into the regulated US financial system so that crypto assets and bank accounts could in future be protected by the Securities and Exchange Commission and the Federal Deposit Insurance Corporation, just as ordinary bank deposits are. In other words, the crypto sector wanted to be protected from losses or failure (insolvency) by the US rules-based system sponsored by the American taxpayer.

With the passing of the GENIUS Act, Stablecoin interests achieved their goal.

But she points to a different motive:

But Financialised Capitalism - i.e. Wall Street - had a very different, and a far more devious motive for integrating digital Crypto ’narrow banks’ within the financial system.

That motive: nothing less than the subordination of DeFi to the financial interests of TradFI.

This is where ‘crypto treasury companies’ come into the picture. By first luring the global crypto community of fiery libertarians into the rules-based system governed by the GENIUS ACT, THE US Federal Reserve and US institutions, Wall Street helped to inflate the price of crypto currencies. . As a consequence, sky-rocketing crypto valuations have risen to what many consider are unsustainable levels.

And there is one crypto treasury company in particular:

Crypto treasury companies are already well established and one is way ahead in the race to inflate the biggest bubble in Wall Street’s history.

That company is Strategy, previously MicroStrategy, and is led by one Michael Saylor.

Strategy bears an uncanny resemblance to a company at the heart of one of the biggest Ponzi schemes in history - the South Sea Company

Wkikpedia summarizes the South Sea Company thus:

In Great Britain, many investors were ruined by the share-price collapse, and as a result, the national economy diminished substantially. The founders of the scheme engaged in insider trading, by using their advance knowledge of the timings of national debt consolidations to make large profits from purchasing debt in advance. Huge bribes were given to politicians to support the acts of Parliament necessary for the scheme. Company money was used to deal in its own shares, and selected individuals purchasing shares were given cash loans backed by those same shares to spend on purchasing more shares. The expectation of profits from trade with South America was talked up to encourage the public to purchase shares, but the bubble prices reached far beyond what the actual profits of the business (namely the slave trade) could justify.

Emblematical print on
the South Sea Scheme

Pettifor notes William Hogarth's Emblematical Print on the South Sea Scheme, his famous engraving satirizing the South Sea Bubble:

Sky-high share prices are attracting a constant flow of new investors all keen to make quick capital gains from this giant and expanding Ponzi scheme. In a process not unlike the financial wheel of fortune depicted in Hogarth’s merry-go-round of 1721 (see below), the purchase of Strategy shares speeds up the speculative frenzy and in a positive feedback loop, further inflates the market capitalisation of the company.

According to Wikipedia:

The print is often considered the first editorial cartoon or as a precursor of the form.

DLF Digest: September 2025 / Digital Library Federation

A monthly round-up of news, upcoming working group meetings and events, and CLIR program updates from the Digital Library Federation. See all past Digests here.

Team DLF has some wonderful news: we are welcoming Shaneé Willis as our new DLF Senior Program Officer! Read about Shaneé on our blog and look for her at upcoming working group meetings as she gets up to speed in the coming months. With the coming of fall, those groups are busy as ever, and Forum planning is in full swing – make sure you register to join the fun in Denver and reserve your hotel room soon.

— Aliya from Team DLF

This month’s news:

Forum featured speaker announced: DLF is delighted to share that Dr. KáLyn Coghill will open the DLF Forum this fall with their talk, “Your Silence Will Not Protect You: A Clarion Call to Make Online Protection the Rule and NOT the Exception.” Read the abstract and learn more about Dr. Coghill.
Conference hotel filling: If you’re attending the DLF Forum this fall, make sure to book your room for the lowest available room rates. Staying in the conference hotel enhances your networking opportunities and helps us meet our commitments and allocate more resources to the events.
Call for proposals: The CFP for the IIPC 2026 Web Archiving Conference is open now through October 15, 2025. Learn more and submit.
Save the date: The IIIF Consortium will host their annual Online Meeting, January 27-29, 2026. The Online Meeting is free for all and will feature updates from the IIIF staff and editors, lightning talks, and workshops. Sign up for the IIIF Newsletter for updates on the Call for Proposals and registration.
Sponsorship opportunities available: Support the community and the DLF Forum by becoming a partner for this year’s events.

This month’s open DLF group meetings:

For the most up-to-date schedule of DLF group meetings and events (plus NDSA meetings, conferences, and more), bookmark the DLF Community Calendar. Meeting dates are subject to change. Can’t find the meeting call-in information? Email us at info@diglib.org. Reminder: Team DLF working days are Monday through Thursday.

DLF Born-Digital Access Working Group (BDAWG): Tuesday, 9/2, 2pm ET / 11am PT
DLF Digital Accessibility Working Group (DAWG): Tuesday, 9/2, 2pm ET / 11am PT
DLF AIG Cultural Assessment Working Group: Monday, 9/8, 1pm ET / 10am PT
DLF AIG Metadata Assessment Working Group: Thursday, 9/11, 1:15pm ET / 10:15am PT
DLF AIG User Experience Working Group: Friday, 9/19, 11am ET / 8am PT
DLF Digitization Interest Group: Monday, 9/22, 2pm ET / 11am PT
DLF Committee for Equity and Inclusion: Monday, 9/22, 3pm ET / 12pm PT
DLF Digital Accessibility Working Group Policy & Workflows Subgroup: Friday 9/26, 1pm ET / 10am PT
DLF Digital Accessibility IT Subgroup (DAWG-IT): Monday, 9/29, 1pm ET / 10am PT
DLF Climate Justice Group: Tuesday, 9/30, 1pm ET / 10am PT

DLF groups are open to ALL, regardless of whether or not you’re affiliated with a DLF member organization. Learn more about our working groups on our website. Interested in scheduling an upcoming working group call or reviving a past group? Check out the DLF Organizer’s Toolkit. As always, feel free to get in touch at info@diglib.org.

Get Involved / Connect with Us

Below are some ways to stay connected with us and the digital library community:

Subscribe to the DLF Forum newsletter.
Join, start, or revive a working group and browse their work on the DLF Wiki.
Subscribe to our community listserv, DLF-Announce.
Bookmark our Community Calendar.
Learn more about becoming a DLF member organization.
Follow us on LinkedIn and YouTube
Contact us at info@diglib.org.

Socials

NOTE: We no longer post to Twitter/X, Instagram, or Facebook.

The post DLF Digest: September 2025 appeared first on DLF.

The Dawn Of Nvidia's Technology / David Rosenthal

Because Nvidia became one of the most valuable companies in the world, there are now two books explaining its rise and extolling the genius of Jensen Huang, Tae Kim's The Nvidia Way: Jensen Huang and the making of a tech giant, and Steven Witt's The Thinking Machine: Jensen Huang, Nvidia, and the World's Most Coveted Microchip. For the later 90% of the history I wasn't there, so I won't comment on their treatment of that part. But for the pre-history at Sun Microsystems and the first 10% of the history I was there. Kim's account of the business side of this era is detailed and, although it was three decades ago, matches my recollections.

Witt's account of the business side of the early history is much less detailed and some of the details don't match what I remember.

But as regards the technical aspects of this early history it appears that neither author really understood the reasons for the two kinds of innovation we made; the imaging model and the I/O architecture. Witt writes (Page 31):

The first time I asked Priem about the architecture of the NV1, he spoke uninterrupted for twenty-seven minutes.

Below the fold, I try to explain what Curtis was talking about for those 27 minutes. It will take me quite a long post.

Background

NV1-based Diamond Edge
Swaaye, CC-By-SA 3.0

In the "Three Decades" section of Engineering For The Long Term I wrote:

The opportunity we saw when we started Nvidia was that the PC was transitioning from the PC/AT bus to version 1 of the PCI bus. The PC/AT bus' bandwidth was completely inadequate for 3D games, but the PCI bus had considerably more. Whether it was enough was an open question. We clearly needed to make the best possible use of the limited bandwidth we could get.

We had two basic ways of making "the best possible use of the limited bandwidth":

Reduce the amount of data we needed to ship across the bus for a given image.
Increase the amount of data shipped in each cycle of the bus.

Imaging Model

A triangle is the simplest possible description of a surface. Thus almost the entire history of 3D computer graphics has modeled the surfaces of 3D objects using triangles. But there is a technique, dating back at least to Robert Mahl's 1972 paper Visible Surface Algorithms for Quadric Patches, for modeling curved surfaces directly. It takes a lot more data to describe a quadric patch than a triangle. But to achieve equivalent realism you need so many fewer patches that the amount of data for each frame is reduced by a significant factor.

Virtua Fighter on NV1

As far as I know at the time only Sega in the video game industry used quadric patches. When we launched NV1 at Comdex we were able to show Sega arcade games such as Virtua Fighter running on a PC at full frame rate, a first for the industry. The reason was that NV1 used quadric patches and thus made better use of the limited PCI bus bandwidth.

At Sun, James Gosling and I built the extremely sophisticated and forward-looking but proprietary NeWS window system. At the same time, I also worked with engineers at competitors such as Digital Equipment to build the X Window System. One of my many learning experiences at Sun came early in the long history of the X Window System. It rapidly became obvious to me that there was no way NeWS could compete with the much simpler, open-source X. I argued for Sun to open-source NeWS and failed. I argued for Sun to drop NeWS and adopt X, since that was what application developers wanted. Sun wasted precious time being unable to decide what to do, finally deciding not to decide and wasting a lot of resource merging NeWS and X into a kludge that was a worse NeWS and a worse X than its predecessors. This was just one of a number of fights at Sun I lost (this discusses another).

Once Microsoft announced Direct X it was obvious to me that Nvidia was doomed if the next chip did quadric patches, because the developers would have to work with Direct X's triangles. But, like Sun, Nvidia seemed unable to decide to abandon its cherished technology. Time for a decision to be effective was slipping away. I quit, hoping to shake things up so as to enable a decision to do triangles. It must have worked. The books recount how close Nvidia was to bankruptcy when RIVA 128 shipped. The rest is history for which I was just an observer.

I/O Architecture

In contrast the I/O architecture was, over time, the huge success we planned. Kim writes (Page 95):

Early on, Curtis Priem had invented a "virtualized objects" architecture that would be incorporated in all of Nvidia's chips. It became an even bigger advantage for the company once Nvidia adopted the faster cadence of chip releases. Priem's design had a software based "resource manager", essentially a miniature operating system that sat on top of the hardware itself. The resource manager allowed Nvidia's engineers to emulate certain hardware features that normally needed to be physically printed onto chip circuits. This involved a performance cost but accelerated the pace of innovation, because Nvidia's engineers could take more risks. If the new feature wasn't ready to work in the hardware, Nvidia could emulate it in software. At the same time, engineers could take hardware features out when there was enough leftover computing power, saving chip area.

For most of Nvidia's rivals, if a hardware feature on a chip wasn't ready, it would mean a schedule slip. Not, though, at Nvidia, thanks to Priem's innovation. "This was the most brilliant thing on the planet," said Michael Hara. "It was our secret sauce. If we missed a feature or a feature was broken, we could put it in the resource manager and it would work." Jeff Fisher, Nvidia's head of sales, agreed: "Priem's architecture was critical in enabling Nvidia to design and make new products faster."

Context

Sun GX version 1

Nvidia is just one of the many, many startups that Sun Microsystems spawned. But at the time what made Nvidia unique among the competing graphics startups was the early engineers from the team at Sun that built the GX series of graphics chips. We went through an intensive education in the techniques needed to implement graphics effectively in Unix, a multi-process, virtual memory operating system. The competitors all came from a Windows background, at the time a single-process, non-virtual memory system. We understood that, in the foreseeable future, Windows would have to evolve multi-processing and virtual memory. Thus the pitch to the VCs was that we would design a "future-proof" architecture, and deliver a Unix graphics chip for the PC's future operating system.

The GX team also learned from the difficulty of shipping peripherals at Sun, where the software and hardware schedules were inextricable because the OS driver and apps needed detailed knowledge of the physical hardware. This led to "launch pad chicken", as each side tried to blame schedule slippage on the other.

Write-mostly

Here is how we explained the problem in US5918050A: Apparatus accessed at a physical I/O address for address and data translation and for context switching of I/O devices in response to commands from application programs (inventors David S. H. Rosenthal and Curtis Priem), using the shorthand "PDP11 architecture" for systems whose I/O registers were mapped into the same address space as system memory:

Not only do input/output operations have to be carried out by operating system software, the design of computers utilizing the PDP11 architecture usually requires that registers at each of the input/output devices be read by the central processing unit in order to accomplish any input/output operation. As central processing units have become faster in order to speed up PDP11 type systems, it has been necessary to buffer write operations on the input/output bus because the bus cannot keep up with the speed of the central processing unit. Thus, each write operation is transferred by the central processing unit to a buffer where it is queued until it can be handled; other buffers in the line between the central processing unit and an input/output device function similarly. Before a read operation may occur, all of these write buffers must be flushed by performing their queued operations in serial order so that the correct sequence of operations is maintained. Thus, a central processing unit wishing to read data in a register at an input/output device must wait until all of the write buffers have been flushed before it can gain access to the bus to complete the read operation. Typical systems average eight write operations in their queues when a read operation occurs, and all of these write operations must be processed before the read operation may be processed. This has made read operations much slower than write operations. Since many of the operations required of the central processing unit with respect to graphics require reading very large numbers of pixels in the frame buffer, then translating those pixels, and finally rewriting them to new positions, graphics operations have become inordinately slow. In fact, modern graphics operations were the first operations to disclose this Achilles heel of the PDP11 architecture.

'930 Figure 3

We took two approaches to avoiding blocking the CPU. First, we implemented a queue in the device, a FIFO (First In First Out), that was quite long, and we allowed the CPU to read from the FIFO the number of free slots, the number of writes it could do and be guaranteed not to block. When the CPU wanted to write to NV1 it would ask the FIFO how many writes it could do. If the answer were N, it would do N writes before asking again. NV1 would acknowledge each of those writes immediately, allowing the CPU to proceed to compute the data for the next write. This was the subject of US5805930A: System for FIFO informing the availability of stages to store commands which include data and virtual address sent directly from application programs (inventors David S. H. Rosenthal and Curtis Priem), the continuation of an application we filed 15^th May 1995. Note that this meant the application didn't need to know the size of the device's FIFO. If a future chip had a bigger or smaller FIFO, the unchanged application would use it correctly.

Second, we tried as far as possible not to use the CPU to transfer data to and from NV1. Instead, whenever we could we used Direct Memory Access, in which the I/O device reads and writes system memory independently of the CPU. In most cases, the CPU instructed NV1 to do something with one, or a few writes, and then got on with its program. The instruction typically said "here in memory is a block of quadric patches for you to render". If the CPU needed an answer, it would tell NV1 where in system memory to put it and, at intervals, check to see if it had arrived.

Remember that we were creating this architecture for a virtual memory system in which applications had direct access to the I/O device. The applications addressed system memory in virtual addresses. The system's Memory Management Unit (MMU) translated these into the physical addresses that the bus used. When an application told the device the address of the block of patches, it could only send the device one of its virtual addresses. To fetch the patches from system memory, the DMA engine on the device needed to translate the virtual address into a physical address on the bus in the same way that the CPU's MMU did.

So NV1 didn't just have a DMA engine, it had an IOMMU as well. We patented this IOMMU as US5758182A: DMA controller translates virtual I/O device address received directly from application program command to physical i/o device address of I/O device on device bus (inventors David S. H. Rosenthal and Curtis Priem). In 2014's Hardware I/O Virtualization I explained how Amazon ended up building network interfaces with IOMMUs for the servers in AWS data centers so that mutiple virtual machines could have direct access to the network hardware and thus eliminate operating system overhead.

Context switching

The fundamental problem for graphics support in a multi-process operating system such as Unix (and later Linux, Windows, MacOS, ...) is that of providing multiple processes the illusion that each has exclusive access to the single graphics device. I started fighting this problem in 1983 at Carnegie-Mellon. James Gosling and I built the Andrew Window System, which allowed multiple processes to share access to the screen, each in its own window. But they didn't have access to the real hardware. There was a single server process that accessed the real hardware. Applications made remote procedure calls (RPCs) to this server, which actually drew the requested graphics. Four decades later the X Window System still works this way.

RPCs imposed a performance penalty that made 3D games unusable. To allow, for example, a game to run in one window while a mail program ran in another we needed the currently active process to have direct access to the hardware, and if the operating system context-switched to a different graphics process, give that process direct access to the hardware. The operating system would need to save the first process' state from the graphics hardware, and restore the second process' state

Our work on this problem at Sun led to a patent filed in 1989, US5127098A: Method and apparatus for the context switching of devices (inventors David S. H. Rosenthal, Robert Rocchetti, Curtis Priem, and Chris Malachowsky). The idea was to have the device mapped into each process' memory but to use the system's memory management unit (MMU) to ensure that at any one time all but one of the mappings was invalid. A process' access to an invalid mapping would interrupt into the system's page fault handler, which would invoke the device's driver to save the old process' context and restore the new process' context. The general problem with this idea is that, because the interrupt ends up in the page fault handler, it requires device-dependent code in the page fault handler. This is precisely the kind of connection between software and hardware that caused schedule problems at Sun.

There were two specific Nvidia problems with this idea. First that Windows wasn't a virtual memory operating system so you couldn't do any of this. And second that even once Windows had evolved into a virtual memory operating system, Microsoft was unlikely to let us mess with the page fault handler.

'930 Figure 6

As you can see in Figure 6 of the '930 patent, the I/O architecture consisted of an interface between the PCI bus and an internal bus that could implement a number of different I/O devices. The interface provided a number of capabilities:

It implemented the FIFO, sharing it among all the devices on the internal bus.
It implemented the DMA engine and its IOMMU, sharing it among all the devices on the internal bus.
Using a translation table, it allowed applications to connect to a specific device on the internal bus via the interface using a virtual name.
It ensured that only one application at a time could access the interface.

The difference between the PCI and PC/AT buses wasn't just that the data path grew from 16 to 32 bits, but also that the address bus grew from 24 to 32 bits. The address space was 256 times bigger, thus Nvidia's devices could occupy much more of it. We could implement many virtual FIFOs, so that each application could have a valid mapping to one of them. The device, not the operating system, would ensure that only one of the virtual FIFOs was mapped to the single physical FIFO. A process accessing a virtual FIFO that wasn't mapped to the physical FIFO would cause an interrupt, but this time the interrupt would go to the device's driver, not the page fault handler. The driver could perform the context switch, and re-assign the physical FIFO to the new virtual FIFO. It would also have to copy page table entries from the CPU's MMU into the IOMMU to reflect the placement of the new process' pages in physical memory. There would be no page fault so no knowledge of the device in the operating system's page fault handler. As we wrote in the '050 patent:

the use of many identically-sized input/output device address spaces each assigned for use only by one application program allows the input/output addresses to be utilized to determine which application program has initiated any particular input/output write operation.

Because applications each saw their own virtual FIFO, future chips could implement multiple physical FIFOs, allowing the virtual FIFO of more than one process to be assigned a physical FIFO, which would reduce the need for context switching.

Objects & Methods

Don's NeWS Pie Menu

One of the great things about NeWS was that it was programmed in PostScript. We had figured out how to make PostScript object-oriented, homomorphic to SmallTalk. We organized objects in the window system in a class hierarchy with inheritance. This, for example, allowed Don Hopkins to implement pie menus for NeWS in such a way that any user could replace the traditional rectangular menus with pie menus. This was such fun that Owen Densmore and I used the same technique to implement object-oriented programming for the Unix shell.

At a time when PC memory maxed out at 640 megabytes, the fact that the PCI bus could address 4 gigabytes meant that quite a few of its address bits were surplus. So we decided to increase the amount of data shipped in each bus cycle by using some of them as data. IIRC NV1 used 23 address bits, occupying 1/512th of the total space. 7 of the 23 selected one of the 128 virtual FIFOs, allowing 128 different processes to share access to the hardware. We figured 128 processes was plenty.

'930 Figures 4,5

The remaining 16 address bits could be used as data. In theory the FIFO could be 48 bits wide, 32 from the data lines on the bus and 16 from the address lines, a 50% increase in bits per bus cycle. NV1 ignored the byte part of the address so the FIFO was only 46 bits wide.

So we organized the objects in our I/O architecture in a class hierarchy, rooted at class CLASS. The first thing an application did was to invoke the enumerate() method on the object representing class CLASS. This returned a list of the names of all the instances of class CLASS, i.e. all the object types this instance of the architecture implemented. In this way capabilities of the device weren't wired in to the application. The application asked the device what its capabilities were. In turn, the application could invoke enumerate() on each of the instances of class CLASS in the list, which would get the application a list of the names of each of the instances of each class, perhaps LINE-DRAWER.Thus the application would find out rather than know a priori the names of all the resources (virtual objects) of all the different types that the device supported.

The application could then create objects, instances of these classes, by invoking the instantiate() method on the class object with a 32-bit name for the newly created object. The interface was thus limited to 4B objects for each application. The application could then select() the named object, causing an interrupt if there was no entry for it in the translation table so the resource manager could create one. The 64Kbyte address space of each FIFO was divided into 8 8K "sub-areas". The application could select() an object in each, so it could operate on 8 objects at a time. Subsequent writes to each sub-area were interpreted as method invocations on the selected object, the word offset from the base of each sub-area within the 8Kbyte space specifying the method and the data being the argument to the method. The interface thus supported 2048 different methods per object.

In this way we ensured that all knowledge of the physical resources of the device was contained in the resource manager. It was the resource manager that implemented class CLASS and its instances. Thus it was that the resource manager controlled which instances of class CLASS (types of virtual object) were implemented in hardware, and which were implemented by software in the resource manager. It was possible to store the resource manager's code in read-only memory on the device's PCI card, inextricably linking the device and its resource manager. The only thing the driver for the board needed to be able to do was to route the device's interrupts to the resource manager.

The importance of the fact that all an application could do was to invoke methods on virtual objects was that the application could not know whether the object was implemented in hardware or in the resource manager's software. The flexibility to make this decision at any time was a huge advantage. As Kim quotes Michael Hara as saying:

This was the most brilliant thing on the planet. It was our secret sauce. If we missed a feature or a feature was broken, we could put it in the resource manager and it would work."

Conclusion

As you can see, NV1 was very far from the "minimal viable product" beloved of today's VCs. Their idea is to get something into users' hands as soon as possible, then iterate rapidly based on their feedback. But what Nvidia's VCs did by giving us the time to develop a real chip architecture was to enable Nvidia, after the failure of the first product, to iterate rapidly based on the second. Iterating rapidly on graphics chips requires that applications not know the details of successive chip's hardware.

I have been privileged in my career to work with extraordinarily skilled engineers. Curtis Priem was one, others included James Gosling, the late Bill Shannon, Steve Kleiman, and Jim Gettys. This search returns 2 Sun patents and 19 Nvidia patents for which both Curtis Priem and I are named inventors. Of the Nvidia patents, Curtis is the lead inventor on 9 and I am the lead inventor on the rest. Most describe parts of the Nvidia architecture, combining Curtis' exceptional understanding of hardware with my understanding of operating systems to redefine how I/O should work. I rate this architecture as my career best engineering. It was certainly the most impactful. Thank you, Curtis!

We go to school to better understand problems / Mita Williams

Know the problem.

Connecting Teens with Technology at the Library / Journal of Web Librarianship

Come Join the LibraryThing 20th Birthday Hunt! / LibraryThing (Thingology)

LibraryThing is turning twenty, and we’re hosting a special 20th Birthday Treasure Hunt! We’ve got twenty clues, one for each year we’ve been around. The answers can highlight developments on the site during that year, or events in the wider bookish world.

We’ve scattered a collection of birthday banners around the site, and it’s up to you to try and find them all.

Decipher the clues and visit the corresponding LibraryThing pages to find a banner. Each clue points to a specific page right here on LibraryThing. Remember, they are not necessarily work pages!
If there’s a birthday banner on a page, you’ll see a hunt banner at the top of the page.
You have three weeks to find all the banners (until 11:59pm EDT, Friday September 19th).
Come brag about your collection of birthday banners (and get hints) on Talk.

Win prizes:

Any member who finds at least two birthday banners will be awarded a birthday banner badge. Badge ().
Members who find all 20 birthday banners will be entered into a drawing for one of five sets of LibraryThing (or TinyCat) swag. We’ll announce winners at the end of the hunt.

P.S. Thanks to conceptDawg for the magpie illustration. Magpies collect treasure, and so do LibraryThing members!

Decoding the UN CSTD Working Group on Data Governance online series / Open Knowledge Foundation

Join us in this online event to unpack two key documents approved last May: the terms of reference and work plan of the WG. Speakers will present an overview of the goals of the working tracks, and share the expectations they hold for the next WG meeting.

The post Decoding the UN CSTD Working Group on Data Governance online series first appeared on Open Knowledge Blog.

2025-08-18: Eight WSDL Classes Offered for Fall 2025 / Web Science and Digital Libraries (WS-DL) Group at Old Dominion University

https://xkcd.com/2237/

Eight courses from the Web Science and Digital Libraries (WS-DL) Group will be offered for Fall 2025. The classes will be a mixture of on-line, f2f, and hybrid.

CS 418/518 Web Programming, Ms. Nasreen Muhammad Arif, Tuesdays & Thursdays 3-4:15
Topics: MySQL, React, Express.js, NodeJS, MVC (Model-View-Control), Search Engines, GitHub
CS 432/532 Web Science, Ms. Nasreen Muhammad Arif, asynchronous
Topics: Python, R, D3, ML, and IR.
CS 433/533 Web Security, Ms. Kritika Garg, asynchronous
Topics: HTTP, Cross-Site Request Forgery, Same Origin Policy, Cross-Site Scripting (XSS), Fingerprinting, Denial-of-service, Phishing.
CS 620 Intro to Data Science & Analytics, Dr. Sampath Jayarathna, Web, Tuesdays & Thursdays 6-7:15 and asynchronous (8 week session starting in October)
Topics: Python, Pandas, NumPy, NoSQL, Data Wrangling, ML, Colab.
CS 625 Data Visualization, Dr. Michele C. Weigle, asynchronous
Topics: Tableau, Python Seaborn, Matplotlib, Vega-Lite, Observable, Markdown, OpenRefine, data cleaning, visual perception, visualization design, exploratory data analysis, visual storytelling
CS 733/833 Natural Language Processing, Dr. Vikas Ashok, Wednesdays 4:30-7
Topics: Language Models, Parsing, Word Embedding, Machine Translation, Text Simplification, Text Summarization, Question Answering
CS 734/834 Information Retrieval, Dr. Jian Wu, Tuesdays 6-8:40
Topics: crawling, ranking, query processing, retrieval models, evaluation, clustering, machine learning.
CS 795/895 Foundational Models for Data Science, Dr. Jian Wu, Mondays and Wednesdays, 1:30-2:45
Topics: AI, Foundation Models, Language Models, Large Language Models, Multimodal Large Language Models, Pre-training, Fine-tuning, Text Generation

Dr. Michael L. Nelson will not be teaching in Fall 2025. Non-WSDL faculty member Ms. Nasreen Muhammad Arif will also teach a section of CS 624, and current WSDL PhD student and newly appointed lecturer Bhanuka Mahanama will teach a section of CS 620.

The Spring 2026 semester is still undecided, but the following courses are likely:

CS 432/532 Web Science, Ms. Arif
CS 480/850 AI, Dr. Vikas Ashok
CS 624 Data Analytics and Big Data, Dr. Ramlatchan
CS 725/825 Information Visualization, Dr. Michele C. Weigle
CS 728/828 Deep Learning, Dr. Jian Wu
CS 734/834 Information Retrieval, Dr. Jian Wu
CS 740/840 Computer Vision, Dr. Vikas Ashok
CS 800 Research Methods, Dr. Michael L. Nelson

Previous course offerings: S25, F24, S24, F23, S23, F22, S22, F21, S21, F20, S20, F19, S19, and F18.

--Michael

2025-08-26: Summer in the Midwest / Web Science and Digital Libraries (WS-DL) Group at Old Dominion University

Wrapping up my internship with a presentation at Fermilab

I was fortunate to spend the summer interning at Fermi National Accelerator Laboratory under the supervision of Dr. Marc Paterno. The whole experience was full of firsts. I had never been to Chicago or even Illinois before, and I had never seen a bison or a particle accelerator in real life. These are the kinds of things you only hear about on the news back in Europe.

At first, the dorm I was placed in felt a bit cramped. But over time, I got to know many interesting people from all sorts of backgrounds, and somehow my room didn't feel small anymore.

How PSO finds the global minimum

For my project, we developed an efficient GPU-based multidimensional global optimization algorithm, which we submitted for publication. It is a novel hybrid method that has two main phases: particle swarm optimization for global search and then a local refinement phase with BFGS, and automatic differentiation. It outperforms existing methods in terms of both convergence and runtime. It was equal parts a challenge and an interesting discovery process.

A practitioner using a global optimizer does not know the landscape of the function, which may contain many local minima. The animation shows how PSO behaves for a highly multimodal 2-dimensional Rastrigin function. The color of the contour shows the height of the function. At first, the particles are spread randomly across the search space. Then each particle's movement is influenced by its personal best-known position but is also guided toward the global best-known position, which then moves the entire swarm to the global best solution. The arrows refer to the velocity vector of each particle. We can observe that, by iteration 20, most of the swarm found the global minimum located at (0, 0) in a search space where there are 121 local minima, so each yellow circle is one where a gradient-based method can get stuck in.

During my internship, I learned a tremendous amount about experimental setup and software engineering practices, including working with larger codebases, writing test suites, and how to explore and visualize data with R (the animation above was made using ggplot). These are skills I’ll carry beyond this summer.

Beyond the technical work, I had the chance to visit several of Fermilab's experiments, including the famous Muon g-2, Mu2e, and NOvA experiments, and the Short-Baseline Neutrino Detector. When I say "visit", I mean actually going down into the mineshafts where the particle detectors live. The MINERvA experiment is currently using the detector I'm pointing to in the photo. Standing there made me realize how much mystery still surrounds us and how exciting it is to even play a small part in exploring it.

~ Dominik Soós

School of Data’s New Platform: A Fresh Look for a Renewed Mission / Open Knowledge Foundation

We are thrilled to announce that the School of Data project has a stunning new website, providing a central hub for anyone looking to build powerful data skills for change.

The post School of Data’s New Platform: A Fresh Look for a Renewed Mission first appeared on Open Knowledge Blog.

Andrés Vázquez: ‘90% of technological problems can be solved with tools that have been around for decades’ / Open Knowledge Foundation

The senior developer at Open Knowledge, a specialist in CKAN, joins us for the eighteenth #OKFN100, a series of conversations about the challenges and opportunities facing the open movement

The post Andrés Vázquez: ‘90% of technological problems can be solved with tools that have been around for decades’ first appeared on Open Knowledge Blog.

Call for Applications – Open Knowledge Legal Lab / Open Knowledge Foundation

The Open Knowledge Foundation is proud to launch the Open Knowledge Legal Lab — a new global community of lawyers committed to advancing openness at the frontiers of technology. The Lab will bring together practising lawyers, legal scholars, and public sector lawyers from around the world to experiment, collaborate, and co-create legal frameworks that support...

The post Call for Applications – Open Knowledge Legal Lab first appeared on Open Knowledge Blog.

2025-08-24: Pasindu Thenahandi (Computer Science PhD Student) / Web Science and Digital Libraries (WS-DL) Group at Old Dominion University

Hello! My name is Pasindu Sankalpa, and I am an international student from Sri Lanka. In Fall 2025, I joined the NIRDS Lab (Neuro-Information Retrieval and Data Science Lab) in WS-DL (Web Science and Digital Libraries Research Group) at Old Dominion University as a PhD student under the supervision of Dr. Sampath Jayarathna.

In December 2024, I graduated with a B.Sc. Engineering (Hons) degree in Electronic and Telecommunication Engineering from the University of Moratuwa, earning First-Class Honors.

During my undergraduate studies, my final-year research project focused on “Deep Learning–based Drone Classification by Utilizing Radar Cross Section Signatures”, supervised by Dr. Sampath Perera. This work addressed the limitations of traditional drone detection techniques by proposing a multi-modal deep learning approach for real-time detection and classification. This paper has been submitted for publication.

Alongside academics, I also gained valuable professional experience in the tech industry. I began my career as an Intern Software Engineer at Axiata Digital Labs (Jan 2023 – July 2023), where I worked in an R&D team of the Telco department. My contributions involved developing tools for digital banking e-KYC (Know Your Customer) solutions with video conferencing and messaging capabilities.

After completing my degree, I returned to Axiata Digital Labs as a Software Engineer (June 2024 – August 2025). There, I worked as a Backend Software Developer in the Telco department, focusing on the SmartNas mobile application (Android/iOS). This app is the flagship digital platform of Smart Axiata, Cambodia’s leading telecommunications operator.

By the time I embarked on my PhD journey, I had gained over one year of industry experience, working with state-of-the-art technologies on impactful, industry-driven projects. I am proficient in multiple programming languages, including Python, Java, C++, NodeJS, HTML, CSS, and PHP, as well as frameworks such as Spring Boot and Laravel. My research interests are Machine Learning, Computer Vision, Data Science etc.

Joining ODU and becoming a part of the WS-DL research community is a privilege. The opportunity to learn and grow in such a cutting-edge environment, surrounded by inspiring faculty and peers, excites me greatly. I look forward to contributing to innovative research and sharing my journey along the way.

Even before I arrived, I received a lot of help, especially from Dr. Bhanuka Mahanama in setting up and preparing everything. And now, even though I just got here, I’ve already had the opportunity to meet so many amazing people.

I may not yet know where the road after my PhD will take me, but one thing is certain, I want to do work that leaves a mark, and I want the world to remember my name.

Thanks for taking the time to read my blog post and get to know more about me. I’m looking forward to working, collaborating, and sharing knowledge with all of you throughout this journey.

Thank you!

-- Pasindu Thenahandi --

Shaneé Yvette Willis to join CLIR as the new Senior Program Officer for DLF / Digital Library Federation

CLIR is delighted to announce that Shaneé Yvette Willis will join CLIR as the new Senior Program Officer for DLF, beginning September 2, 2025.

With over 14 years of experience in the GLAM (Galleries, Libraries, Archives, and Museums) sector, Shaneé brings a dynamic combination of leadership in community engagement, digital strategy, and cultural heritage programming. She most recently served as Program Manager at the HBCU Library Alliance, where she led strategic initiatives across 90+ member libraries and championed collaborative resource-sharing and accessibility.

Throughout her career, Shaneé has advanced digital transformation and inclusion within national networks such as the Digital Public Library of America (DPLA) and in academic settings like the University of West Georgia, where she was both Assistant Professor and University Archivist. Her award-winning leadership has been recognized with honors including the Library Journal’s Mover and Shaker Award in 2021 and Faculty Member of the Year in 2018.

In her new role at CLIR, Shaneé will provide strategic leadership and vision for the Digital Library Federation, aligning DLF’s work with CLIR’s broader mission to enhance research, learning, social justice, and the public good through digital library technologies. She will oversee the development and implementation of DLF program initiatives and events, including the annual DLF Forum, and foster a strong sense of community through active engagement with DLF members, working groups, and partners.

Shaneé Willis brings to DLF and CLIR the exceptional insight and experience of someone who has built communities and successfully applied their collective agency to the most complex challenges of our era. The timing could not be better: with phenomena like AI touching on so many aspects of our professional life, her acumen and leadership skills will help guide a steady, informed exploration of ideas and practices that will define our future.

“Shaneé Willis brings to DLF and CLIR the exceptional insight and experience of someone who has built communities and successfully applied their collective agency to the most complex challenges of our era. The timing could not be better: with phenomena like AI touching on so many aspects of our professional life, her acumen and leadership skills will help guide a steady, informed exploration of ideas and practices that will define our future.”
— Charles Henry, President, CLIR

“Stepping into the Senior Program Officer role for DLF is a natural extension of my commitment to community stewardship and digital transformation,” said Shaneé Willis. “I am particularly excited to contribute to CLIR’s collaborative culture where innovation thrives and to support the growth of our colleagues across the field, especially in light of the challenges we face today.”
—Shaneé Willis

Shaneé holds an MLS from North Carolina Central University, an MDiv from Drew Theological School, and a BA in Religion and Philosophy from Bethune-Cookman University. She is also a graduate of the Archives Leadership Institute and holds certifications in grant writing and nonprofit management.

Please join us in warmly welcoming Shaneé to CLIR and the DLF community!

The post Shaneé Yvette Willis to join CLIR as the new Senior Program Officer for DLF appeared first on DLF.

Permacomputing / Ed Summers

I enjoyed listening to this talk by Devine Lu Linvega about the nomad life and the principles of Permacomputing while I was on a walk, and then watching it later when I was at home for the visuals.

One of the things that I found myself thinking about when listening was the somewhat unstated principle of Practice, which 100 Rabbits work exemplifies so well. Watching the video afterwards then drove this point home, because of the artistry that went into the slides. There are certainly technical aspects to sustainable computing, but culture, shared values, and techniques are just, or perhaps even more, important.

I think a written version of the talk will get published over at 100r.co at some point but I wanted to jot down some of my own notes.

Permacomputing practice includes work along a continuum, where each step is further away from a resource rich environment:

Frugal (being careful about the resources that you use)
Salvage (limited to what is already produced)
Collapse (utilizing what has survived collapse of infrastructure).

There is a technological bent to a lot of permacomputing practice (computing is in the name after all). But there is no need for it to be exclusively related to tech. It involves an engagement with community, and is fundamentally about “exploring the value of absence”. Finding where technologies don’t have a strengthening effect on the ecosystem.

Devine sees three aspects of permacomputing design:

Design for Disassembly: encourages repair by allowing a product to be taken apart and for users to discover how to repair it.
Design for Encapsulation: be able to replicate the environment of a product: code and data can be migrated and/or emulated. Docker, JVM, [UXN], as examples of this thinking.
Design for Descent: designing for degradation, failures of internet, energy, etc (resilience).

Resilience itself is a deep area of practice that includes: - agility - preparedness - elasticity - redundancy

Devine notes that permacomputing sort of fits within a longer trajectory of digital preservation work, and mentions some previous projects: PADI, Cedar, CAMiLEON, Domesday Project. These projects were well funded, did good work, but are now largely defunct, with only some scraps of information remaining in the Wayback Machine.

This reminded me of my own experience with digital preservation grant funding, and how paradoxically it is at odds with the goals of sustainability and preservation. Even when granting organizations require a plan for a sustainability plan, the types of activities and practices that are born into a resource rich environment can be difficult to sustain, unless they participate in some way with resource extraction. The discipline of doing work on a boat, as a nomad, (or some equivalent frugality) helps to shape sustainable computing practice.

Another aspect is the importance of piracy to digital preservation or that:

Any sufficiently organized Piracy is indistinguishable from Preservation.

This reminded me of Abigail De Kosnik’s book Rogue Archives: Digital Cultural Memory and Media Fandom. I think the in built community, frugality, and decentralization of a sharing economy is often overlooked as a key to digital preservation and its something that permacomputing really stresses.

There is also the idea of aligning engineering with primitives from the natural world, e.g. the mathematics behind the Arecibo Message which were designed to translate across species and planetary ecosystems. Whether it achieved that or not is hard to say, but the goal of reaching for computational primitives that could be rediscovered is an interesting angle. I think it’s very important for data preservation standards. I feel like this sort of minimalist thinking lay behind the design of BagIt, which aimed to allow data to be moved through time and space, without the loss of data and context.

Ultimately permacomputing is about “exploring the boundaries between suffocating minimalism and unsustainable maximalism”. It’s an invitation to slow down, and participate and learn to control the tools so they don’t control us. Where does your work sit along that continuum?

Live and Let Die: Rethinking Personal Digital Archiving, Memory, and Forgetting Through a Library Lens / Harvard Library Innovation Lab

In today’s world, each moment generates a digital trace. Between the photos we take, the texts we send, and the troves of cloud-stored documents, we create and accumulate more digital matter each day. As individuals, we hold immense archives on our personal devices, and yet we rarely pause to ask: What of this is worth keeping? And for how long? Each text we send, document we save, or photo we upload quietly accumulates in the digital margins of our daily routines. Almost always, we intend to return to these traces later. Almost never do we actually return to them.

Libraries do not collect and store everything indiscriminately. They are bastions of selection, context, and care. So why don’t we do the same when managing our personal digital archives? How can library principles inform personal archiving practices when memory becomes too cheap, too easy, and too abundant to manage? What does meaningful digital curation look like in an age of “infinite” storage and imperfect memory? How might we better navigate the tension between memory and forgetting in the digital age? At LIL, we’re interested in holding space for these tensions and exploring the kinds of tools and frameworks that help communities navigate these questions with nuance, care, and creativity. We researched and explored what it could look like to provide individuals with new kinds of tools and frameworks that support a more intentional relationship with their digital traces. What emerged is less a single solution and more a provocation about curation, temporality, and what it means to invite forgetting as part of designing for memory.

This blog post sketches some of our ideas and questions informed by the work of archivists, librarians, researchers, coders, and artists alike. It is an invitation to rethink what it means to curate the digital residue of our everyday lives. Everyone, even those outside of libraries, archives, and museums (LAMs), should engage in memory work with their own personal digital archives. How might we help people rigorously think through the nature of digital curation, even if they aren’t already thinking of themselves as archivists or librarians of their personal collections? We hope what follows offers a glimpse into our thinking-in-progress and sparks broader conversation about what communities and individuals should do with the sprawling, often incoherent archives our digital lives leave behind.

Our premise: overaccumulation and underconsideration

We live in a time of radical abundance when it comes to digital storage. Cloud platforms promise virtually unlimited space. A single smartphone can hold thousands of photos. Machines never forget (at least, not by default) and so we hold on to everything “just in case,” unsure when or why we might need it. Often, we believe we are preserving things such as emails, messages, and files, because we’re simply not deleting them.

But this archive is oddly inhospitable. It’s difficult to find things we didn’t intentionally label or remember existed. Search functions help us find known items, but struggle with the forgotten. Search is great for pinpointing known things like names or keywords, but lost among our buried folders and data dumps are materials we didn’t deliberately catalog for the long-term (like screenshots in your photos app). One distinction that emerged in our work is the difference between long-term access and discovery or searchability. You might have full-text search capability over an inbox or drive, but without memory of what you’re looking for or why it mattered, it won’t appear. Similarly, even when content resurfaces through algorithmic recommendation, it often lacks appropriate context.

And so, we are both overwhelmed and forgetful. We save too much, but know too little about what we’ve saved. Digital infrastructure has trained many of us to believe that “saving” is synonymous with “remembering,” but this is a design fiction. People commonly assume that “they can keep track of everything,” “they can recognize the good stuff,” and most of all, “they’re going to remember what they have!” But in practice, these assumptions rarely hold true. The more we accumulate, the less we can truly remember. Not because the memories aren’t saved, but because they are fundamentally disconnected from context.

A library lens on everyday personal digital archives

“Not everything that is dead is meant to be alive.”

When it comes to our digital lives, we often feel pressure to rescue every bit of data from entropy. But what if some data is just refuse, never meant to be remembered? In libraries and archives, we don’t retain every book, document, or scrap of marginalia. We acquire with purpose, discard items and weed our collections with care, organize our collections, and provide access with users in mind. Digitally, this process can be much harder to implement because of the sheer volume of material. Everything is captured whether it be texts, searches, or half-finished notes. Some of it may be precious, some useful, and some exploitable.

The challenge is thus cultural as much as technical. What deserves preservation? Whose job is it to decide? And how can we create tools that align with people’s values, rather than simply saving everything? Libraries and archives are built on principles of deliberate acquisition, thoughtful organization, and selective retention. What if we followed those same principles in our personal digital ecosystems? Can we apply principles like curation, appraisal, and mindful stewardship from library science to personal digital archives? What if, instead of saving everything permanently by default, we adopted a mode of selective preservation rooted in intention, context, and care?

Integral to memory work is appraisal, deciding what is worth keeping. In archival theory, this is a complex, value-laden practice. As the Society of American Archivists (SAA) notes, archivists sometimes use the term “enduring value” rather than “permanent value” with intention, signaling that value may persist for a long time, but not necessarily forever. Notions of “enduring value” can shift over time and vary in different communities.

On forgetting (and why it’s valuable)

In digital systems, forgetting often has to be engineered. Systems are designed to store and resurface, not to decay. But decay, entropy, and obsolescence are part of the natural order of memory. If we accept that not everything needs to be held forever, we move into the realm of intentional digital gardening.

“What if forever isn’t the goal? What’s the appropriate level of preservation for a given context?”

Preservation need not be permanent in all cases. It can be revisited, adjusted, revised with time as people, contexts, and values change. Our tools should reflect that. What if temporary preservation was the more appropriate goal? What if the idea of a time capsule was not just about novelty and re-surfacing memory, but instead core to a practice of sustainable personal archiving, where materials are sealed for a time, viewed in context, then allowed to disappear?

“The memory needs to be preserved, not necessarily the artifact.”

There’s a growing recognition in library and archival science that resurfacing content too easily, and out of context, can be damaging, especially in an era where AI searches can retrieve texts without context. Personal curation tools should assist in the caretaking of memory, not replace it with AI. Too often, we see narratives that frame technology as a substitute for curation. “Don’t worry about organizing,” we’re told, “We’ll resurface what you’ll want to remember.” But this erases the intentionality fundamental to memory-making. Sometimes, forgetting protects. Sometimes, remembering requires stewardship, not just storage.

Designing for memory: limits as creative force

Designing for memory is ultimately a human-centered challenge. Limitations can be a tool, not a hindrance, and constraints can cultivate new values, behaviors, and practices that prioritize deliberate choice and intentional engagement.

Imagine creating a digital time capsule designed for memory re-encountering, temporality, and impermanence. You can only choose 10 personal items to encapsulate for future reflection. What would you choose? What story would those items tell? Would they speak to your accomplishments? Your values? Your curiosities? Would they evoke joy or loss?

Capsules could be shaped around reflective prompts to aid selection and curation:

What story or feeling do you want to preserve? What emotional tone does this capsule carry: celebration, remembrance, grief, joy?
Who is your audience: your future self, family, a future researcher, a larger community?
What context needs to be retained for future understanding?
What kind of media captures this best: text, photo, audio, video, artifacts? Why did you choose what you did?
What items would you miss the most if digital platforms went down or the items became unavailable? (Make a list).
Should these items be available immediately, or unlocked after a certain amount of time?
Once opened, should the capsule remain accessible, or eventually disappear?

Engaging in reflection like this can help individuals perform the difficult and deeply human work of curating your personal digital archive without being overwhelmed by the totality of your digital footprint. Making this kind of digital housekeeping part of your established maintenance routine (like spring cleaning) helps make memory work an intentional and active process that encourages curation, self-reflection, and aids the process of choosing what not to keep. It is memory with intention.

Memory craft: a call to action

In every era, humans have sought ways to preserve what’s vital, and let the nonessential fall away. In our current digital context, that task has become harder, not because of lack of space, but because of lack of frameworks. Your life doesn’t have to be backed up in its entirety. It only needs to be honored in its essentials. Sometimes, that means creating a space in which to remember. Sometimes, that means creating a ritual in which to let go.

At the Library Innovation Lab, we are continuing to explore what it means to help people preserve with intention. Becoming active memory stewards means moving beyond default accumulation and choosing with care and creativity what stories and traces to carry forward. We want to make memory, not just data, something people can shape and steward over time. Not everything needs to be preserved forever, and our work is to provide people with the frameworks and tools to make these decisions.

Resources

The following resources helped shape our thinking and approach to intentional curation of personal archives in the digital age:

Acknowledgements

We would like to thank our colleagues Clare Stanton, Ben Steinberg, Aristana Scourtas, and Christian Smith for the ideas that emerged from our conversations together.

Visual by Jacob Rhoades.

Interest Convergence, Intersectionality, and Counter-Storytelling: Critical Race Theory as Practice in Scholarly Communications Librarianship / In the Library, With the Lead Pipe

In Brief: Despite the ever-increasing presence of diversity, equity, and inclusion (DEI) rhetoric in librarianship, library workers who are Black, Indigenous, and people of color (BIPOC) are still underrepresented and marginalized. Critical race theory (CRT) offers the tools necessary to understand why the underlying racial power dynamics of our profession remain unchanged and to generate new ideas to move toward true equity and inclusion. This article presents applications of the theoretical frameworks of interest convergence, intersectionality, and counter-storytelling to the authors’ work with users and to our collegial relationships. As scholarly communications and information policy specialists of color at a predominantly white academic library, these three frameworks inform how we teach about scholarly practices, such as copyright and citation, as well as how we analyze and educate about the open access landscape. We also argue that a critical race theory lens can provide useful analytical tools to inform practice in other types of libraries and different kinds of library work, and encourage all library workers to engage with it as they seek meaningful change in their work settings and the profession more broadly.

By Maria Mejia and Anastasia Chiu

Introduction

As scholarly communications practitioners of color located in an academic library of a predominantly white^[1] institution (PWI), we find that critical race theory serves as a cornerstone for how we relate to each other and to the profession. Multiple theoretical frameworks in this movement give name and shape to our approaches and to the racialized phenomena that we seek to resist. The themes of counter-storytelling, intersectionality, and a problematized approach to interest convergence speak most closely to the ways in which we practice CRT in our relationships and our work. We are members of a department consisting entirely of librarians who are Black, Indigenous, and people of color—a somewhat uncommon occurrence in our PWI and in librarianship more broadly—and this dynamic has shaped our CRT-informed practice. Collectively, as a department, we seek to set our own terms around what it means to be a good library worker and a good colleague. We work together to advocate for communities that are systematically excluded in scholarship and librarianship because our librarianship is for those communities. Yet, we must also contend with the fact that our institution’s support for this work is mainly a matter of interest convergence. To paraphrase Derrick Bell (1980), PWIs value and promote racial progress and racial justice work only insofar as it serves their political interest to do so. In our case, our institution benefits from the optics of our intersectionality as a woman and a non-binary person of color, taking on the labor of building inclusive services and an inclusive workplace. With this in mind, we take advantage of interest convergence as it suits us, while also prioritizing ourselves and each other in this environment. We empower each other to recognize when our institution is pushing us to do too much of the labor of inclusion and support each other in setting strong boundaries.

At the core of our scholarly communications work is providing services to researchers to help them navigate their scholarship and academic communities. These services include teaching scholars about common conventions that exist in the scholarly lifecycle, such as publication and citation practices. Many researchers are taught to see scholarly practices as pro forma requirements devoid of politics, but we seek to trouble this assumption. We recognize that scholarly practices exist in a capitalist, heteropatriarchal, white supremacist framework that reinforces the marginalization of BIPOC scholars and creators. With this understanding, we work with researchers to push back against the mainstream narrative to surface the counter-stories of those silenced in scholarly discourse. Relying on CRT as a frame, we attempt to build expansive conversations that recognize the racialized, gender essentialist, ableist, and capitalist politics of knowledge production, while making space for more liberatory, critically open, and equitable practices.

Critical race theory provides us with vocabulary, theoretical frameworks, and tools for many aspects of the collegial relationships and the services that we are building together. Our article will explore how we find our work and our working selves in CRT, applying the concepts of interest convergence, intersectionality, and counter-storytelling. We hope that our lived experiences in bringing CRT to practice will serve as an example for others looking to build environments where BIPOC library workers can thrive.

Definitions of Interest Convergence, Intersectionality, and Counter-Storytelling

We look at CRT and apply it in our practice with recognition that it is not a single monolithic model of race, racism, and anti-racism, but a fully-fledged social and scholarly movement with many strands and tenets, some of which differ significantly in their emphases. The ways that we inform our approaches to scholarly communications librarianship with CRT are not the only or definitive ways to apply CRT. Rather, they form a concrete example of approaching library work with an understanding of racism as an everyday phenomenon that is structurally embedded in individual interactions, institutional relationships, and macro-level policy. Three commonly used frameworks of CRT that we use as lenses for our work are the interest convergence dilemma, intersectionality, and counter-storytelling.

Interest convergence is a theory that posits that racial progress only happens when the political interests of BIPOC and white people converge. Derrick Bell (1980) originated this theory in his critical perspective of the Supreme Court decision in Brown v. Board of Education. He saw the Brown decision not as a result of change in legal reasoning or social norms around race, but as a result of temporary common ground between the political need of white people for optics of racial equity during the Cold War and the enduring equality needs of Black people. Bell concludes that although we can sometimes use interest convergence to push for useful changes, this approach also has serious shortcomings in fostering change in foundational racial power dynamics. Interest convergence does not only appear at the macro level of national policy, as in Bell’s case analysis; it also shows up in organizations and the labor conditions within them, including libraries and library work. Library organizations have been building rhetorical commitments to diversity, equity, and inclusion for decades in response to BIPOC-led calls for change in professional organizations, and though these may be seen as progress, Bell’s articulation of interest convergence offers an explanation for the observable shortcomings of these organizational statements.

Despite the growth of DEI rhetoric, libraries as organizations nevertheless continue to enact racial domination through our work and working conditions. As Victor Ray (2019) points out, race is constitutive of organizations; organizations “help launder racial domination by obscuring or legitimating unequal processes” (35). We see this in action when “diversity work” is coded in libraries as something to be done primarily by BIPOC separately from, and in addition to, everyday organizational functions, resulting in disproportionate influence and control of our time and labor. Moreover, interest convergence also encourages us to notice that racialized groups often do not reap the benefits of policies and measures for inclusion and equity, and in fact, racialized communities can be harmed in the halfhearted enactment of those policies and measures. Just as Bell points out that this was the case in school desegregation under Brown v. Board, Hathcock and Galvan point out that this is the case in libraries’ DEI efforts, such as the use of temporary job appointments as diversity hiring initiatives (Galvan 2015; Hathcock 2015; Hathcock 2019). These temporary job appointments increase staff diversity in the short term but do not disrupt the racial dynamics of predominantly white libraries in the long term, demonstrating the limits of interest convergence. We use this framework to approach librarianship with a critical understanding of where our interests truly converge and diverge with our organization’s, and to inform how we situationally advocate for equitable practices and policies.

One of the key steps in our advocacy for equity is understanding the multiple identities we embody and, therefore, the overlapping marginalizations that we face in a predominantly white profession and institution. We look to intersectionality as a framework for understanding the compounding effects of racial marginalization with other forms of marginalization (such as gender, class, etc.). Intersectionality rejects the tendency of institutions and individuals to treat these forms of marginalization as entirely separate spheres. The concept was originally coined by legal scholar Kimberlé Crenshaw (1991) to highlight the shortcomings of workplace discrimination law in addressing discrimination that appears specifically at the intersection of race and gender. Sociologist Patricia Hill-Collins (2000) expanded upon it, identifying the organized interconnections between multiple forms of oppression and the experience of multiple marginalizations as having a compounding effect that constitutes a “matrix of domination” (43). Although the term “intersectionality” has become a buzzword in DEI rhetoric, it remains a useful theoretical framework for analyzing the experiences of BIPOC library workers who may also be queer, working class, disabled, and hold other marginalized identities. Examining our day-to-day experiences through an intersectional lens allows us to understand how the dilemma of interest convergence manifests itself in our work and professional relationships. When our interests do not align with our organization’s interests, developing and sharing counter-stories can be a powerful and necessary tool.

Counter-storytelling refers to the practice of telling stories that reflect marginalized experiences, histories, and knowledge in a way that challenges mainstream narratives or commonly-taught histories. In their foundational Critical Race Theory: An Introduction, Delgado and Stefancic (2023) characterize narrative and storytelling as having the “valid destructive function” of illuminating preconceptions and myths about race and racialization that form the contextual backdrop against which BIPOC are marginalized and dehumanized (76). In our practice, we tell counter-stories that highlight how white supremacy is enacted in underlying philosophies and common practices of scholarship and librarianship. We use these counter-stories to move ourselves and others to center feminist and BIPOC knowledge and scholarship, as well as to resist and relinquish the everyday practices that serve white interests at the expense of BIPOC humanity. We also use counter-stories to develop counter-spaces where BIPOC library workers and users can build community with each other in honest, safe, and liberatory ways that resist the dominant gaze.

In summary, we use the frameworks of interest convergence and intersectionality to understand the conditions of our workplace and of academia in general, and we apply those understandings to construct counter-stories, with which to empower each other and the scholars we serve.

Applying CRT in Our Collegial Relationships

We are BIPOC library workers with many interrelated marginalized identities that affect how we approach our work. That being the case, applying the frameworks of CRT to our relationships with each other, with our entire department, and with colleagues across our library system is key to helping us understand our work environment and mitigate the effects of that environment on our minds and bodies. We work in academic spaces that are predominantly white, ableist, heteronormative, cisgender, and patriarchal, as information professionals who do not fit many or any of those dominant identities. It can be exhausting, but we lean on our CRT praxis to help ourselves and each other not only survive in this environment but also find moments to thrive and experience joy. By employing intersectionality and interest convergence to recognize and call out the white supremacist environment in which we work, we can build our counter-stories and even counter-spaces, to ensure that we set healthy boundaries and find fulfillment in our work together

As we apply CRT in our relationship with each other, we explicitly acknowledge and push back against the white supremacist culture that permeates every aspect of our workplace and our work (Quiñonez, Nataraj, and Olivas 2021). This requires that we recognize vocational awe and neutrality as key components of everyday white supremacy in library work. Vocational awe encourages us to sacrifice ourselves for a so-called sacred profession dominated by a white womanhood that neither of us can (or wants to) achieve (Ettarh 2018). Library neutrality is a myth that serves to disguise white supremacist ideas as normal (Chiu, Ettarh, and Ferretti 2021). Our very racialized presence in a white profession necessitates a radical pushback against vocational awe and neutrality for us to survive, much less thrive. Thus, part of our work as colleagues is to help each other call out those instances of white supremacy that would seek to appropriate our work and, in many ways, our very selves (Brown et al. 2021).

Part of the white supremacist culture that we seek to name and dismantle is the widespread practice in academic—and by extension academic library—spaces to co-opt, tokenize, and undervalue our work as marginalized people (Brown, Cline, and Méndez-Brady 2021). More often than not, BIPOC academics are relegated to what Dr. Chanda Prescod-Weinstein (2021) refers to as the “emotional housework” of the academy, where we are expected to meet general academic standards while also providing much of the support work for our colleagues and students (189). We are not only responsible for completing the standard service, research, and librarian duties that are expected of all faculty members. Our institutions also expect us to take on the additional labor of building more inclusive spaces through mentoring, educating colleagues from dominant groups, and shouldering the burden on diversity committees and in diversifying faculty governance and search committees (Brown, Cline, and Méndez-Brady 2021).

This additional burden is true for academics like us who are tenure-track but has even weightier implications for those in more precarious positions, including contract and adjunct workers. As Dr. Prescod-Weinstein (2021) notes: “Researchers from minoritized groups, including gender minorities and especially those of us of color, face an extraordinary burden in academic spaces” (188). We find that to be the case for us at our institution, where our small department has often been overrepresented on search committees, diversity work, and other activities where the optics of a BIPOC perspective are seen as beneficial. These requests come fast and furious with no additional compensation for the extra work and little to no acknowledgement of the inequitable distribution of this labor. Using the CRT frameworks of intersectionality, interest convergence, and counter-storytelling/counter-spaces, we gauge each request as it comes in, and push ourselves and each other to consider what opportunity the request presents to make changes that benefit BIPOC in our library, weighed against how it will affect our individual and departmental capacity, as well as the precedent that it may set for other BIPOC in our library. We refuse and accept requests judiciously, and thus, can carve out joy and fulfillment in our relationships with each other and our work amidst the additional burdens a white supremacist environment places on us.

Applying intersectionality requires that we exercise care in acknowledging each other’s labor, respect each other’s boundaries, and give each other space to fully inhabit our intersectional identities at work as we see fit. We explicitly recognize that we are full human beings with interlocking identities that we cannot and often do not wish to leave at the proverbial door when we enter our workspaces. We each have different ways of embodying our marginalized identities at work and deserve to be treated with respect to our varying needs. We also acknowledge that our lives involve more than the work we do at our institution. We are caregivers and receivers; we are family and friends; we are whole people with ideas, desires, and commitments that go beyond our workplace. With this in mind, the intersectional care that we bring to the profession takes many forms, including building flexibility into our work schedules, using multiple modes of communication for meetings and information-sharing, or taking full advantage of the hybrid virtual and in-person work environment that surfaced during the earlier days of the pandemic. Our intersectional care also extends to a sensitivity to our feelings and focus as we approach our work; there have often been moments when we have had to postpone job activities to take time and space to check in with each other about heightened circumstances in our personal lives or the world around us.

Another way we employ intersectionality in our collective work is through intentionally surfacing and valuing that “emotional housework” which often gets invisibilized and devalued by our white institution. Together, we call out and celebrate the achievements we make in mentoring BIPOC students, teaching our white colleagues through our lived experiences, and all the labor we put into helping to make our institution a viable place for marginalized workers and learners. We not only surface this invisibilized work among ourselves, but we also do so with our colleagues across the institution, both formally and informally. We take the time to mention the emotional labor we are engaging in, especially when new requests for such labor come in, and we make a point of adding narratives of that labor to our formal documentation for promotion and tenure.

Finally, we apply intersectionality as we support one another in exercising boundaries when the institution wishes to exploit our labor and intersectional identities (Brown, Cline, and Méndez-Brady 2021). We keep each other informed as necessary about our schedules, workload, and personal life loads, and make adjustments as needed to allow any one of us to step back or forward as they are able. The purpose of this practice is never to force disclosure beyond what either of us wishes to disclose at work. Rather, it is to encourage each other to reflect holistically and to empower ourselves to decline new requests if needed to secure and maintain healthy boundaries in our work. We seek to interrupt the professional librarian norm that we must be selective in refusing requests for ad hoc or invisibilized labor; instead, we encourage each other to be selective in accepting these requests, particularly in times when we are stretched thin due to staffing shortages or simply because it is a busy time of the academic term. Overall, our intersectional approach to empowered boundary-setting is a form of “transformative self-care” where we build a supportive community to affirm our intersectional identities, validate our lived experiences, and push back against coerced assimilation to the surrounding white supremacist norms of our institution (Baldivia, Saula, and Hursh 2022, 137; see also Moore and Estrellado 2018).

In addition to intersectionality, we also use a critical approach to interest convergence to help us call out oppressive systems in our workplace and call forth more liberatory possibilities. From the CRT framework of interest convergence, we learn that antiracist work is only supported in white supremacist culture to the extent that it also provides a benefit to white supremacy (Delgado and Stefancic 2023). Employing this framework can, therefore, be a powerful way of pushing forward initiatives and changes that benefit us and our fellow racialized colleagues and do so with the support of our predominantly white academic library (Brown, Cline, and Méndez-Brady 2021; Aguirre 2010). However, we also recognize Derrick Bell’s original framing of the interest convergence dilemma, which teaches us that basing our pursuit of racial justice solely on the interests of white supremacy is no way to build toward liberation. We must instead take a critical approach to interest convergence that allows us to move forward an antiracist agenda within our white-dominated workplace without losing sight of our own ultimate goals and needs as BIPOC workers.

With this tension in mind, we make use of interest convergence only when it best suits our needs of building an antiracist workspace while continuing to make material changes to the white supremacist culture that surrounds us. For example, when one of us served alongside our department’s senior leader on a committee tasked with providing recommendations for initiatives to aid new faculty with integration to the library, it was a priority to provide recommendations with a clear benefit to new BIPOC faculty. This approach was informed by a recognition that benefits to BIPOC faculty would also benefit white faculty. One recommendation was to create a scheduling system for more equitable sharing of service duties, such as committee work, informed by the heavy service loads of every member of our department at the time. For the sake of interest convergence, one explanation of the recommendation was that it would maximize faculty productivity since productivity is a priority for our white-dominated institution. However, with the antiracist goals and praxis of our department in mind, the recommendation also served as a means of helping to alleviate the excessive service load often placed on racialized academics, particularly BIPOC women, who formed a substantial contingent of our library’s new hires at the time. Thus, interest convergence helped to get the point across, but did not overtake our particular anti-oppression agenda. We saw some of the initial advantages of leaning into interest convergence as an active strategy when our senior leadership group accepted the recommendation. However, we also see some of its drawbacks, as we witnessed the recommendation come to faculty governance for consensus without success. To this day, we have not yet seen a scheduling system take shape, though we have certainly learned more about whose interests have influence in implementing various types of recommendations.

As we apply the CRT frameworks of intersectionality and interest convergence to name and push back against white supremacy in our workplace, we support one another in crafting counter-stories of what it means to be an academic library worker at a private PWI. We craft our counter-stories as a means of making our department what Solorzano, Ceja, and Yosso (2000) describe as a “counter-space” where we can find some reprieve as racialized people working in a white supremacist environment (70-71; Masunaga, Conner-Gaten, Blas, & Young 2022, 16-17). In crafting our departmental counter-space within our academic library, we build a more liberatory community together and extend that to others within our institution and beyond.

Highlighted CRT-Informed Relational Practices

Recognize that there are many ways to embody multiply-marginalized identities at work.
Encourage keeping each other informed about needs for work adjustment without forcing disclosure beyond what is necessary or useful.
Encourage selective acceptance of service requests rather than selective refusal. Think thoroughly about requests for committee and other voluntary service, weighing the opportunities of the request against its effects on individual capacity, departmental capacity (particularly for BIPOC department members who may pick up any slack), and the precedent that it sets for BIPOC service expectations across the organization.
Build flexibility into work schedules.
Use multiple modes of communication to meet and share information; take advantage of norms for virtual and in-person hybridity from the earlier days of the pandemic to enable colleagues to contribute their work.
Practice attunement to heightened moments in the world and in colleagues’ lives, check in with each other around those moments, and postpone business as usual to support each other.
Take time to acknowledge emotional and other invisibilized labor explicitly. Build narratives about the value of invisibilized labor in promotion cases.
When creating bold new practices to improve work experiences for BIPOC workers, be aware of whose interests will influence their implementation, and proactively articulate their benefits for all workers.

Applying CRT in Our Scholarly Communications Work

Academic librarians are crucial in educating researchers about the importance of scholarly practices such as citation and copyright. As the scholarly communications and information policy department at our academic library, we regularly teach library information sessions and consult one-on-one with users about the myriad ways citation norms and copyright law can help scholars share their research with the public, thereby challenging a common misconception that these are perfunctory steps in the research process. However, we are also scholars in our own right and, as such, have a stake in challenging a conventional approach to these scholarly practices that privileges white, able-bodied, cisgender, and heteronormative perspectives over others.

In the digital zine “How to Cite Like a Badass Tech Feminist Scholar of Color,” Rigoberto Lara Guzmán and Sareeta Amrute (2019) challenge the established rules of citation in which the “scholar” and the “research subject” both contribute to the production of knowledge but only the contribution of the “scholar” merits acknowledgment through citation. The idea that citing is the key to engaging in scholarly conversation privileges writing over other forms of communication and assumes that all perspectives are given equal value in academia (Okun 2021). In reality, many academic authors cite marginalized scholars less often than scholars from dominant groups, erase the scholarly contributions of those outside academia, and deem certain kinds of knowledge unworthy of citation.

Instead of upholding a hierarchy of knowledge with “research subjects” at the bottom, library workers can challenge researchers to think beyond the mechanics of citation and to contend with the socioeconomic structures upholding this scholarly practice – who does academia cite and who does it exclude when it comes time to name the creators and keepers of knowledge? One way our department addresses this issue in instruction sessions is by discussing who the students cite in their research papers and asking if they are citing, or otherwise acknowledging, classmates with whom they have been in conversation during the semester. Since the answer is usually “no,” this creates an opportunity for a larger discussion about who else the students might be omitting from their list of citations and why they should expand their understanding of who is worthy of being cited.

The Cite Black Women Collective is an example of how scholars across the social sciences and humanities have put into practice Black feminist theories and applied those critical theoretical frameworks by disrupting “hegemonic citational politics” (Smith et al. 2021, 14). Citing Black women is at the core of the Collective’s mission, but their work extends further. The Cite Black Women Collective explains: “Our politics of collectivity demand that we strive to embody a particular kind of Black feminist thought, one that rejects profitability, neoliberalism, self-promotion, branding, and commercialization. We make intentional decisions to uptorn neoliberal values of hyper-individualistic profiteering” (Smith et al. 2021, 13). As librarians who are also tenure-track faculty, we must meet expectations of publishing and get cited by other scholars to achieve tenure and promotion. However, these expectations reinforce the profiteering we are trying to challenge.

Instead of succumbing to the pressures of a process shaped by white supremacist values, we approach our tenure and promotion portfolios from an intersectional, collective perspective by publishing in journals that prioritize BIPOC and other marginalized voices and citing scholars who challenge the status quo, regardless of academic status or institutional affiliation. Librarian Harrison W. Inefuku (2021) identified academic publishing as one of the main stumbling blocks for BIPOC scholars and argued that “the persistence of racism in the process creates a negative feedback loop that suppresses the number of faculty of color, resulting in a scholarly record that continues to be dominated by whiteness” (211). Publishing in an established academic journal and being cited as the first author carries prestige that increases our chances of achieving tenure and promotion, even though these do not directly correlate to the effort or creativity that goes into creating a work or the impact a work has on the larger scholarly field. Approaching the tenure and promotion process from a viewpoint shaped by CRT, particularly a problematized approach to interest convergence, allows us to make sense of these contradictions. Our tenure and promotion materials put interest convergence into practice by emphasizing how our institution has benefited from our CRT-informed work and, at the same time, making it clear that our work goes far beyond the DEI initiatives we may be involved with. We can then focus on creating a new path forward instead of trying to meet professional benchmarks that may not align with our values as BIPOC academic librarians informed by Black feminist practices. We might not be able to escape the hyper-individualistic, commercialized expectations of our PWI completely during the tenure and promotion process. Still, we can choose to deprioritize those expectations, pursue work that fulfills us, and share counter-stories in our tenure and promotion materials that highlight our contributions to the collective good over individual achievements.

In our roles as scholarly communication and information policy specialists, we encourage users to question how they consume information and how they create it. One of the ways we do this is by framing copyright as a mechanism to expand the reach of scholarship, not just to monetize creative works. In the age of the Internet, many students first encounter copyright when they learn that they have shared copyrighted material in violation of a social media platform’s rules and have their content taken down. Others may see how large corporations and individuals intentionally use copyright infringement claims to remove content and censor conversations on social media. Our job is to educate scholars on the basic principles of copyright. However, we take it a step further by challenging the use of copyright in ways that prioritize profit over sharing information.

Our department utilizes counter-storytelling when teaching copyright law to empower our students and colleagues to disrupt the power dynamics inherent in the academic publishing process. In doing so, the scholars we train learn how to maintain some legal control over their work and to prioritize the freedom to share their scholarship with communities on the margins of or outside academia over making a profit. In past workshops about research ethics, we asked graduate students to imagine a scenario in which they submitted their thesis for publication in an academic journal and had to negotiate the terms of their publishing agreement. One of the questions we posed to the students was whether they wanted to transfer their copyright to the publisher. The students unanimously responded that they wanted to keep their copyright. Their reasoning for keeping their copyright varied. A common thread, however, was that the students wanted the freedom to make their work widely available instead of having the piece they created stuck behind a publisher’s paywall. Even though it was an imaginary scenario, the students touched on an important reality for many scholars who want to share their work freely but ultimately have to compromise to meet the publication expectations of their institutions.

Authors who seek to make their research widely available for free while retaining their copyright may find traditional academic publishing challenging and instead turn to open access publishing. “Open is an expansive term and encompasses a series of initiatives, policies, and practices that broadly stand for ideals of transparency, accessibility, and openness” (Yoon 2023, “Introduction;” italics in the original). By publishing open access, scholars can reach readers unaffiliated with academic institutions or whose institutions have limited access to publications due to budget restrictions. In theory, authors who publish open access would not have to choose between reaching a large audience, one of the appeals of publishing in established journals, or keeping their copyright. However, we have seen how traditional publishers have co-opted the open access publishing model for their own profit-making goals by charging exorbitant fees and pushing authors to give up their copyright if they want to publish open access (Maron et al. 2019). We encourage authors who consult with us to be aware of this tendency toward co-optation of equity-oriented movements by for-profit publishers, and to think critically about whether the value of working with those publishers measures up to what the publishers will extract from them (and other scholarly authors) in the process.

Despite these challenges, open access has significant implications beyond intellectual property rights for BIPOC scholars, particularly those who engage in research about race, gender, sexuality, and class. Inefuku (2021) wrote: “With the growth of open-access publishing and library publishing programs, there has been a growth of opportunities to create journals dedicated to publishing subjects and methodologies that are ignored by the dominant publications” (205). While open access journals provide necessary platforms for scholars researching marginalized people or topics, Inefuku explained, these publications may not be given the same weight as traditional journals during the tenure and promotion process. Traditional journals still outrank open access journals, which limits the options available to authors who want to pursue open access publishing exclusively and allows cynics to dismiss open access publications as inferior. In addition, publications that rely on steep publication fees to publish open access reproduce the inequities of traditional publishing by making it harder for scholars with less funding to share their research.

Open access exists within the capitalist landscape of academic publishing, an industry that, like librarianship, is predominantly white and gatekeeps scholarship that does not uphold whiteness (Inefuku 2021). Our department promotes open access and sees its potential to make academic publishing more accessible to scholars that publishers have historically excluded. When publishing our own research, we prioritize open access publications and publishers that support self-archiving (or green open access), which allows us to easily share our research with the public for free. At the same time, we recognize that open access is not intrinsically equitable. Analyzing open access, and scholarly communications more broadly, from a critical and intersectional perspective allows us to simultaneously imagine the possibilities and see the stumbling blocks in the path toward equitable academic publishing.

Highlighted CRT-Informed ScholComm Practices

Teach researchers to recognize that their scholarly choices are political.
Challenge researchers to engage with scholarly practices beyond their simple mechanics, and contend with the underlying socioeconomic structures that privilege white, nondisabled, cisgender, and heteronormative perspectives over others.
Encourage researchers to step out of their comfort zones of conformance with scholarly norms and traditions, and step into growth zones around citation politics, publication outlet choices, negotiation of publishing contracts, and more.
Tenure-track practitioners (particularly BIPOC) should find a balance between meeting the expectations of their institution’s requirements, contributing to the growth of scholarly norms that recognize independent scholars as scholars, and BIPOC counter-stories as worthy of entry into the scholarly record.
Tenured practitioners should work to establish institutional norms that affirmatively value scholarly practices that critically engage with the racialized legacies of traditional scholarship and the scholarly publishing ecosystem, and model those norms for their colleagues without tenure.
Engage with growing collective analysis of how for-profit co-optation impacts open access and the movement for equity in academic publishing, and be prepared to articulate nuance based on that analysis when explaining scholarly practices to researchers at all levels and in all contexts.

Looking Forward: A Case for CRT Application Across Library Settings

One of the most insidious outcomes of stock stories of diversity is that it is often assumed that the plain and simple presence of Black people, Indigenous people, and people of color represents justice and dismantlement of white supremacy. But white supremacy is not merely manifested in rare, egregious incidents in environments where BIPOC are minoritized; it is normalized across everyday life, and can often operate and have impacts on BIPOC even without a predominating presence of white people within an organization (Delgado and Stefancic 2023). Shaundra Walker (2017) gives an example of this by examining the patronizing and white-supremacy-serving history of 19th-century white industrialist philanthropy on Historically Black Colleges and Universities (HBCUs) and its lasting impacts. From Walker’s incisive critique, we draw an understanding that white supremacy thrives best in the absence of interrogation of norms and practices, and indeed, some of the most hurtful forms of white supremacy are the ones that BIPOC may enact on each other when we have not questioned the everyday norms and practices of our fields. That understanding deeply informs our efforts to go beyond the confines of what is typically defined as scholarly communications work in libraries and to work with library users to challenge traditional views of scholarship and open access. A CRT-informed analysis of library work and workplace culture is applicable across all types of departments and institutions. It’s important to note that although our application of CRT is in the particular setting of a large private PWI, a CRT lens can also inform work at institutions with numerically significant populations of BIPOC employees and users, particularly students of color. It can also inform others’ work in different functional areas and institutions with different racial dynamics from our own.

Another facet of our collective application of CRT that may be useful to others is our use of it as a lens to understand our services for users and our collegial relationships within our organization. Because a common pitfall in DEI work is to focus on departmental services and neglect internal workplace culture, it is particularly significant that library workers tend to both of these aspects carefully and intentionally, rather than choosing one at the expense of the other. Library workers can approach work with an underlying awareness that organizations enact racialization (knowingly or not) through practices like heightened surveillance of how BIPOC workers spend their time, heightened expectations of care labor from BIPOC women without recognition of it as added workload, and more. These are normalized “business as usual” practices. We recognize that library workers’ prioritization of each other as BIPOC colleagues must be expressed first and foremost by pushing back against these “business as usual” approaches. Among library workers at large, one way to enact equity values is to help each other protect work-life boundaries by judiciously refusing requests for labor that cannot be accommodated without compromising capacity for self-care. This is not a way of justifying deprioritizing collegiality, nor is it refusal simply to exercise the power to refuse, both of which are common narratives that implicitly penalize BIPOC employees’ refusal to perform demanded labor. Rather, it is a way of recognizing that white supremacy deeply informs the cultural expectations of library work to provide service even when it compromises our physical and mental selves (Okun 2021). These cultural expectations will not graciously recede as a result of institutional commitment without action, nor even as a result of incremental and partial action toward values alignment. As Espinal, Hathcock, and Rios (2021) point out: “It is clear that we need new approaches. It is not enough to continuously demonstrate and bemoan the state of affairs [of the profession’s racial demographics]; we need to take action, another tenet of CRT” (232). These actions must encompass library workers’ relationships with each other as colleagues in addition to our services to users.

Many other themes that appear in our application of CRT in our services and working relationships can apply to different library settings as well. For example, informed by the primary CRT tenet that racism is normalized across everyday norms and practices, library workers can strive to take an expansive approach to the practices we encourage our users to engage in, to avoid reproducing the same racialized dynamics that have always existed in each sector of library services. In our context as scholarly communications practitioners, we recognize that many scholarly communications departments focus primarily on open access and authors’ rights, often with an orientation toward open access at any cost, and we encourage scholars who work with us to approach open access critically, and to also consider racialized dynamics in their citation practice. We encourage practitioners in other sectors of library work to consider how common practices in those sectors reproduce racialized inequity, and redefine their services and approaches accordingly.

Although the practices of another library department in a different library setting from our own might be completely different, the same CRT analysis could also lead to a new imagination of the underlying values, scope, and practices of that department’s services. We do not simply share space and work together as a department of BIPOC colleagues; as Nataraj et al. encourage, we also work together to recognize how racialized organizational norms and bureaucratic standards impact us (2020). An overall goal in this work of callout and pushback is to support each other in resistance through rest (Hersey 2021). How might CRT inform your understanding of your work and your organization? How can you use it to interrupt “business as usual?”

Conclusion

We write this against the backdrop of persistent attacks on critical race theory, a term that right-wing politicians co-opted and turned into a white supremacist dog whistle for any effort to educate about race or address systemic racism. We also write this at a time when genocides are being openly perpetrated in Palestine, Sudan, the Democratic Republic of the Congo, and other places in the world, while those in power violently suppress vocal opposition to these egregious acts of oppression. These bans—on teaching CRT, on calling out deadly oppression across the globe, on sharing counter-stories in solidarity—cannot be separated from the everyday marginalization we experience as library workers who are Black, Indigenous, and people of color.

Our systemic exclusion from and marginalization in librarianship, a predominantly white profession, means that BIPOC employees face inherent risk when challenging standard scholarly and cultural practices. Even libraries that profess to value marginalized perspectives in their DEI statements fail to translate these words into action and, instead, shift the burden of DEI work onto the relatively few BIPOC library workers that they hire (Brown, Cline, and Méndez-Brady 2021). This burden increased significantly after the 2020 Black Lives Matter protests against police brutality and anti-Black racism, which motivated many institutions to create additional DEI committees and working groups (Rhodes, Bishop, and Moore 2023). Like other academic libraries, our employer released a “commitment to anti-racism” statement in the aftermath of the Black Lives Matter protests, but the statement was never updated after 2020 to outline what concrete actions the organization took, if any, to enact meaningful change for its BIPOC users and employees. The statement was eventually removed from the library website in 2025. Much as the burden of DEI work increased after 2020, it has become even more complex as libraries have begun hiding, watering down, distancing themselves from, or retracting these statements since 2024, often while maintaining the same expectation that equity-oriented labor is still necessary for organizational optics and that BIPOC will carry it out.

The tools of interest convergence, intersectionality, and counter-storytelling shape how we interact with each other and our communities, ultimately helping us navigate the profession in ways that resist white supremacy, capitalism, and individualism. We hope that libraries move beyond the existing model of hiring token BIPOC library workers and expecting them to diversify overwhelmingly white workplaces and instead question why the profession remains so white despite decades of DEI work. In an environment that is at best resistant and at worst actively hostile to any disruption of the status quo, placing the onus of diversity work on BIPOC library workers is ineffective and violent. Although our experience working in a department consisting entirely of BIPOC is rare, we believe that others can learn from how we have carved out our own space and see the potential for building communities of library workers who prioritize living over simply surviving. Our mere presence at a predominantly white institution is not enough to dismantle the racism that thrives in academic libraries such as ours; CRT provides us with frameworks and inspiration to enact meaningful change at our institution and in the field of librarianship more broadly. We call on all library workers to do the same. With the tools that CRT has to offer, we can build new visions of librarianship that benefit everyone and work toward them together.

Acknowledgements

Thank you to our publishing editor, Jess Schomberg, and the editorial board for their flexibility, guidance, and expertise throughout the publication process. We would also like to thank our reviewers, Brittany Paloma Fiedler and Charlotte Roh, for their invaluable feedback and enthusiasm. This project would not have been possible without the encouragement of our manager and associate dean, April Hathcock, who has built a rare departmental culture that deeply supports our efforts to build community and create a healthier work environment. Many thanks to her!

References

Aguirre, Adalberto, Jr. 2010. “Diversity as Interest-Convergence in Academia: A Critical Race Theory Story.” Social Identities 16, no. 6: 763-774. https://doi.org/10.1080/13504630.2010.524782.

Baldivia, Stefani, Zohra Saulat, and Chrissy Hursh. 2022. “Creating More Possibilities: Emergent Strategy as a Transformative Self-Care Framework for Library EDIA Work.” In Practicing Social Justice in Libraries, edited by Alyssa Brissett and Diana Moronta, 133-144. Routledge.

Bell, Derrick A. 1980. “Brown v. Board of Education and the Interest-Convergence Dilemma.” Harvard Law Review 93 (3): 518–33. https://harvardlawreview.org/print/no-volume/brown-v-board-of-education-and-the-interest-convergence-dilemma/.

Brown, Alexandria, James Cheng, Isabel Espinal, Brittany Paloma Fiedler, Joyce Gabiola, Sofia Leung, Nisha Mody, Alanna Aiko Moore, Teresa Y. Neely, and Peace Ossom Williamson. “Statement Against White Appropriation of Black, Indigenous, and People of Color’s Labor.” WOC + Lib. https://www.wocandlib.org/features/2021/9/3/statement-against-white-appropriation-of-black-indigenous-and-people-of-colors-labor?rq=fiedler.

Brown, Jennifer, Nicholae Cline, and Marisa Méndez-Brady. 2021. “Leaning on Our Labor: Whiteness and Hierarchies of Power in LIS Work.” In Knowledge Justice: Disrupting Library and Information Studies through Critical Race Theory, edited by Sofia Y. Leung and Jorge R. López-McKnight, 95-110. MIT Press. https://doi.org/10.7551/mitpress/11969.003.0007.

Chiu, Anastasia, Fobazi M. Ettarh, and Jennifer A. Ferretti. 2021. “Not the Shark, but the Water: How Neutrality and Vocational Awe Intertwine to Uphold White Supremacy.” In Knowledge Justice: Disrupting Library and Information Studies through Critical Race Theory, edited by Sofia Y. Leung and Jorge R. López-McKnight, 49-71. MIT Press. https://doi.org/10.7551/mitpress/11969.003.0005.

Collins, Patricia Hill. 2000. Black Feminist Thought: Knowledge, Consciousness, and the Politics of Empowerment, rev. 10th anniversary ed. Routledge.

Crenshaw, Kimberlé W. 1991. “Mapping the Margins: Intersectionality, Identity Politics, and Violence Against Women of Color.” Stanford Law Review 43, no. 6: 1241-1299.

Crenshaw, Kimberlé, Neil Gotanda, Gary Peller, and Kendall Thomas, eds. 1995. Critical Race Theory: The Key Writings that Formed the Movement. The New Press.

Delgado, Richard, and Jean Stefancic. 2023. Critical Race Theory: An Introduction, 4^th ed. New York University Press.

Espinal, Isabel, April M. Hathcock, and Maria Rios. 2021. “Dewhitening Librarianship: A Policy Proposal for Libraries.” In Knowledge Justice: Disrupting Library and Information Studies through Critical Race Theory, edited by Sofia Y. Leung and Jorge R. López-McKnight, 223-240. MIT Press. https://doi.org/10.7551/mitpress/11969.003.0017.

Ettarh, Fobazi. 2018. “Vocational Awe and Librarianship: The Lies We Tell Ourselves.” In the Library with the Lead Pipe, January 10. https://www.inthelibrarywiththeleadpipe.org/2018 /vocational-awe/.

Galvan, Angela. 2015. “Soliciting Performance, Hiding Bias: Whiteness and Librarianship.” In the Library with the Lead Pipe, June 3. https://www.inthelibrarywiththeleadpipe.org/2015/soliciting-performance-hiding-bias-whiteness-and-librarianship/.

Guzmán, Rigoberto Lara, and Sareeta Amrute. 2019. “How to Cite Like a Badass Tech Feminist Scholar of Color.” Points (blog). Data & Society, August 22. https://medium.com/datasociety-points/how-to-cite-like-a-badass-tech-feminist-scholar-of-color-ebc839a3619c.

Hathcock, April. 2015. “White Librarianship in Blackface: Diversity Initiatives in LIS.” In the Library with the Lead Pipe, October 7. https://www.inthelibrarywiththeleadpipe.org/2015/lis-diversity/.

Hathcock, April. 2019. “Why Don’t You Want to Keep Us?” At the Intersection (blog), January 18. https://aprilhathcock.wordpress.com/2019/01/18/why-dont-you-want-to-keep-us/.

Hersey, Tricia. 2022. Rest is Resistance: A Manifesto. Little, Brown Spark.

Inefuku, Harrison W. 2021. “Relegated to the Margins: Faculty of Color, the Scholarly Record, and the Necessity of Antiracist Library Disruptions.” In Knowledge Justice: Disrupting Library and Information Studies through Critical Race Theory, edited by Sofia Y. Leung and Jorge R. López-McKnight, 197-216. MIT Press. https://doi.org/10.7551/mitpress/11969.001.0001.

Laws, Mike. 2020. “Why We Capitalize ‘Black’ (and Not ‘White’).” Columbia Journalism Review (blog), June 16. https://www.cjr.org/analysis/capital-b-black-styleguide.php.

Maron, Nancy, Rebecca Kennison, Paul Bracke, Nathan Hall, Isaac Gilman, Kara Malenfant, Charlotte Roh, and Yasmeen Shorish. 2019. Open and Equitable Scholarly Communications: Creating a More Inclusive Future. Association of College and Research Libraries. https://doi.org/10.5860/acrl.1.

Masunaga, Jennifer, Aisha Conner-Gaten, Nataly Blas, and Jessea Young. 2022. “Community Building, Empowering Voices, and Brave Spaces Through LIS Professional Conferences.” In Practicing Social Justice in Libraries, edited by Alyssa Brissett and Diana Moronta, 14-27. Routledge.

Moore, Alanna Aiko, and Jan Estrellado. 2018. “Identity, Activism, Self-Care, and Women of Color Librarians.” In Pushing the Margins: Women of Color and Intersectionality in LIS, edited by Rose L. Chou and Annie Pho, 349-390. Library Juice Press.

Nataraj, Lalitha, Holly Hampton, Talitha R. Matlin, and Yvonne Nalani Meulemans. “Nice White Meetings: Unpacking Absurd Library Bureaucracy through a Critical Race Theory Lens.” Canadian Journal of Academic Librarianship 6 (2020): 1-15. https://www.erudit.org/en/journals/cjalib/2020-v6-cjalib05325/1075449ar.pdf.

Okun, Tema. 2021. “White Supremacy Culture Characteristics” White Supremacy Culture. https://www.whitesupremacyculture.info/characteristics.html.

Prescod-Weinstein Chanda. 2021. The Disordered Cosmos: A Journey into Dark Matter Spacetime and Dreams Deferred. Bold Type Books.

Quiñonez, Torie, Lalitha Nataraj, and Antonia Olivas. 2021. “The Praxis of Relation, Validation, and Motivation: Articulating LIS Collegiality through a CRT Lens.” In Knowledge Justice: Disrupting Library and Information Studies through Critical Race Theory, edited by Sofia Y. Leung and Jorge R. López-McKnight, 197-216. MIT Press. https://doi.org/10.7551/mitpress/11969.003.0018.

Ray, Victor. 2019. “A Theory of Racialized Organizations.” American Sociological Review 84, no. 1: 26–53. https://doi.org/10.1177/0003122418822335.

Rhodes, Tamara, Naomi Bishop, and Alana Aiko Moore. 2023. “The Work of Women of Color Academic Librarians in Higher Education: Perspectives on Emotional and Invisible Labor.” up//root, February 13. https://www.uproot.space/features/the-work-of-women-of-color.

Smith, Christen A., Erica L. Williams, Imani A. Wadud, Whitney N.L. Pirtle, and The Cite Black Women Collective. 2021. “Cite Black Women: A Critical Praxis (A Statement).” Feminist Anthropology 2, no. 1: 10–17. https://doi.org/10.1002/fea2.12040.

Solorzano, Daniel, Miguel Ceja, and Tara Yosso. 2000. “Critical Race Theory, Racial Microaggressions, and Campus Racial Climate: The Experiences of African American College Students.” The Journal of Negro Education 69, no. 1/2: 60–73. http://www.jstor.org/stable/2696265.

Walker, Shaundra. 2017. “A Revisionist History of Andrew Carnegie’s Library Grants to Black Colleges.” In Topographies of Whiteness: Mapping Whiteness in Library and Information Science, edited by Gina Schlesselman-Tarango, 33–53. Library Juice Press. https://kb.gcsu.edu/lib/3.

Yoon, Betsy. 2023. “A Genealogy of Open.” In the Library with the Lead Pipe, March 1. https://www.inthelibrarywiththeleadpipe.org/2023/genealogy-of-open/.

^[1] We choose to capitalize the “B” in Black and to lowercase the “W” in white. We capitalize Black in recognition of its use to describe shared struggle, identity, and community, including a history of slavery that erased many Black people’s knowledge of their specific ethnic heritage (Laws 2020). We do not capitalize white because we see whiteness as a construct that only exists in opposition to racialized communities, and is only used to claim shared identity and community in white supremacist contexts.

2025-08-19: Paper Summary: Reproducibility Study on Network Deconvolution / Web Science and Digital Libraries (WS-DL) Group at Old Dominion University

The “reproducibility crisis” in scientific research refers to growing concerns over the reliability and credibility of published findings, in many fields including biomedical, behavioral, and the life sciences (Laraway et al. 2019, Fidler et al. 2021). Over the past decade, large-scale reproducibility projects revealed failures to replicate findings. For example, in 2015 the Open Science Collaboration reported that a larger portion of replicated studies produced weaker evidence for the original findings despite using the same materials. Similarly, in fields like machine learning researchers may publish impressive new methods, but if others can’t reproduce the results, it hinders progress. Therefore, reproducibility matters. It’s science’s version of fact-checking.

In this blog, we’ll break down our recent effort to reproduce the results of the paper Network Deconvolution by Ye et al. (hereafter, "original study"), published in 2020, which claimed that replacing a Batch Normalization (BN) with “deconvolution layers” boosts model performance. During spring 2024 we began this work as our “CS895 Deep Learning Fundamentals” class project, then extended and published it as a journal paper in ReScience C, a venue dedicated to making science more reliable through reproducibility.

What is Rescience C?

ReScience C (initiative article) is a platinum open-access, peer-reviewed journal dedicated to promoting reproducibility in computational science by explicitly replicating previously published research. The journal was founded in 2015 by Nicolas Rougier, a team leader at the Institute of Neurodegenerative Diseases in Bordeaux, France, and Konrad Hinsen, a researcher at the French National Centre for Scientific Research (CNRS). It addresses the reproducibility crisis by encouraging researchers to independently reimplement computational studies using open-source software to verify and validate original results.

Unlike traditional journals, ReScience C operates entirely on GitHub, where submissions are managed as issues. This platform facilitates transparent, collaborative, and open peer-review processes. Each submission includes the replication code, data, and documentation, all of which are publicly accessible and subject to community scrutiny.

The journal covers a wide range of domains within computational science, including computational neuroscience, physics, and computer science. ReScience C provides valuable insights into the robustness of scientific findings and promotes a culture of open access and critical evaluation by publishing both successful and unsuccessful replication attempts.

Our approach in reproducing “Network Deconvolution”

We evaluated the claim that the Network Deconvolution (ND) technique improves deep learning model performance when compared with BN. We re-ran the paper’s original experiments using the same software, datasets, and evaluation metrics to determine whether the reported results could be reproduced. Out of 134 test results, 116 (87%) successfully reproduced the original findings within a 10% margin, demonstrating good reproducibility. Further, we examined the consistency of reported values, documented discrepancies, and discussed the reasons why some results could not be consistently reproduced.

Introduction

BN is a widely used technique in deep learning that accelerates training and enhances prediction performance. However, recent research explores alternatives to BN to further enhance model accuracy. One such method is Network Deconvolution. In 2020, Ye et al. studied the model performance using both BN and ND and found that ND can be used as an alternative to BN, while improving performance. This technique replaces BN layers with deconvolution layers that aim to remove pixel-wise and channel-wise correlations in input data. According to their study, these correlations cause a blur effect in convolutional neural networks (CNNs), making it difficult to identify and localize objects accurately. By decorrelating the data before it enters convolutional or fully connected layers, network deconvolution improves the training of CNNs.

Figure 1: Performing convolution on this real world image using a correlative filter, such as a Gaussian kernel, adds correlations to the resulting image, which makes object recognition more difficult. The process of removing this blur is called deconvolution. Figure 1 in Ye et al. 2020.

In the original study, Ye et al. evaluated the method on 10 CNN architectures using the benchmark datasets CIFAR-10 and CIFAR-100, and later validated the results on the ImageNet dataset. They report consistent performance improvements in ND over BN. Motivated by the potential of this method, we attempted to reproduce these results. We used the same datasets and followed the same methods, but incorporated updated versions of software libraries when necessary to resolve compatibility issues, making this a “soft-reproducibility” study. Unlike “hard reproducibility”, which replicates every detail exactly, soft reproducibility offers a practical approach while still assessing the reliability of the original findings.

Methodology

The authors of the original study reported six values per architecture for the CIFAR-10 dataset as three for BN (1, 20, 100 epoch settings), and three for ND (1, 20, 100 epoch settings). They reported similarly for the CIFAR-100 dataset. To assess the reproducibility as well as consistency, we conducted three runs for each value reported for both CIFAR-10 and CIFAR-100 datasets. For instance, we repeated the experiment using the same hyperparameter settings (Table 1) for batch normalization at 1 epoch for a specific architecture three times, recording the outcome of each run. We then calculated the average of these three results and compared them to the corresponding values from the original study.

Table 1: Hyperparameter settings that we used to reproduce results in Ye et al.

Results

Table 2 shows the results from Table 1 in Ye et al. (Org. Value) and the reproduced averaged values from our study (Rep. Avg) for CIFAR‐10 dataset with 1, 20, and 100 epochs. Architectures: (1) VGG‐16, (2) ResNet‐18, (3) Preact‐18, (4) DenseNet‐121, (5) ResNext‐29, (6) MobileNet v2, (7) DPN‐92, (8) PNASNet‐18, (9) SENet‐18, (10) EfficientNet (all values are presented as percentages). Color code red represents if the reproduced result is lower than the original value by more than 10%, green represents if the reproduced value is greater than the original value, and black represents if the reproduced value is less than the original value, but the difference between the two values is no more than 10%.

Table 2: Reproduced results for CIFAR-10 dataset.

Table 3 shows the results from Table 1 in Ye et al., and the reproduced averaged values from our study for CIFAR‐100 dataset with 1, 20, and 100 epochs. Architectures: (1) VGG‐16, (2) ResNet‐18, (3) Preact‐18, (4) DenseNet‐121, (5) ResNext‐29, (6) MobileNet v2, (7) DPN‐92, (8) PNASNet‐18, (9) SENet‐18, (10) EfficientNet (all values are presented as percentages). The color codes are the same as Table 2.

Table 3: Reproduced results for CIFAR-100 dataset.

The results indicate that although network deconvolution generally enhances model performance, there are certain cases where batch normalization performs better. To assess reproducibility, we applied a 10% threshold for accuracy as our evaluation criterion. On the CIFAR-10 and CIFAR-100 datasets, 36 out of 60 values (60%) were successfully reproduced with improved outcomes. For the ImageNet dataset, 9 out of 14 values showed better reproduced performance. We identified a few instances, particularly in CIFAR-10 and CIFAR-100, where the reproduced accuracy was lower than the original report, mostly occurring when models were trained for just 1 epoch. However, for models trained over 20 and 100 epochs, the reproduced results were generally higher, closely aligning with the original study’s accuracy. One exception is the PNASNet-18 architecture, which demonstrated relatively poor performance across both batch normalization and network deconvolution methods.

We reported the reproduced top‐1 and top‐5 accuracy values for BN and ND for the VGG‐11, ResNet‐18, and DenseNet‐121 using the ImageNet dataset. All the reproduced results fall within our reproducibility threshold, and they confirm the main claim in the original study (Tables 4 and 5).

Table 4: Accuracy values reported by the original study Table 2 and the reproduced values for VGG‐11 with 90 epochs (Rep.: Reproduced value).

Table 5: Accuracy values reported by the original study’s Table 2 and the reproduced values for the architectures ResNet‐18 and DenseNet‐121 with 90 epochs (Rep.: Reproduced value).

During our reproducibility study, we observed a noticeable difference in the training time between BN and ND, which was not reported in the original paper. Training time is a critical factor to consider when building a deep learning architecture and deciding on the computing resources. Therefore, we compare the training times of the BN and ND observed when testing them on the 10 deep learning architectures (Figures 2 and 3).

Figure 2: Training times for each CNN architecture with CIFAR‐10 dataset: (a) with 1 epoch, (b) with 20 epochs, (c) with 100 epochs

Figure 3: Training times for each CNN architecture with CIFAR‐100 dataset: (a) with 1 epoch, (b) with 20 epochs, (c) with 100 epochs

There are large training time gaps visible between BN and ND in DenseNet‐121 and DPN‐92. The shortest time gap is seen in the EfficientNet for the CIFAR‐10. A similar trend can be observed in the CIFAR‐100 dataset except ResNet‐18 has the shortest time difference in 20 epochs.

Discussion

The reproducibility study on network deconvolution presents that out of 134 test results, 116 (87%) successfully reproduced the original findings within a 10% margin, demonstrating good reproducibility. Surprisingly, 80 results actually performed better than the original study, with higher accuracy scores across different models and datasets. These improvements likely stem from updated software libraries, better optimization algorithms, improved hardware, and more stable numerical calculations from the newer versions of the libraries used. While network deconvolution generally outperformed traditional batch normalization methods, there were exceptions where batch normalization was superior, such as with ResNet-18 on certain datasets. Although network deconvolution requires longer training times, the performance gains typically justify the extra computational cost. The technique has gained widespread adoption in modern applications, including image enhancement, medical imaging, and AI-generated images, confirming its practical value and reliability in real-world scenarios.

Behind the Scenes: Challenges and Solutions

Reproducing the study was not without obstacles. On the software side, we encountered PyTorch compatibility issues, dependency conflicts, and debugging challenges. Working with the ImageNet dataset also proved demanding due to its large size of over 160 GB and changes in its folder structure, which required further research to find out solutions. Hardware constraints were another factor, and we had to upgrade from 16 GB to 80 GB GPUs to handle ImageNet training efficiently. These experiences emphasize that reproducibility is not only about having access to the original code, but also about adapting to evolving tools, datasets, and computational resources.

Why this matters for the Machine Learning Community

Reproducibility studies such as this one are essential for validating foundational claims and ensuring that scientific progress builds on reliable evidence. Changes in software or hardware can unintentionally improve performance, which highlights the importance of providing context when reporting benchmarking results. By making our code and data openly available, we enable others to verify our findings and extend the work. We encourage researchers to take part in replication efforts or contribute to venues such as ReScience C to strengthen the scientific community.

Conclusions

Our study finds that the accuracy results reported in the original paper are reproducible within a threshold of 10 percent with respect to the accuracy values reported in the original paper. This verifies the authors’ primary claim that network deconvolution improves the performance of deep learning models compared with batch normalization.

Important Links

Original Study:

Read the paper: https://openreview.net/pdf?id=rkeu30EtvS
GitHub repository: https://github.com/yechengxi/deconvolution

Our Reproducibility Study:

Read the paper: https://doi.org/10.5281/zenodo.15321683
GitHub repository: https://github.com/lamps-lab/rep-network-deconvolution
Pre-trained model weights: https://osf.io/hp3ab/files/osfstorage
Review thread: https://github.com/ReScience/submissions/issues/89

Join ReScience C: https://rescience.github.io/write/

– Rochana R. Obadage, Kumushini Thennakoon

Meet the people behind the books / John Mark Ockerbloom

Today I’m introducing new pages for people and other authors on The Online Books Page. The new pages combine and augment information that’s been on author listings and subject pages. They let readers see in one place books both about and by particular people. They also show let readers quickly see who the authors are and learn more about them. And they encourage readers to explore to find related authors and books online and in their local libraries. They draw on information resources created by librarians, Wikipedians, and other people online who care about spreading knowledge freely. I plan to improve on them over time, but I think they’re developed enough now to be useful to readers. Below I’ll briefly explain my intentions for these pages, and I hope to hear from you if you find them useful, or have suggestions for improvement.

Who is this person?

Readers often want to know about more about the people who created the books they’re interested in. If they like an author, they might want to learn more about them and their works– for instance, finding out what Mark Twain did besides creating Tom Sawyer and Huckleberry Finn. For less familiar authors, it helps to know what background, expertise, and perspectives the author has to write about a particular subject. For instance, Irving Fisher, a famous economist in the early 20th century, wrote about various subjects, not just ones dealing with economics, but also with health and public policy. One might treat his writings on these various topics in different ways if one knows what areas he was trained in and in what areas he was an interested amateur. (And one might also reassess his predictive abilities even in economics after learning from his biography that he’d famously failed to anticipate the 1929 stock market crash just before it happened.)

The Wikipedia and the Wikimedia Commons communities have created many articles, and uploaded many images, of the authors mentioned in the Online Books collection, and they make them freely reusable. We’re happy to include their content on our pages, with attribution, when it helps readers better understand the people whose works they’re reading. Wikipedia is of course not the last word on any person, but it’s often a useful starting point, and many of its articles include links to more authoritative and in-depth sources. We also link to other useful free references in many cases. For example, our page on W. E. B. Du Bois includes links to articles on Du Bois from the Encyclopedia of Science Fiction, the Internet Encyclopedia of Philosophy, BlackPast, and the Archives and Records center at the University of Pennsylvania, each of which describes him from a different perspective. Our goal in including these links on the page is not to exhaustively present all the information we can about an author, but to give readers enough context and links to understand who they are reading or reading about, and to encourage them to find out more.

Find more books and authors

Part of encouraging readers to find out more is to give them ways of exploring books and authors beyond the ones they initially find. Our page on Rachel Carson, for example, includes a number of works she co-wrote as an employee of the US Fish and Wildlife Service, as well as a public domain booklet on her prepared by the US Department of State. But it doesn’t include her most famous works like Silent Spring and the Sea Around Us, which are still under copyright without authorized free online editions, as are many recent biographies and studies of Carson. But you can find many of these books in libraries near you. Links we have on the left of her page will search library catalogs for works about her, and links on the bottom right will search them for work by her, via our Forward to Libraries service.

Readers might also be interested in Carson’s colleagues. The “Associated authors” links on the left side of Carson’s page go to other pages about people that Carson collaborated with who are also represented in our collection, like Bob Hines and Shirley Briggs. Under the “Example of” heading, you can also follow links to other biologists and naturalists, doing similar work to Carson.

Metadata created with care by people, processed with care by code

I didn’t create, and couldn’t have created (let alone maintained), all of the links you see on these pages. They’re the work of many other people. Besides the people who wrote the linked books, collaborated on the linked reference articles, and created the catalog and authority metadata records for the books, there are lots of folks who created the linked data technology and data that I use to automatically pull together these resources on The Online Books Page. I owe a lot to the community that has created and populated Wikidata, which much of what you see on these pages depends on, and to the LD4 library linked data community, which has researched, developed, and discussed much of the technology used. (Some community members have themselves produced services and demonstrations similar to the ones I’ve put on Online Books.) Other crucial parts of my services’ data infrastructure come from the Library of Congress Linked Data Service and the people that create the records that go into that. The international VIAF collaboration has also been both a foundation and inspiration for some of this work.

These days, you might expect a new service like this to use or tout artificial intelligence somehow. I’m happy to say that the service does not use any generative AI to produce what readers see, either directly, or (as far as I’m aware) indirectly. There’s quite a bit of automation and coding behind the scenes, to be sure, but it’s all built by humans, using data produced in the main by humans, who I try to credit and cite appropriately. We don’t include statistically plausible generated text that hasn’t actually been checked for truth, or that appropriates other people’s work without permission or credit. We don’t have to worry about unknown and possibly unprecedented levels of power and water consumption to power our pages, or depend on crawlers for AI training so aggressive that they’re knocking library and other cultural sites offline. (I haven’t yet had to resort to the sorts of measures that some other libraries have taken to defend themselves against aggressive crawling, but I’ve noticed the new breed of crawlers seriously degrading my site’s performance, to the point of making it temporarily unusable, on more than one occasion.) With this and my other services, I aim to develop and use code that serves people (rather than selfishly or unthinkingly exploiting them), and that centers human readers and authors.

Work in progress

I hope readers find the new “people” pages on The Online Books Page useful in discovering and finding out more about books and authors of interest to them. I’ve thought of a number of ways we can potentially extend and build on what we’re providing with these new pages, and you’ll likely see some of them in future revisions of the service. I’ll be rolling the new pages out gradually, and plan to take some time to consider what features improve readers’ experience, and don’t excessively get in their way. The older-style “books by” and “books about” people pages will also continue to be available on the site for a while, though these new integrated views of people may eventually replace them.

If you enjoy the new pages, or have thoughts on how they could be improved, I’d enjoy hearing from you! And as always, I’m also interested in your suggestions for more books and serials — and people! — we can add to the Online Books collection.

Rest as a productive act / Meredith Farkas

I’m a member of an online support group for the autoimmune condition I have and one of the recently diagnosed people wrote a post about how hard it is to cope with the pain and fatigue alongside their job, parenting, and housework and sometimes they have to “give in” and rest. They made giving in sound so negative and you could tell that they were filled with shame about it. Like giving in was giving UP and that was unacceptable. My response was to suggest to them that they might consider reframing rest as an active treatment for their condition… namely because it is. I know that I can’t go full-on with anything the way I used to. I need more sleep, I need more rest, even mental exertion sometimes becomes too much. Along with the many meds I take, I see rest as an essential treatment that I need every day, and some days more than others. Given the unpredictable nature of this disease and its flares, I just don’t take on as much as I used to professionally. And when I look at what my Spring and Summer have looked like, as I developed a totally new condition out of nowhere that is still not fully understood or definitively diagnosed (after seeing eight different medical professionals – though at least I’m now under the care of two good specialists), I feel very prescient for having decided not to pursue several opportunities that I wanted to do, but I would absolutely have had to drop.

For those of you with disabilities, spoon theory is probably quite familiar. We only have so many spoons each day – so much capacity for deep thought, stress, physical exertion, and even social interaction before we crash. And crashing often leads to further disability – for example, overexerting myself one day could (and has) lead to a flare of pain, fatigue, and a host of other symptoms that lasts weeks or even a month. So we try to plan our lives around leaving a few spoons in reserve each day, because stuff comes up, right? Our kid tells us as we’re going to bed that they need help with a project that’s due tomorrow. Our colleague is unable to do their part for a presentation we’re supposed to give together tomorrow and we need to figure out how to deliver their part as well. Our spouse gets sick and we have to take care of everything at home on our own. You can’t plan for everything and it’s inevitable that there will be times when you’re going to use up all your spoons and then some, but learning to plan around your capacity and leave some in reserve is a critical skill for those of us with disabilities. And learning how many spoons we have for different types of activities is a process and one that feels like building a sandcastle next to the water at low-tide. It’s an ever-changing endless process.

Even if you don’t live with disabling conditions, I can promise you that you only have so many spoons for each day. If you have a bad tension headache at the end of a workday, if your mind is racing when you try to go to sleep, if your shoulders are knotted and tight, if you’re snapping at the people you love because you’re all out of patience when you get home, if you’re so mentally exhausted that you can’t even make a simple decision like what to eat… those (and others) are signs that you have pushed yourself too hard. Even if you’re not disabled, pushing yourself beyond your capacity disables you, at least temporarily. It makes you less capable of reflection, attention, patience, and solid decision-making. As I’ve mentioned in the past, having too much on our plates (called “time poverty” in the literature) has been shown to increase our risk of anxiety and depression. And repeatedly pushing yourself too hard puts you at much greater risk of burnout. Whether you are disabled or not, there are consequences for working beyond your capacity.

And yet, so many of us overwork. For some of us, that’s more the norm than the rare exception, to the point where we see doing our contracted amount of work as underperforming, as lazy, as letting people down. Instead of looking at our to-do lists and seeing that we’re being asked to do way more than is reasonable, we assume that we just need to find new ways to become more productive. Because the failure must be ours, not the system of work that keeps intensifying, and asking us to do more and become expert in more and more things.

And being productive is a seductive thing, especially for people who have self-esteem issues. If you feel you’re not enough, meeting deadlines and getting things done can make you feel good about yourself temporarily. But it can easily become more about chasing the dopamine hit that comes from completing a task than about doing something meaningful. I think a lot of productivity is that way — feeling busy and getting things done can make us feel useful. If we’re busy, we must be worthwhile, right? It’s sort of a hedge against our existential worries. I must be a good person if I’m getting all these things done on time!

I’ve come to recognize that I feel a strong need to show people that I’m a person who lives up to their commitments and respects other people’s schedules and needs. Basically, I want to be liked, probably (definitely) to an unhealthy extent, and I spend a lot of time worrying that I’m inconveniencing or pissing off others. A library is very much an interdependent ecosystem where one person’s failure to complete a task can impact the timelines and workloads of others. For example, I’ve seen the negative impact that waiting until the end of the fiscal year to do the bulk of one’s book ordering has on our Technical Services staff. I don’t want to be the sort of person who causes stress for another colleague. That said, I think I’m a bit compulsive about my reliability to the point where I put completing tasks on time (even relatively unimportant ones) over my own wellbeing.

I think how we treat productivity comes from the stories we tell ourselves about who we are. I grew up hearing that I was a uniquely terrible kid and thinking I was inherently unlovable, and while I’ve become much more confident in myself, that assumption still hangs over me and colors my interpretation of everything. I think it’s very hard to feel deserving of rest when you are worried about what your colleagues will think if you have to rely on them because you can’t get x or y done. If you think you’re an inherently good person who is just as deserving of kindness and grace as anyone else at work, I imagine it would be a lot easier to do what you need to stay well.

I was extremely sick and was barely sleeping from March through July and I only took one day of sick leave because I felt like I needed to get all the tasks done before the end of the academic year (I’m on a 9-month contract). And it truly did take me every single one of those days I had left to complete everything. Would it have been the end of the world if I’d taken some sick leave to rest and came back to some of the projects in September? No. But I was also feeling a lot of guilt about needing people to cover for me for certain parts of my job in the Spring due to my new illness and felt the need to overcompensate by being super-productive.

I want to feel comfortable not completing things. I want to feel ok looking at to-do lists that I know I won’t complete at the end of the term or the year. I want to feel like I can take a sick day if I’ve slept only a few hours, even if it’s last minute and means my reference shift may not be covered. I want to be ok with letting people down if it means safeguarding my well-being. What’s really stunning is that I have much better boundaries than I used to and they are still fairly pathetic. I don’t take on nearly as much as I used to. I’m ok with saying no. I’m far better at conserving my energy and paying attention to my capacity on any particular day. And yet, I have so far to go, especially as I get sicker.

Last week, my family was visiting colleges in the Northeast. On Thursday, I had to wake up around 2am East Coast time, fly all the way back to the West Coast, and, since I arrived home around 10:30am Pacific time, I felt like I had a whole day to get household chores done, in spite of the fact that I was absolutely wrecked (that productivity urge is really ingrained in me). Instead, I spent the bulk of the day on the couch watching TV, went to sleep at 6:45pm, and have no regrets. It wasn’t laziness that kept me on the couch; it was the right treatment for my body and mind. We need to stop feeling guilt for giving our bodies and minds the comfort and rest they need.

Do we call taking a medication that we need for our survival “giving in?” What if we treated rest as a productive act like exercise? What if we saw rest as protecting our capacity; our overall ability to show up at work and in our lives? What if we saw it as being as integral to our health as the medications we take? And why are we so willing to cheat ourselves out of rest, often for things that in the long-run are not that important?

It’s one thing for me to take the rest I need at home, another entirely to do it when it will impact my colleagues (and, to be clear, I have amazingly lovely and generous colleagues who all support and cover for one another when life inevitably smacks us in the face). I need to keep reminding myself that it’s better for my workplace to have a healthy, happy colleague who is committed to the work and sometimes needs to take time off to stay healthy than a burnt out husk of a colleague. I need to remind myself that I won’t be able to be reflective, creative, or a solid decision-maker if I am too depleted. In the end, rest is integral to my doing my job well, as it is for all of you. You’re doing a service to your place of work when you take the time you need to rest and get/stay healthy because it makes you better at your job.

If you feel like you’re overworking, that you can’t slow down when you need rest, if you feel guilty for taking sick days when you need them, if you rely on getting things done for your self-worth, it’s worth interrogating the stories you tell about yourself and your work. Is whatever you’re going to do that day really more important than your health and, if so, are you really the only one who can do it? Do you give your colleagues grace when they are sick and take the time they need or if they miss a deadline because they have too many competing demands? Why can’t you extend that same grace to yourself? Why do you think you’re not deserving? (I find it sometimes helps to think of myself in the third person and imagine how I’d feel if my colleague needed whatever it is I do.) And if your workplace sucks and someone is going to resent you for doing what you need to do to take care of yourself, the problem is with them, not you. You deserve rest. We all do.

2025 Optical Media Durability Update / David Rosenthal

Seven years ago I posted Optical Media Durability and discovered:

Surprisingly, I'm getting good data from CD-Rs more than 14 years old, and from DVD-Rs nearly 12 years old. Your mileage may vary.

Here are the subsequent annual updates:

It is time once again for the mind-numbing process of feeding 45 disks through the readers to verify their checksums, and yet again this year every single MD5 was successfully verified. Below the fold, the details.

Month	Media	Good	Bad	Vendor
01/04	CD-R	5x	0	GQ
05/04	CD-R	5x	0	Memorex
02/06	CD-R	5x	0	GQ
11/06	DVD-R	5x	0	GQ
12/06	DVD-R	1x	0	GQ
01/07	DVD-R	4x	0	GQ
04/07	DVD-R	3x	0	GQ
05/07	DVD-R	2x	0	GQ
07/11	DVD-R	4x	0	Verbatim
08/11	DVD-R	1x	0	Verbatim
05/12	DVD+R	2x	0	Verbatim
06/12	DVD+R	3x	0	Verbatim
04/13	DVD+R	2x	0	Optimum
05/13	DVD+R	3x	0	Optimum

The fields in the table are as follows:

Month: The date marked on the media in Sharpie, and verified via the on-disk metadata.
Media: The type of media.
Good: The number of media with this type and date for which all MD5 checksums were correctly verified.
Bad: The number of media with this type and date for which any file failed MD5 verification.
Vendor: the vendor name on the media

The drives I use from ASUS and LG report read errors from the CDs and older DVDs but verify the MD5s correctly.

Surprisingly, with no special storage precautions, generic low-cost media, and consumer drives, I'm getting good data from CD-Rs more than 21 years old, and from DVD-Rs nearly 19 years old. Your mileage may vary. Tune in again next year for another episode.

Previously I found a NetBSD1.2 CD dating from October 1996. Each directory has checksums generated by cksum(1), all but one of which verified correctly despite a few read errors. So some of the data on that CD is bad after nearly 29 years.

From Chaos to Order: A Workshop with ODE to Uncomplicate Data in Brazil / Open Knowledge Foundation

The three-hour online session focused on teaching the fundamentals of data usage and quality to professionals from various fields with different levels of familiarity with open data.

The post From Chaos to Order: A Workshop with ODE to Uncomplicate Data in Brazil first appeared on Open Knowledge Blog.

CD Tree / Ed Summers

as seen through a window screen.

Sequential / Ed Summers

We are sequential beings. Actions cannot be undone; life, as we experience it, cannot be reversed. The irreversibility of human life is the source of our pain and also our wonder.

Spinoza’s Rooms by Madeliene Thien.

On the dissemination of ideas and innovation / Lorcan Dempsey

This is an excerpt from a longer contribution I made to Responses to the LIS Forward Position Paper: Ensuring a Vibrant Future for LIS in iSchools. [pdf] It is a sketch only, and somewhat informal, but I thought I would put it here in case of interest. It occasionally references the position paper. It is also influenced by the context in which it was prepared which was a discussion of the informational disciplines and the iSchool in R1 institutions.
It would be interesting to take a fuller discussion in one of two directions, which would fill in more individual or organizational names. The first is empirical, based on survey, citations, and other markers of influence. A second would be to be more opinionated, which would be partial (in more than one sense) but might prompt some reflection about emphasis and omission.
If you wish to reference it, I would be grateful if you cite the full original: Dempsey, L. (2025). Library Studies, the Informational Disciplines, and the iSchool: Some Remarks Prompted by LIS Forward. In LIS Forward (2025) Responses to the LIS Forward Position Paper: Ensuring a Vibrant Future for LIS in iSchools, The Friday Harbor Papers, Volume 2. [pdf]

The diffusion of ideas

As numerous critics beyond Kristof have observed, the professionalization of the academy prioritizes peer-reviewed publications over other forms of writing. Professors allocate the bulk of their efforts to researching, writing, and publishing in their field journals. The first task of any professor—particularly junior professors—is to publish in prestigious peer-reviewed outlets. Even scholars who have some facility with engaging a wider audience have warned that it takes time away from research.49 It is great when academics also express their ideas to a wider audience. Professional incentives dictate, however, that this will always be the hobby and not the job. // Daniel W. Drezner (2017) The Ideas Industry.

The university plays an important role in the generation and diffusion of ideas and innovation. The Report does not focus on this area. However, informing practice and influencing policy is an important part of what the university does, especially in a practice-oriented discipline. As noted, in a period of change, libraries benefit from data-based frameworks, evidence, and arguments to support advocacy work, or to think about new service areas. In a related context, being interviewed on NPR when expertise is required or writing an op-ed in a leading newspaper are markers of esteem (see the discussion of symbolic capital in the next section).

... in a period of change, libraries benefit from data-based frameworks, evidence, and arguments to support advocacy work, or to think about new service areas.

Drezner’s typology of sources

Dan Drezner writes about the dissemination of ideas in The Ideas Industry. Drezner is interested in how ideas are diffused and taken up in political and policy contexts, and how they lead to action or practical decisions. He discusses the evolving sources of ideas in the policy arena.

Academic. The academy may have been the historically primary source of ideas, although Drezner argues that its influence has waned for various reasons. He notes the scholarly incentives of faculty, which promote peer-reviewed articles in leading journals as the peak achievement, and which in turn leads to disciplinary peers as the primary audience and community they seek. Disciplines will have different dynamics. For example, perhaps because of its normative base, Drezner suggests, economics has more influence than other social sciences.
Think tanks. He charts the rise and changing role of think tanks such as the Rand Corporation or Brookings Institution. Unlike universities, these do have an explicit role in influencing policy, and over time some have become more partisan.
Industry and management consulting. Various firms – McKinsey, Gartner, and others – have developed capacity for published research and thought pieces, often as a form of reputational promotion of their consulting or related services.
Individuals. Drezner highlights, for example, the careers of Fareed Zakaria and Neil Ferguson.

How does this play out in the library field?

The incentives Drezner mentions are strongly at play in R1 schools and may not be aligned with broader community engagement. This is evident in the comments of the Early Career Researchers. Of course, taken collectively iSchools do work which influences both practice and policy, and there are some notable connections (Sheffield and exploration of open access, for example). There are also some high-profile iSchool faculty members who make important and visible contributions to broader debate outside the library context.

While they are not think-tanks as such, one can point to Ithaka S&R and OCLC Research, divisions, respectively, of large not-for-profit service organizations, each of which is quite active in working with groups of libraries to develop applied R&D outputs.[1] They tend to focus on areas of topical interest, such as collections, collaboration, research infrastructure and scholarly communication. Over the years, they have worked on a variety of topics (including, for example, metadata and protocols, research support, library collaboration, and user behavior in the case of OCLC Research). Ithaka S&R has an academic and cultural focus. OCLC Research works with academic and public libraries (notably through WebJunction, a learning platform for libraries). In each case, there is definitely an interest in providing knowledge, evidence and models that help influence practice or inform policy.

This interest is also evident in the output of professional associations and others which produce outputs on behalf of members. While different from Drezner’s consultancy category, there are some parallels in terms of providing value to members. Here one might point to the Urban Libraries Council or to the Association for Research Libraries and the Coalition for Network Information, or to the divisions of ALA. ARSL is another example.

Advocacy and other groups also produce materials to inform and guide. Helping with evidence and arguments is important here. SPARC and EveryLibrary are examples.

An important element of what associations and membership groups do is to provide venues for networking and to support communities of practice. They aim to scale learning and innovation within their constituencies.

An important element of what associations and membership groups do is to provide venues for networking and to support communities of practice. They aim to scale learning and innovation within their constituencies.

One can also see that vendors produce occasional reports, as a value-add to customers. Think of Sage or Clarivate for example. In some cases, these may not be seen as more than elevated marketing.

Finally, there is a variety of individual practitioner voices that are quite influential.

I have not given a lot of examples above, because without some analysis, it would be very subjective. However, some exploration of the diffusion of ideas and innovation in this space would be interesting, acknowledging that it is a smaller more tight-knit community than some of the areas Drezner (who is a scholar and commentator of International Relations) discusses.

Public intellectuals and thought leaders

Public intellectuals delight in taking issue with various parts of the conventional wisdom. By their very nature, however, they will be reluctant to proffer alternative ideas that appeal to any mass audience. Thought leaders will have no such difficulty promising that their ideas will disrupt or transform the status quo. And the shifts discussed in this chapter only increase the craving for clear, appealing answers. // Daniel W. Drezner (2017) The Ideas Industry.

This is an inherent tension between scholarship and communication, one that breeds resentment for academics trying to engage a wider audience as well as readers who have to wade through complex, cautious prose. Daniel W. Drezner (2017) The Ideas Industry.

In Drezner’s terms, thought leaders favor large explanatory ideas, and deliberately aim to influence policy and practice. They value clear communication, may view the world through a single frame, and evangelize their ideas actively. Thomas Friedman is an example in the book. Public intellectuals promote critical thought across different arenas, may not offer easy solutions or answers, and emphasize complexity and questioning. Francis Fukuyama and Noam Chomsky are cited examples here.

Drezner notes that the current climate favors thought leaders because their ideas are easier to consume: their big idea can be delivered in a Ted Talk. Perhaps the library community hit peak thought leadership in the heyday of the personal blog, where several influential librarians achieved large audiences.

Platform publications

Computing has Communications of the ACM. Engineering has IEEE Spectrum. Business readers turn to the Harvard Business Review. The HE technology community has Educause Review.

These are what I have called in the past ‘platform’ publications (Dempsey and Walter). They aggregate the attention of an audience within a particular domain, including leadership, practice and research. They provide a platform for their authors, who may be reasonably assured of a broad engaged audience.

The library community does not have such a publication, which could provide a venue for research and practice to co-exist in dialog.

Ischools and influence on policy and practice

What is the role of the iSchool in influencing policy and informing practice? More specifically, how important is it for Library Studies to visibly do this?

The bilateral research - practice connection is of course much discussed, and I wondered about the gap here above. Is the influence on policy at various levels perhaps less discussed?

This works in a variety of ways, not least through participation in professional venues – membership of associations, presentation where practitioners congregate to learn about direction, partnership with library organizations and libraries.

Again, without supporting evidence, my impression is that there may be a higher level of engagement with policy and practice in Archival Studies than in Library Studies when measured against overall research and education capacity.

I believe that markers of influence are important for elevating the overall profile of Library Studies, and that the initiative should look at this engagement in a further iteration of this work. A comparative perspective would be interesting, thinking firstly of the LAM strands, and then of other practice-oriented disciplines. How do library studies perform in terms of impact on policy/practice compared to other comparable disciplines?

This seems especially important now, given the importance of evidence and arguments in a time of contested value and values.

Coda: overview of and links to full contribution

Collection: LIS Forward (2025) Responses to the LIS Forward Position Paper: Ensuring a Vibrant Future for LIS in iSchools, The Friday Harbor Papers, Volume 2. [pdf]

Contents: Here are the sections from my contribution. Where I have excerpted them on this site, I provide a link.

1 Introduction
2 Information: a brief schematic history [excerpted here]
3 Libraries and library studies [excerpted here]
4 Informational disciplines
5 On the dissemination of ideas and innovation [excerpted here]
6 Symbolic capital
7 Recommendations and candidate recommendations
Coda 1: Google Ngram
Coda 2: Personal position
References

[1] Buschman (2020) talks about ‘white papers’ in the library space, and, focusing attention on the outputs of Ithaka S&R, describes them as ‘empty calories.’

References

Buschman, J. (2020). Empty calories? A fragment on LIS white papers and the political sociology of LIS elites. The Journal of Academic Librarianship, 46(5), 102215. https://doi.org/10.1016/j.acalib.2020.102215

Dempsey, L., & Walter, S. (2014). A platform publication for a time of accelerating change. College and Research Libraries, 75(6), 760–762. https://doi.org/10.5860/crl.75.6.760

Drezner, D. W. (2017). The Ideas Industry: How Pessimists, Partisans, and Plutocrats are Transforming the Marketplace of Ideas. Oxford University Press.

The Drugs Are Taking Hold / David Rosenthal

cyclonebill CC-BY-SA

In The Selling Of AI I compared the market strategy behind the AI bubble to the drug-dealer's algorithm, "the first one's free". As the drugs take hold of an addict, three things happen:

Their price rises.
The addict needs bigger doses for the same effect.
Their deleterious effects kick in.

As expected, this what is happening to AI. Follow me below the fold for the details.

The price rises

Ethan Ding starts tokens are getting more expensive thus:

imagine you start a company knowing that consumers won't pay more than $20/month. fine, you think, classic vc playbook - charge at cost, sacrifice margins for growth. you've done the math on cac, ltv, all that. but here's where it gets interesting: you've seen the a16z chart showing llm costs dropping 10x every year.

so you think: i'll break even today at $20/month, and when models get 10x cheaper next year, boom - 90% margins. the losses are temporary. the profits are inevitable.

it’s so simple a VC associate could understand it:

year 1: break even at $20/month

year 2: 90% margins as compute drops 10x

year 3: yacht shopping

it’s an understandable strategy: "the cost of LLM inference has dropped by a factor of 3 every 6 months, we’ll be fine”

Source

The first problem with this is that only 8% of the users will pay the $20/month, the 92% use it for free (Menlo Ventures thinks it is only 3%). Indeed it turns out that an entire government agency only pays $1/year, as Samuel Axon reports in US executive branch agencies will use ChatGPT Enterprise for just $1 per agency:

The workers will have access to ChatGPT Enterprise, a type of account that includes access to frontier models and cutting-edge features with relatively high token limits, alongside a more robust commitment to data privacy than general consumers of ChatGPT get. ChatGPT Enterprise has been trialed over the past several months at several corporations and other types of large organizations.

The workers will also have unlimited access to advanced features like Deep Research and Advanced Voice Mode for a 60-day period. After the one-year trial period, the agencies are under no obligation to renew.

Did I mention the drug-dealer's algorithm?

But that's not one of the two problems Ding is discussing. He is wondering why instead of yacht shopping, this happened:

but after 18 months, margins are about as negative as they’ve ever been… windsurf’s been sold for parts, and claude code has had to roll back their original unlimited $200/mo tier this week.

companies are still bleeding. the models got cheaper - gpt-3.5 costs 10x less than it used to. but somehow the margins got worse, not better.

What the A16Z graph shows is the rapid reduction in cost per token of each specific model, but also the rapid pace at which each specific model is supplanted by a better successor. Ding notes that users want the current best model:

gpt-3.5 is 10x cheaper than it was. it's also as desirable as a flip phone at an iphone launch.

when a new model is released as the SOTA, 99% of the demand immediatley shifts over to it. consumers expect this of their products as well.

Source

Which causes the first of the two problems Ding is describing. His graph shows that the cost per token of the model users actually want is approximately constant:

the 10x cost reduction is real, but only for models that might as well be running on a commodore 64.

so this is the first faulty pillar of the “costs will drop” strategy: demand exists for "the best language model," period. and the best model always costs about the same, because that's what the edge of inference costs today.
...
when you're spending time with an ai—whether coding, writing, or thinking—you always max out on quality. nobody opens claude and thinks, "you know what? let me use the shitty version to save my boss some money." we're cognitively greedy creatures. we want the best brain we can get, especially if we’re balancing the other side with our time.

So the business model based on the cost of inference dropping 10x per year doesn't work. But that isn't the worst of the two problems. While it is true that the cost in dollars of a set number of tokens is roughly constant, the number of tokens a user needs is not:

while it's true each generation of frontier model didn't get more expensive per token, something else happened. something worse. the number of tokens they consumed went absolutely nuclear.

chatgpt used to reply to a one sentence question with a one sentence reply. now deep research will spend 3 minutes planning, and 20 minutes reading, and another 5 minutes re-writing a report for you while o3 will just run for 20-minutes to answer “hello there”.

the explosion of rl and test-time compute has resulted in something nobody saw coming: the length of a task that ai can complete has been doubling every six months. what used to return 1,000 tokens is now returning 100,000.

Source

Users started by trying fairly simpple tasks on fairly simple models. The power users, the ones in the 8%, were happy with the results and graduated to trying complex questions on frontier models. So their consumption of tokens exploded:

today, a 20-minute "deep research" run costs about $1. by 2027, we'll have agents that can run for 24 hours straight without losing the plot… combine that with the static price of the frontier? that’s a ~$72 run. per day. per user. with the ability to run multiple asynchronously.

once we can deploy agents to run workloads for 24 hours asynchronously, we won't be giving them one instruction and waiting for feedback. we'll be scheduling them in batches. entire fleets of ai workers, attacking problems in parallel, burning tokens like it's 1999.

obviously - and i cannot stress this enough - a $20/month subscription cannot even support a user making a single $1 deep research run a day. but that's exactly what we're racing toward. every improvement in model capability is an improvement in how much compute they can meaningfully consume at a time.

The power users were on Anthropic's unlimited plan, so this happened:

users became api orchestrators running 24/7 code transformation engines on anthropic's dime. the evolution from chat to agent happened overnight. 1000x increase in consumption. phase transition, not gradual change.

so anthropic rolled back unlimited. they could've tried $2000/month, but the lesson isn't that they didn't charge enough, it’s that there’s no way to offer unlimited usage in this new world under any subscription model.

it's that there is no flat subscription price that works in this new world.

Ed Zitron's AI Is A Money Trap looks at the effect of Anthropic figuring this out on Cursor:

the single-highest earning generative AI company that isn’t called OpenAI or Anthropic, and the highest-earning company built on top of (primarily) Anthopic’s technology.

When Anthropic decided to reduce the rate at which they were losing money, Cursor's business model collapsed:

In mid-June — a few weeks after Anthropic introduced “priority tiers” that required companies to pay up-front and guarantee a certain throughput of tokens and increased costs on using prompt caching, a big part of AI coding — Cursor massively changed the amount its users could use the product, and introduced a $200-a-month subscription.

Cursor's customers weren't happy:

Cursor’s product is now worse. People are going to cancel their subscriptions. Its annualized revenue will drop, and its ability to raise capital will suffer as a direct result. It will, regardless of this drop in revenue, have to pay the cloud companies what it owes them, as if it had the business it used to. I have spoken to a few different people, including a company with an enterprise contract, that are either planning to cancel or trying to find a way out of their agreements with Cursor.

So Cursor, which was already losing money, will have less income and higher costs. They are the largest company buit on the AI major's platforms, despite only earning "around $42 million a month", and Anthropic just showed that their business model doesn't work. This isn't a good sign for the generative AI industry and thus, as Zitron explains in details, for the persistence of the AI bubble.

Ding explains why OpenAi's $1/year/agency deal is all about with similar deals at the big banks:

this is what devins all in on. they’ve recently announced their citi and goldman sachs parterships, deploying devin to 40,000 software engineers at each company. at $20/mo this is a $10M project, but here’s a question: would you rather have $10M of ARR from goldman sachs or $500m from prosumer devleopers?

the answer is obvious: six-month implementations, compliance reviews, security audits, procurement hell mean that that goldman sachs revenue is hard to win — but once you win it it’s impossible to churn. you only get those contracts if the singular decision maker at the bank is staking their reputation on you — and everyone will do everything they can to make it work.

Once the organization is hooked on the drug, they don't care what it costs because both real and political switching costs are intolerable,

Bigger doses are needed

Anjli Raval reports that The AI job cuts are accelerating:

Even as business leaders claim AI is “redesigning” jobs rather than cutting them, the headlines tell another story. It is not just Microsoft but Intel and BT that are among a host of major companies announcing thousands of lay-offs explicitly linked to AI. Previously when job cuts were announced, there was a sense that these were regrettable choices. Now executives consider them a sign of progress. Companies are pursuing greater profits with fewer people.

For the tech industry, revenue per employee has become a prized performance metric. Y Combinator start-ups brag about building companies with skeleton teams. A website called the “Tiny Teams Hall of Fame” lists companies bringing in tens or hundreds of millions of dollars in revenue with just a handful of employees.

Source

Brandon Vigliarolo's IT firing spree: Shrinking job market looks even worse after BLS revisions has the latest data:

The US IT jobs market hasn't exactly been robust thus far in 2025, and downward revisions to May and June's Bureau of Labor Statistics data mean IT jobs lost in July are part of an even deeper sector slowdown than previously believed.

The Bureau of Labor Statistics reported relatively flat job growth last month, but unimpressive payroll growth numbers hid an even deeper reason to be worried: Most of the job growth reported (across all employment sectors) in May and June was incorrect.

According to the BLS, May needed to be revised down by 125,000 jobs to just 19,000 added jobs; June had to be revised down by even more, with 133,000 erroneous new jobs added to company payrolls that month. That meant just 14,000 new jobs were added in June.
...
Against that backdrop, Janco reports that BLS data peg the IT-sector unemployment rate at 5.5 percent in July - well above the national rate of 4.2 percent. Meanwhile, the broader tech occupation unemployment rate was just 2.9 percent, as reported by CompTIA.

Note these points from Janco's table:

The huge spike of 107,100 IT jobs lost last November.
The loss of 26,500 IT jobs so far this year.
That so far this year losses are 327% of the same period last year.

Source

The doses are increasing but their effect in pumping the stock hasn't been; the NDXT index of tech stocks hasn't been heading moonwards over the last year.

CEOs have been enthusiastically laying off expensive workers and replacing them with much cheaper indentured servnts on H-1B visas, as Dan Gooding reports in H-1B Visas Under Scrutiny as Big Tech Accelerates Layoffs:

The ongoing reliance on the H-1B comes as some of these same large companies have announced sweeping layoffs, with mid-level and senior roles often hit hardest. Some 80,000 tech jobs have been eliminated so far this year, according to the tracker Layoffs.fyi.

Gooding notes that:

In 2023, U.S. colleges graduated 134,153 citizens or green card holders with bachelor's or master's degrees in computer science. But the same year, the federal government also issued over 110,000 work visas for those in that same field, according to the Institute for Sound Public Policy (IFSPP).

"The story of the H-1B program is that it's for the best and the brightest," said Jeremy Beck, co-president of NumbersUSA, a think tank calling for immigration reform. "The reality, however, is that most H-1B workers are classified and paid as 'entry level.' Either they are not the best and brightest or they are underpaid, or both."

While it is highly likely that most CEOs have drunk the Kool-Aid and actually believe that AI will replace the workers they fired, Liz Fong-Jones believes that:

the megacorps use AI as pretext for layoffs, but actually rooted in end of 0% interest, changes to R&D tax credit (S174, h/t @pragmaticengineer.com for their reporting), & herd mentality/labour market fixing. they want investors to believe AI is driving cost efficiency.

AI today is literally not capable of replacing the senior engineers they are laying off. corps are in fact getting less done, but they're banking on making an example of enough people that survivors put their heads down and help them implement AI in exchange for keeping their jobs... for now.

Note that the megacorps are monopolies, so "getting less done" and delivering worse product by using AI isn't a problem for them — they won't lose business. It is just more enshittification.

Presumably, most CEOs think they have been laying off the fat, and replacing it with cheaper workers whose muscle is enhanced by AI, thereby pumping the stock. But they can't keep doing this; they'd end up with C-suite surrounded by short-termers on H-1Bs with no institutional memory of how the company actually functions. This information would have fallen off the end of the AIs' context.

The deleterious effects kick in

The deleterious effects come in three forms. Within the companies, as the hype about AI's capabilities meets reality. For the workers, and not just those who were laid off. And in the broader economy, as the rush to build AI data centers meets limited resources.

The companies

But Raval sees the weakening starting:

But are leaner organisations necessarily better ones? I am not convinced these companies are more resilient even if they perform better financially. Faster decision making and lower overheads are great, but does this mean fewer resources for R&D, legal functions or compliance? What about a company’s ability to withstand shocks — from supply chain disruptions to employee turnover and dare I say it, runaway robots?

Some companies such as Klarna have reversed tack, realising that firing hundreds of staff and relying on AI resulted in a poorer customer service experience. Now the payments group wants them back.

Of course, the tech majors have already enshittified their customer experience, so they can impose AI on their customers without fear. But AI is enshittifying the customer experience of smaller companies who have acutal competitors.

The workers

Shannon Pettypiece reports that 'A black hole': New graduates discover a dismal job market:

NBC News asked people who recently finished technical school, college or graduate school how their job application process was going, and in more than 100 responses, the graduates described months spent searching for a job, hundreds of applications and zero responses from employers — even with degrees once thought to be in high demand, like computer science or engineering. Some said they struggled to get an hourly retail position or are making salaries well below what they had been expecting in fields they hadn’t planned to work in.

And Anjli Raval note that The AI job cuts are accelerating:

Younger workers should be particularly concerned about this trend. Entire rungs on the career ladder are taking a hit, undermining traditional job pathways. This is not only about AI of course. Offshoring, post-Covid budget discipline, and years of underwhelming growth have made entry-level hiring an easy thing to cut. But AI is adding to pressures.
...
The consequences are cultural as well as economic. If jobs aren’t readily available, will a university degree retain its value? Careers already are increasingly “squiggly” and not linear. The rise of freelancing and hiring of contractors has already fragmented the nature of work in many cases. AI will only propel this.
...
The tech bros touting people-light companies underestimate the complexity of business operations and corporate cultures that are built on very human relationships and interactions. In fact, while AI can indeed handle the tedium, there should be a new premium on the human — from creativity and emotional intelligence to complex judgment. But that can only happen if we invest in those who bring those qualities and teach the next generation of workers — and right now, the door is closing on many of them.

In Rising Young Worker Despair in the United States, David G. Blanchflower & Alex Bryson describe some of the consequences:

Between the early 1990s and 2015 the relationship between mental despair and age was hump-shaped in the United States: it rose to middle-age, then declined later in life. That relationship has now changed: mental despair declines monotonically with age due to a rise in despair among the young. However, the relationship between age and mental despair differs by labor market status. The hump-shape in age still exists for those who are unable to work and the unemployed. The relation between mental despair and age is broadly flat, and has remained so, for homemakers, students and the retired. The change in the age-despair profile over time is due to increasing despair among young workers. Whilst the relationship between mental despair and age has always been downward sloping among workers, this relationship has become more pronounced due to a rise in mental despair among young workers. We find broad-based evidence for this finding in the Behavioral Risk Factor Surveillance System (BRFSS) of 1993-2023, the National Survey on Drug Use and Health (NSDUH), 2008-2023, and in surveys by Pew, the Conference Board and Johns Hopkins University.

History tends to show that large numbers of jobless young people despairing of their prospects for the future is a pre-revolutionary situation.

The economy

Source

Bryce Elder's What’ll happen if we spend nearly $3tn on data centres no one needs? points out the huge size of the AI bubble:

The entire high-yield bond market is only valued at about $1.4tn, so private credit investors putting in $800bn for data centre construction would be huge. A predicted $150bn of ABS and CMBS issuance backed by data centre cash flows would triple those markets’ current size. Hyperscaler funding of $300bn to $400bn a year compares with annual capex last year for all S&P 500 companies of about $950bn.

It’s also worth breaking down where the money would be spent. Morgan Stanley estimates that $1.3tn of data centre capex will pay for land, buildings and fit-out expenses. The remaining $1.6tn is to buy GPUs from Nvidia and others. Smarter people than us can work out how to securitise an asset that loses 30 per cent of its value every year, and good luck to them.

Brian Merchant argues that this spending is so big it is offsetting the impact of the tariffs in The AI bubble is so big it's propping up the US economy (for now):

Over the last six months, capital expenditures on AI—counting just information processing equipment and software, by the way—added more to the growth of the US economy than all consumer spending combined. You can just pull any of those quotes out—spending on IT for AI is so big it might be making up for economic losses from the tariffs, serving as a private sector stimulus program.

Source

Noah Smith's Will data centers crash the economy? focuses on the incredible amounts the big four — Google, Meta, Microsoft, and Amazon — are spending:

For Microsoft and Meta, this capital expenditure is now more than a third of their total sales.

Smith notes that, as a proportion of GDP, this roughly matches the peak of the telecom boom:

That would have been around 1.2% of U.S. GDP at the time — about where the data center boom is now. But the data center boom is still ramping up, and there’s no obvious reason to think 2025 is the peak,

The fiber optic networks that, a quarter-century later, are bringing you this post were the result of the telecom boom.

Source

Over-investment is back, but might this be a good thing?

I think it’s important to look at the telecom boom of the 1990s rather than the one in the 2010s, because the former led to a gigantic crash. The railroad boom led to a gigantic crash too, in 1873 ... In both cases, companies built too much infrastructure, outrunning growth in demand for that infrastructure, and suffered a devastating bust as expectations reset and loans couldn’t be paid back.

In both cases, though, the big capex spenders weren’t wrong, they were just early. Eventually, we ended up using all of those railroads and all of those telecom fibers, and much more. This has led a lot of people to speculate that big investment bubbles might actually be beneficial to the economy, since manias leave behind a surplus of cheap infrastructure that can be used to power future technological advances and new business models.

But for anyone who gets caught up in the crash, the future benefits to society are of cold comfort.

Source

How likely is the bubble to burst? Elder notes just one reason:

Morgan Stanley estimates that more than half of the new data centres will be in the US, where there’s no obvious way yet to switch them on:

America needs to find an extra 45GW for its data farms, says Morgan Stanley. That’s equivalent to about 10 per cent of all current US generation capacity, or “23 Hoover Dams”, it says. Proposed workarounds to meet the shortfall include scrapping crypto mining, putting data centres “behind the meter” in nuclear power plants, and building a new fleet of gas-fired generators.

Good luck with that! It is worth noting that the crash has already happened in China, as Caiwei Chen reports in China built hundreds of AI data centers to catch the AI boom. Now many stand unused.:

Just months ago, a boom in data center construction was at its height, fueled by both government and private investors. However, many newly built facilities are now sitting empty. According to people on the ground who spoke to MIT Technology Review—including contractors, an executive at a GPU server company, and project managers—most of the companies running these data centers are struggling to stay afloat. The local Chinese outlets Jiazi Guangnian and 36Kr report that up to 80% of China’s newly built computing resources remain unused.

Elder also uses the analogy with the late 90s telecom bubble:

In 2000, at the telecoms bubble’s peak, communications equipment spending topped out at $135bn annualised. The internet hasn’t disappeared, but most of the money did. All those 3G licences and fibre-optic city loops provided zero insulation from default:

Peak data centre spend this time around might be 10 times higher, very approximately, with public credit investors sharing the burden more equally with corporates. The broader spread of capital might mean a slower unwind should GenAI’s return on investment fail to meet expectations, as Morgan Stanley says. But it’s still not obvious why creditors would be coveting a server shed full of obsolete GPUs that’s downwind of a proposed power plant.

When the bubble bursts, who will lose money?

A data center bust would mean that Big Tech shareholders would lose a lot of money, like dot-com shareholders in 2000. It would also slow the economy directly, because Big Tech companies would stop investing. But the scariest possibility is that it would cause a financial crisis.

Financial crises tend to involve bank debt. When a financial bubble and crash is mostly a fall in the value of stocks and bonds, everyone takes losses and then just sort of walks away, a bit poorer — like in 2000. Jorda, Schularick, and Taylor (2015) survey the history of bubbles and crashes, and they find that debt (also called “credit” and “leverage”) is a key predictor of whether a bubble ends up hurting the real economy.

The Jorda et al paper is When Credit Bites Back: Leverage, Business Cycles, and Crises, and what they mean by "credit" and "leverage" is bank loans.

Smith looks at whether the banks are lending:

So if we believe this basic story of when to be afraid of capex busts, it means that we have to care about who is lending money to these Big Tech companies to build all these data centers. That way, we can figure out whether we’re worried about what happens to those lenders if Big Tech can’t pay the money back.

And so does The Economist:

During the first half of the year investment-grade borrowing by tech firms was 70% higher than in the first six months of 2024. In April Alphabet issued bonds for the first time since 2020. Microsoft has reduced its cash pile but its finance leases—a type of debt mostly related to data centres—nearly tripled since 2023, to $46bn (a further $93bn of such liabilities are not yet on its balance-sheet). Meta is in talks to borrow around $30bn from private-credit lenders including Apollo, Brookfield and Carlyle. The market for debt securities backed by borrowing related to data centres, where liabilities are pooled and sliced up in a way similar to mortgage bonds, has grown from almost nothing in 2018 to around $50bn today.

The rush to borrow is more furious among big tech’s challengers. CoreWeave, an ai cloud firm, has borrowed liberally from private-credit funds and bond investors to buy chips from Nvidia. Fluidstack, another cloud-computing startup, is also borrowing heavily, using its chips as collateral. SoftBank, a Japanese firm, is financing its share of a giant partnership with Openai, the maker of ChatGPT, with debt. “They don’t actually have the money,” wrote Elon Musk when the partnership was announced in January. After raising $5bn of debt earlier this year xai, Mr Musk’s own startup, is reportedly borrowing $12bn to buy chips.

Smith focuses on private credit:

These are the potentially scary part. Private credit funds are basically companies that take investment, borrow money, and then lend that money out in private (i.e. opaque) markets. They’re the debt version of private equity, and in recent years they’ve grown rapidly to become one of the U.S.’ economy’s major categories of debt:

Source

Are the banks vulnerable to private credit?.

Private credit funds take some of their financing as equity, but they also borrow money. Some of this money is borrowed from banks. In 2013, only 1% of U.S. banks’ total loans to non-bank financial institutions was to private equity and private credit firms; today, it’s 14%.

BDCs are “Business Development Companies”, which are a type of private credit fund. If there’s a bust in private credit, that’s an acronym you’ll be hearing a lot.

And I believe the graph above does not include bank purchases of bonds (CLOs) issued by private credit companies. If private credit goes bust, those bank assets will go bust too, making banks’ balance sheets weaker.

The fundamental problem here is that an AI bust would cause losses that would be both very large and very highly correlated, and thus very likely to be a tail risk not adequately accounted for by the banks' risk models, just as the large, highly correlated losses caused the banks to need a bail-out in the Global Financial Crisis of 2008.

The Book of Records / Ed Summers

I recently finished Madeleine Thien’s The Book of Records and found this in the acknowledgements at the end:

The Book of Records, guided by histories, letters, philosophies, poetry, mathematics and physics, is a work of the imagination. I am indebted to the library, and to librarians, archivists and translators, for their companionship and light–they are the steadfast keepers of the building made of time.

I like how this blends the people and infrastructure of libraries and record keeping, and recognizes them as partners in imagination. Reading and writing are the central theme, of this beautiful book, which David Naimon describes well in the opening to his extended interview with her:

The Book of Records is many things: a book of historical fiction and speculative fiction, a meditation on time and on space-time, on storytelling and truth, on memory and the imagination, a book that impossibly conjures the lives and eras of the philosopher Baruch Spinoza, the Tang dynasty poet Du Fu and the political theorist Hannah Arendt not as mere ghostly presences but portrayed as vividly and tangibly as if they lived here and now in the room where we hold this very book. But most of all this is a book about books, about words as amulets, about stories as shelters, about novels as life rafts, about strangers saving strangers, about friendships that defy both space and time, about choosing, sometimes at great risk to oneself, life and love.

I will add that the underlying theme of being a refugee from various forms of fascism and totalitarianism amidst a catastrophically changing climate really speaks to our moment–especially considering that the book took her ten years to write.

I heard in the interview that Thien worked through copying Spinoza’s Ethics as an exercise while writing The Book of Records. I don’t know if I’m going to do this, but I did enjoy the sections on Spinoza a lot, and previously enjoyed reading about how his philosophy informed Joyful Militancy, so I got a copy too. Fun fact: George Eliot (Mary Ann Evans) wrote the first English translation of Ethics in 1856, but it sat unpublished until 1981.

Deeper Dive into Estimating BTAA Sociology Serials Holdings with WMS APIs, Z39.50, and Spreadsheets / Library | Ruth Kitchin Tillman

Two years ago, my colleague Stephen Woods approached me about collaborating on an article¹ extending research he’d already performed about serials use in doctoral sociology work. He and another colleague, John Russell, had developed a methodology for determining “CDRank” based on the number of times a journal was citated across a dissertation dataset and the number/% of dissertations it was cited in.²

On his sabbatical, Stephen had mined citations in 518 sociology dissertations from Big Ten schools. He planned to perform a CDRank analysis and determine the most influential journals by school and see where they overlapped (or didn’t). He had a spreadsheet of titles and ISSNs for the 5,659 distinct journals cited and a related question: What did holdings for these look like across the Big Ten Libraries?

He was interested in whether the highest-ranked journals were more universally held, whether there were any noticeable gaps, basically would any patterns emerge if we looked at the Big Ten’s holdings for these journals. And then, at an institution-by-institution level, were any of the most-used journals for that institution not held by the library?

As anyone who works with it knows, holdings data is notoriously difficult. But I was interested in it as a challenge: could I combine a variety of resources to come up with a reasonable assessment of which libraries had some holdings of the serial in question?

Obtaining Library Holdings: The Summary

The journal article was focused on the outcome, so it wasn’t a place for me to write a deep dive of the process I used for identifying holdings. This is the summarized version from the article:

The 57,777 citations were condensed to a list of 5,659 distinct journal title/ISSN entries. Holdings data from across the BTAA was then queried to determine the extent to which these journals are held by BTAA institutions. It was first necessary to obtain all representative ISSNs for each journal. The WorldCat Metadata API 2.0 was queried by ISSN and results were processed to identify additional ISSNs. These additional ISSNs were used in subsequent queries. During this process, 25 titles were identified that did not include ISSNs and the list was reduced to 5,634 unique pairings.

Holdings data was obtained from the WorldCat Search API v.2 and Z39.50 services. First, the WorldCat Search API’s bibliographic holdings endpoint was queried by ISSN and results limited to a list of OCLC symbols representing the BTAA institutions. However, an institution’s WorldCat holdings may not be up-to-date and are unlikely to represent electronic-only items. In the second step, MarcEdit software was used to query each institution’s Z39.50 service by ISSN for any of the 5,634 entities not found at that institution during the WorldCat API phase. This combined holdings data was saved to a JSON database.

In limitations, I addressed some of the challenges I ran into:

Holdings represent those found in WorldCat and respective library ILSes during November 2023. Several factors limit the effectiveness of representing libraries’ journal holdings. Coverage is not recorded in ways which can be easily machine-parsed at scale to determine whether holdings represent part or all of a publication run. E-journal records are often updated on a monthly basis, resulting in varying results by month. Additionally, if a library does not have sufficient staffing to perform updates, their WorldCat holdings statements may not reflect recent weeding. The presence of WorldCat holdings or of a record in the library’s ILS (queried by Z39.50) indicates, at minimum, that the library has held some coverage of this journal at some point in time.

University of Nebraska-Lincoln’s Z39.50 documentation was not available online and email inquiries were not answered, so the Z39.50 phase could not be run. Gaps in Nebraska’s WorldCat holdings for the combined list of top 161 journals were manually queried by title and ISSNs using the library’s Primo discovery search. As indicated in Results, all but two of these journals were found.

Obtaining Library Holdings: The Whole Story

Even for our original intended audience at Serials Review,³ a full writeup would’ve been too deep a dive (buckle in, this is 2500 words), but I really enjoyed (and hated) the challenge of figuring out how to even tackle the project and solving problems along the way (except when I tore my hair out). So I thought I’d share it here.

I need to preface by noting again that my research question was not whether an institution had complete holdings or precisely which holdings they had. It’s challenging to do that at the scale of one’s own institution. My question was:

Does this institution appear to hold print or electronic versions of some segment of this serial?

Processing ISSNs for Siblings, Mostly

First, I evaluated my starting data. I had journal titles and ISSNs. A quick check confirmed my hypothesis that some were for print materials and some were for e-journals. I wanted to check for both kinds, of course.

Because library records don’t yet have rich cluster ISSNs and I didn’t have a API subscription to the ISSN Portal,⁴ I decided to use the next best thing – WorldCat. I searched brief bibs in the WorldCat Search API v.2 using the ISSN to obtain all records. I used a function to run through list of ISSNs in a brief bib, clean it up if needed, and append all ISSNs found to a new list. So my output was the title, original ISSN, and a list of all ISSNs found.

{ "title": "journal of loss and trauma",
"original_issn": "1532-5024",
"all_issns": ["1532-5024", "1532-5032"] }

Challenges

The first challenge I ran into was that I was relying on data which came from a field originally intended for recording misprints. However, it had become repurposed to do what I wanted – record the ISSN for the other (print / electronic) version of a serial. Frustration with this dual purpose sparked my MARC Misconceptions post re: the 022$y which explores the problem further. After some rabbit holes attempting to find ways to identify these and some spot checks to identify how often the problem was happening, I finally accepted that the holdings data was just going to be generally accurate. I also decided that I would allow for continuations if they showed up in the ISSN data because when a record had more than 2 ISSNs, my spot checking determined it was almost always for one or more continuations vs. another work entirely.

A more concrete problem was that sometimes ISSN data was recorded with hyphens and sometimes it wasn’t. Sometimes it even contained parentheticals. I developed some rather complex processing logic, including regular expressions and substring slices, to turn a field into just the ISSN, formatted as 1234-5678. Using Regex, I reviewed my data and manually corrected the few errors, most of which were caused by a cataloger typoing a 7-digit ISSN, e.g. 234-5678 and 1234-567.

I also used this phase to manually review a small handful of ISSNs which showed up in two records. In most cases, they were for the same serial. The original citations had used slight title variants (author error) and then the print and electronic ISSNs (natural variance), leading to a database duplication. A few were continuations. I resolved all of these, referring to the ISSN Portal when needed. I also informed Stephen so he could recalculate CD Rank of the merged records.

25 of the journals in the original dataset simply had no ISSNs. While I was running my big scripts to gather sibling ISSNs, I used the ISSN Portal to confirm that they really had no ISSN. Fortunately, all had extremely low CDRanks representing one-off citations.

Querying Holdings: WorldCat

Next, I needed to get actual holdings. The one place I could think of to get holdings in aggregate was, again, WorldCat. I used the bibs holdings API for this one.

First, I created a list of the institutional identifiers for each school. For each record in my database, I ran the list of its potential ISSNs (most often just a pair) through the API using the “heldBySymbol” limiter and grabbed a list of institutions with some holding for this ISSN. It output these to a JSON file/database of records consisting of: title, the original ISSN, the list of ISSNs, the list of holding institutions.

{ "title": "journal of loss and trauma",
"original_issn": "1532-5024",
"holdings": ["MNU", "UPM","IPL","UMC","OSU","GZM","NUI","LDL","IUL","EYM","EEM"],
"all_issns": ["1532-5024", "1532-5032"] }

However, my years of experience working in cataloging departments and with library data meant I also know that WorldCat holdings are unreliable. Worst case for this research, the institution had weeded the journal and not updated their holdings. But, conversely, they likely didn’t provide holdings information for their e-journal records.

Sampling the results I got at this phase, I knew I wasn’t getting the whole picture…

Querying Holdings: Z39.50

So far, I’d been able to work on the whole thing as a single batch – one big batch of ISSN sibling hunts, one big batch where I queried all the library codes at once.⁵ But now, it was time to get targeted.

I wrote a script to check each each entry in the database for which institutions were not present. It wrote all the ISSNs from these entries to a line break-separated text file of ISSNs. I saved these by symbol, so UPM.txt, EEM.txt, etc. Some of these files were 3000 ISSNs long (but keep in mind that, in most cases, several ISSNs represent the same journal).

I then used MarcEdit to query each institution’s Z39.50 for the whole thing.

Now, in addition to writing MARC files, MarcEdit provides a handy log of your query and results:

Searching on: 1942-4620 using index: ISSN
0 records found in database University of Michigan
Searching on: 1942-4639 using index: ISSN
0 records found in database University of Michigan
Searching on: 1942-535X using index: ISSN
1 records found in database University of Michigan

I saved these as text files⁶ and then ran a Python script over them to process the search key and results. It read through each pair of lines, made a dict of the results {"1942-4620" : 0, "1942-4639" : 0, "1942-535X" : 1}, then opened the JSON database and updated the holdings. I used an “if value not in” check so that an entry’s holdings would only update once even if the Z39.50 output matched 3 sibling ISSNs from that entry.

…this was one of those moments in coding where you feel like an utter genius but worry that you might be a little unhinged as well.

Querying Holdings: Shared Print

In some cases, the reason an institution didn’t have a journal any more was that they’d contributed it to the BTAA Shared Print Repository. This project specifically targeted journals, so it was entirely possible that one of these institutions had eased its shelf space by sending a run to Shared Print.

Using my contacts at the BTAA, I got emails for the people at Indiana and Illinois who actually managed the projects. Fortunately, both had holdings spreadsheets, including ISSNs, and were willing to share them.

I wrote a Python script to take these (as CSVs) and check for the presence of an ISSN in each spreadsheet. If it found the ISSN, it would write the OCLC library code (UIUSP or IULSP) to the output database. I wrote and ran this while gathering Z39.50 data, since that took several weeks.

This turned out to be a non-issue for the overall project, since almost all of the top 161 journals were held at all the institutions. If contributions to shared print were partly judged on the basis of usage, this would make sense. Still, it might be interesting to look at shared print coverage of the database as a whole.

Minding the Gaps

There was one big gap in the whole thing – University of Nebraska-Lincoln. They had somewhat recently migrated to Alma, their systems librarian at left when they migrated, and they had not yet filled the position. I contacted several people there asking about Z39.50 access for a research project but didn’t hear anything. (Fortunately, they’ve now got a new and engaged systems librarian whom I met at ELUNA this year.)

Anyway, this posed a challenge. If they had Z39.50 turned on, it wasn’t showing up in any of the ways I could think of. I made several attempts, mimicking the many other Alma schools I had queried. Nothing worked.

By this point, we had a combined list of the top 161 journals. We also had partial holdings data for Nebraska from the WorldCat query. So I sat down and did this one manually. I think I searched ~30 journals by hand in their Primo front-end, using advanced search by ISSN and then by title/material type if the ISSN didn’t come up (and then double check to ensure it was the right journal). I marked all the ones I found and used this data to update the top 161.

Because there weren’t many, I decided to be as thorough as possible and manually check each institution’s catalog/discovery/e-journal finders for remaining gaps in the top 161.

Observations

In some ways, my findings were not very exciting: BTAA schools widely hold (at least some of) the journals most commonly used in sociology dissertations. Either that or the commonality of these holdings means that they’re the most widely used. (But many others were just as widely held and not as widely used, so I suspect the former, with internal usage data playing a role in retention.)

Ultimately, my process got far more data than we actually used. I could’ve just run the queries for the top 161. That would’ve been a much smaller project and I could’ve thoroughly validated my results. For example, I would’ve checked any instances where the ISSN list contained more than 2, using the ISSN Portal to be sure these were cases of a journal continuation vs. an actual incorrect ISSN. But when we started, Stephen was still working on his own analysis of the data. And while an enormous job, this yielded a really interesting database of results, something I might be able to revisit in the future. It was also a fascinating challenge.

Woods, Stephen, and Ruth Kitchin Tillman. “Supporting Doctoral Research in Sociology in the BTAA.” Pennsylvania Libraries: Research & Practice, 13, no. 1 (2025). 10.5195/palrap.2025.303. ↩︎
Woods, Stephen and John Russell. “Examination of journal usage in rural sociology using citation analysis.” Serials Review, 48, no. 1–2 (2022), 112–120. 10.1080/00987913.2022.2127601 ↩︎
We’d intended this for Serials Review, like the other articles Stephen had published in this vein, but they did not respond to our submission for more than 6 months (they did finally acknowledge some time after we pulled it from consideration) and failed to publish an issue, so we pulled it. ↩︎
Though I sure did use it manually throughout the project. ↩︎
Lest this sound smooth on its own, it required a lot of iterative scripting and testing, followed by running them in batches, and dealing with occasional errors which ground things to a halt. It was sometimes exciting and sometimes stressful. At one point, I got snippy with OCLC support for an issue that was on my code’s end (though I still think it should have given a different error message). ↩︎
After spending a day or more running a Z39.50 query, I always felt so nervous at this stage, paranoid that I would close the log while I was attemping to copy it. ↩︎

Going around in Circles: Interrogating Librarians’ Spheres of Concern, Influence, and Control / In the Library, With the Lead Pipe

In Brief: The practice placing one’s anxieties into circles of concern, influence, and control can be found in philosophy, psychology, and self-help literature. It is a means of cultivating agency and preventing needless rumination. For librarians, however, it is often at odds with a profession that expects continuous expansion of responsibilities. To reconcile this conflict, it is useful to look back at the original intent of this model, assess the present library landscape through its lens, and imagine a future in which library workers truly feel in control of their vocation.

By Jordan Moore

Introduction

It is a beautiful experience when you discover something that reorients your entire outlook on life. This happened to me during one of my first therapy sessions after being diagnosed with Generalized Anxiety Disorder. My therapist gave me a piece of paper and a pencil and instructed me to draw a large circle. Next, they told me to imagine that circle was full of everything I was anxious about, all the real and hypothetical problems that stressed me out. We labeled that circle “concern.” Then, they asked me to draw a much smaller circle in the middle of it. I would say it was one-tenth the size of the first circle. “That” they said, “represents what you can control.”

A small circle labeled control, within a large circle labeled concern.

I felt disheartened while looking at that picture, as if it spelled out a grave diagnosis. The second circle was already so small, and I could have sworn it was even tinier when I looked back at the page and compared it to the first circle. Then, we began to populate the circle of control with what was in my power to determine – how much sleep I got, how often I reached out to loved ones, how many hours I spent doomscrolling, and so on. Finally, my therapist asked, “How much time do you spend thinking about things in the outer circle?” If I didn’t answer 100%, the number was close. They tapped a finger on the inner circle and, in the way that therapists often phrase advice as a question, asked “What if you concentrated on what is in your control instead?” What if indeed.

That conversation occurred over a decade ago. Since then, I have grown accustomed to categorizing my anxieties into ones of concern or control. If something is weighing on me, but is outside of my circle of control, I do my best not to ruminate on it, or at least redirect my thoughts back to what I, as a single person, can do. I try to devote most of my energy to practices that keep me in good health and good spirits. This has done wonders for my mental health. It has also proven beneficial in my professional life, keeping me focused on the aspects of my job that fulfill me. It has become so integral to my way of thinking that I have even discussed the concept (and the context I learned it from) at work. Naturally, I was at first hesitant to bring “therapy talk” into work. However, it has proven to be a catchy idea. I have been at numerous meetings where someone describes a situation, often the behavior of patrons or administrators, as “outside of our circle,” with a nod in my direction.

Sometimes, though, instead of accepting the situation for what it is, we discuss what adjustments we need to make to our practice or policy to fix the situation. When these types of conversations occur, I think back to that original drawing of two circles. Suddenly, another circle appears between the circle of concern and control. It is the circle of influence. It’s something that wasn’t in my initial understanding of the model, but is in plenty of other illustrations. It is a place meant for one to use tools in their circle of control to enact a small, person-sized amount of impact to their circle of concern. An example of this would be a librarian informing a lingering patron that the library is closing soon. They are not going to pick the patron up and toss them out the door, but they can encourage them to exit promptly. That is a reasonable expectation of influence. An unreasonable expectation would be if that librarian felt the need to make sure that that patron, or any patron, never had a negative thing to say about the library. In my experience, it appears that librarians and libraries seem to have high expectations of influence. I began to wonder why that is, and what could be done to alleviate that burden. To start, I decided to learn more about the model that had been so life-changing for me. That inquiry would take me back further than I expected.

An Unexpected Literature Review

Because I need to find a new therapist every time my health insurance changes – Great job, American Healthcare system! – I unfortunately could not ask the therapist who introduced me to the model of circles how they learned about it. Fortunately, looking for answers is part of my job, and I was able to play both parts of a reference interview. One of the first websites I visited was “Understanding the Circles of Influence, Concern, and Control,” written by Anna K. Scharffner. I noticed that Schraffner’s qualifications include “Burnout and Executive Coach,” which let me know others were thinking about this concept in the workplace. I also noticed that Schraffner’s model includes a sphere of influence. In her description of that area, she writes, “We may or may not have the power to expand our influence… We can certainly try. It is wise to spend some of our energy in that sphere, bearing in mind that we can control our efforts in this sphere, but not necessarily outcomes.”

A circle containing 3 rings: the innermost ring is labeled "circle of control: things I can control," the middle ring is labeled "circle of influence: things I can influence" and the outer ring is labeled "circle of concern: things that are outside of my control"

As I continued reading interpretations of the circles model, I noticed references to other concepts that I only had passing familiarity with. The oldest among these was Stoicism. To learn more, I decided to speak with my brother-in-law, a Classical Studies professor. After I told him about what I was researching, he said it had a lot in common with Stoics’ quest to lead a virtuous life by valuing logic and self-possession. At the root of Stoicism is the recognition of the difficult truth that humans cannot control much – neither the whims of capricious gods, nor the actions of flawed human beings. The Greek philosopher Epictetus states in the opening lines of his Enchiridion,

Some things are in our control and others not. Things in our control are opinion, pursuit, desire, aversion, and, in a word, whatever are our own actions. Things not in our control are body, property, reputation, command, and, in one word, whatever are not our own actions (I).

Later, the Roman emperor and philosopher Marcus Aurelius writes in his Meditations, “If thou art pained by any external thing, it is not this that disturbs thee, but thy own judgment about it. And it is in thy power to wipe out this judgment now” (VII. 47).

As unfamiliar and phonetically challenging as these authors and texts were at first glance, I was quickly able to make connections between them and literature in my own frame of reference. I recalled the line in Hamlet, “There is nothing either good or bad, but thinking makes it so” (II.ii). I thought back to reading Man’s Search for Meaning by Victor Frankl, which I picked up on the recommendation of another therapist. I remembered being particularly moved by the line, “Everything can be taken from a man but one thing: the last of the human freedoms – to choose one’s attitude in any given set of circumstances, to choose one’s own way” (75). It turns out I was a fan of Stoicism without knowing it.

Speaking of ideas I learned about in therapy – and you can tell I constantly am – the next concept I came across was cognitive behavior therapy (CBT). Having engaged in CBT work throughout my time in therapy, I was familiar with its thesis that maladaptive behaviors stem from “cognitive distortions,” thoughts and feelings about ourselves and our experiences that do not reflect reality. CBT posits that by challenging these distortions, one can think, feel, and act in a healthier way. What I did not know was that Aaron Beck, one of the pioneers of CBT, was a student of Stoicism. In Cognitive Therapy of Depression, he credits Stoicism as “the philosophical origins of cognitive therapy” (8). The connection made sense once I realized how much of my time with that first therapist was spent battling the cognitive distortion that I could control any situation if I worried about it hard enough.

I still wanted to learn more about the in-between space of influence, and why it seems particularly vast for librarians. As I continued to search for literature about the circle of influence, my references became less tied to philosophy and psychology and closer to self-help and business. One title that kept popping up, and one that I had heard before, was The 7 Habits of Highly Effective People by Stephen Covey. When I started reading it, I felt like I was in familiar territory. Covey supplies anecdotes of people benefitting from concentrating on the elements of their life that they can control, even referencing Viktor Frankl as an example. However, Covey later diverges from the Stoic belief that there are limits to our control. He combines the spheres of control and influence into one circle and instructs readers to pour their energy into it, not necessarily for the sake of their sanity, but for the opportunity to gain more influence. He calls this being “proactive” and writes, “Proactive people focus their efforts in the Circle of Influence. They work on the things they can do something about. The nature of their energy is positive, enlarging, and magnifying, causing their Circle of Influence to increase.” This idea of ever-increasing influence allows Covey to claim, “We are responsible for our own effectiveness, for our own happiness, and ultimately, I would say, for most of our circumstances” (96-98).

A circle within a larger circle. The inner circle is labeled circle of influence, the outer circle is labeled circle of concern. The inner circle has arrows pointing outward, to indicate that the inner circle (circle of influence) is growing. The image is labeled Proactive focus: positive energy enlarges the circle of influence.

Applications in Librarianship

Thinking about Covey’s advice in context of my job made me uneasy. His model, with its arrows pushing ever-outward, gave me the same sense of pressure I got from conversations about how my library or librarianship in general needs to do more to meet that day’s crisis. I also suspected that Covey’s argument for power over all circumstances ignores some basic truths that people, especially those without societal privilege, must face. I knew 7 Habits was popular, with dozens of reprints and special editions since its original publication. However, I was able to find critical voices who shared my skepticism. For instance, in “Opening Pandora’s box: The Unintended Consequences of Stephen Covey’s Effectiveness Movement,” Darren McCabe writes, “Covey preaches freedom, but he fails to acknowledge the constraints on freedom that operate within a capitalist system,” and notes that Covey’s outlook “may be acceptable in a utopian society, but not when one faces inequality, pay freezes, work intensification, monotonous working conditions, autocratic management, or redundancy” (186-187). I also recalled how Schaffner, the burnout specialist, advises against devoting too much energy to the circle of influence, saying we can only control our efforts, not our outcomes. Having brought my research of the history of the spheres model up to the present, I was ready to turn to library literature to see how they play out in the profession.

Giving until it hurts

Since the topic of burnout was fresh on my mind, I began by revisiting Fobazi Ettarh’s “Vocational Awe and Librarianship: The Lies We Tell Ourselves.” In it, she characterizes librarianship’s inflated sense of responsibility and influence like this: “Through the language of vocational awe, libraries have been placed as a higher authority and the work in service of libraries as a sacred duty.” Ettarh describes how this can cause librarians to be underpaid, overworked, and burnt out. After all, it is much more difficult to negotiate the terms of a sacred duty than an ordinary job.

Ettarh is also quoted in Library Journal and School Library Journal’s 2022 Job Satisfaction Survey by Jennifer A. Dixon titled, “Feeling the Burnout: Library Workers Are Facing Burnout in Greater Numbers and Severity—And Grappling with it as a Systemic Problem.” In it, Ettarh states “One of the biggest system-wide problems, when it comes to librarianship, is job creep.” This term describes the continual addition of responsibilities librarians are expected to perform. The report also describes “mission creep,” where libraries, particularly public ones, become response centers for issues that are far afield from routine services. This results in librarians being responsible for assisting patrons experiencing drug overdoses, mental health crises, and homelessness. In these situations, librarians are rarely given additional training or resources, and are indeed dealing with these crises exactly because society at large does not give them adequate attention or funding. In summary, job creep and mission creep cause librarians’ circle of concern to expand, and, as the report illustrates, attempting to exert control or influence over all that new territory can spell disaster. Dixon puts it this way, “With institutions continually cutting budgets without actually reducing their expectations of what library workers can accomplish, those who are committed to service and to their profession will continue pushing themselves to the point of burnout.”

Feeling the disparity

The job satisfaction survey points to another source of discontent for librarians, and that is the cognitive dissonance caused by the gulf between their perceived level of influence and their actual level of influence. For academic librarians, the issue can be seen in the lack of recognition of their expertise in comparison to other professionals on campus. The ambiguous status of academic librarians is also listed as a contribution to low morale in a 2021 Journal of Library Administration review. This review cites Melissa Belcher’s “Understanding the experience of full-time nontenure-track library faculty: Numbers, treatment, and job satisfaction,” which illustrates how academic librarians enjoy less autonomy and less professional courtesy than traditional faculty. I could very much relate to the sentiments expressed in these articles. It is a classic academic librarian conundrum to be expected to be in constant contact with faculty, but not be able to get them to reply to emails.

For public librarians in the Library Journal survey, the issue can be seen in the disconnect between their perceived status as “heros” or “essential workers” and the antagonism they face from patrons, particularly while attempting to enforce masking during COVID. The Journal of Library Administration review also notes that physical safety is of particular concern to public librarians, stating “It is important to note that morale in libraries can be impacted not only by theoretical or conceptual concerns, but also by qualms about basic physical safety from surrounding communities.” Since hostility toward public libraries has only increased since the report’s publication due to their vilification from federal, state, and local powers, its words are prescient.

Because my experience is limited to academia, I wanted to get a public librarian’s take on the Library Journal job satisfaction survey. When I brought it up to a friend, they were kind enough to share their thoughts, though they wished to remain anonymous. They wrote that during COVID,

“There were a few especially egregious instances of aggression due to patrons’ unwillingness to wear a mask that still affects how I view these folks today. Management was unsupportive, did not originally handle these volatile encounters by stopping people at the door, and expected other staff members lower on the hierarchy to handle these issues.”

In those instances, management had something that was in their control (whether patrons could enter the building without masks) and chose instead to leave it up to librarians to influence patrons’ behaviors.

My friend also provided examples of how management used both vocational awe and job creep to overload staff. They summed up the situation like this,

“Workloads are never analyzed before staff members are given even more tasks, and if there is any sort of push back, you are viewed as not being a team player. People who speak up are used as examples, and the rest of the staff stays quiet because they fear similar retaliation… I’m always like, ‘OMG, if you don’t like something, please speak up so I’m not constantly viewed as causing trouble and it’s not only me who has the issue.’”

Starting from the top

The stories featured in these articles about job satisfaction and moral, as well as my friend’s account, reminded me of Anne Helen Peterson’s 2022 talk, “The Librarians Are Not Okay,” which appeared in her newsletter, Culture Studies. In it, she lays out the necessity of institutional guardrails to do the work that individual boundaries cannot accomplish alone. She explains that in today’s parlance, “Boundaries are the responsibility of the worker to maintain, and when they fall apart, that was the worker’s own failing.” Guardrails, on the other hand, are “fundamental to the organization’s operation, and the onus for maintaining them is not on the individual, but the group as whole.” An individual’s boundaries can be pushed for many reasons. They could be trying to live up to the ideal that their vocational awe inspires, as Ettarh puts it. Their management may be using that vocational awe to turn any pushback into accusations of betrayal, as both Ettarh and my friend describe. Peterson shows how guardrails can remedy those internal and external pressures by creating a shared understanding of expectations. Those expectations play a critical role in preventing burnout. She gives the example, “an email is not a five-alarm fire, and you shouldn’t train yourself to react as if it was, because that sort of vigilance is not sustainable.” Peterson’s piece caused me to reflect on times that I have talked about the circles of concern, influence, and control in the workplace. I appreciated the moments when all of us, including administration, agreed that something was outside of our responsibility, and we would breathe a sigh of relief. And those occasions when a supervisor or administrator told me not to worry about something I was convinced I needed to worry about? Heaven.

In the interest of exploring what the circles of concern, influence, and control may look like for administrators, I read the most recent Ithaka S+R Library Survey, published in 2022. This survey of academic library leadership offered interesting examples of administrators grappling with the breadth of their concern and the limits of their influence. The report explains,

“Convincing campus leaders of the library’s value proposition remains a challenge. While over 72 percent of library deans and directors report high levels of confidence in their own ability to articulate their library’s value proposition in a way that aligns with the goals of the institution, only 51 percent are confident other senior administrators believe in this alignment.”

The study also lists several key issues, such creating impactful Diversity, Equity, Inclusion and Accessibility (DEIA) initiatives, hiring and retaining staff in technology roles, and supporting Open Access, that leaders categorize as high priorities, yet express a low level of confidence in their organization’s strategies to address these concerns. (This is even before the federal government and the Department of Education began attacking DEIA measures and threatening institutional funding.) At the same time, the survey offers examples of administrators resisting mission creep and focusing their efforts on library service inside their control. The report states, “Deans and directors see the library contributing most strongly to increasing student learning and helping students develop a sense of community, rather than to other metrics such as addressing student basic needs or improving post-graduation outcomes.” Survey results about budgetary considerations also demonstrate the leaders’ commitment to recruiting and retaining positions with high customer-service impact. All in all, the survey shows that these leaders recognize that their library cannot do it all. Because of that, they make strategic choices on where to allot resources, and just as importantly, where to not. Being in charge of their institution, that is their prerogative. But what if that kind of decision-making was available to individuals, as well?

Taking it slow

There is an existing philosophy in our field that complements the philosophy of circles very nicely – slow librarianship. On her blog, Information Wants to be Free, in a post titled “What is Slow Librarianship,” Meredith Farkas describes what slow librarianship values. She writes, “Workers in slow libraries are focused on relationship-building, deeply understanding and meeting patron needs, and providing equitable services to their communities. Internally, slow library culture is focused on learning and reflection, collaboration and solidarity.” In describing what slow librarianship opposes, she writes, “Slow librarianship is against neoliberalism, achievement culture, and the cult of productivity.” Similarly to Peterson, Farkas describes how sticking to these principles require not just boundaries, but guardrails. She writes,

“One of the most important pieces of the slow movement is the focus on solidarity and collective care and a move away from the individualism that so defines the American character. If you’re only focused on your own liberation and your own well-being, you’re doing it wrong.”

What I appreciate about this picture of slow librarianship is that it gives librarians a useful framework to decide if they should dedicate time and energy to a task. It must be meaningful to both the patrons and themselves, and it must support the relationship between them. Better yet, when they identify such a task, they are not going at it alone, but with the community they have developed. Even better still, slow librarianship demands that librarians use their influence not to expand what they control, but to protect what is important to themselves and others.

Another benefit of slow librarianship is that it can alleviate some of the causes of burnout. In “Rising from the Flames: How Researching Burnout Impacted Two Academic Librarians,” Robert Griggs-Taylor and Jessica Lee discuss the changes they have made to their management style after studying and experiencing different factors of burnout. Although the authors do not call their approach slow librarianship, several of their adjustments align with its tenets. This includes encouraging staff to pursue avenues of interest during the workday and to take earned time away from work without overdrawn explanation or guilt. The article is another example of how administrative influence can allow librarians to maintain a healthy circle of control.

I’ve spent the majority of this article using circular imagery to get my point across, but let me offer two more ways of thinking about slow librarianship. In “The Innovation Fetish and Slow Librarianship: What Librarians Can Learn from the Juciero,” Julia Glassman uses flowers, specifically the jacaranda, as a metaphor for the importance of rest and reflection. She explains how in order to bloom in one season, flowers go dormant in others. She writes, “It’s supremely unhealthy, for both individuals and organizations, to try to be in bloom all the time.” I am more of an indoor person, so what comes to my mind is The Fellowship of the Rings and Bilbo Baggins’ description of exhaustion as feeling “like butter that has been scraped over too much bread” (40). When I shared this line with my current therapist, they pointed out that the problem in that scenario is not a lack of butter, but an excess of bread. Librarians have enough butter. We are talented, motivated, and knowledgeable people. There is just too much bread to be concerned about! We can continue to spread ourselves thin, or we can take on only what we can manage without scraping.

Conclusion

If this article were a therapy session – and it may as well be – now would be when the therapist says, “we’re just about out of time” and we would take stock of what we’ve learned. So, we know librarians are being pressured by patrons, administrators, and their own sense of duty to overextend themselves. Even librarians in leadership positions seem to recognize that pouring time, energy, or money into a concern does not guarantee influence over it. This may sound like a sad state of affairs, but I still believe in the philosophy of circles, because it has always meant to cultivate agency in the face of adversity. For librarians and libraries, being cognizant and honest about what aspects of the profession are inside each circle is a start. The next challenge is to maintain those distinctions in the face of internal and external pressures to exert influence over every concern, risking job creep, mission creep, and burnout. Even if one’s work environment is not conducive to such thinking, the beauty of this concept is that it starts with the individual. If it remains an internal process to keep anxiety in check? Great! If it ends up being discussed in staff meetings? Also great! I did not begin talking about it with colleagues in a Covey-esque maneuver to increase my influence in the workplace. In the same vein, I did not write this article with the idea that librarians everywhere will suddenly be free of outsized expectations. Although, the idea certainly is appealing. It would mean not being seen as the last bastion of intellectual freedom or the single remaining thread of a ruined social safety net. Librarians would be able to go slower, grow stronger roots, and not try to cover so much ground (or bread). All that would be lovely, but this exercise has taught me to start small. So I will pose this last question: What would happen if one librarian was empowered to reconsider one of their expectations and nurture one part of their practice that is truly in their control? And yes, that was advice phrased as a question.

Acknowledgements

Thank you to my reviewers, Brea McQueen and Patrice Williams. Thank you to my publishing editor, Jessica Schomberg. Thank you to Alexander Hall, Teaching Professor of Classical Studies & Latin at Iowa State University, for talking shop during family time. Thank you to the public librarian who shared their challenges in trying to create a healthier environment for themself and their colleagues. Thank you to the mental health professionals who have given me advice throughout the years. I’m glad I wrote so much of it down!

Works Cited

Anonymous. Personal interview. 13 March 2025.

Aurelius, Marcus. The Meditations, translated by George Long, 1862. https://classics.mit.edu/Antoninus/meditations.html.

Becher, Melissa. “Understanding the Experience of Full-time Nontenure-track Library Faculty: Numbers, Treatment, and Job Satisfaction.” The Journal of Academic Librarianship, 45, no. 3 (2019) 213-219. https://doi.org/10.1016/j.acalib.2019.02.015.

Beck, Aaron T. Cognitive Therapy of Depression. Guilford Press, 1979.

Covey, Stephen R. The 7 Habits of Highly Effective People. 1989. RosettaBooks LLC, 2012.

Dixon, Jennifer A. “Feeling the Burnout: Library Workers Are Facing Burnout in Greater Numbers and Severity—And Grappling with it as A Systemic Problem.” Library Journal 147, no. 3 (2022): 44. https://www.proquest.com/trade-journals/feeling-burnout/docview/2634087993/se-2.

Epictetus. The Enchiridion, translated by Elizabeth Carter, 1807. https://classics.mit.edu/Epictetus/epicench.html.

Ettah, Fobazi. “Vocational Awe and Librarianship: The Lies We Tell Ourselves.” In the Library With the Lead Pipe, 10 Jan. 2018. https://www.inthelibrarywiththeleadpipe.org/2018/vocational-awe.

Farkas, Meredith. “What is Slow Librarianship?” Information Wants to Be Free, 18 October 2021. https://meredith.wolfwater.com/wordpress/2021/10/18/what-is-slow-librarianship.

Frankl, Viktor E. Man’s Search for Meaning. Translated by Ilse Lasch, New York: Beacon Press, 2006.

Glassman, Julia. “The Innovation Fetish and Slow Librarianship: What Librarians Can Learn from the Juciero.” In the Library With the Lead Pipe, 18 Oct. 2017. https://www.inthelibrarywiththeleadpipe.org/2017/the-innovation-fetish-and-slow-librarianship-what-librarians-can-learn-from-the-juicero.

Griggs-Taylor, R., & Lee, J. “Rising from the Flames: How Researching Burnout Impacted Two Academic Librarians.” Georgia Library Quarterly, 59, no. 4 (2022). https://doi.org/10.62915/2157-0396.2539.

Hulbert, Ioana G. “US Library Survey 2022: Navigating the New Normal.” Ithaka S+R. Last modified 30 March 2023. https://doi.org/10.18665/sr.318642.

McCabe, Darren. “Opening Pandora’s Box: The Unintended Consequences of Stephen Covey’s Effectiveness Movement.” Management Learning 42, no. 2 (2011): 183. https://doi.org/10.1177/1350507610389682.

Petersen, Anne Helen. “The Librarians are Not Okay.” Culture Studies, 1 May 2022. https://annehelen.substack.com/p/the-librarians-are-not-okay.

Schaffner, Anna Katharina. “Understanding the Circles of Influence, Concern, and Control.” Positive Psychology. Last modified 13 March 2023. https://positivepsychology.com/circles-of-influence.

Shakespeare, William. Hamlet from The Folger Shakespeare. Ed. Barbara Mowat, Paul Werstine, Michael Poston, and Rebecca Niles. Folger Shakespeare Library, https://folger.edu/explore/shakespeares-works/hamlet.

Weyant, E. C., Wallace, R. L., & Woodward, N. J. “Contributions to Low Morale, Part 1: Review of Existing Literature on Librarian and Library Staff Morale.” Journal of Library Administration, 61, no 7 (2021): 854–868. https://doi.org/10.1080/01930826.2021.1972732.

[Announcement] Open Data Editor 1.6.0 AI-enhanced Version Release / Open Knowledge Foundation

We are glad to announce today the release of ODE's new version. The app is now evolving into a key companion tool in the early and critical stages of your AI journey.

The post [Announcement] Open Data Editor 1.6.0 AI-enhanced Version Release first appeared on Open Knowledge Blog.

2025-08-03: The Wayback Machine Has Archived at Least 1.3M goo.gl URLs / Web Science and Digital Libraries (WS-DL) Group at Old Dominion University

The interstitial page for https://goo.gl/12XGLG, telling the user that Google will soon abandon this shortened URL.

Last year, Google announced it intended to deprecate its URL shortener, goo.gl, and just last week they released the final shut down date of August 25. I was quoted in Tech Friend, a Washington Post newsletter by Shira Ovide, joking that the move "would save Google dozens of dollars." Then last Friday, Google announced a slight update, and that links that have had some activity in "late 2024" would continue to redirect.

To be sure, the shut down isn't about saving money, or at least not about the direct cost of maintaining the service. goo.gl stopped accepting new shortening requests in 2019, but continued to redirect existing shortened URLs, and maintaining the server with a static mapping of shortened URLs to their full URLs has a negligible hardware cost. The real reason is likely that nobody within Google wants to be responsible for maintaining the service. Engineers in tech companies get promoted based on their innovation in new and exciting projects, not maintaining infrastructure and sunsetted projects. URL shorteners are largely a product of the bad old days of social media, and the functionality has largely been supplanted by the companies themselves (e.g., Twitter's t.co service, added ca. 2011). URL shorteners still have their place: I still use bitly's custom URL service to create mnemonic links for Google Docs (e.g., https://bit.ly/Nelson-DPC2025 instead of https://docs.google.com/presentation/d/1j6k9H3fA1Q540mKefkyr256StaAD6SoQsJbRuoPo4tI/edit?slide=id.g2bc4c2a891c_0_0#slide=id.g2bc4c2a891c_0_0). URL shorteners proliferated for a while, and most of them have since gone away. The 301works.org project at the Internet Archive has archived a lot, but not all, of the mappings.

When Shira contacted me, one of the things she wanted to know was the scale of the problem. A Hacker News article had various estimates: 60k articles in Google Scholar had the string "goo.gl", and another person claimed that a Google search for "site:goo.gl" returned 9.6M links (but my version of Google no longer shows result set size estimates).

2025-08-03 Google Scholar search for "goo.gl"

2025-08-03 Google search for "goo.gl"

Curious and not satisfied with those estimates, I started poking around to see what the Internet Archive's Wayback Machine has. These numbers were taken on 2025-07-25, and will surely increase soon based on Archive Team's efforts.

First, not everyone knows that you can search URL prefixes in the Wayback Machine with the "*" character. I first did a search for "goo.gl/a*", then "goo.gl/aa*", etc. until I hit something less than the max of 10,000 hits per response.

https://web.archive.org/web/*/goo.gl/a*

https://web.archive.org/web/*/goo.gl/aa*

https://web.archive.org/web/*/goo.gl/aaa*

https://web.archive.org/web/*/goo.gl/aaaa*

We could repeat with "b", "bb", "bbb", "bbbb", etc. but that would take quite a while. Fortunately, we can use the CDX API to get a complete response and then process it locally.

The full command line session is shown below, and then I'll step through it:

% curl "http://web.archive.org/cdx/search/cdx?url=goo.gl/*" > goo.gl

% wc -l goo.gl

3974539 goo.gl

% cat goo.gl | awk '{print $3}' | sed "s/https://" | sed "s/http://" | sed "s/?.*//" | sed "s/:80//" | sed "s/www\.//" | sort | uniq > goo.gl.uniq

% wc -l goo.gl.uniq

1374191 goo.gl.uniq

The curl command accesses the CDX API, searching for all URLs prefixed with "goo.gl/*", and saves the response in a file called "goo.gl".

The first wc command shows that there are 3.9M lines in a single response (i.e., pagination was not used). Although not listed above, we can take a peek at the response with the head command:

% head -10 goo.gl

gl,goo)/ 20091212094934 http://goo.gl:80/ text/html 404 2RG2VCBYD2WNLDQRQ2U5PI3L3RNNVZ6T 298

gl,goo)/ 20091217094012 http://goo.gl:80/? text/html 200 HLSTSF76S2N6NDBQ4ZPPQFECB4TKXVCF 1003

gl,goo)/ 20100103211324 http://goo.gl/ text/html 200 HLSTSF76S2N6NDBQ4ZPPQFECB4TKXVCF 1166

gl,goo)/ 20100203080754 http://goo.gl:80/ text/html 200 HLSTSF76S2N6NDBQ4ZPPQFECB4TKXVCF 1010

gl,goo)/ 20100207025800 http://goo.gl:80/ text/html 200 HLSTSF76S2N6NDBQ4ZPPQFECB4TKXVCF 1006

gl,goo)/ 20100211043957 http://goo.gl:80/ text/html 200 HLSTSF76S2N6NDBQ4ZPPQFECB4TKXVCF 1001

gl,goo)/ 20100217014043 http://goo.gl:80/ text/html 200 HLSTSF76S2N6NDBQ4ZPPQFECB4TKXVCF 999

gl,goo)/ 20100224024726 http://goo.gl:80/ text/html 200 HLSTSF76S2N6NDBQ4ZPPQFECB4TKXVCF 1000

gl,goo)/ 20100228025750 http://goo.gl:80/ text/html 200 HLSTSF76S2N6NDBQ4ZPPQFECB4TKXVCF 1003

gl,goo)/ 20100304130514 http://goo.gl:80/ text/html 200 HLSTSF76S2N6NDBQ4ZPPQFECB4TKXVCF 1008

The file has seven space-separated columns. The first column is the URL in SURT format (a form of normalizing URLs), the second column is the datetime of the visit, and the third column is the actual URL encountered. The above response shows that the top level URL, goo.gl, was archived many times (as you would expect), and the first time was on 2009-12-12, at 09:49:34 UTC.

The third command listed above takes the 3.9M line output file, uses awk to select only the third column (the URL, not the SURT), and the first two sed commands remove the schema (http and https) from the URL, and third sed command removes any URL arguments. The fourth sed command removes any port 80 remnants, and fifth sed removes any unnecessary "www." prefixes. Then the result is sorted (even though the input should already be sorted, we sort it again just to be sure), then the result is run through the uniq command to remove duplicate URLs.

We process the URLs and not the SURT form of the URLs because in short URLs, capitalization in the path matters. For example, "goo.gl/003br" and "goo.gl/003bR" are not the same URL – the "r" vs. "R" matters.

goo.gl/003br --> http://www.likemytweets.com/tweet/217957944678031360#217957944678031360%23like

and

goo.gl/003bR --> http://www.howtogeek.com/68999/how-to-tether-your-iphone-to-your-linux-pc/

We remove the URL arguments because although they are technically different URLs, the "?d=1" (show destination) and "si=1" (remove interstitial page) arguments shown above don't alter the destination URLs.

% grep -i "003br" goo.gl | head -10

gl,goo)/0003br 20250301150956 https://goo.gl/0003bR application/binary 302 3I42H3S6NNFQ2MSVX7XZKYAYSCX5QBYJ 239

gl,goo)/0003br 20250301201105 https://goo.gl/0003BR application/binary 302 3I42H3S6NNFQ2MSVX7XZKYAYSCX5QBYJ 328

gl,goo)/0003br?d=1 20250301150956 https://goo.gl/0003bR?d=1 text/html 200 YS7M3IHIYA4PGO37JKUZBPMX3WDCK5QW 591

gl,goo)/0003br?d=1 20250301201104 https://goo.gl/0003BR?d=1 text/html 200 GSJJBSKEC2AULCMM3VLZZ4R7L37X65T7 718

gl,goo)/0003br?si=1 20250301150956 https://goo.gl/0003bR?si=1 application/binary 302 3I42H3S6NNFQ2MSVX7XZKYAYSCX5QBYJ 237

gl,goo)/0003br?si=1 20250301201105 https://goo.gl/0003BR?si=1 application/binary 302 3I42H3S6NNFQ2MSVX7XZKYAYSCX5QBYJ 325

gl,goo)/003br 20250228141837 https://goo.gl/003br application/binary 302 3I42H3S6NNFQ2MSVX7XZKYAYSCX5QBYJ 273

gl,goo)/003br 20250228141901 https://goo.gl/003bR application/binary 302 3I42H3S6NNFQ2MSVX7XZKYAYSCX5QBYJ 281

gl,goo)/003br2 20250302155101 https://goo.gl/003BR2 text/html 200 JO23EZ66WLVAKLHZQ57RS4WEN3LTFDUH 587

gl,goo)/003br2?d=1 20250302155100 https://goo.gl/003BR2?d=1 text/html 200 IQ6K5GU46N3TY3AIZPOYP4RLWZC4GEIT 623

The last wc command shows that there are 1.3M unique URLs, after the URL scheme and arguments have been stripped.

If you want to keep the arguments to the goo.gl URLs, you can do:

% cat goo.gl | awk '{print $3}' | sed "s/https://" | sed "s/http://" | sed "s/:80//" | sed "s/www\.//" | sort | uniq > goo.gl.args

% wc -l goo.gl.args

3518019 goo.gl.args

And the Wayback Machine has 3.5M unique goo.gl URLs if you include arguments (3.5M is, not unsurprisingly, nearly 3X the original 1.3M URLs without arguments).

Not all of those 1.3M (or 3.5M) URLs are syntactically correct. A sharp eye will catch that in the first screen shot for https://web.archive.org/web/*/goo.gl/a* there is a URL with an emoji:

Which is obviously not syntactically correct and that URL does not actually exist and is thus not archived:

https://web.archive.org/web/20240429092824/http://goo.gl/a%F0%9F%91%88 does not exist.

Still, even with a certain number of incorrect URLs, they are surely a minority, and would not effectively change the cardinality of unique 1.3M (or 3.5M) goo.gl URLs archived at the Wayback Machine.

Shira noted in her article that Common Crawl (CC) told her that they estimated 10M URLs were impacted. I'm not sure how they arrived at that number, especially since the Wayback Machine's number is much lower. Perhaps there are CC crawls that have yet to be indexed, or are excluded from replay by the Wayback Machine, or they were including arguments ("d=1", "si=1"), or something else that I haven't considered. Perhaps my original query to the CDX API contained an error or a paginated response that I did not account for.

In summary, thankfully the Internet Archive is preserving the web, which includes shortened URLs. But also, shame on Google for shutting down a piece of web infrastructure that they created, walking away from at least 1.3M URLs they created, and transferring this function to a third party with far fewer resources. The cost to maintain this service is trivial, even in terms of engineer time. The cost is really just intra-company prestige, which is a terrible reason to deprecate a service. And I suppose shame on us, as a culture and more specifically a community, for not valuing investments in infrastructure and maintenance.

Google's concession of maintaining recently used URLs is not as useful as it may seem at first glance. Yes, surely many of these goo.gl URLs redirect to URLs that are either now dead or are/were of limited importance. But we don't know which ones are still useful, and recent usage (i.e., popularity) does not necessarily imply importance. In my next blog post, I will explore some of the shortened URLs in technical publications, including a 2017 conference survey paper recommended by Shira Ovide that used goo.gl URLs, presumably for space reasons, to link to 27 different datasets.

–Michael

2025-08-10: Who Cares About All Those Old goo.gl Links Anyway? / Web Science and Digital Libraries (WS-DL) Group at Old Dominion University

11 of the 26 goo.gl URLs for data sets surveyed in Yin & Berger (2017)

In a previous post, I estimated that when Google turns off its goo.gl URL shortening service, at least 1.3M goo.gl URLs are already saved by the Internet Archive's Wayback Machine. Thanks to the efforts of Archive Team and others, that number will surely grow in the coming weeks before the shutdown. And Google has already announced plans to keep the links that have recently been used. But all of this begs the question: "who cares about all those old goo.gl links anyway?" In this post, I examine a single technical paper from 2017 that has 26 goo.gl URLs, one (1/26) of which is scheduled to be deprecated in two weeks. Assuming this loss rate (1/26) holds for all the goo.gl URLs indexed in Google Scholar, then at least 4,000 goo.gl URLs from the scholarly record will be lost.

In our discussions for the Tech Friend article, Shira Ovide shared with me "When to use what data set for your self-driving car algorithm: An overview of publicly available driving datasets", a survey paper published by Yin & Berger at ITSC 2017 in Japan (preprint at ResearchGate). I can't personally speak to the quality of the paper or its utility in 2025, but it's published at an IEEE conference and according to Google Scholar it has over 100 citations, so for the sake of argument I'm going to consider this a "good" paper, and that as a survey it is still of interest some 8 years later.

109 citations for Yin & Berger on 2025-08-09 (live web link).

The paper surveys 27 data sets that can be used to test and evaluate self-driving cars. Of those 27 data sets, 26 of them are directly on the web (the paper describing the BAE Systems data set has the charming chestnut "contact the author for a copy of the data"). For the 26 data sets that are on the web, the authors link not to the original link, such as:

http://www.gavrila.net/Datasets/Daimler_Pedestrian_Benchmark_D/daimler_pedestrian_benchmark_d.html

but to the much shorter:

https://goo.gl/l3U2Wc

Presumably, Yin & Berger used the shortened links for ease and uniformity of typesetting. Especially in the two column IEEE conference template, it is much easier to typeset the 21 character goo.gl URL rather than the 98 character gavrila.net URL. But the convenience of the 77 character reduction comes with the loss of semantics: if the gavrila.net URL rotted (e.g., became 404, the domain was lost), then by visual inspection of the original URL, we know to do a search engine query for "daimler pedestrian benchmark" and if it's still on the live web with a different URL, we have a very good chance of (re)discovering its new location (see Martin Klein's 2014 dissertation for a review of techniques). But if goo.gl shuts down, and all we're left with in the 2017 conference paper is the string "l3U2Wc", then we don't have the semantic clues we need to find the new location, nor do we have the original URL with which to discover the URL in a web archive, such as the Internet Archive's Wayback Machine.

Fortunately, http://www.gavrila.net/Datasets/Daimler_Pedestrian_Benchmark_D/daimler_pedestrian_benchmark_d.html is still on the live web.

Let's consider another example that is not on the live web. The short URL:

https://goo.gl/07Us6n

redirects to:

https://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/

Which is currently 404:

https://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/ (via https://goo.gl/07Us6n) is now 404 on the live web.

From inspection of the 404 URL, we can guess that "Caltech Pedestrians" is a good SE query, and the data appears to be available from multiple locations, including the presumably now canonical URL https://data.caltech.edu/records/f6rph-90m20. (The webmaster at vision.caltech.edu should use mod_rewrite to redirect to data.caltech.edu, but that's a discussion for another time).

The Google SERP for "Caltech Pedestrians": it appears the data set is in multiple locations on the live web.

https://data.caltech.edu/records/f6rph-90m20 is presumably now the canonical URL and is still on the live web.

Even if all the caltech.edu URLs disappeared from the live web, fortunately the Wayback Machine has archived the original URL. The Wayback Machine has archived the new data.caltech.edu URL as well, though it appears to be far less popular (so far, only 8 copies of data.caltech.edu URL vs. 310 copies of the original vision.caltech.edu URL).

https://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/ is well archived at the Wayback Machine.

This 2017-03-29 archived version is probably close to the state of the page at the time as it was cited by Yin & Berger in 2017.

The new data.caltech.edu URL is archived, but less so (so far).

Resolving the 26 goo.gl URLs, 18 of them successfully terminate in an HTTP 200 OK. The eight that did not have the following response codes or conditions:

404 Not Found: 07Us6n, pxr3Yc, 0R8XX6 (see the discussion below for 0R8XX6)
403 Forbidden: xXDTwI
410 Gone: ausKsL (410 responses are rare in the wild!)
Timed out or did not resolve: GNNq0f, KRBCLa, rf12z6 (since these are not HTTP events, their final response code in the log terminates with a 302).

Although marked as "404" above, goo.gl/0R8XX6 resolves to an HTTP 200 OK, but it's that HTTP 200 response is actually to an interstitial page saying that this URL was not accessed in late 2024, and thus will be sunsetted on 2025-08-25. Appending the argument "?si=1" to bypass the interstitial page results in a redirection to the 3dvis.ri.cmu.edu page, and that URL is 404. Fortunately, the page is archived at the Wayback Machine. For those in the community, perhaps there is enough context to rediscover this data set, but the first several hits for the query for "CMU Visual Localization Dataset" does not return anything that is obvious to me as the right answer (perhaps the second hit subsumes the original data set?).

The reference to http://goo.gl/0R8XX6 in Yin & Berger (2017).

A Google query for "CMU Visual Localization Dataset" on 2025-08-10; perhaps the data set we seek is included in the second hit?

https://goo.gl/0R8XX6 did not win the popularity contest in late 2024, and will cease working on 2025-08-25. It appears that dereferencing the URL now (August 2025) will not save it.

Dereferencing https://goo.gl/0R8XX6?si=1 yields http://3dvis.ri.cmu.edu/data-sets/localization/, which no longer resolves (which is technically not an HTTP event, since there is not a functioning HTTP server to respond).

https://3dvis.ri.cmu.edu/data-sets/localization/ was frequently archived between 2015 and 2018.

https://3dvis.ri.cmu.edu/data-sets/localization/ as archived on 2015-02-19.

So under the current guidance, one of the 26 goo.gl URLs (https://goo.gl/0R8XX6) in Yin & Berger (2017) will cease working in about two weeks, and it's not immediately obvious that the paper provides enough context to refind the original data set. This is compounded by the fact that the original host, 3dvis.ri.cmu.edu, no longer resolves. Fortunately, the Wayback Machine appears to have the site archived (I have not dived deeper to verify that all the data has been archived; cf. our Web Science 2025 paper).

2025-08-03 Google Scholar search for "goo.gl"

Here, we've only examined one paper, so the next natural question would be "how many other papers are impacted?" A search for "goo.gl" at Google Scholar a week ago estimated 109,000 hits. Surely some of those hits include simple mentions of "goo.gl" as a service and don't necessarily have shortened links. On the other hand, URLs shorteners are well understood and probably don't merit extended discussion, so I'm willing to believe that nearly all of the 109k hits have at least one shortened URL in them; the few that do not are likely balanced by Yin & Berger (2017), which has 26 shortened URLs.

For simplicity, let's assume there are 109,000 shortened URLs indexed by Google Scholar. Let's also assume that the sunset average (1/26, or 4%) for the URLs in Yin & Berger (2017) also holds for the collection. That would yield 109,000 * 0.04 = 4,360 shortened URLs to be sunsetted on 2025-08-25. Admittedly, these are crude approximations, but saying there are "at least 4,000 shortened URLs that will disappear in about two weeks" passes the "looks right" test, and if forced to guess, I would bet that the actual number is much larger than 4,000. Are all 4,000 "important"? Are all 4,000 unfindable on the live web? Are all 4,000 archived? I have no idea, and I suppose time will tell. As someone who has devoted much of their career to preserving the web, especially the scholarly web, deprecating goo.gl feels like an unforced error in order to save "dozens of dollars".

–Michael

A gist with the URLs and HTTP responses is available.

2025-08-09: ODU's Strategic Research Thrust Areas Uniquely Describe ODU / Web Science and Digital Libraries (WS-DL) Group at Old Dominion University

https://www.odu.edu/research/strategic-research-areas

ODU published its "Strategic Research Areas" last April, and I've been meaning to comment on it for a while. First, it hasn't been well messaged (yet), as reflected by an informal poll of my WSDL peers. Second, I'm reaching the stage in my career where I read with enthusiasm "strategy" and "policy" documents.

So what makes this strategy statement different from previous iterations? First, it uniquely describes ODU. Instead of a collection of generic terms like "impact", "development", "innovation", etc., the four thrusts identified capture many of ODU's current activities and are areas of ODU's comparative and competitive advantage. The official document has more eloquent language justifying the four thrusts, but mostly it comes down to the geography and resulting industry and demographics of the Hampton Roads region:

Coastal Resilience: we have a lot of flooding, for example, Norfolk is second only to New Orleans for flood risk
Health Innovations: we have a lot of uninsured people, which is one the many reasons for our healthcare disparities
Maritime Systems: we have a lot of ships, centered around the largest shipyard in the country
National Security: we have a lot of military, including the largest naval base in the world

Certainly other institutions are active in some of these areas, but I can't think of another institution more centered at the intersection of the four. In addition to the main areas, there are five cross-cutting research areas that, while not unique to ODU, are critically important and enabling to nearly every research pursuit:

Happily, I find myself, and WSDL, in most if not all of the five cross-cutting areas.

This "4+5" model does not exhaustively catalog all of ODU's research areas, but it is a helpful descriptive and prescriptive model for informing future resource investments. All institutions have to choose what they are going to be good at, and in this case, we've chosen to be good at the things that are unique to ODU and Hampton Roads. These are difficult times for university research, and the nation seems to have lost sight of the economic impact of funding higher education. Hopefully ODU's alignment of research thrusts to this unique combination of the region's strengths – and weaknesses – will allow us to demonstrate that higher education is a public good.

–Michael

Seeing No Red Flags: Why Do Authors Ignore Journal Titles? / Journal of Web Librarianship

DC.creator
Chitnarong Sirisathitkul Division of Physics, School of Science, Walailak University, Nakhon Si Thammarat, ThailandChitnarong Sirisathitkul obtained his D.Phil. in 2000 from the University of Oxford, UK. He has been working at Walailak University, Thailand, since 2001, where he was awarded the Best Teacher in 2005 and the Best Researcher in 2017. He is currently an associate professor and head of the Division of Physics at the School of Science. His publications in Scopus-indexed journals cover topics such as magnetic materials, traditional ceramics and artificial intelligence in education. Serving as the editor of Area Based Development Research Journal and Thai Journal of Physics, he is also interested in scholarly publication ethics.
DC.title
Seeing No Red Flags: Why Do Authors Ignore Journal Titles?
DC.publisher
Journal of Web Librarianship
DC.date
Fri, 08 Aug 2025 02:36:46 +0000
DC.rights

Author Interview: Joanne Harris / LibraryThing (Thingology)

LibraryThing is pleased to sit down this month with bestselling Anglo-French author Joanne Harris, whose 1999 novel, Chocolat—shortlisted for the Whitbread Award—was made into a popular film of the same name. The author of over twenty novels, including three sequels to Chocolat—as well as novellas, short stories, game scripts, screenplays, the libretti for two operas, a stage musical, and three cookbooks, her work has been published in over fifty countries, and has won numerous awards. She was named a Member of the Order of the British Empire (MBE) in 2013 and an Officer of the Order of the British Empire (OBE) in 2022, for services to literature. A former teacher, Harris is deeply involved in issues of author rights, serving two terms as Chair of the UK’s Society of Authors (SOA) from 2018 to 2024. She is a patron of the charity Médecins Sans Frontières (Doctors Without Borders), to which she donated the proceeds from sales of her cookbooks. Cooking and food are consistent themes in her work, and she returns to the story of her most famous culinary character in her newest novel, Vianne, a prequel to Chocolat that is due out from Pegasus Books in early September. Harris sat down with Abigail this month to discuss this new book.

Set six years before the events of Chocolat, your new book is actually the fifth novel about Vianne Rocher to be released. What made you decide you needed to write a prequel? Did any of the ideas for the story come to you as you were writing the other books about Vianne, or was it all fresh and new as you wrote?

Vianne and I have travelled together for over 25 years, and although we’re different in many ways, I think we have some things in common. When I wrote Chocolat, I was the mother of a small child, and wrote Vianne’s character from a similar perspective. I left her in 2021 as the mother of two children, both young adults, and I realized that both Vianne and I needed to look back in order to move forward. Hence Vianne, my protagonist’s origin story, which answers a number of questions left unanswered at the end of Chocolat, and hopefully gives some insights into her journey. Most of it was new; I found a few references in Chocolat to work from, but until now I’ve had very little idea of what Vianne’s past might have been, which made the writing of this book such an interesting challenge.

Food and cooking are important themes in your work. Why is that? What significance do they have for you, and what can they tell us about the characters in your stories, and the world in which they live?

Food is a universal theme. We all need it, we all relate to it in different, important ways. It’s a gateway to culture; to the past; to the emotions. In Vianne it’s also a kind of domestic magic, involving all the senses, and with the capacity to transport, transform and touch the lives of those who engage with it.

Talk to us about chocolate! Given its importance in some of your best-known fiction, as well as the fact that you published The Little Book of Chocolat (2014), I think we can assume you enjoy this treat. What are your favorite kinds? Are there real life chocolatiers you would recommend, or recipes you like to make yourself? (Note: the best chocolate confections I myself ever tasted came from Kee’s Chocolates in Manhattan).

As far as chocolate is concerned, my journey has been rather like Vianne’s. I really didn’t know much about it when I wrote Chocolat, but since then I’ve been involved with many artisanal chocolatiers, and I’ve travelled to many chocolate producing countries. Some of my favourites are Schoc in New Zealand, and Claudio Corallo in Principe, who makes single-origin bean to bar chocolate on location from his half-ruined villa in the rainforest. And David Greenwood-Haigh, a chef who incorporates chocolate into his recipes much as Vianne does in the book (and who created the “chocolate spice” to which I refer in the story.)

Like its predecessors (or successors, chronologically speaking), Vianne is set in France. As the daughter of an English father and French mother, what insight do you feel you bring to your stories, from a cultural perspective? Do you feel you are writing as an insider, an outsider, perhaps both— and does it matter?

I think that as a dual national, there’s always a part of me that feels slightly foreign, which is why Vianne, too, is a perpetual outsider. But I do know enough about France to write with authority and affection – and maybe a little nostalgia, too. The France of my books is a selective portrait, based on the places and people I love, some of which have disappeared. These books are a way of making them live again.

Tell us a little bit about your writing process. Are you someone who maps out your story beforehand, or do you like to discover where things are going as you write? Do you have a particular writing routine? What advice would you give young writers who are just getting started?

My process varies according to the book, but as a rule I don’t map out the story in its entirety: I usually start with a voice, and a mission, and a number of pivotal scenes, and I see where that takes me. I write where I can: if I’m at home, I prefer my shed in the garden, but I can make do with any quiet space. My process involves reading aloud, so it’s best if I’m alone. And I use scent as a trigger to get me into the zone: a trick borrowed from Stanislasky’s An Actor Prepares, which I’ve been using for 30 years. In the case of Vianne I used Chanel’s Coromandel, partly because it’s an olfactory relative of Chanel No. 5, which I used when I was writing Chocolat. (And on the same theme, I’ve created a scent of my own with the help of perfumier Sarah McCantrey of 4160 Tuesdays): it’s called Vianne’s Confession, and it illustrates a passage from the book.)

As for my advice to young writers; just write. You get better that way. And if you are indeed just getting started, don’t be in a hurry to publish or to share your work if you don’t feel ready. You have as long as you like to write your first book, and only one chance at making a first impression. So take it slow, let yourself grow, and enjoy the process, because if you don’t enjoy what you do, why should anyone else?

What’s next for you? Do you have further books in the pipeline? Do you think Vianne, or any of the sequels to Chocolat, will also be made into a film?

I always have more books in the pipeline: the next one is very different; it’s a kind of quiet folk-horror novel called Sleepers in the Snow. As for films, it’s too early to say, but it would be nice to see something on screen again – though preferably as a series, as I really think these books, with their episodic structure, would probably work better that way.

Tell us about your library. What’s on your own shelves?

At least 10,000 books in French, English, German. I find it hard to give books away, so I’ve accumulated quite a library of all kinds of things, in many different genres.

What have you been reading lately, and what would you recommend to other readers?

Right now I’m reading a proof of Catriona Ward’s new book, Nowhere Burning, which is terrific: so well-written, and like all her books, quite astonishingly creepy.

Dune Path / Ed Summers

Early morning in Villas, NJ

2025-08-06: Paper Summary: "ETD-MS v2. 0: A Proposed Extended Standard for Metadata of Electronic Theses and Dissertations" / Web Science and Digital Libraries (WS-DL) Group at Old Dominion University

Our paper, “ETD-MS v2.0: A Proposed Extended Standard for Metadata of Electronic Theses and Dissertations,” was accepted at the 27th International Symposium on Electronic Theses and Dissertations (ETD 2024), held in Livingstone, Zambia. ETD 2024 welcomed contributions on a wide range of topics related to Electronic Theses and Dissertations (ETDs), including digital libraries, institutional repositories, graduate education and training, open access, and open science. The symposium brought together global researchers, practitioners, and educators dedicated to advancing the creation, curation, and accessibility of ETDs.

As the number of ETDs in digital repositories continues to grow, the need for a metadata standard that aligns with the FAIR (Findability, Accessibility, Interoperability, and Reusability) principles becomes increasingly important. Dublin Core and ETD-MS v1.1 are widely used metadata schemas for scholarly documents and ETDs. However, we identified several gaps that limit their ability to fully represent the structure and content of ETDs. In particular, content-level metadata, such as the individual components or “objects” within an ETD, has become increasingly important. This level of detail is essential for supporting machine learning applications that extract scientific knowledge and for enabling large-scale scholarly data services.

In this paper, we present ETD-MS v2.0, an extended metadata schema developed to address these limitations. ETD-MS v2.0 provides a comprehensive description of ETDs by representing both document-level and content-level metadata. The proposed schema includes a Core Component building on the existing ETD-MS v1.1 schema, and an Extended Component that captures objects, their provenance, and user interactions for ETDs.

Motivation

The motivation for ETD-MS v2.0 arises from three major limitations observed in current metadata standards. First, existing metadata standards lack the metadata elements to describe access rights and ETD file formats in detail. For example, the dc.rights field in ETD-MS v1.1 offers only three preset values for access. The dc.format field assumes a single MIME type per ETD, which is inadequate for ETDs that include multiple file types. Second, current standards lack metadata elements for representing internal components of ETDs such as chapters, figures, and tables. In our schema, these are referred to as “objects,” and they often have rich attributes of their own that require structured representation. Third, existing schemas do not support metadata originating from sources outside the original ETD submission, such as those generated by human catalogers or AI models. The absence of provenance information for such metadata further limits its utility.

Schema Design

ETD-MS v2.0 is composed of two main components: the Core Component and the Extended Component.

Figure 1: Relationships among Entities in the Core and Extended Components of ETD-MS v2.0. Blue represents Extended Components, and green represents core components.

Core Component

The Core Component focuses on document-level metadata and was developed using a top-down approach. We analyzed 500 ETDs sampled from a collection of over 500,000 ETDs (Uddin et al., 2021) spanning various disciplines and publication years. The Core Component comprises 10 entities and 73 metadata fields.

Some key improvements include the transformation of dc.rights into a dedicated “Rights” entity, with attributes such as rights_type, rights_text, and rights_date. Another major addition is the “ETD_File” entity, which captures metadata related to multiple file types, file descriptions, generation methods, and checksums. We also introduced a new “References” entity, missing in earlier schemas, to capture structured metadata for cited works, including the fields reference_text, author, title, year, and venue.

The Core Component entities are categorized into two types: those that describe the ETD itself, such as “ETDs,” “Rights,” “Subjects,” “ETD_classes,” and “ETD_topics,” and those that capture relationships between ETDs or collections of ETDs, such as “References,” “ETD-ETD_neighbors,” “Collections,” and “Collection_topics”.

Extended Component

Figure 2: Relationships among Entities in the Extended Components of ETD-MS v2.0. Blue represents Category E.1, red represents Category E.2, and orange represents Category E.3.

The Extended Component focuses on content-level metadata and was developed using a bootstrap approach. It introduces 18 entities with 87 metadata fields, grouped into three categories:

Category E.1: Includes entities such as “Objects,” “Object_metadata,” “Object_summaries,” “Object_classes,” and “Object_topics” to describe individual components such as figures, tables, and sections.
Category E.2: Entities such as “Classifications,” “Classification_entries,” “Classifiers,” “Topic_models,” and “Summarizers” store metadata about how certain content was generated or classified.
Category E.3: Captures metadata about user behaviors and preferences using entities such as “Users,” “User_queries,” “User_queries_clicks,” “User_topics,” “User_classes,” and “User-user_neighbors”.

Implementation

To evaluate the feasibility of ETD-MS v2.0, we implemented the schema using a MySQL database and populated it with data from a separate collection of 1,000 ETDs (distinct from the 500 ETDs used for schema development). These ETDs, sourced from 50 U.S. universities and published between 2005 and 2019, were used to simulate real-world metadata extraction. We used OAI-PMH APIs and HTML scraping to gather document-level metadata, and employed PyMuPDF and Pytesseract for text extraction from born-digital and scanned ETDs, respectively. We developed a GPT-3.5 based prompt to classify ETDs using the ProQuest subject taxonomy, and applied summarization models such as T5-Small and Pegasus to generate chapter and object summaries. For topic modeling, we used LDA, LDA2Vec, and BERT, while CNNs and YOLOv7 were used to detect and classify visual elements such as figures and tables. User interaction data was populated with dummy data. The full process of extracting, processing, and inserting metadata for all 1,000 ETDs was completed in approximately 11 minutes on a virtual machine with 32 CPU cores and 125 GB RAM, demonstrating the scalability of our approach.

Interoperability and Mapping

To ensure interoperability and mitigate schema adoption challenges, we created a detailed mapping between ETD-MS v2.0 and the existing standards Dublin Core and ETD-MS v1.1. For example, the new field ETDs.owner_and_statement aligns with dc.rights, and ETDs.discipline maps to thesis.degree.discipline in ETD-MS v1.1. In some cases, our schema introduces new metadata fields with no equivalents in older standards, such as the detailed “References,” “ETD_File,” and “Object_metadata” entities.

Limitations and Future Work

The current version of the schema was developed using a sample of 500 ETDs, which may not fully capture the metadata of ETDs beyond the scope of selection. For example, some ETDs contain multiple date fields, such as submission date and public release date, or include metadata such as a “peer reviewed” status. These elements are not represented in our current schema.

We view ETD-MS v2.0 as an evolving framework. In the future, we will refine the schema by including additional metadata elements. We will also collect feedback from ETD repository managers, librarians, and other stakeholders.

Conclusion

ETD-MS v2.0 is a comprehensive and extensible metadata schema developed to align ETD metadata with the FAIR principles. Our proposed schema extends existing standards by providing a more complete and detailed description and integrating content-level metadata. The proposed ETD-MS v2.0 schema, along with its mappings to both Dublin Core and ETD-MS v1.1, is available at the following GitHub link: https://github.com/lamps-lab/ETDMiner/tree/master/ETD-MS-v2.0.

References

Salsabil, L., Wu, J., Ingram, W. A., & Fox, E. (2024). ETD-MS v2.0: A Proposed Extended Standard for Metadata of Electronic Theses and Dissertations . In Proceedings of the 27th International Symposium on Electronic Theses and Dissertations (ETD 2024).

Uddin, S., Banerjee, B., Wu, J., Ingram, W. A., & Fox, E. A. (2021, December). Building A large collection of multi-domain electronic theses and dissertations. In 2021 IEEE International Conference on Big Data (Big Data) (pp. 6043-6045). IEEE. https://doi.org/10.1109/BigData52589.2021.9672058

-- Lamia Salsabil (@liya_lamia)

Open Data and ODE in Bangladesh: Students and Researchers Step into a New World of Openness / Open Knowledge Foundation

During our ODE workshop, many participants had an eye-opening moment. At first, some didn’t quite get how open source or open data related to their work. But once we introduced them to tools like the Data Package and its connection with ODE, it all clicked.

The post Open Data and ODE in Bangladesh: Students and Researchers Step into a New World of Openness first appeared on Open Knowledge Blog.

2025-08-04: Trip Report: 2025 AIAA SciTech Forum / Web Science and Digital Libraries (WS-DL) Group at Old Dominion University

The 2025 AIAA SciTech Forum in Orlando served as a seminal meeting point for researchers, practitioners, and policy influencers from across the aerospace spectrum. Throughout the week, a series of plenaries, panel discussions, and technical sessions provided a multifaceted view of contemporary challenges and future directions in aerospace research and development. This report will review key themes—from resilient space systems and exascale computing to transformative applications of artificial intelligence (AI) and NASA’s evolving strategies for continuous human presence in low Earth orbit (LEO) and my own personal takeaways from the various talks I attended.

I. Opening Plenary and Community Building

In his opening address, Clay Mowry, the newly appointed CEO of the American Institute of Aeronautics and Astronautics (AIAA), set the stage by emphasizing the forum’s role in fostering technical exchange and innovation. Speaking to an audience exceeding 6,000 attendees, including over 2,000 students and young professionals, Mowry underscored the Institute’s dual commitment to honoring its long-standing heritage and charting a forward-looking course for aerospace. He highlighted several strategic priorities:

Community Engagement and Mentorship: Mowry’s remarks stressed the importance of intergenerational knowledge transfer, noting the active participation of first-generation college students and international members within AIAA’s volunteer base. This is, to me, a key benefit to attending these types of conferences.
Institutional Growth: With AIAA approaching its centennial in 2031, the organization is actively seeking to expand its membership base and enhance its member services, thereby reinforcing its role as a thought leader in aerospace.
Inspiration for Innovation: Mowry’s energetic account of his recent visits to industry sites—such as Lockheed Martin and GE Aerospace—and his personal reflections on 32 years in the field served to inspire attendees by linking historical achievement with the promise of emerging technologies. It’s easy to forget how much innovation is going on when you live in the NASA bubble despite regular industry collaborations.

JPL’s Vision and the Future of Planetary Exploration

In the opening keynote by Dr. Lori Leshin of NASA’s Jet Propulsion Laboratory (JPL), the forum’s attention shifted to the challenges of planetary exploration. Dr. Leshin’s presentation covered topics such as:

Advances in Robotic Exploration: JPL’s work on Mars sample return, the deployment of sophisticated rovers, and the development of autonomous robotic systems illustrates the agency’s commitment to addressing the complex challenges of landing and operating on extraterrestrial surfaces.
Deep-Space Optical Systems: The integration of a next-generation coronagraph instrument on the Roman Space Telescope was highlighted as a transformative advance, offering the potential to detect exoplanets that are up to 100 million times fainter than their host stars.
Interdisciplinary and International Collaboration: Dr. Leshin stressed the importance of sustained partnerships—with entities such as the Indian Space Research Organization (ISRO) and various European space agencies—to address global scientific challenges and ensure that exploration efforts yield both technological and scientific dividends.

Dr. Leshin’s keynote underscored that the future of planetary exploration will be defined by the capacity to execute “ridiculously hard” missions; an endeavor that demands the convergence of technical innovation, rigorous testing, and robust international cooperation. A focus on complex mission research and execution helps to drive home the point that we can not continue to be a leader in space exploration with commercial space alone.

II. Advancing Resilient Space Systems

A key theme at the forum was the redefinition of resiliency in space systems. In a panel discussion featuring Dr. Deborah Emmons from the Aerospace Corporation and other experts, participants examined the limitations of traditional point-to-point resiliency models and advocated for a distributed, holistic approach. The session presented a rigorous analysis of emergent threats, including:

Systemic Vulnerabilities: The increasing reliance on space assets for global communications, navigation, and defense necessitates architectures that can autonomously reassign tasks and maintain functionality despite targeted disruptions.
Technological Threats: Contemporary challenges—ranging from anti-satellite (ASAT) weapons to the potential deployment of nuclear systems in orbit—demand innovative countermeasures and collaborative research efforts.
Interdisciplinary Collaboration: The discussion reinforced that strategic partnerships among government agencies, industry leaders, and academic institutions are indispensable for advancing robust space technologies.

Such deliberations reinforce the imperative for next-generation space systems to incorporate resiliency as an emergent, distributed property — a concept that will shape both technical R&D and national security policy.

III. Exascale Computing and Its Impact on Aerospace Research

Dr. Bronson Messer’s keynote presentation on Oak Ridge National Laboratory’s Frontier supercomputer provided a technical deep-dive into the transformative potential of exascale computing. Frontier, with its reported peak performance of 2.1 exaflops in double precision, exemplifies the convergence of advanced hardware, optimized interconnectivity, and innovative cooling solutions. Key points included:

Architectural Innovation: The supercomputer’s nearly 10,000-node configuration, housed in cabinet-sized racks and cooled via a state-of-the-art liquid-cooling system, enables unprecedented computational throughput essential for multi-scale, multiphysics simulations.
Scientific Applications: Frontier’s capabilities are already being leveraged for high-fidelity simulations in turbulence modeling, combustion dynamics, and retropropulsion analysis for Mars missions. These applications are critical for validating theoretical models and accelerating technology maturation.
Collaborative Synergies: Messer emphasized the importance of interdisciplinary collaboration, highlighting partnerships with industry (e.g., GE Aerospace) and academic institutions to maximize the impact of exascale resources on aerospace innovation.

Dr. Messer’s presentation illustrates that advances in computational infrastructure are pivotal to solving complex aerospace problems, thereby fostering breakthroughs in both fundamental science and applied engineering.

IV. AI as a Catalyst for Transformative Aerospace Applications

The session titled “The AI Future Is Now,” led by Alexis Bonnell, CIO of the Air Force Research Lab (at the time of the presentation), offered a forward-looking perspective on the integration of AI into aerospace systems. Moderated by Dr. Karen Wilcox, Banel’s presentation addressed several critical issues:

Iterative Learning and Rapid Adaptation: Bonnell noted that the accelerated pace of AI innovation requires a “fail-fast, learn-fast” approach. This methodology is essential for refining generative AI systems and ensuring that technological developments remain relevant in rapidly changing operational contexts.
Transformation of Routine Operations: One of the most compelling insights was the potential for AI to convert mundane tasks into strategic “time,” thus enhancing operational efficiency. This shift is particularly significant in defense, where reducing cognitive load can free decision-makers to focus on high-priority challenges.
Ethical and Cultural Considerations: Bonnell’s discussion also addressed the ethical dimensions of AI deployment, arguing that AI should be viewed as an augmentation tool rather than a replacement for human judgment. This perspective is crucial for fostering a balanced relationship between technology and human expertise.

The session’s exploration of AI underscores its role as both a technical enabler and a transformative force that reshapes the dynamics of human-machine collaboration in aerospace.

V. NASA’s Evolving Vision for Low Earth Orbit (LEO)

NASA’s strategic vision for continuous human presence in LEO was articulated by Associate Administrator Jim Free in a session that presented the agency’s new LEO microgravity strategy. Free’s remarks provided a comprehensive overview of NASA’s long-term objectives, which build on the legacy of the International Space Station (ISS) while charting a course for future exploration. Salient aspects included:

Historical Continuity and Future Ambitions: Free contextualized NASA’s achievements, from Apollo and the ISS to upcoming missions, emphasizing that LEO remains a critical proving ground for sustaining human presence and advancing exploration technologies.
Consultative Strategy Development: The formulation of the Leo microgravity strategy involved extensive stakeholder consultation with industry, academia, and international partners. This collaborative process yielded a refined set of goals and objectives that emphasize a “continuous heartbeat” in LEO.
Operational and Budgetary Considerations: Free discussed the challenges of maintaining a sustainable transportation base, managing orbital debris, and balancing budgetary priorities to ensure that strategic objectives can be met.

This session not only provided a detailed roadmap for future LEO operations but also highlighted the importance of consultation and iterative strategy development in addressing the multifaceted challenges of space exploration.

VI. NASA Langley Specific Talks

Dr. Danette Allen

The presentation by Dr. B. Danette Allen, titled "Teaming with Autonomous Systems for Persistent Human-Machine Operations in Space, on the Moon, and on to Mars," explored the critical role of autonomous systems in NASA’s long-term Moon-to-Mars strategy. The discussion emphasized the need for reliable, resilient, and responsible autonomy to support human-machine teaming in deep space exploration.

Dr. Allen framed the talk around the question of whether autonomy should be "irresponsible"—a rhetorical setup that presented the challenges of ensuring trust, safety, and effectiveness in autonomous robotic systems. The presentation aligned with NASA’s broader Moon-to-Mars architecture, which envisions integrated human and robotic operations to maximize scientific and engineering productivity. The emphasis was on creating autonomous systems that can function effectively in harsh, time-critical environments while maintaining transparency, explainability, and human oversight.

A key focus was the concept of Human-Machine Teaming (HMT) which involves the integration of human cognition with robotic efficiency to optimize exploration activities. NASA’s strategy aims to balance supervised autonomy with trusted, independent robotic operations that extend the reach of human explorers. This approach ensures that, even during uncrewed mission phases, habitation systems, construction equipment, and surface transportation can function autonomously while still allowing human intervention when necessary.

The presentation detailed how autonomous systems will contribute to NASA’s Lunar Infrastructure (LI) and Science-Enabling (SE) objectives. These include autonomous site surveying, sample stockpiling, and in-situ resource utilization (ISRU) to prepare for crewed missions. Autonomous construction techniques will be crucial for building long-term infrastructure, such as power distribution networks and surface mobility systems, while robotic assistants will help optimize astronaut time by handling routine or high-risk tasks.

One of the central challenges discussed was trust in autonomous systems. Dr. Allen highlighted that autonomy in space is not merely about function allocation but about fostering justifiable trust, which ensures that robots make decisions in a way that humans can understand and rely on — especially in safety-critical scenarios. The talk addressed different levels of autonomy, ranging from supervised to fully autonomous systems, and how human explorers will interface with these technologies through natural interaction methods such as gestures, gaze tracking, and speech.

From an in-space assembly perspective, this research is vital. As NASA moves toward constructing large-scale space infrastructure, ranging from modular lunar habitats to Martian research stations, robotic autonomy will be essential in assembling, repairing, and maintaining these structures. Autonomous systems capable of adapting to dynamic conditions will reduce reliance on Earth-based control, allowing for more resilient and self-sustaining operations.

The Moon-to-Mars strategy’s emphasis on interoperability and maintainability also ties into the need for autonomous systems that can adapt to different mission phases. Whether constructing habitats, assisting in scientific research, or supporting crew logistics, autonomy must be integrated seamlessly across NASA’s exploration objectives. By leveraging artificial intelligence and robotic automation, NASA is setting the foundation for a future where in-space assembly and long-term space habitation become feasible and sustainable.

Ultimately, the idea that autonomy in space must be trustworthy, explainable, and mission-critical is fundamental to the development of reliable human-machine teams. These teams will be a cornerstone of NASA’s efforts to establish a persistent human and robotic presence on the Moon and Mars, paving the way for deeper space exploration and long-term space infrastructure development.

Dr. Natalia Alexandrov

The presentation by Natalia M. Alexandrov and colleagues, "MISTRAL: Concept and Analysis of Persistent Airborne Localization of GHG Emissions," explored an innovative approach to tracking and mitigating methane (CH₄) emissions using persistent airborne monitoring. Funded by NASA’s Convergent Aeronautic Solutions (CAS) initiative, the project sought to develop a scalable, low-cost solution for real-time methane detection, with a particular focus on high-emission regions like the Permian Basin.

The presentation emphasized the urgency of methane reduction by highlighting that the global temperature increase had surpassed the critical 1.5°C threshold in 2024. This warming has exacerbated environmental, economic, and health crises, with methane playing a significant role due to its potency as a greenhouse gas. The discussion also addressed the direct health effects of methane emissions, which displace oxygen and contribute to respiratory, cardiovascular, and neurological conditions. Studies cited in the talk estimated that emissions from oil and gas industry operations contribute to 7,500 excess deaths and a $77 billion annual public health burden in the U.S. alone.

Initially, the research team explored airborne CO₂ removal but pivoted toward methane due to its greater short-term climate impact. The final concept emphasized persistent localization and reporting rather than scrubbing, as some experts raised concerns that removal technologies might unintentionally encourage more emissions. Instead, MISTRAL proposed a decentralized approach in which fleets of commercial off-the-shelf (COTS) drones would conduct continuous monitoring and reporting of methane leaks, allowing for timely intervention and mitigation.

The design reference mission (DRM) centered on the Permian Basin, one of the largest methane super-emitters in the world. The project proposed partitioning the observation area into units, each operating a fleet of drones for continuous surveillance of emissions from production sites, pipelines, and storage facilities. The study also explored different operational strategies, such as distributed battery hot swapping and chase vehicle-based battery replacements, to maximize efficiency and minimize downtime.

A key aspect of the analysis was its feasibility assessment. The team evaluated the economic viability of the system, modeling costs under pessimistic assumptions. Even in worst-case scenarios, the study found that small municipalities could afford to implement and maintain a localized monitoring network. The project also aligned with existing Environmental Protection Agency (EPA) third-party reporting initiatives, empowering local governments, first responders, and communities to take direct action in holding polluters accountable.

From an Earth science and conservation perspective, MISTRAL represented a major step forward in environmental monitoring and climate change mitigation. Persistent airborne surveillance of greenhouse gases could provide critical data for climate researchers, regulatory agencies, and policymakers, improving the accuracy of emissions inventories and facilitating more effective enforcement of environmental regulations. The ability to track methane emissions in near-real-time also complemented broader conservation efforts by helping to identify and address sources of ecosystem degradation, such as habitat loss due to oil and gas extraction.

Furthermore, MISTRAL’s model of community-driven, low-cost, technology-enabled environmental oversight offered a scalable blueprint for other regions grappling with industrial pollution. By decentralizing environmental monitoring and making it more accessible, the project aligned with global efforts to use technology for conservation, supporting initiatives like methane reduction pledges under the Global Methane Pledge and broader climate resilience strategies.

Ultimately, the presentation concluded that the MISTRAL concept was not only technically and economically viable but also a transformative tool for conservation and environmental protection. By leveraging autonomous aerial systems for persistent methane tracking, the project offered a pragmatic, actionable solution for reducing greenhouse gas emissions and mitigating climate change at a critical time for global climate action.

Dr. Javier Puig-Navarro

Dr. Javier Puig-Navarro’s talk, “Performance Evaluation of a
Cartesian Move Algorithm for the LSMS Family of Cable-Driven Cranes”, presented on the performance of a novel algorithm designed for the Lightweight Surface Manipulation System (LSMS).

The LSMS crane operates through multiple cable actuators that provide both support and control of the payload, enabling large workspaces with lightweight hardware. However, the system’s complex nonlinear dynamics, coupled actuator paths, and lack of traditional joint sensors pose significant challenges to motion planning, especially in the precise manipulation required for autonomous or teleoperated operations on the Moon or Mars.

The Cartesian Move Algorithm: A Simpler Path to Precision

Puig-Navarro’s team developed the Cartesian Move Algorithm to simplify these challenges by shifting control focus from joint space to task space. The algorithm's objective is to drive the crane’s end effector (e.g., hook or gripper) to a desired 3D location, maintaining position even in the face of actuation delays, feedback uncertainty, and mechanical compliance.

Inputs to the algorithm include:

Goal position: Supplied by a perception system (e.g., vision-based localization)
Hook position estimate: Computed from actuator encoders
Ideal motion profile: A straight-line trajectory from current position to target

Instead of prescribing precise joint motions, the algorithm computes control signals that move the end effector directly along a Cartesian path. This approach abstracts the operator or control planner away from the complexities of cable tensioning, kinematic switching, and nonlinear coupling; common obstacles in cable-driven robotic systems.

In practice, the Cartesian move operates during several key motion phases:

Capture and lift in pick-up tasks
Drop, release, and retreat during payload placement

For initial alignment (approach), a separate joint-space trajectory tracking algorithm is used, which ensures smooth transition into Cartesian control when precision is most critical.

Performance Insights from Hardware and Simulation

Puig-Navarro reported on a rigorous evaluation of the algorithm using 49 real-world trials on LSMS testbeds. The results were impressive:

46 of 49 Cartesian move executions achieved a “desired” result (minimal error at the goal)
Only 3 were classified as "minimum acceptable" or "unsatisfactory"

Moreover, the team benchmarked algorithm performance across both hardware and simulation environments. While both platforms showed excellent convergence behavior, physical hardware introduced subtle differences in path curvature and command saturation—attributable to real-world constraints like cable elasticity and latency.

Practical Implications for Planetary Robotics

The key takeaway from Puig-Navarro’s talk is that the Cartesian move algorithm is a powerful and practical solution for tasks requiring final-position accuracy in environments where traditional robot arms are impractical or infeasible. For operations where the path shape is also critical (e.g., obstacle avoidance or coordination with other manipulators), the team recommends using trajectory-tracking or path-following algorithms instead.

Dr. Joshua Moser

Dr. Joshua Moser’s talk, "Bridging the Gap Between Humans and Robotic Systems in Autonomous Task Scheduling," explored the integration of human decision-making with autonomous task scheduling to enhance operational efficiency in space environments. The core focus was on the sequencing and allocation of tasks and crucial elements in ensuring smooth execution of autonomous operations, particularly in scenarios involving data collection, mining, offloading, assembly, repair, maintenance, and outfitting.

Moser discussed various approaches to task sequencing, emphasizing the importance of dependencies and workflow constraints. He introduced Mixed Integer Programming (MIP) and Genetic Algorithms as computational techniques for optimizing task execution order, ensuring efficiency and feasibility in robotic operations. Similarly, task allocation was analyzed through the lens of an agent’s capabilities, location, and travel constraints — highlighting the necessity of considering independence, dependencies, and failure probabilities when assigning work to robotic systems.

A significant aspect of the presentation was the role of human-autonomy collaboration. Moser distinguished between "human-in-the-loop" and "human-on-the-loop" frameworks, where humans either actively direct autonomous systems or oversee their operations with minimal intervention, respectively. The key challenge lies in creating interfaces that enable intuitive human interaction with autonomy; leveraging graphical representations, large language models (LLMs), and interactive visualization tools.

Moser uses the LSMS (Lightweight Surface Manipulation System) as an example of real-world applications, illustrating how autonomous scheduling can optimize payload offloading using a cable-driven crane and rover system. The emphasis on graphical task-agent visualization and intuitive user inputs (such as click-and-drag interfaces) reflected an effort to make autonomy more interpretable and manageable by human operators.

In the broader context of NASA’s in-space assembly efforts, Moser’s work aligns with ongoing initiatives aimed at enabling autonomous robotic construction and maintenance of space infrastructure. As NASA pushes toward large-scale space structures—such as modular space habitats, solar power stations, and next-generation observatories—intelligent task scheduling and allocation mechanisms become increasingly critical. Bridging the gap between human cognition and robotic automation will be essential to achieving scalable and resilient in-space assembly systems, reducing reliance on direct human intervention while ensuring mission success in unpredictable environments.

Me:

I presented on "Trust-Informed Large Language Models via Word Embedding-Knowledge Graph Alignment," exploring innovative methods to enhance the reliability and accuracy of large language models. The central theme of my presentation was addressing the critical challenge of hallucinations,instances where LLMs generate plausible yet incorrect information, particularly problematic in high-stakes fields such as aerospace, healthcare, and financial services.

My research investigates the integration of LLMs with knowledge graphs, structured representations of real-world knowledge, to foster intrinsic evaluation of information credibility without external verification sources. Specifically, I discussed aligning word embeddings, mathematical representations of words capturing semantic relationships, with knowledge graph embeddings which encode entities and their interconnections. By merging these two types of embeddings into a unified vector space, the resultant model significantly improves its ability to evaluate the plausibility of generated content intrinsically, thus reducing its dependence on external systems and mitigating the risk of hallucinations.

During the presentation, I provided a comprehensive survey of existing alignment methods, including mapping-based approaches, joint embedding techniques, and the application of graph neural networks. Additionally, I outlined key applications where this methodology could significantly enhance trust in AI systems, particularly in safety-critical decision-making environments such as aerospace operations.

Lastly, I addressed the technical, methodological, and ethical challenges that accompany this integration, offering insights into future research directions to further develop robust, trustworthy AI. My work aims not only to advance understanding of language models but also to contribute practically to developing safer, more reliable AI systems that can independently discern truth from misinformation.

VII. Conclusion

The 2025 AIAA SciTech Forum exemplified the integration of cutting-edge technology with strategic foresight in the aerospace domain. Several overarching themes were repeated throughout the conference such as the imperative to develop space systems that are resilient by design, capable of dynamic, distributed response to emergent threats by targeted research and development in Distributed Resilience. Additionally, the transformative role of exascale computing in enabling high-fidelity simulations that drive both fundamental research and applied technology development. Finally, the promise of artificial intelligence to not only optimize operational efficiency but also fundamentally alter the relationship between human decision-making and information processing.

As I return to my work at NASA Langley, I'm reminded that innovation often happens at the boundaries between fields. The conversations in hallways, the unexpected connections between presentations, and the diverse perspectives of over 6,000 attendees all contribute to pushing aerospace forward. In an era where the challenges are "ridiculously hard" (to borrow Dr. Leshin's phrase), our solutions must be equally ambitious—and thoroughly collaborative.

The path from Earth to a sustained presence on the Moon and Mars will require not just technological breakthroughs, but a fundamental shift in how we approach complex systems. The 2025 SciTech Forum showed that the aerospace community is ready for this challenge, armed with distributed thinking, unprecedented computational tools, and a commitment to building AI systems worthy of our trust.

- Jim

How to make a custom template for the Remarkable 2 / Hugh Rundle

How to make a custom template for the Remarkable 2

Recently I decided I wanted to make a custom template to use on my reMarkable 2. I eventually figured out how to do this, but whilst I found some useful guides online, all of them were slightly misleading or unhelpful in different ways – probably due to changes over time. This guide is for anyone else wanting to give it a shot in 2025.

The tl;dr

The reMarkables are built on Linux, and the templates are SVG files in a specific directory. Adding your own template is probably easier than you expected:

create an SVG file for your template
connect to your reMarkable using SSH
copy your template to the templates directory
update the templates.json file so your template appears in the listing
reboot the reMarkable

I haven't tried it on Windows, but apparently Windows has an SSH terminal and also scp so you should be able to follow this same process whether you have a computer running Linux, MacOS, any other Unix-based system, or Microsoft Windows.

You will need a computer, software for creating SVG graphics, and a little confidence.

Caveats

It's possible you could brick your reMarkable if you mess this up really badly. Always make sure you have backed up your files before doing anything in reMarkable's file system.

I haven't been using custom templates for long enough to know for sure, but others have suggested that when your reMarkable software is next updated, any custom templates may be deleted. Make sure you have backups of your templates as well!

Finally, this is what worked for me on a reMarkable 2 running the latest operating software in July 2025. Future system updates may change the way this works.

Step 1 - create your template

Older guides for making custom templates, like this one were helpful for me to understand the basics of templates, but it seems that in the past templates were .png files, whereas recently they changed to SVG.

To create a template you will need something to create SVG graphics. I use Affinity Designer, but you could try Inkscape, Adobe Illustrator, or Canva. The reMarkable 2 screen size is 1872px x 1404px so although SVGs will scale proportionally, for best results make your file match that size.

Remember that your reMarkable 2 will only display in black, white, and grey. If your design doesn't quite work the first time, you can play around with it and reload it, so you can experiment a little until you get the design that suits your needs.

Once you're finished, save the template somewhere you can find it easily on your computer, as a .svg file.

Step 2 - connect to your reMarkable via SSH

To access the operating system for your reMarkable, you will need to connect using Secure Shell (SSH). For this, you need two pieces of information about your reMarkable: the IP address, and the password. From the main menu (the hamburger icon at top left) navigate to Settings - Help - Copyrights and licenses. At the bottom of the first page in this section you will find your password in bold type, and a series of IP addresses. The second (IPv4) address is the one you are looking for. This will be a private IP address starting with 10. If your reMarkable is connected to WiFi, you can use SSH over the same WiFi network. Otherwise, connect via your reMarkable's USB power/data cable. Either way, ensure that your reMarkable remains awake whilst you are connected, otherwise your session may hang.

Open a terminal on your computer (Terminal on Mac and Linux desktop, CMD.exe or PowerShell on Windows). You will be logging in as the user called root. This is a superuser on Linux machines so take care - with great power comes great responsibility. You should be able to log in using this command (where xxx.xxx.xxx.xxx is your IP address):

ssh root@xxx.xxx.xxx.xxx

Your terminal will then ask for a password, which you should type in, and then press Enter - the quotation marks are not part of the password. If all goes well, you should see something like this:

ｒｅＭａｒｋａｂｌｅ
╺━┓┏━╸┏━┓┏━┓   ┏━┓╻ ╻┏━╸┏━┓┏━┓
┏━┛┣╸ ┣┳┛┃ ┃   ┗━┓┃ ┃┃╺┓┣━┫┣┳┛
┗━╸┗━╸╹┗╸┗━┛   ┗━┛┗━┛┗━┛╹ ╹╹┗╸
reMarkable: ~/

~ hacker voice ~ You're in 😎.

Step 3 - copy your template to the reMarkable

At this point you should pause to ensure that you know the filepath to the template path on your computer. If you saved it to your desktop (not a great place for long term storage, but convenient for quick operations like this) it will be something like ~/Desktop/my_custom_template.svg. We are now going to create a special subdirectory for your custom template/s, and copy your file across.

In your terminal session you should still be logged in to the reMarkable. The templates are all stored in the /usr/share/remarkable/templates directory. To create a new subdirectory, we use the mkdir command, like this:

mkdir /usr/share/remarkable/templates/my_templates

Now we can copy our template over. Open a new terminal window. We will use the secure copy protocol to copy the file over SSH from your computer to your reMarkable:

scp ~/Desktop/my_custom_template.svg /usr/share/remarkable/templates/my_templates/

Back in your first terminal session – which should still be connected to the reMarkable – you can check whether the file transferred across using the ls command:

ls /usr/share/remarkable/templates/my_templates

This should display my_custom_template.svg.

Step 4 - update the `templates.json` file

Now for the trickiest part. You will need to update a file in the templates directory called templates.json. This provides information about where each template is stored, what it should be called, and which icon to use in the templates menu. If you make an error here, your templates may no longer work properly (I know this from my own mistake!) - so whilst it is reasonably straightforward, you do need to pay attention.

Many tutorials about editing files on the Linux command line tell you to use vi or vim. These are the default text editors on Linux, but they are also obtuse and confusing for newcomers. We are going to instead use the nano program that is also standard on most Linux distributions, but a little easier to understand. To edit the templates JSON file, open it in nano:

nano /usr/share/remarkable/templates/templates.json

You should now see a screen showing the beginning of a long string of JSON. We want to add a new entry down the bottom of the file, so we will navigate down to line 500 using the keyboard shortcut Ctrl + / + 500 + Enter. From there you can use your cursor/arrow keys to navigate down to the last entry in the file. We want to add a new entry, like this:

    {
      "name": "Hexagon small",
      "filename": "P Hexagon small",
      "iconCode": "\ue98c",
      "categories": ["Grids"]
    },
    {
      "name": "My Daily Schedule",
      "filename": "my_templates/my_custom_template.svg",
      "iconCode": "\ue9ab",
      "categories": ["Planners"]
    }
  ]
}

Make sure you do not overwrite or delete the square and curly brackets at the end of the file, that you do put a comma after the second-last entry and your new one, and do not leave a trailing comma after your new entry.

Note that the filename is relative to the templates directory, so we need to include the new subdirectory. The iconCode uses a "private use" unicode value that matches one of reMarkable's standard images – it is not possible to create your own icon so you will need to re-use one of the existing ones.

Once you confirm everything is correct, enter Ctrl + x to exit, and y + Enter to confirm you want to save changes using the original filename.

Step 5 - reboot

Now for the most terrifying moment: rebooting your reMarkable!

Back on your command line, type reboot and then press Enter.

This step is simple but it will be a little nerve-wracking because your reMarkable will reboot, then pause for a moment before letting you log back in. If everything has gone according to plan you should now be able to find your new template by name in the template directory, and start using it!

Optional bonus step 7 - SSH keys

Logging in with a password is ok, but it can get a bit tedious. An easier way is to use SSH keys.

You can set up an SSH "key pair" on Linux and MacOS and also now natively on Windows.

Once you have created your keys, you can use ssh-copy-id to copy your public key to your reMarkable, allowing you to log in without a password! We use the ssh-copy-id command, with the i flag followed by the path to our ssh key:

ssh-copy-id -i ~/.ssh/id_rsa root@xxx.xxx.xxx.xxx

If you only have one ssh key, you can just enter:

ssh-copy-id root@xxx.xxx.xxx.xxx

At the prompt, enter your password and press Enter. You should see a number of lines of output, ending in:

Number of key(s) added:        1

Now try logging into the machine, with:   "ssh 'root@xxx.xxx.xxx.xxx'"
and check to make sure that only the key(s) you wanted were added.

You should now be able to log in to your reMarkable to update templates at you leisure without a password.

Happy note taking!

Tide Out / Ed Summers

Looking south along Villas, NJ at the Delaware bay.

‘Educating the Gaze’ with Open Data Editor at Abrelatam/Condatos Bolivia 2025 / Open Knowledge Foundation

Workshop participants were able to identify problems associated with working with data. Some common scenarios were addressed, which were linked to a fear of working with data or dealing with databases without knowing what to do.

The post ‘Educating the Gaze’ with Open Data Editor at Abrelatam/Condatos Bolivia 2025 first appeared on Open Knowledge Blog.

How I use Zotero + OpenRefine + QuickStatements to create Scholia profiles from Wikidata / Mita Williams

Let's make scholarly profiles for our colleagues. Together.

August 2025 Early Reviewers Batch Is Live! / LibraryThing (Thingology)

Win free books from the August 2025 batch of Early Reviewer titles! We’ve got 232 books this month, and a grand total of 4,410 copies to give out. Which books are you hoping to snag this month? Come tell us on Talk.

If you haven’t already, sign up for Early Reviewers. If you’ve already signed up, please check your mailing/email address and make sure they’re correct.

» Request books here!

The deadline to request a copy is Tuesday, September 2nd at 6PM EDT.

Eligibility: Publishers do things country-by-country. This month we have publishers who can send books to the US, the UK, Canada, Germany, Australia, New Zealand, Spain, Denmark, France, Belgium and more. Make sure to check the message on each book to see if it can be sent to your country.

Thanks to all the publishers participating this month!

Akashic Books	Akashic Media Enterprises	Alcove Press
Anchorline Press	Awaken Village Press	Baker Books
Bear Paw Press	Bellevue Literary Press	Bethany House
Bigfoot Robot Books	Broadleaf Books	Castle Bridge Media
Chosen Books	Cinnabar Moth Publishing LLC	CMU Press
Consortium Book Sales and Distribution	Crooked Lane Books	eSpec Books
Gefen Publishing House	Gnome Road Publishing	Harbor Lane Books, LLC.
Harper Horizon	HarperCollins Leadership	Harvard Business Review Press
HB Publishing House	Henry Holt and Company	Heritage Books
Mayobook	Minds Shine Bright	Muse Literary Publishing
Paul Stream Press	Pegasus Books	PublishNation
Purple Moon Publishing	Revell	RIZE Press
Ronsdale Press	Rootstock Publishing	Running Wild Press, LLC
Seerendip Publishing	Simon & Schuster	Sunrise Publishing
Tapioca Stories	Tundra Books	University of Nevada Press
University of New Mexico Press	UpLit Press	What on Earth!
Wolf’s Echo Press	WorthyKids	Yorkshire Publishing