Planet Code4Lib

Another Way of Knowing: Resisting Eugenic Propaganda Through Community Archiving / In the Library, With the Lead Pipe

In Brief: How do information workers resist the creation of archival “deathworlds”? With rising eugenicist rhetoric in the United States, sites of cultural memory face devastating impacts. These consequences are particularly felt by Disabled and multiply-marginalized communities. This article draws on Disability Justice principles and necropolitical framings to investigate how processes of erasure can be interrupted through active collaboration and critical reevaluation of power-sharing. By supporting alternative forms of knowledge sharing and honoring the lived experience of historically marginalized communities, especially those who have faced forced institutionalization, we hope to craft alternative methodologies that center community involvement and self-determination.

By Jess Petrazzuoli-Gallagher and Ashten Vassar-Cain

Introduction

As early-career community archivists based in the United States, we are entering the archival profession at a time of fracture, lack of funding, communication breakdown, and heightened awareness of our field’s interconnectedness with policy and power. As of recently, most of the news centering Libraries, Archives, and Museums (LAMs) revolves around fear and censorship—a new list of banned books, another exhibit removal, and persistent retaliation from the Trump administration in the form of resource cuts and smear campaigns targeting institutions that refuse to bend to their will. Like many of our colleagues, we feel an overwhelming sense of urgency, guided by the weight of unanswered questions.

Reflecting on our positionality, both authors of this article are Queer and Disabled. We come to this work from research backgrounds, studying the American Eugenics movement and the use of medicalization to justify violence against marginalized bodies. Our work places us between multiple streams of knowledge. On the one hand, we are students in an ALA accredited library program, meaning we have the support and formalized training as a result of our proximity to an institution. However, most of our professional work exists outside of academia. As community archivists and activists, our work is inherently relational. It is an iterative series of mistakes, reinvention, and stories shared around tables. Our work also carries with it the lived experience of navigating ableism and violence in our daily lives, including our own experiences of institutionalization and abuse. In our practice, we reject prioritizing knowledge gained in a classroom over knowledge gained through active listening, experience, and engagement. We recognize that Disabled people have had their authority as “knowers” and knowledge producers challenged (Fricker 2007). Susan Wendell describes how “disabled people’s knowledge is dismissed as trivial, complaining, mundane (or bizarre)” (Wendell 120). Our ability to physically access knowledge is often similarly disregarded, as inaccessible buildings further restrict the ability of people with disabilities to participate as full contributors to knowledge. Because of this, we turn toward “cripistemology,” coined by Merri Lisa Johnson and Robert McCruer (2014), as an alternative to academic forms of knowledge.

Before we begin examining the process of eugenic violence and the ways it is continually recreated in our political and professional lives, we want to acknowledge that confronting this violence is far from impersonal. It is often a painful journey, especially for practitioners and community members who have been historically targeted, and those who continue to suffer harm. We struggle with the popular notion that the challenges we are currently experiencing are unprecedented. Rather, we are seeing a reinvigorated commitment to eugenicist rhetoric and policy, which have always been part of the American landscape and are enshrined in our social politics. This article examines our role as community memory workers in bearing witness and interrupting harm.

Our work is guided by Disability Justice principles, scholarship, and activism that moves beyond “rights-based” framings and toward collective action and rejection of all forms of oppression, domination, and exploitation. Because we see Disability as a dynamic axis of politics and identity, we choose to capitalize it in this article when it is used to refer to an identity category rather than as a descriptor. As Leah Lakshmi Piepzna-Samarasinha states in Care Work: Dreaming Disability Justice, “I don’t want to be fixed, if being fixed means being bleached of memory, untaught by what I have learned through this miracle of surviving. My survivorhood is not an individual problem. I want the communion of all of us who have survived, and the knowledge” (Piepzna-Samarasinha 239).

Our positionality informs our ways of documenting memory. It has also led us to imagine and create interventions that challenge power structures within our own archival practice. Through our work with the Pennhurst Memorial & Preservation Alliance’s (PMPA) Community Archive and Special Collections, we are undertaking efforts to document narratives from the self-advocacy movement led by individuals with Intellectual and Developmental Disabilities in a way that prioritizes original voice, honors lived experience, and expands access.

The Modern Eugenics Movement

In the late nineteenth century and early twentieth century, eugenics was seen as a “scientific” approach to control human genetics by limiting reproduction, resulting in forced sterilization, segregation, and systemic abuse and neglect. Eugenics and its connection to scientific racism were used to justify mistreatment on the basis of “perceived impairment.” Eugenics is one way that white supremacy violently asserts itself, claiming a scientific basis for settler colonialism and the expansion of empire. Sociologist Irving Kenneth Zola explained how medical authorities participate in enforcing state violence. In his 1972 essay “Medicine as an Institution of Social Control,” he writes that

the labels health and illness are remarkable ‘depoliticizers’ of an issue. […] By the very acceptance of a specific behaviour as an ‘illness’ and the definition of illness as an undesirable state, the issue becomes not whether to deal with a particular problem, but how and when. Thus the debate over homosexuality, drugs or abortion becomes focused on the degree of sickness attached to the phenomenon in question or the extent of the health risk involved. And the more principled, more perplexing, or even moral issue, of what freedom should an individual have over his or her own body is shunted aside. (500)

The understanding that medicine can be used as a tool to promote settler colonialist aims is sometimes referred to as Medical Imperialism (Schreier). The eugenics projects enacted by the United States resulted in over 70,000 forced sterilizations in the 20th century, though this number is likely larger, as many sterilizations have been performed without the informed consent of the individuals who had been subjected to the procedures. This did not happen overnight. It began with extensive campaigns by members of various scientific communities that captured the interest of policy makers, physicians, educators, and the American public. Eugenics bounced amongst America’s intellectual circles, ingraining itself in medical practice and scholarship. Detailed by Edwin Black in War Against the Weak, the saying “the taint is in the blood” became a prominent precept of the early eugenicists, who claimed that eradication of “undesirable traits” would result in a collectively superior “race,” and thus, collective peace and safety (Black 25). Scholar Marius Turda emphasizes a similar sentiment that capitalizes on the self-styled scientific theory of human betterment and planned breeding that eugenicists embraced. In posing biological purity as the nation’s responsibility, “eugenicists dissolved aspects of the private sphere, by scrutinizing and working to curtail reproductive, individual, gender, religious and indigenous rights. The boundary between the private and public spheres was blurred by the idea of public responsibility for the nation and the race, which came to dominate both” (2471). Such widespread influence on the “biological deterioration” of the human race captured politicians, doctors, scientists, lawmakers, and educators around the globe, and inspired horrific campaigns of genocidal violence.

The American obsession with surveillance and censorship weaponizes an idealized nuclear family, just as proponents of the early American Eugenics movement did. While libraries, archives, and museums contend with the removal of exhibits, Disability communities fear removal from public life, citing escalations that target community living protections.

In the latest iteration of American Eugenics, the Trump administration has waged a multipronged attack against Disabled Americans. In addition to dismantling of Diversity, Equity, Inclusion, and Access (DEIA) initiatives, the article “The Trump Administration’s War on Disability” from the Center on American Progress outlines an accelerated erosion of civil rights for Disabled Americans, including:

  • Weakening the government’s ability to enforce civil rights protections and investigate discrimination cases
  • Threatening access to benefits, affordable healthcare, and resources such as social services and community-based living supports
  • Divesting from public health infrastructure amidst an ongoing pandemic that disproportionately affects Disabled people
  • Decreasing employment protections related to disability
  • Attacking public education services offered to Disabled children
  • Robert F. Kennedy Jr.’s determination to “fight against Autism,” and proposed surveillance of Autistic individuals.

Actions taken by the Trump administration are most prominently seen in the July 24th executive order titled “Ending Crime and Disorder on American Streets,” which calls on the Attorney General and Health and Human Services Secretary to

enforce, and where necessary, adopt, standards that address individuals who are a danger to themselves or others and suffer from serious mental illness or substance use disorder, or who are living on the streets and cannot care for themselves, through assisted outpatient treatment or by moving them into treatment centers or other appropriate facilities via civil commitment or other available means, to the maximum extent permitted by law. (Trump, Executive Order 14321)

In combination, these measures constitute a return to “Ugly Laws,” a series of policies spanning from 1880s-1970s that removed Disabled people from public life through incarceration on the basis of “disfigurement.” Historical ugly laws and eugenics legislation reveal two intersecting dimensions of marginalization that encompass the visceral discomfort of a viewing public and the pathologization of “subnormality.” Ugly laws were then used as one metric for policing disabled, poor, and people of color for being in public, heavily relying on them being perceived as “dangerous,” “immoral,” or “unsightly”. Under Ugly Laws (1867-1974), almshouses acted as an alternative sentencing for “unsightly beggars” and “physically unable persons”—marking both people in these categories as unworthy of participating in public life, and instead subject to management by the state. State institutions for people with disabilities acted as an expansion of this carceral system, funneling individuals between its iterations, where commitment could extend to the end of that individual’s lifetime (Schweik).

Eugenic rhetoric is also employed in Executive Orders that target LAMs, invoking language of “sanity” and likening reparative descriptions to cognitive distortions (Trump, Executive Order 14253). As information workers, our choices in language, curation, and collaboration have the potential to accelerate the process of erasure and create lasting consequences for the communities depicted in the records we steward.

The Necropolitical Landscape of Memory and LAMs

Achille Mbembe, an anti-colonial Cameroonian scholar, coined the term “necropolitics.” Necropolitics expands on Foucault’s “biopolitics” through a Fanonian lens, anchoring it in opposition to apartheid and occupation. Necropolitics explores notions of oppression and mortality, while emphasizing the role of sovereignty as “the capacity to define who matters and who does not, who is disposable and who is not” (Mbembe 2003 27). Mbembe critically examines sovereignty in relation to biopower, stating that war is how nations exercise sovereignty, enact subjugation and uphold colonialism. He imagines politics as a “form of war” and asks “what place is given to life, death, and the human body (in particular the wounded or slain body)? How are they inscribed in the order of power?” (Mbembe 2003 12). Mbembe describes the creation of “deathworlds,” or “new and unique forms of social existence in which vast populations are subjected to living conditions that confer upon them the status of the living dead” (Mbembe 2003 39-40).

Some deathworlds have physical boundaries, identified by checkpoints and walls. They may take the form of state institutions that house people with disabilities, immigration detention centers, or whatever new construction of the carceral imagination best serves the goals and aims of those in power. They can also be less visible to the naked eye, consumed in the form of “small doses” of death that slowly erode and constrict our personhood. Necropolitics is wielded as “the power to manufacture an entire crowd of people who specifically live at the edge of life, or even on its outer edge — people for whom living means continually standing up to death” (Mbembe 2003 37-38).

Museums and archives occupy critical positions in this necropolitical landscape, serving as repositories of historical evidence and active creators of cultural memory. Our institutions have the power to enact “archival death” through our curatorial choices. Archival life exists in the tension between preservation and interpretation, between fixed materiality and fluid meanings attributed across time. Archival death becomes particularly insidious through its apparent neutrality—the removal of exhibits, deaccessioning of materials, and reframing of narratives—while achieving the same erasure as more overt forms of violence.

Prominent discussions in the field of digital stewardship and archival processing ask us if we are ready to confront the fact that our professional practices have upheld and facilitated the under-documentation and erasure of people from Black, Indigenous, Immigrant, Disabled, and Queer communities from the historical record and what inclusive forms of archiving might look like within our field (Duff 124).

“Archival Death” and Curatorial Power

Archival literature and practice relating to disability has historically focused on medical narratives, accessible practices, and histories authored by those in power rather than the ways in which disability and marginalized populations are documented. These framings often result in a narrow, medicalized representation of people within our collections that fails to capture the complex, intersectional lives of disabled people. Primarily residing in medical libraries and government archives, practices related to archiving disability tend to focus heavily on the interpretation of disability in the context of science and medicine, relying on labels prescribed by medical authorities rather than the social and embodied experience of disability. Kelvin White’s overview of the genesis of the field of archival preservation in “Promoting Reflexivity and Inclusivity in Archival Education, Research, and Practice,” provides valuable insight into the field’s standardized practices that have been shaped by people in positions of power—mainly those who were not from historically marginalized communities themselves. (K. White 117).

The intersectional lives of people with disabilities and their interactions with medical violence and eugenics are erased, and the physical archives remain largely inaccessible to disabled patrons and disabled professionals. The record may technically survive, but the life within it does not. Archival death, then, is not only about what is discarded or deaccessioned. Instead, it operates equally through how/what remains and if it is actively hidden.

According to Tobin Siebers, disability can be seen as an “elastic social category” that changes depending on the social context, making a singular definition for archival usage difficult. Without a proper theoretical framework like complex embodiment —which Sara White explains as evaluating disability as an experience—archivists and political actors risk inserting their internal biases about disability, eugenics, and state-sanctioned violence against marginalized communities (S. White).

Beth Linker’s “On the Borderland of Medical and Disability History: A Survey of the Fields,” expands on White’s historical discourse and links the rise of the American medical system to the documentation of disease-centric medical history that often neglects disability. The academic study of medical history in the U.S. began in the early 1930s—much like the beginning of the professional archival field—largely due to émigré scholars who had been trained in medicine and the humanities in German-speaking Europe. These scholars, including Henry Sigerist, Owsei Temkin, and Erwin Ackerknecht (all of whom spent time at Johns Hopkins), brought with them a deep commitment to continentally-infused theories and ideas, particularly concerning disease. As a result of this intellectual background and the research interests of these influential figures, disease history became the central research aim of the newfound field of medical history in the United States. This focus was not predetermined but rather a product of the time, as the individuals who shaped the discipline were medical practitioners. The relative neglect of disability within medical history contributed to the emergence of disability history as a distinct field in the late 1990s. “New” disability historians explicitly defined their discipline in contrast to medical history, arguing that the divergences between the two fields defined disability history’s parameters. Disability historians argued that the medical model defined disability solely as a consequence of biological factors—such as congenital or chronic illness, injury, or deviation from a perceived “normal” biomedical structure or function—and seeks to “fix” or cure its effects at the individual level.

When disability materials are collected solely through a medical lens—cataloged under disease categories, described in clinical language, stripped of the political and social contexts that gave them meaning—the person is rendered a faceless patient rather than an agent. When it comes to records and legacies of state-run institutions, such as the Fernald State School and Hospital in Massachusetts or the Pennhurst State School and Hospital in Pennsylvania, sketches of the institutions themselves are widely available. The institutions are venerated, retaining a life of their own even after the buildings have been shuttered. However, former residents are reduced to gaps in the historic record. Family members and researchers requesting information about their lives are met with barriers such as incomplete or lost records, restricted access to records, or lack of resident perspectives. In a way, the archive may act akin to the institution itself, dressing harm and isolation in the language of “care,” and participating in epistemic injustice against people with disabilities and their many “ways of knowing.”

In “Documenting Disability History in Western Pennsylvania,” Bridget Malley highlights Helen Samuels’ documentation strategy as “well designed to address gaps in the historic narrative by ‘provid[ing] a useful framework for discussing selection issues,’” particularly with respect to marginalized communities (17). Documentation strategy as methodology “guides selection and assures retention of adequate information about a specific geographic area, a topic, a process, or an event that has been dispersed throughout society” (17). Documentation strategy emphasizes structuring the inquiry and examining the form and substance of the available documentation. This involves actively seeking to understand the documentary universe related to a topic, identifying where documentation exists and, crucially, where it doesn’t exist.

Traditional appraisal practices can inadvertently contribute to archival gaps by reflecting the biases and perspectives of the appraisers, leading to the erasure of marginalized voices like those of people with disabilities. Documentation strategy aims to move away from subjectivity by incorporating a deeper understanding of the topic and the perspectives of those involved. For Malley, the collaboration between Western Pennsylvania Disability History Action Consortium (WPDHAC) members (community experts) and archivists from the Heinz History Center allowed for a merging of archival and community knowledge, leading to more nuanced appraisal decisions and potentially filling gaps based on community-identified needs in the archives.

Archival death was built into the profession’s foundations: the same apparatus that preserved evidence of national progress systematically failed to collect, or actively destroyed, evidence of the state’s violence against its most marginalized citizens. As uncomfortable as it may be to confront, this was not an oversight. It was a feature of archives designed to construct a cohesive national narrative by using seemingly neutral representations of history that avoided critical insights into the past, and something that we must take active steps to remedy. The records relating to institutionalized individuals that do survive from this era—asylum logs, sterilization orders, commitment papers—were authored by perpetrators, not survivors. Even when disabled people appear in the archive, they appear as objects of intervention rather than witnesses to their own lives. When disability materials are appraised, arranged, and described according to the medical model, the archive reproduces the very framework that justified confinement and cure. The disabled person is preserved as a case, not a life; a diagnosis, not a history. This is erasure through categorization—effective in determining who the ‘dead’ and the ‘living’ might become to future researchers, advocates, and communities seeking to understand what was done and what survived.

Community-Led Collecting: Alternative Methods for Protecting Shared History

Community-based archives have emerged as crucial alternatives to mainstream institutions, particularly for documenting the histories of marginalized groups. These archives are often created by and for the communities they serve, prioritizing agency, self-determination, and the preservation of narratives that challenge dominant frameworks. Community archives intentionally subvert the “neutrality” of institutional preservation by centering the values, priorities, and privacy concerns of their communities. Community archives may employ participatory appraisal and curation strategies, working directly with community members to identify, select, and describe materials. This approach seeks to capture the richness and diversity of lived experience, rather than reducing disability or other markers of identity to a social problem. It also reflects the interconnectedness of “ways of knowing.” 

In a 2022 interview, Achille Mbembe explains “the epoch we have entered into is one of indivisibility, of entanglement, of concatenations. Times of concatenation presuppose that our bodies have become repositories of different kinds of risks” (Mbembe 2022). Risk is very present in discussions of disability politics. “Risky bodies,” as described by Hi‘ilei Julia Kawehipuaakahaopulani Hobart and Tamara Kneese, are subjected to coercive forms of care (Hobart & Kneese), under the assumption that those living in the intersections of positionality cannot be trusted as knowers. One path away from paternalistic and colonial imaginings of care and knowledge is community-led intervention.

Some notable sites of community intervention we have encountered in our work include the Living Archives on Eugenics (LAE), The Anti-Eugenics Collective at Yale, and the From Small Beginnings Collective. The Living Archives on Eugenics in Western Canada (LAE) provides us a realistic vision for the future of community archival practices centered on disability and survivors of the eugenics movement. Working directly with survivors the project “raised awareness of historical and contemporary manifestation of eugenics [by capturing and disseminating] survivor’s stories.” Interactive collections provide historical context to the eugenics movement in Western Canada and emphasize the bond that was created between curators, archivists, and survivors. The Anti-Eugenics Collective at Yale situated Yale’s campus as the former headquarters of the American Eugenics Society, and their related collections as a site of harm and opportunity for reparation. They engage this troubled history through workshops with K-12 educators, students, medical professionals and the general public. The global collective From Small Beginnings is a group of anti-eugenics activists that helps educators and researchers learn of ongoing efforts to disrupt eugenics in action, and combats isolation by building a network of committed individuals and organizations. These projects meaningfully balance confrontation and collaboration through outreach and information organization.

These initiatives, coupled with the work done by the Disability Archives Lab on centering critical disability studies in archival research and practice, and the recent publication Preserving Disability: Disability and the Archival Profession (Brilmyer & Tang, 2024), inspired us to think about potential places of intervention in our own archival process. We established the Pennhurst Memorial & Preservation Alliance Community Archives in 2024 with support from the Pennsylvania Historic & Archival Records Care (HARC) grant, organizing materials collected by the organization over a span of approximately five decades. Despite being a volunteer-run organization, we wanted to ensure that we could make our records, largely generated by self-advocates with Intellectual and Developmental Disabilities, accessible to the public. Of particular significance is the archive’s Speaking for Ourselves (SFO) collection, which documents the critical role of self-advocacy organization Speaking for Ourselves in the disability rights movement emerging from state institutions in Pennsylvania. By centering self-advocates’ political struggle, agency, expertise, and vision for the future, the archive challenges dominant narratives of Disability, medicalization, and victimhood.

Our process required that we confront power imbalances and collective trauma surrounding documentation and consent. In our previous archival research and practice, we noticed that academic and state archives privileged the voices of medical and institutional authorities. Disabled individuals, especially people with Intellectual and Developmental Disabilities, were excluded from authoring knowledge, instead cast as subjects of research. Our collections contained a unique perspective that was absent from more “traditional” state and academic archives, acknowledging People with Intellectual and Developmental Disabilities as originators of our records and contributors to social change.

After over a year of sitting in on community board meetings and listening to requests from the community for long-term preservation and digital accessibility, we applied for funding through the Council on Library and Information Resources’ “Digitizing Hidden Collections: Amplifying Unheard Voices” grant program, and were awarded funds to run a two year digitization project. As grantees of this program, we seek to digitize and make accessible over 9,000 items within our collections.

In order to keep us accountable to our mission and the community’s requests, we have implemented a series of strategies to aid us in our intervention. To us, this means prioritizing audio and visual material for digitization. We recognize that many of the self-advocates who  originated the materials in our collections did not participate in traditional forms of communication or written record keeping, instead relying on dictation, audio and video recording, and other accessible modes of knowledge sharing and creation In an attempt to preserve original voice, we are engaging in consultation with surviving advocates depicted in our materials. We are also planning for quarterly access consultations with users across disability communities. These consultations will allow us to continually integrate feedback and ensure that the open access metadata and digital exhibits generated through this project are accessible to the widest possible user base.

Planning this project required us to reimagine what our role as archivists might look like. We understand that the self-advocates who originated and are depicted in the collections have had much of their lives documented without consent in the form of institutional medical records. To avoid replicating similar harm, we have decided to forgo a traditional “donor” model, in favor of a “stewardship agreement.” The terms of this agreement allow our archive to take actions necessary for preservation, make collections publicly available, and provide open access to associated metadata, with the understanding that the physical materials are property of Speaking for Ourselves.

We are documenting each step of our process so that we can contribute findings from this project to the larger Disability archival community, so that it may be replicated and expanded upon. Additionally, we designed the project to prioritize intergenerational collaboration, bringing younger members of our community in conversation with the historical context for our current struggles and learning from the voices of Disabled elders. Over the next two years, we are engaging in an iterative learning process, in hopes of fulfilling the request for digital access and making the perspectives and activism of self-advocates known to a wider audience.

Interrupting Erasure: How Archivists Protect the Future

Critically rethinking power and ownership requires institutions and practitioners to develop new methodologies that recognize diverse community members as primary stakeholders, and archives as sites of evolution and growth. This involves moving beyond inclusion toward genuine power-sharing and decolonial praxis. The archive becomes a site of active creation and a path to alternative “ways of knowing,” which in turn resist the creation of “deathworlds.” In “Out of the Dark Night: Essays on Decolonization,” Achille Mbembe affirms that

humanity is to be made to rise [faire surgir] through the process by which the colonized subject awakens to self-consciousness, subjectively appropriates his or her I, takes down the barrier, and authorizes him- or herself to speak in the first person. This awakening and appropriation aim not only at the realization of the self, but also, more significantly, at an ascent into humanity, a new beginning of creation, the disenclosure of the world. (62)

Many Disabled self-advocates have communicated distrust toward record-keeping solely dictated by medical and legal authorities. Disability community archives can demonstrate one avenue for grassroots preservation efforts that maintain disabled people’s ownership/autonomy over their narratives of survival and liberation, challenge dominant medical narratives that have historically justified confinement, and resist archival death.

The formation of the archival profession and of many LAM institutions are tied to colonial wealth and power. Even after constant re-evaluation of practices and strategic upheaval, LAMs come in contact with a different type of “deathworld” on a daily basis, and participate in decisions that determine what – and who– is remembered. As librarians, archivists, and museum professionals, we are responsible in part for making difficult choices with the collections we steward. We are also responsible to the communities depicted in those records. Speaking of Zola’s writings on medicalization, he cautions that “not only is the process masked as a technical, scientific, objective one, but one done for our own good” (502). Though not medical professionals, information workers risk modeling a similar attitude toward disability in our work, and participating in archival erasure under the guise of objective processes. Even as the risk landscape changes, we must recognize that our roles are not neutral. To combat erasure, we must take an active role in interrupting the weaponization of memory against the most marginalized. Interruption can take many forms, and requires ongoing reflection and adaptation. 

Drawing on the practical work of building and sustaining the PMPA Community Archive, we are energized in imagining how LAM professionals can engage in active forms of resistance. The sustainability of grassroots archival work depends on the active partnerships between individuals, communities, larger institutions, and solidarity networks across libraries, archives, and museums. Without webs of mutual aid and professional collaboration, we all remain vulnerable to the same political pressures that have compromised larger institutions.

Dealing with histories of medical violence mandates space to grieve. Leah Lakshmi Piepzna-Samarasinha invites grief as an active part of the process, arguing

that feelings of grief and trauma are not a distraction from the struggle. For example, transformative justice work—strategies that create justice, healing, and safety for survivors of abuse without predominantly relying on the state—is hard as hell! What would it be like if we built healing justice practices into it from the beginning? (42)

The stakes of this work extend beyond professional practice. It challenges all of us to consider whose lives matter in the American memory. The PMPA Community Archives imagines community-controlled historical preservation as a form of active survival and maintains originators’ ownership over their narratives of resistance and liberation—narratives that are urgently felt when policies seek to industrialize age-old structures of abuse and reintroduce the ‘legal removal’ of disabled people from the public eye.


Acknowledgements

We are deeply grateful to our Lead Pipe editors, Jess Schomberg and Pam Lach, whose feedback and support helped shape this article. We are especially grateful to our reviewer, Gracen Brilmyer, whose work we greatly appreciate and respect. This work emerged from and belongs to the community of self-advocates and activists who have shaped the Pennhurst Memorial & Preservation Alliance. Their decades of organizing, documenting, and demanding recognition created the conditions for this scholarship to exist. Their lived expertise, activism, and commitment to truth-telling made this research possible. We are grateful to be entrusted with carrying forward their labor of memory and resistance. Any insights here reflect their collective work.


Works Cited

Assistant Secretary for Public Affairs (ASPA). “Secretary Kennedy Appoints New Interagency Autism Coordinating Committee to Advance Fight Against Autism.” HHS.Gov, 28 Jan. 2026, www.hhs.gov/press-room/hhs-kennedy-appoints-new-interagency-autism-coordinating-committee.html.

Black, Edwin “America’s National Biology,” in War Against the Weak: Eugenics and America’s Campaign to Create a Master Race Dialog Press, 2012, pp. 21-42.

Brilmyer, Gracen, and Lydia Tang, editors. Preserving Disability: Disability and the Archival Profession. Library Juice Press, 2024.Disability Archives Lab, disabilityarchiveslab.com.

Duff, Wendy, et al. “Investigating the Impact of the Living Archives on Eugenics in Western Canada.” Archivaria: The Journal of the Association of Canadian Archivists, 2019, archivaria.ca/index.php/archivaria/article/download/13701/15099.

Eugenics and its Afterlives, www.antieugenicscollective.org.

Fricker, Miranda. Epistemic Injustice: Power and The Ethics of Knowing, Oxford University Press, 2007.

FromSmallBeginnings, www.fromsmallbeginnings.org.

“Hidden Collections.” CLIR, 28 Oct. 2025, www.clir.org/hiddencollections.

Hobart, Hi‘ilei Julia Kawehipuaakahaopulani, and Tamara Kneese, editors. Radical Care: Survival Strategies for Uncertain Times. Duke University Press, 2020.

Ives-Rublee, Mia, and Casey Doherty. The Trump Administration’s War on Disability. Center for American Progress, www.americanprogress.org/article/the-trump-administrations-war-on-disability.

Johnson, Merri Lisa and Robert Mcruer. “Cripistemologies: Introduction.” Journal of Literary & Cultural Disability Studies 8 (2014): 127 – 147

Linker, Beth. “On the Borderland of Medical and Disability History: A Survey of the Fields.” Bulletin of the History of Medicine, vol. 87, no. 4, Dec. 2013, pp. 499–535, https://doi.org/10.1353/bhm.2013.0074.

Living Archive on Eugenics, www.eugenicsarchive.ca.

Malley, Bridget. “Documenting Disability History in Western Pennsylvania.” The American Archivist, vol. 84, no. 1, 1 Mar. 2021, pp. 13–31, https://doi.org/10.17723/0360-9081-84.1.13.

Mbembe, Achille. “Necropolitics.” Public Culture, vol. 15, no. 1, 1 Jan. 2003, pp. 11–40, https://doi.org/10.1215/08992363-15-1-11.

Mbembe, Achille. Necropolitics. Duke University Press, 2019.

Mbembe, Achille. Out of the Dark Night: Essays on Decolonization. Columbia University Press, 2021.

Mbembe, Achille. “Achille Mbembe: Planetary Politics for All Creation.” Interview by Noema Magazine. Noema Magazine, 11 Jan. 2022, Noema Magazine.

NPR. “The Supreme Court Ruling That Led to 70,000 Forced Sterilizations.” Fresh Air, 7 Mar.
2016, www.npr.org/sections/health-shots/2016/03/07/469478098/the-supreme-court-ruling-that-led-to-70-000-forced-sterilizations.

Pennhurst Memorial & Preservation Alliance, preservepennhurst.org.

Pennsylvania State Archives. Historical & Archival Records Care Grant Program, Commonwealth of Pennsylvania, www.pa.gov/services/phmc/apply-for-the-historical—archival-records-care-grant-program.

Piepzna-Samarasinha, Leah Lakshmi. Care Work: Dreaming Disability Justice. Arsenal Pulp Press, 2021.

Schreier, H. & Berger, L. (letter), ‘On medical imperialism’, Lancet, 1:1161, 1974.

Schweik, Susan M. The Ugly Laws: Disability in Public. New York University Press, 2010.

Sins Invalid. “10 Principles of Disability Justice.” sinsinvalid.org/10-principles-of-disability-justice.

Speaking For Ourselves, speaking.org/.

Turda, Marius “Legacies of Eugenics: Confronting the Past, Forging a Future,” Ethnic and Racial Studies 45, no. 13, 2022, pp. 2470–77, https://doi.org/https://doi.org/10.1080/01419870.2022.2095222.

United States, Executive Office of the President [Donald Trump]. Executive Order 14253: Restoring Truth and Sanity to American History, 28 March, 2025, www.whitehouse.gov/presidential-actions/2025/03/restoring-truth-and-sanity-to-american-history/.

United States, Executive Office of the President [Donald Trump]. Executive Order 14321: Ending Crime and Disorder on America’s Streets, 24 July 2025, www.whitehouse.gov/presidential-actions/2025/07/ending-crime-and-disorder-on-americas-streets/.

Weinberg, Hannah. “Tracking the Trump Administration’s Attacks on Libraries.” American Libraries Magazine, 1 May 2025, americanlibrariesmagazine.org/2025/03/19/tracking-the-trump-administrations-attacks-on-libraries.

Wendell, Susan “Toward a feminist theory of disability.” Hypatia, 1989.

White, Kelvin L., and Anne J. Gilliland. “Promoting Reflexivity and Inclusivity in Archival Education, Research, and Practice.” The Library Quarterly, vol. 80, no. 3, July 2010, pp. 231–248, https://doi.org/10.1086/652874.

White, Sara. “Crippling the Archives: Negotiating Notions of Disability in Appraisal and Arrangement and Description.” The American Archivist, vol. 75, no. 1, Apr. 2012, pp. 109–124, https://doi.org/10.17723/aarc.75.1.c53h4712017n4728. Zola, Irving Kenneth. “Medicine as an Institution of Social Control” The Sociological Review, 20: 487-504. https://doi.org/10.1111/j.1467-954X.1972.tb00220.x

2026-04-15: ACM Capital Region Celebration of Women in Computing (CAPWIC 2026) Trip Report / Web Science and Digital Libraries (WS-DL) Group at Old Dominion University



This year the ACM Capital Region Celebration of Women in Computing (CAPWIC 2026) was held from March 27–28 as an in-person event. The conference took place in Alexandria, Virginia, and the event was hosted by the Virginia Tech's Institute for Advanced Computing (IAC). CAPWIC is all about bringing together women in computing and their peers to support each other and grow in the field. The conference connects students, faculty, and industry professionals from across the Capital Region – Pennsylvania to Virginia to share ideas, discuss research, and build a strong, supportive community.


The conference featured workshops, technical talks, flash talks, research shorts, poster presentations, as well as panel,  birds-of-a-feather and keynote sessions. I was the only participant from Old Dominion University's Web Science and Digital Libraries (WS-DL) research group this year, where I presented a research short. The event included parallel sessions across various categories and topics, and I attended sessions from each category.


Conference Venue: Institute for Advanced Computing (IAC), Virginia Tech, Alexandria, Virginia


Day 1: March 27, 2026 


The first day of the conference began with a campus tour and a graduate/career fair. Day 2 also included the campus tour and graduate/career fair for those who missed. This was followed by opening remarks and dinner. Next, the first keynote was delivered, and the day concluded with closing remarks for Day 1.


Campus Tour: Drone Display and Immersive Visualization Lab Visit


The Institute for Advanced Computing (IAC) of Virginia Tech is a research institute located at Alexandria, Virginia. The institute offers hands-on learning opportunities for graduate students in computer science and computer engineering. Specialized labs are available at the institute for research in immersive visualization, drone technology, wireless, quantum, and brain-inspired computing systems. We got the opportunity to visit the Drone Lab and Immersive Visualization Lab. 


The Drone Lab featured an indoor drone cage used to conduct flight experiments in a controlled and safe environment. The lab team introduced us to the fundamentals of unmanned aerial system technology and shared insights into their ongoing research. One of the interesting discussions was about how they are trying to detect commercial off-the-shelf (COTS) drones, which can be used for attacks or unauthorized surveillance. They also gave us a chance to fly drones inside the cage, which was a really fun and thrilling experience.


The Immersive Visualization Lab provided immersive projection on three walls and the floor, allowing users to be fully immersed in visual representations of data and other phenomena. We had the opportunity to experience a virtual walk through a beautiful garden, which felt truly magical. It was amazing to see how visual design and 3D modeling can bring environments to life and let us explore places we would not normally be able to experience in person.


Graduate/Career Fair

The sponsors of the event organized a graduate/career fair for the attendees. There were representatives from ACM Women in Computing (ACM-W), Virginia Tech’s Computer Science Department, Virginia Tech’s Sanghani Center for Artificial Intelligence & Data Analytics, University of Mary Washington's College of Business, Northeastern University’s Khoury College of Computer Sciences, and Women in CyberSecurity (WiCys). They shared information about graduate programs and career opportunities in research and academia, and also provided valuable feedback on resumes. I had the opportunity to interact with several representatives, which helped me better understand potential career paths in academia and research. 


Opening Remarks

After the campus tour and graduate/career fair, the conference began with the opening remarks from the organizers. The head of Virginia Tech’s computer science department, Christine Julien, welcomed everyone to the conference. The organizing chairs, Sehrish Basir Nizamani from Virginia Tech and ODU alumna (PhD, 2004) Mona Rizvi from James Madison University, provided an overview of the tracks for each category and schedule of the conference. The program included 2 panels, 5 workshops, 8 technical talks, 12 flash talks, 22 research shorts, 43 posters, and 1 birds-of-a-feather session, all conducted across parallel sessions.  


Keynote #1: Tools in Your Toolbox: What I've Learned as a Professional Female Computer Scientist 

Christine Julien introduced the first keynote speaker Laurian Vega, a Senior System Engineer at Booz Allen Hamilton. She shared her skills and expertise that she developed throughout her career as a female computer scientist. One of the key takeaways from the keynote was that no effort is ever wasted as long as you learn something from it. The speaker talked about how important it is to invest in soft skills and to build strong networks. She also encouraged us to choose workplaces that align with our values and treat us well. A point I found especially meaningful was that mental health is just as important as physical health. The speaker also emphasized caring about the work we do and using our skills to give back to the community. Finally, she highlighted that a PhD degree is not about becoming an expert in everything, but about continuously learning and growing. Overall, the talk highlighted that success in computing is not just about technical ability, but about developing a balanced toolkit that supports both personal and professional growth. 


Day 2: March 28, 2026 


Day 2 started off with breakfast, followed by the second keynote. Before the lunch break, two parallel sessions were held. During the first session, I attended a workshop and a technical talk. In the second session, I attended flash talks and another workshop. After the lunch break, there were two more parallel sessions. I attended a panel and a poster session for the third one and research shorts from the final one.


Keynote #2: Goodbye Imposter, Hello Winner: Overcoming Perceptual Expectations to Reclaim Excellence

Erika Olimpiew from Virginia Tech introduced the second keynote speaker Candace Aku, a Senior Technical Program Manager at Google Public Sector. She shared her journey in the tech industry and the process of creating a professional identity beyond others’ expectations. The speaker emphasized on the importance of not limiting one’s potential before even starting a career, and of reflecting on whether actions are driven by personal goals or others’ expectations. The speaker also highlighted how constantly chasing the “next” can lead to burnout, reminding the need to prioritize well-being. The discussion on the weight of expectations such as intelligence, imperfection, fear of failure, and judgment was particularly insightful, as these factors often contribute to imposter syndrome. The keynote was highly motivating, encouraging individuals to embrace their identity, challenge limiting beliefs, and grow without sacrificing their well-being. 


Workshop: Debugging Your Resume

Aubrey Baker, an eCommerce Web Developer from Red Van Workshop, and Holly Wilsey, a Video Game Engineer from Purple Basil Games, organized a hand-on workshop on preparing resumes. The session was chaired by Nguyen Ho from Loyola University Maryland. They shared best practices on customizing resumes for various purposes, such as academic work, internships, employment, and volunteer experiences. Participating in this workshop gave me a better understanding of how Applicant Tracking Systems (ATS) screen resumes before they are seen by a human. I learned how small details in formatting and wording can impact visibility, and how to avoid common mistakes that can weaken a resume. The breakout sessions were especially helpful, as we got to review and improve our resumes in a group while receiving useful feedback. 


Technical Talk: Education & Inclusion


Denise D'Angelo, a Transformation Technology Leader at DynamicD Enterprises, presented a technical talk “Designing the AI-Ready Workforce” as part of the Education & Inclusion track. The session was chaired by Mohammed Farghally from Virginia Tech. The speaker offered valuable insights into how to approach AI-enabled work more thoughtfully, especially in the context of hiring. As AI becomes part of our everyday work, traditional ideas about roles and performance are evolving, influencing both opportunities and trust. She explained that being “AI-ready” is not just about knowing how to use AI tools, but about understanding how people and AI systems work together. The talk was very helpful to clearly understand how to prepare for interviews in an AI-driven hiring landscape. 


Flash Talk: Trust, Fairness, and Societal Impact of AI


Mona Rizvi chaired the flash talk session on the Trust, Fairness, and Societal Impact of AI track. 

As the first presenter, Sadia Afrin Mim from George Mason University presented “LLM-Guided Input Generation for Causal Fairness Testing.” Current fairness testing methods in machine learning systems often create unrealistic test cases by ignoring how features relate to each other in real-world situations. To address this limitation, the presenter introduced a new approach that uses large language models (LLMs) to generate more meaningful and context-aware test inputs. 

Next, Arshnoor Bhutani and Mahi Sanghavi from University of Maryland, College Park presented “Data-Driven Exploration of Physiological Factors Perpetuating Bias in Pulse Oximetry Readings for At-Home Use.” They examined bias in pulse oximeters which is used to measure blood oxygen levels and found that skin tone remains an important contributor to this bias. They analyzed the BOLD data set and identified a clear pattern showing that errors increase as skin tone gets darker. Their work aims to better understand these patterns so that corrections can be developed to help reduce health disparities.

Next up, Khoulood Alharthi from Virginia Tech presented “Gender, Culture, and Privacy: Navigating Social Media Concerns in Saudi Arabia.” She explored how privacy concerns on social media are shaped by culture and gender, focusing on users in Saudi Arabia. She emphasized that privacy is not only determined by platform features, but deeply influenced by users’ social and cultural values such as modesty, reputation, and social boundaries. Her work provided valuable insights into how social media platforms and privacy settings can be designed to better align with users’ cultural expectations.

Christopher Parham from Virginia State University presented the next flash talk, “A Trust-Aware, Biometrically-Secured Social Network Using Decentralized Identity Protocols and the Analytic Hierarchical Process for Collaboration.” His talk focused on improving security and trust in online systems by addressing human factors in cybersecurity. He proposed a novel decentralized method that creates one-time biometric features to prevent attacks like replay or misuse of credentials. His work aims to create a more secure, user-friendly authentication framework that supports reliable and trust-based collaboration.

Saanvi Shashikiran from Georgetown University presented the last flash talk, “Understanding State-Level AI Readiness Policy.” She explored how prepared different U.S. states are for adopting AI, focusing on the role of policies, infrastructure, and support systems. She examined five states and analyzed government documents to understand how policies regulate AI use. She discussed how text-based search methods can be used to identify policies relevant to AI readiness, and found that a specific scoring method (BM25) performed most effectively.

 

Workshop: Cyber Hygiene That Sticks: Research in the K-12 Space on Cybersecurity


Deborah Kariuki, an Assistant Teaching Professor at University of Maryland Baltimore County, led a workshop on how cybersecurity in K-12 education can be improved through more interactive and human-centered approaches. She emphasized that relying only on rules and one-time sessions are not enough to strengthen cyber hygiene. She demonstrated how interactive activities through real-world scenarios can help students effectively recognize threats like phishing, password safety, and data-sharing online. She also talked about the broader efforts of organizations like WiCyS in actively promoting cybersecurity education and awareness, helping students build confidence and lasting digital safety habits. 


Panel: Navigating the Path to Grad School: Discuss, Reflect, and Make an Informed Decision


Mohammed Seyam from Virginia Tech moderated a panel session that provided a reflective perspective on what it is truly like to pursue an advanced degree and how to decide whether it is the right path. The panel featured Madison Barton, a graduate admission counselor at Northeastern University, Mohammed Farghally, a collegiate assistant professor at Virginia Tech, Promise Owa, a graduate student at Northeastern University, and Chandani Shrestha, an Assistant Professor at James Madison University. The panelists shared their own journeys, including their uncertainties, key turning points, and lessons learned, while addressing common questions such as –


  • Does taking a year off or doing a job before starting grad school have any impact?

  • What are the key factors to choose grad school? Funding? Research facilities?

  • How much research interest matters to sustain throughout grad school?

  • What are some challenges for women in CS grad school?

  • How to deal with imposter syndrome during grad life?


The session was very encouraging for the participants to think intentionally about their goals, interests, and readiness for grad school. 


Posters: Cybersecurity, Privacy & Responsible AI


Jessica Zeitz from University of Mary Washington chaired the poster session on the Cybersecurity, Privacy & Responsible AI track.


Fairuz Nawer Meem from George Mason University presented two posters. One poster titled “Hope or Hype? Understanding Vibe Coding through Software Practitioner Discussions” showed analysis on online discussions to understand how developers’ opinions about “vibe coding” changed over time. Another poster, “Well-Being in AI-Assisted Software Development,” showed an experimental study on how using AI tools affects developers’ stress, emotions, and overall well-being while coding, compared to coding without AI. Sadia Afrin Mim, also from George Mason University, presented the poster “Towards Practical Discrimination Testing for Software Systems,” discussing the evaluation of a user study that fairness-specific tools, along with AI support, help developers find and understand bias in software more easily. 


Min Zhang from Virginia Tech proposed a way for smaller local AI models to get help from powerful remote models while protecting sensitive data in her poster “PrivacyR1: Privacy-Preserving Collaborative Reasoning in Multi-Agent Systems.” Jennifer Alexandra Thompson, also from Virginia Tech, presented the poster “Exploring Socioeconomic Status Narratives of Computer Science Students,” exploring how a student’s socioeconomic background affects their access to technology and success in computer science education. Another student from Virginia Tech, Kimberly Giordano presented the poster “Beyond the Android Manifest: Analyzing Native Libraries and Eye-Tracking Use in Virtual Reality Applications,” showing an analysis on how VR apps use eye-tracking data based on Android Manifest evaluation and native code inspection. She found that some apps may access data without clearly informing users. 

Rebecca George from the College of William & Mary presented performance evaluation of a new storage system (DAOS) on a large supercomputer and identified the best ways to optimize data reading and writing for faster performance in her poster “Benchmarking DAOS Filesystem on Aurora.” Zahra Rizvi, also from the College of William & Mary, presented the poster “Bridging the AI Education Gap: A Self-Funded AI Awareness Initiative in Cocoa-Farming Villages of Ghana,” introducing an initiative that teaches basic AI concepts to students in rural Ghana and aims to expand access to AI education in underserved communities.  

Susan Zehra, a PhD student and senior lecturer from Old Dominion University’s CS department, presented her poster “Securing Vehicular Ad Hoc Networks (VANETs) Against Cyber Threats,” proposing a decentralized security system to protect vehicle communication networks and showing it can effectively detect and prevent cyber attacks.


Research Shorts: Cybersecurity, Trust & Resilience

Nareman Hamdan from James Madison University chaired the research short session on the “Cybersecurity, Trust & Resilience” track.

The first presenter, Stephanie Travis from Virginia Tech, presented “Identifying Human Factors in Red Teams for Cyber Exercises.” She focused on making cybersecurity training more realistic by considering how real attackers think and behave. She studied existing works and gathered insights from experts to create a set of human behavioral factors to incorporate in cyber defense exercises in simulations. She found that by improving how red teams simulate attacks, the training can better reflect real-world situations.

Next, I presented “Framework for Finding Attribution of Social Media Screenshots.” Sharing screenshots on social media platforms is now common. I pointed out legitimate reasons why people share screenshots, such as to enable cross-platform sharing, to use as evidence for deleted posts etc. Then, I showed how people can create fake tweets easily and share such screenshots on social media platforms. Next, I demonstrated different ways the live web and web archives can be used to find attribution of screenshot content. I emphasized using web archives to find attribution of deleted posts since they cannot otherwise be found on the live web. Lastly, I shared my evaluation results of the automated process of how one can find attribution of a screenshot using the Wayback Machine.


Next up, Xinyi Zhang from Virginia Tech presented “From Vulnerable to Resilient: Examining Parent and Teen Perceptions on How to Respond to Unwanted Cybergrooming Advances.” Cybergrooming is a harmful online behavior that can affect teens’ mental health and physical safety. The presenter studied how teens and parents react to different scenarios and identified behaviors that can either increase risk or help protect against harm. By analyzing these responses, she developed patterns of both vulnerable and protective actions to better support teens through education and tools that would encourage safer online behavior. 

Yeana Bond from Virginia Tech was the last presenter and discussed improving how metadata-related bugs are detected in Java applications in “Towards Large Language Model-Powered Automation of Detecting Metadata Related Bugs.” Misuse of metadata can cause severe issues in Enterprise Applications written in Java, so her goal was to make debugging metadata problems more efficient with the help of AI. By comparing different AI models, she found that newer models produce more accurate and complete rules. 


Keynote #3: Human-Centered Automation: A Journey through HCI, AI, and the Future of Robotics

Jessica Zeitz from University of Mary Washington introduced the final keynote speaker Meg Dickey-Kurdziolek, a UX Lead/Senior Staff UX Researcher at Intrinsic.ai. She shared her journey from her PhD at Virginia Tech to her current role at Intrinsic.ai, and how her understanding of human-centered design has changed along the way. She provided valuable insights into the evolving role of UX in the age of AI and robotics. She discussed the challenges of making complex robotic systems more user-friendly. She talked about how Explainable AI (XAI) helps people better understand and trust these systems as AI becomes part of our everyday life. In summary, the speaker highlighted how HCI principles can guide the future of automation and provided useful insights into navigating this rapidly evolving field. 


Closing Remarks and Award Ceremony


The conference concluded with acknowledgments to the sponsors and a vote of thanks to all participants and organizers, followed by an award ceremony. The organizing committee delivered closing remarks and announced that CAPWIC 2027 will be held at James Madison University in Harrisonburg, Virginia. They also introduced the organizing chairs for the upcoming conference. Next, the awards for ‘Best Research Short’, ‘Best Flash Talk’, and ‘Best Poster’ were announced in both graduate and undergraduate categories, along with honorable mentions for each category.

I was delighted to receive the ‘Best Research Short’ award in the graduate category for “Framework for Finding Attribution in Social Media Screenshots.” 


The awardees for Best Flash Talk and Best Poster (graduate category) are listed below:

  • Best Flash Talk – “Benchmarking and Advancing Generative Models for Calorimeter Shower Simulation” by Farzana Yasmin Ahmad from the University of Virginia
  • Best Poster – “Well-Being in AI-Assisted Software Development” by Fairuz Nawer Meem from George Mason University


Wrap-up


CAPWIC indeed provides a supportive and encouraging platform for sharing ideas while fostering meaningful opportunities for both personal and professional growth. This was my first time attending the CAPWIC conference in-person. It was a great opportunity for me to connect with researchers, students, and professionals across different areas of computing. I would like to express my sincere gratitude to ODU ACM-W for providing travel support to attend this conference. I also had the wonderful opportunity to stroll through one of the oldest areas in the U.S. – Old Town, Alexandria. I was mesmerized by the brick sidewalks, cobblestone streets, historic townhouses, cherry blossoms in bloom, and the beautiful sunset views along the waterfront. It was refreshing to relax after a full day at the conference.



Previous trip reports for CAPWIC by WS-DL members: 2025, 2015.


---- Tarannum Zaki (@tarannum_zaki)

 

Finding AI Learning resources for Library Professionals / Artefacto

We’ve recently added a new starter curriculum for AI to our Libraryskills.io platform – a space dedicated to highlighting and signposting great, free learning resources for and by library professionals.  AI is one of the most talked about topics in libraries right now. And it has particular relevance for library and information professionals for a [...]

Continue Reading...

Source

Angels in America / David Rosenthal

I have wanted to write this post for a long time, but I was waiting until I could visit the invaluable Royal National Theatre Archive to check my memory of their early productions. It doesn't look like I'll be in London any time soon, and I have the time now to write a long post about a long play, so here goes.

Growing up in London meant that theatre has always been an important part of my life. I have seen a great many plays including some legendary performances and magnificent productions, such as Royal National Theatre's 2014 King Lear. One of my particular theatrical interests is long-form plays. Highlights of this genre have included:
Play Text
But there is one such play that is very special to me, Tony Kushner's 7+ hour Angels in America. It is clearly among the greatest plays of the 20th century. I was there at the beginning, and I have seen many productions since. Below the fold I recount my history with this masterpiece.

Introduction

Anyone interested in this play should read both the text of the two halves, Millennium Approaches and Perestroika, and Isaac Butler and Dan Kois' magisterial and comprehensive oral history, The World Only Spins Forward: The Ascent of Angels in America. Because my story starts in 1991 I have used both to refresh my memory. Below the many quotes without links are from Butler and Kois, to whom I owe a debt of gratitude. I also viewed the National Theatre's 2017 production on the National Theatre at Home streaming service.

When I moved to the Bay Area in 1985 It was a decade since I'd lived in London and I was starved of theater. So I went a bit nuts and over the next few years subscribed to American Conservatory Theater, Berkeley Repertory Theatre, the Magic Theater and the Eureka Theater.

Eureka Theatre (1991)

The story of the play starts with a $50,000 grant from the National Endowment for the Arts for Tony Kushner to write a "two-hour play, with songs" for "five gay men and an angel" that the Eureka would produce. In 1989 the play was developed and in 1990 workshopped at the Mark Taper in Los Angeles.
KUSHNER: I wrote the part of Harper for Lorri Holt, Hannah for Abigail Van Alyn, Sigrid [Wurmschmidt] was the Angel. And Jeff King, I wrote the part of Joe for him. And that took care of the Eureka company. My first year at NYU, I became friends with Stephen Spinella. I thought then, as I think now, that he was one of the most remarkable actors I'd ever met, and I loved writing for him, and so I wrote Prior Walter for him.
As a subscriber to the Eureka I had responded to their call for donations to stage Angels in America in their next season, so I was anxious to see it. By the time it arrived at the Eureka it had evolved into two long plays with five gay men, two women, and angel and no songs.

I believe I saw Millennium Approaches the weekend after it opened, and Perestroika the following weekend. The cast was different from that at the Mark Taper. Rick Frank (Roy) and Sigrid Wurmschmidt (Angel) had both died, and Lori Holt had a new baby. It was:
  • Hannah: Kathleen Chalfant
  • Roy: John Bellucci
  • Joe: Michael Scott Ryan
  • Harper: Anne Darragh
  • Belize: Harry Waters Jr.
  • Louis: Michael Ornstein
  • Prior: Stephen Spinella
  • Angel: Ellen McLaughlin
The Eureka was staging Millennium Approaches, a four-hour play full of scene changes and magic, with almost no money. So another abiding memory is that they got this enormous impact with an incredibly stripped-down production:
[Ellen] McLAUGHLIN: Not that many people saw the Eureka version of it, but it was very important to those who did. I think there was a kind of beauty to the hammer and nails and spit and Scotch tape quality of that first version. It was moving because we had nothing.
In some ways it reminded me of the San Francisco Mime Troupe's annual free shows in parks around the Bay Area. The same quality of conspiring with the audience's imagination:
KATHLEEN CHALFANT: It was in some ways the most beautiful version of the play, and the most Poor Theater version of the play.
[Dennis] HARVEY: They basically had a giant shower curtain in front of the stage. For scene transitions they would just whip the shower curtain across, one actor at the front and one at the back, and when they got the other side it would be a new scene.
KUSHNER: To this day no one has ever done better with the magic. David [Esbjornson] is incredibly clever designing and building gizmos, so every magic trick in the play, David figured out a way to do it. There was no money or anything. He built all this shit — it was incredible.
DEBORAH PEIFER: That sense of amazement of a book popping up out of the floor in flames, all done with lighting.
KUSHNER: He did it all with bungee cords.
My most abiding memory of that first part was walking out of the theater to my car after midnight realizing I had seen the birth of a masterpiece. Theater critic Deborah Peifer sums up my reaction:
PEIFER: I have never in my life seen a situation in which people did not leave the theater during the intermission unless they had to. And I'm not talking about Can I get a cup of coffee? but Can I make it through the next act without a bathroom break? People could not bear to be out of that theater while this thing was happening.
To call this a brilliantly realized, profoundly funny, wickedly thoughtful piece of theater is to discover the severe limitations of language. I find myself wanting to say simply, it's more than I ever imagined. This is an experience in the theater you will remember for your whole life.
Deborah Peifer, Bay Area Reporter, May 30 1991
Perestroika was even more stripped-down, little more than a staged reading:
KUSHNER: Originally, every act of the five acts of Perestroika started with a clown scene set in the Soviet Union. These ended up being the first five scenes of my play Slavs! [1994].
ESBJORNSON: I used the five Bolsheviks as curtain raisers. I made the actors hold the scripts in hand while they moved around. And then at one point in each act, they laid down their scrips and acted out what I considered to be the central point of that act.
It wasn't just that there were five acts, but each of them was rather long. Butler and Kois' description of the first night matches my later recollection of how long it was:
[Brian] THORSTENSON: It got to the scene between Hannah and Prior where Prior's in the hospital and Prior says "I've always depended on the kindness of strangers." They finished the scene and the audience erupted into this ... applause ... I think it lasted a good five minutes. Kathleen and Stephen looked out at the audience, like, What is going on?.
McLAUGHLIN: I came out late into the evening as the Angel wearing the wings and the whole get-up, stood in front of the curtain and said, Act 5: Heaven, I'm in Heaven.
And the woman in the front row said "Act FIVE?! Oh my GOD! DO YOU KNOW WHAT TIME IT IS?!"
And I said "No". Because I honestly had no idea. It's not like I was wearing a watch.
And she said "It's MIDNIGHT, for God's sake! What's going on with the playwright? ACT FIVE? How long is it?
And I said, :We've never done it so I don't know, maybe forty-five minutes?" And she said, "The buses aren't even running anymore! How are we supposed to get HOME?" And she turns to the rest of the audience and says, "Are we going to stay?" And people sort of nodded and mumbled and she says "Well, I guess we'll stay, but I mean really ..."
And then she said, "But that's the end, right? There isn't an Act 6 or something?"
And I said, "Well, there's an epilogue."
And she said, "Oh my GOD, is he NUTS? An EPILOGUE? How long is THAT?"
And I said, "Well, apparently we HAVE TO STAY, but this is RIDICULOUS. TELL HIM HE HAS TO CUT!"
And then I said "Well, the longer we keep talking here ..."
Millennium Approaches was a real play and, despite being over four hours, had the audience in the palm of its hand with rapt attention. Perestroika was really different. Because it was clearly a work-in-progress, the audience felt that they were part of the process of creation, willing the show into existence.

Sometimes at the Berkeley Rep's Ground Floor residency program for new work the teams show their work — an example was Julia Cho's Aubergine which I saw both as a work-in-progress at the Ground Floor and next year in the Rep's season. Even as works-in-progress these shows are way shorter and way more polished than this Perestroika, and there is none of that show's unique, intense audience involvement. Of course, as the Angel notes, this was heightened by the show's length:
McLAUGHLIN: And then after the show, as the actors were basically limping to the dressing rooms, Tony, looking sort of glassy-eyed, came over to us and said, "You know, a really interesting thing happens after and audience has been in the theater for a really long time, they start to lose their bearings and become very malleable. They, like, forget what the think they believe about things and what they do for a living and their names and where they live and ..."
And we were like, "Yeah, Tony, and you really have to cut it."
It was magnificent but it killed its host. Butler and Kois quote the Eureka's business manager:
ANDY HOLTZ: That was the end of the Eureka Theatre as a producing company, The play that cemented the Eureka's place in the history of American theater was also the play that was too epic for such a small company. It's, like, the mom died giving birth to this amazing baby.

Royal National Theatre (1992)

Perhaps the most astonishing thing in the play's whole history is that, apart from a workshop at Juillard, the next production of Millennium Approaches was at the National Theatre in London. At the time, the National Theatre's productions on their two big stages, the Olivier and the Lyttleton, were pretty conservative, as befits the national flagship. But they also had the Cottesloe (now the Dorfman). It is essentially an empty cube, with tiers of seats on two sides. It can be configured in many different ways. For example, for Sing Yer Heart Out For The Lads most of the floor was arranged with tables and chairs, with the audience there being some of the patrons of the pub.

The National Theatre has a history of more adventurous productions in the Cottesloe; it opened with Ken Campbell's Science Fiction Theatre of Liverpool's Illuminatus Trilogy featuring drugs, satanic rituals, blasphemy and nudity. The trilogy later moved to The Roundhouse, which is where I saw this marathon. My main memory was that between the plays meals were served in the lobby. The actors ate with the audience, staying in character.

Nevertheless, Richard Eyre, the artistic director, took a huge risk:
RICHARD EYRE: Gordon Davidson sent me the play and said,, "I think you'd be interested in this". By page 2, I'd decided I wanted to do it.
He chose Declan Donnellan of the Cheek by Jowl theatre company to direct it, and Nick Ormerod, Donellan's partner, to design it. I'd seen several Cheek by Jowl productions at the National Theatre. They did classical plays, so Kushner took them to New York:
DONELLAN: Sometimes when you see images of New York, you think Oh, it's not authentic New York. It's performed New York, from movies and television. But when you get to New York, you find that New York is performing itself. Everybody is ready for their close-up.
ORMEROD: In delis and diners and whatever, they act like New Yorkers they've seen in the movies.
The cast was:
  • Hannah: Rosenmary Martin
  • Roy: Henry Goodman
  • Joe: Nick Reing
  • Harper: Felicity Montague
  • Belize: Joseph Mydell
  • Louis: Marcus D'Amico
  • Prior: Sean Chapman
  • Angel: Nancy Crane
NT's 1993 Angel
David Milling was the stage manager:
DAVID MILLING: The staging was incredibly simple. It was a shiny black floor and a giant American flag as the backdrop. And then in the center of the flag there were small doors for pieces of scenery to run through. Only at the end of the play did the flag split, half going left, half going right, and the Angel tracked through in a cloud of smoke.
I'm sure that the first thing everyone who saw the show remembers is the shock at the end of the Angel bursting through the flag with a huge noise, lots of smoke and a blinding light then announcing:
ANGEL:Greetings, Prophet;
 The Great Work begins;
 The Messenger has arrived.
(Blackout.)
But the start was almost equally memorable:
JON MATTHEWS: It opened with this image, there was nothing on the stage, and the furniture is on the sides, and they're sitting along the sides, and there was this balloon globe, and it had this light inside it, and they all put their hands on it, and then the play began.
Donellan said "My production was very much about the maintenance of tension", and I remember the production as a headlong charge forward:
KUSHNER: Caryl Churchill saw one of the early performances and came up to Declan afterwards and said, "Well congratulations, you've solved the short, choppy scene problem." When you do a play with short scenes, the scene ends, the audience has to disengage from where they've just been, and open themselves up to the next thing. That's hard to do because it involves stopping and starting over and over again. What Declan did is he dovetailed the ends of almost every scene in Millennium. He took the penultimate and the ultimate line, separated them, took the first line of the next scene and put it between the two. So you'd already be in the next scene. He wove them all together.
Donellan could do this because the staging was so sparse that it needed no time for scene changes. The actors carried in whatever props were needed for the next scene, and carried off those from the preceding scene.

It is important to understand both the risk the National Theatre was taking, as an institution supported by the government, and why it was so important, especially to the theatre community:
GARSIDE: The politics of it hit on the right moment. We were having our side of the conservative 1980s with Thatcher and the special relationship with Reagan. There was a kind of resentment of America, a dislike of their politics and how it intersected with our politics. And then there was an audience who hadn't seen a play about gay men and AIDS on a large scale, for whom the play was a revealation.

The big legal fight in gay rights at the time was against someting called Section 28. This was the big thing. It was in effect between 1988 and 2003, and barred the "promotion" of homosexuality.

Royal National Theatre (1993)

The next year both parts opened on Broadway and the National Theatre revived Millennium Approaches and added Perestroika in repertory. For the first time, I saw both parts in one day.
MYDELL: So we opened at the National, and you could see Part 1 and Part 2 in one day. That was seven and a half. People did it! We did it, and people came to see it! It didn't seem like — it felt like it was an event more than a play.
The cast was:
  • Louis: Jason Isaacs
  • Belize: Joseph Mydell
  • Angel: Nancy Crane
  • Joe: Daniel Craig
  • Hannah: Susan Engel
  • Harper: Clare Holman
  • Prior: Stephen Dillane
  • Roy: David Schofield
Part 1 was familiar, but it was the first time I'd seen Part 2 staged. First, seeing them as a seven and a half hour marathon was a revelation. Millennium ends with the mother of all cliff-hangers as the Angel arrives. Resuming the story after a quick meal is completely different from resuming it a week later. Second, Perestroika was very different from my memory of the Eureka. Kushner had done massive rewrites after the Eureka and the 1992 workshop at the Taper in LA:
KUSHNER: I know I haven't got it right yet. I'm not saying I don't think it's good — I think it's always been a good play, Perestroika — but it's never been a finished play and it never ever will be completely finished.
Many people compare the two parts and rate Perestroika as inferior, citing that its a lot more difficult and the fact that Kushner keeps changing it. But this is likely because they have seen it as two separate plays, which is a mistake. I'm pretty sure that people like me who have seen in in a marathon see it as a single play that changes once the Angel arrives. Change is one of its major themes, after all. And it is very Kushner-esque to have the Angel, whose message is to stop change, be the cause of change in the structure of the play as she is in Prior.

Next time I'm in London I plan to visit the Archive and expand these two sections.

American Conservatory Theater, San Francisco (1994)

ACT Program
I saw ACT's production of both halves, I think on successive weekends, but I remember very little about it. It was directed by Mark Wing-Davey, who played the two-headed Galactic President, Zaphod Beeblebrox, in the radio (my favorite) and TV versions (forget it) of The Hitchhiker's Guide to the Galaxy, written by Douglas Adams. I'd been impressed by his production of Caryl Churchill's Mad Forest at Berkeley Rep.

The cast was:
  • Hannah: Cristine McMurdo-Wallis
  • Roy: Peter Zapp
  • Joe: Steven Culp
  • Harper: Julia Gibson
  • Belize: Gregory Wallace
  • Louis: Ben Shenkman
  • Prior: Garret Dillahunt
  • Angel: Lise Bruneau
Dennis Harvey's review noted that:
the director throws action all over the Marines Memorial stage. Kate Edmunds’ set design is dominated by rolling scaffold bridges and graph-patterned backdrops. Their severity suggests a societal infrastructure stripped bare. Huge curtains (one a rather too-obvious American flag), one hydraulic ramp, fully exposed flying rig for the “Angel” (Lise Bruneau), fog, film projection, etc. add to the sensory overload.
This may be one reason it didn't stick in my memory. After stripped-down productions in the Eureka, basically a warehouse, and the National Theatre's Cotttesloe flexible space, the traditional proscenium stage, a more fleshed-out, much flashier staging, and the somewhat distant seating would have been jarring. Indeed, soon after this I stopped subscribing to ACT, only visiting for their excellent productions of Tom Stoppard's plays.

Royal National Theatre (2017)

By dint of waking up very early and standing in line for a long time I got day seats for a marathon of Marianne Elliot's sold-out, extraordinarily impressive production. It was a complete contrast to the earlier version. The cast was:
  • Hannah: Susan Brown
  • Roy: Nathan Lane
  • Joe: Russell Tovey
  • Harper: Denise Gough
  • Belize: Nathan Stewart-Jarrett
  • Louis: James McCardle
  • Prior: Andrrew Garfield
  • Angel: Amanda Lawrence
Joe and Hannah
Elliot's staging was a fascinating way to use the National Theatre's huge resources and the Lyttleton's vast proscenium stage to simulate the original's sparse aesthetic. She used multiple revolves and mostly skeletal scenery that flowed in and out to create small patches of light in the darkness to show, for example, the phone call between Joe and Hannah. Occasionally, as for Harper and Mr. Lies in Antarctica, the whole stage was lit but bare. There was only one scene with the kind of lavish scenery one often sees in the Lyttleton. It was the Council Room of the Hall of the Continental Principalities. Kushner's stage directions for this scene fill multiple pages, and the set needs to contrast Heaven with Earth, so this choice made sense.

One of the most striking and memorable things in Elliot's production was her vision for the Angel:
ELLIOT: Every image you see of this play involves a lovely angel in a white dress on a wire. I didn't want that.
Ben Power, the National's deputy artistic director, explains the Angel's entrance:
POWER: Prior's standing on his bed, as in other productions. The lights are changing. The sound of the approaching object is getting louder and louder. It's extremely loud in the auditorium. The lights change around him and he says, "Very Steven Spielberg".

Everyone's eye are on him and they're also going up to the flies. We know what's about to happen. They're going to fly in a woman with wings. As we're looking, as it's all building to a point of climax. At that point of climax there is a sense of a drop and a full blackout, which is very disorienting.

The lights come up. Everyone's eyes are looking up, looking for what object is coming in through the broken roof. Andrew's looking up there. And there's nothing there. As his eyeline comes down, there, strewn on the floor, among the rubble, is this thing. It's a sort of creature mess in browns and blacks. And then it rises from the floor — it's clearly been dropped from a great height — and coalesces into one body.
Lyra and Armored Bears
The National Theatre has resources that few other theaters do. One is a long history and deep expertise in stage puppetry. This reached a peak with His Dark Materials because in the play's world:
humans' souls naturally exist outside of their bodies in the form of sentient "dæmons" in animal form which accompany, aid, and comfort their humans.
Each actor was accompanied by a puppet of their daemon, manipulated by one or more puppeteers in head-to-toe black. It didn't take long for audience members to stop seeing them. At the end of the second part, all 28 actors came out for their curtain call. And then suddenly the puppeteers all pulled off their black head-dress, and you saw there were more of them than there were actors. And then the backdrop vanished and you saw all the way to the rear wall of the enormous Olivier stage. Standing there were all the stagehands. There were more of them than the actors and puppeteers combined. It was an amazing display of the vast resources the National Theatre can command for a major production.

Angel and Prior
Elliot's Angel was accompanied by a set of black-clad "shadows" like the daemon's. Except when Prior and she were wrestling, the Angel wasn't on a wire but being carried by the shadows. They would scurry around on all fours, sometimes converging on her to lift her up or sweep her massive wings, and sometimes heading off to the back of the set.

It wasn't just the physical resources the National Theatre devoted to the production, it was the time:
KUSHNER: I've never seen a director work as long or as hard on a production. A year of preparation. And you can see that degree — the depth of involvement, it's reflected in the design and in many of the choices she's made.
ELLIOT: We spent about a year and a half on the design. Not every day, but we touched in a lot. And I wished I had longer!
...
ELLIOT: We had eleven weeks, longer than anyone else has had.
KUSHNER: In a way it's the first adequate rehearsal period we've had for these plays.
When the production transferred to Broadway, it won the Tony for the Best Revival of a Play, and Andrew Garfield won for Best Actor and Nathan Lane won for Best Featured Actor. Both performances richly deserved the award.

Berkeley Repertory Theatre (2018)

Program
Berkeley Rep's production was directed by Tony Taccone, who co-directed the Mark Taper workshop with Oskar Eustis, and starred Stephen Spinella, for whom Prior was written and who I had seen at the Eureka, as Roy. So I have seen him play both of the victims of AIDS — his portrayal of sickness is remarkable, as was the contrast between how his Prior and his Roy fought the disease.

The cast was:
  • Hannah: Carmen Roman
  • Roy: Stephen Spinella
  • Joe: Danny Binstock
  • Harper: Bethany Jillard
  • Belize: Caldwell Tidicue
  • Louis: Benjamin T. Ismail
  • Prior: Randy Harrison
  • Angel: Francesa Faridany. Lisa Ramirez
Again, I saw both parts on a single day, as I recall starting at 1pm and ending at 11pm. It was astonishing how well the Rep's, with a regional theater's resources, stood up to the National Theatre's massively resourced production. This huge play can use huge resources, but it does not need them.

Angel and Hannah
Taccone's Angel was clearly influenced by Elliot's, but lacked the shadows. Despite this the Angel's flying, always the most difficult thing to stage, was really well done.

For the first time I got to see the "Roy in Hell" scene, which almost every production omits. It isn't in the published text. Omitting it means Roy's last appearance is when his ghost encounters Joe, a meeting between the play's two doomed characters. Including it, with Roy bargaining for something to do, is a sort of tribute to his drive and contrasts against Joe's spinelessness.

The Berkeley Rep's program had an interview with Spinella, who was initially reluctant to play Roy:
I got a text from Kushner saying — all I really remember is one word — "vital". That Roy is incredibly vital. I had already gone back and read all the Roy scenes, and it really hit me. That's the fun of playing this guy who is dying. He is fighting it tooth and nail. It's this knockdown, drag-out fight with this person who has this incredible will to live. It's different than Prior, who in a way is running away from his own death. Roy is just trying to get his ducks in a row and he's fighting the disease. He loses constantly, yet he keeps coming back. He is unrelenting, and that appeals to me. I'm not going to be in that hospital bed until I am ready to die. The hospital bed is going to have to grab me and pull me into it.
This could have been a quote from Nathan Lane.

The production gained glowing reviews from, among others, the LA Times and the SF Chronicle. For me, as my sixth viewing, seeing the play come back to the Bay Area over a quarter-century after it started here, over a single day, with such a grown-up staging, was a delight.

2026-04-13: The Contemplative Sciences Center (CSC) Sensemaking Symposium 2025: Trip Report. / Web Science and Digital Libraries (WS-DL) Group at Old Dominion University



Introduction – The CSC Sensemaking Symposium 2025

From October 9–11, 2025, the University of Virginia’s Contemplative Sciences Center (CSC), UVA brought together artists, scientists, scholars, and contemplatives from across the world for the Sensemaking Symposium, a two-and-a-half-day immersion into how humans perceive, interpret, and create meaning in an age defined by complexity.

Hosted in the newly opened Contemplative Commons, the event dissolved traditional boundaries, merging research with lived experience. Through sound installations, musical performances, guided contemplations, and cross-disciplinary conversations, participants entered a space where inquiry became sensory, where knowledge was not only discussed but embodied.

The symposium framed sensemaking as both an inner and collective act, linking neuroscience and mysticism, technology and ritual, sound and silence. Across four thematic sessions, Sensemaking, Hearing, Seeing, and Extrasensory, presenters invited the audience to explore the full range of human perception.

Michael R. Sheehy – Director of Research, Contemplative Sciences Center (UVA)

Dr. Michael R. Sheehy delivered the opening keynote, Contemplative Technologies of Human Sensemaking. As Director of Research at the CSC and Research Associate Professor of Religious Studies, Sheehy bridges lived contemplative traditions, especially Tibetan Buddhism, with modern scientific inquiry. His work spans lucid dreaming, Dzogchen meditation, cognitive illusion, and the cultural ecologies of contemplative practice.

Old Dominion University – Mindfulness and Data Class Participants

The Mindfulness and Data class at Old Dominion University examines how contemplative practices intersect with data-driven ways of understanding the world. Under the guidance of Dr. Nicole Willock, students learn to connect inner awareness with analytical thinking by exploring how attention, emotion, and perception shape the interpretation of information. The course blends reflection, discussion, and hands-on engagement with technologies such as physiological sensors, EEG devices, and digital tracking tools. Through this interdisciplinary approach, students gain insight into how mindfulness can enhance critical thinking, empathy, and more ethical, human-centered uses of data.

As part of their learning journey, students from the class attended the Contemplative Science and Art Conference, where they experienced firsthand how contemplative practice, cutting-edge technology, and creative expression come together. By interacting with tools such as heart-rate monitors, EEG headsets, eye-tracking systems, and immersive art installations, students were able to connect classroom concepts with real-world applications and deepen their understanding of how science can illuminate the inner dimensions of human experience.




Dr. Nicole Willock is a Professor at Old Dominion University whose teaching bridges mindfulness, religion, and cultural studies. She leads the Mindfulness and Data class and guided her students to the symposium to experience how science, art, and contemplative practice converge.


Lawrence Obiuwevwi is a doctoral student in Computer Science at Old Dominion University. His research focuses on emotion sensing, physiological signals, and spectrum analysis. He supported the learning experience by helping interpret technologies used at the symposium such as EEG, heart-rate sensors, and eye-tracking systems.


Cora Morgan is a Communication major whose interests center on storytelling, culture, and mindfulness. The symposium expanded her view of how contemplative science connects inner awareness with social expression.


Alexis R. Morel is a Psychology major interested in emotional balance and empathy. Her participation deepened her understanding of how the mind and body interact through attention and awareness.


Araceli Gordus Huizar is majoring in Women’s and Gender Studies with minors in Spanish and Media Studies. She explores identity, culture, and creativity, and found the symposium rich with interdisciplinary insight.


Dabre Ali is an undergraduate with a growing interest in human-centered technology. The symposium exposed him to tools that illuminate invisible dimensions of human experience, emotion, rhythm, and inner stillness.

Session I – Sensemaking

Opening the symposium, this session explored how human perception, art, and science coalesce into new modes of understanding. Led by eco-artist Wolfgang Buttress, philosopher Jelena Markovic, Buddhist studies scholar James Gentry, and techno-artist David Glowacki, the session wove ecological awareness, embodied knowledge, and the aesthetics of complexity.

Moderated by Devin Zuckerman, the conversation grounded the symposium’s theme: that sensemaking is both analytical and aesthetic, bridging data and devotion, science and spirituality.

Wolfgang Buttress – Eco-Artist, United Kingdom

Eco-artwork 1Eco-artwork 2

Dr. Wolfgang Buttress brought an artistic and ecological dimension to Session I: Sensemaking, illuminating how sound, light, and structure can translate the intelligence of the natural world into human experience. A celebrated British sculptor and installation artist, Buttress is internationally known for creating large-scale, data-driven works that merge art, science, and environment. His most renowned installation, The Hive, originally commissioned for the UK Pavilion at Expo 2015 and now a permanent piece at Kew Gardens, uses live sensor data from a real beehive to generate light and sound, allowing viewers to feel the hum of a living colony.

His recent UVA installation, NINFEO, continued this exploration by immersing participants in a responsive landscape of light and resonance. Through collaborations with scientists, architects, and musicians, Buttress transforms empirical data into sensorial art, revealing the hidden rhythms of ecosystems and reminding audiences that perception itself is an act of ecological relationship. His work stands as a meditation on interconnection, an artistic invitation to listen to the world’s subtle frequencies and rediscover the harmony between human sensemaking and the planet’s living pulse.

Jelena Marković – Philosopher, Université Grenoble Alpes, France

Philosophical artwork

Philosopher Dr. Jelena Marković offered a deeply introspective counterpoint to the artistic and scientific discussions, inviting participants to reflect on how thought itself becomes embodied. A scholar whose work bridges philosophy of mind, cognitive science, and affective experience, Marković explores how attention, emotion, and grief transform the self and shape our perception of the world.

Currently a post-doctoral fellow at Université Grenoble Alpes and a member of the Centre for Philosophy of Memory, she examines transformative experiences, moments such as loss or wonder that reorganize our sense of being, and how affect biases attention and meaning-making. Her research also extends into performance and art-based philosophy, using creative forms to investigate cognition and embodiment. At the symposium, her contribution grounded the dialogue in phenomenology and emotional depth, revealing that sensemaking is not only a cognitive act but a lived process through which feeling, memory, and awareness co-construct reality.

James D. Gentry – Buddhist Studies Scholar, Stanford University

James D. Gentry portraitTibetan Buddhism objects

Dr. James D. Gentry offered a historical and contemplative lens on how meaning emerges through embodied ritual and sensory practice. An Assistant Professor of Religious Studies at Stanford University and a leading scholar of Tibetan Buddhism, Gentry studies the material, ritual, and visual cultures that shape Buddhist experience. His acclaimed book, Power Objects in Tibetan Buddhism: The Life, Writings, and Legacy of Sokdokpa Lodrö Gyeltsen, explores how objects such as relics, amulets, and ritual implements become vehicles of perception and transformation.

His broader research investigates how sound, sight, and touch function as technologies of enlightenment within Himalayan traditions, and how these sensory frameworks can dialogue with contemporary understandings of consciousness and materiality. At the symposium, Gentry’s reflections bridged ancient contemplative knowledge and modern philosophical inquiry, reminding the audience that sensemaking, whether through ritual or research, is always an embodied and relational act of seeing, hearing, and touching the sacred in everyday life.

David Glowacki – Techno-Artist & Scientist, Intangible Realities Laboratory (Spain)

Glowacki VR molecular artDavid Glowacki VR Isness installation

Dr. David Glowacki expanded the boundaries of perception by merging physics, philosophy, and mystical imagination. Founder of the Intangible Realities Laboratory (IRL), Glowacki is a scientist-artist whose work transforms data into immersive, contemplative experience. With a Ph.D. in molecular physics and a background spanning chemistry, literature, and philosophy, he creates multi-sensory VR installations that invite audiences to feel energy, form, and consciousness as living presences.

At the symposium, Glowacki spoke about his ongoing exploration of Tara, the Buddhist embodiment of compassion and luminous awareness, and how archetypes reveal the human capacity to perceive interconnection beyond the material. Through projects like Isness, which allows participants to merge as glowing fields of light in shared VR space, Glowacki bridges scientific and contemplative traditions, where atoms and awareness coexist in the same luminous field.

Session II – Hearing: Sound & Silence

Session II opening image

Sound became the medium through which participants listened to the world anew. JoVia Armstrong drew on her expertise as a percussionist to show how rhythm and reverb shape emotion and meaning. Patrick Finan, a clinical psychologist, described how sonic vibration and music therapy alleviate pain and restore balance to the nervous system. Kythe Heller blended poetry, performance, and theology, while Adam Lobel invited stillness as the acoustic of awareness.

Moderated by sound artist Matthew Burtner, the panel revealed that hearing, whether through silence or resonance, is both a physical and contemplative act, tuning the self to the wider harmonics of life.

JoVia Armstrong – Percussionist, Composer & Assistant Professor of Music, University of Virginia

JoVia Armstrong performing at UVAJoVia Armstrong performance banner

Dr. JoVia Armstrong transformed the symposium space into a living instrument, reminding us that sound is not merely heard, it is felt. A percussionist, composer, and sound artist from Detroit, Armstrong blends Afro-diasporic rhythmic traditions with modern electronic experimentation to explore the emotional and spatial dimensions of listening.

At the symposium, she spoke about the importance of reverb in musical composition, not just as an acoustic effect but as a metaphor for resonance, echo, and memory. In her words, reverb gives sound a body; it situates the listener within space and time, allowing emotion to linger like breath in a room.

Currently an Assistant Professor of Music at the University of Virginia, Armstrong holds a Ph.D. in Integrated Composition, Improvisation, and Technology from UC Irvine. Her performance ensemble, Eunoia Society, experiments with drones, loops, and multichannel environments to create immersive sonic meditations.

Through her work, she bridges rhythm and reflection, tradition and technology, inviting audiences to experience sound as a contemplative process of becoming aware of one’s own presence in the auditory world.

Patrick H. Finan – Clinical Pain Psychologist & Professor of Anesthesiology, University of Virginia

Patrick Finan portraitPatrick Finan presentation banner

Dr. Patrick Finan invited listeners to consider sound as medicine, a bridge between psychology, physiology, and the inner landscape of pain. A clinical pain psychologist and Harold Carron Professor of Anesthesiology at the University of Virginia, Finan’s work explores how sleep, emotion, and reward systems shape the experience of chronic pain.

At the symposium, he discussed the healing potential of sound and music, describing how rhythm and resonance can modulate emotional states and neural activity, providing moments of relief and restoration.

Drawing from research in his lab, including fMRI, sensory testing, and ecological momentary assessment, Finan explained that music’s ability to soothe pain is grounded in psychophysiological synchronization: the body literally entrains to patterns of calm.

His talk framed listening as an act of empathy and self-regulation, suggesting that the future of pain management may rely not only on medication but on cultivating deep, attentive sonic relationships with one’s own body.

Kythe Heller – Interdisciplinary Artist, Poet & Scholar, Harvard University

Kythe Heller portraitFirebird album by Kythe Heller

Dr. Kythe Heller wove poetry, philosophy, and performance into a meditation on the spiritual and sensory dimensions of listening. An interdisciplinary artist and Doctor of Theology from Harvard University, Heller’s work bridges creative practice and contemplative inquiry, exploring how sound, silence, and language become conduits for transformation.

At the symposium, she reflected on the voice as a vehicle of revelation, tracing how resonance and vibration carry meaning beyond words, invoking both the mystical and the material. As founder and director of Vision Lab at Harvard Divinity School, Heller convenes artists, scientists, and contemplatives to explore imagination as a force that reshapes consciousness.

Her poetry collection Firebird and multimedia performances investigate illumination, grief, and transfiguration. Through her presence, Heller invited participants to experience silence not as emptiness but as a vibrant medium, a threshold where self and world meet in shared reverberation.

Adam Lobel – Contemplative Teacher, Ecophilosopher & Founder of 4F Regeneration

Adam Lobel portraitAdam Lobel teaching

Dr. Adam Lobel invited participants to listen beyond the human, to tune into the soundscape of the Earth itself. A contemplative teacher, ecophilosopher, and founder of 4F Regeneration, Lobel works at the intersection of Buddhist practice, ecological awareness, and collective transformation.

He spoke about sound as a bridge between consciousness and the living world, encouraging listeners to recognize hearing as both an ethical and ecological act. Drawing on his background in Buddhist philosophy and decades of contemplative teaching, Lobel suggested that awareness practices can heal the rift between human perception and planetary systems.

His teachings blend meditation, ritual, and ecological activism, creating spaces for embodied reflection. In Charlottesville, his words rang like a dharma bell, reminding participants that every sound, from wind to breath to silence itself, is a pulse in the shared heartbeat of life.

Kelsey Johnson – Astronomer & Professor of Astronomy, University of Virginia

Kelsey Johnson observing starsKelsey Johnson astronomy outreach

Dr. Kelsey Johnson guided participants to look outward, and inward, through the lens of the night sky. A renowned astronomer at the University of Virginia and founder of Dark Skies, Bright Kids!, she studies the birth of galaxies and the formation of stars hidden within cosmic dust.

Johnson spoke about the loss of the natural night sky due to light pollution, reminding the audience that the glow of cities is dimming humanity’s oldest connection to the cosmos. For millennia, humans oriented their stories, rhythms, and sense of humility by the stars.

She argued that regaining dark skies is both an ecological and contemplative act, an invitation to rediscover our place in the vastness of space. When we lose the stars, she reflected, we risk losing sight of our own smallness and wonder. Her talk blended scientific insight with existential reverence, making the night sky a mirror for meaning and fragility.

Andrew Holecek – Contemplative Teacher, Author & Scholar of Dream Yoga

Andrew Holecek teaching dream yogaDream Yoga by Andrew Holecek

Dr. Andrew Holecek invited participants to journey beyond ordinary perception, to explore the “luminous darkness” of the mind itself. A renowned teacher of Tibetan Buddhist meditation and lucid dreaming, Holecek has spent decades studying how awareness continues through waking, dreaming, and dying.

He spoke about the transformative power of darkness, drawing on Tibetan dark retreat practices where total darkness reveals the inner light of consciousness. He described how the night, both literal and psychological, can become a field for insight rather than fear, showing that seeing is not only optical but spiritual.

Author of works including Dream Yoga: Illuminating Your Life Through Lucid Dreaming and the Tibetan Yogas of Sleep, Holecek emphasized that cultivating awareness in darkness dissolves boundaries between seer and seen. His reflections reminded the audience that light and darkness are partners in perception, and that embracing the unseen helps us see more clearly within.

Jesse Fleming – Media Artist & Assistant Professor of Emerging Media Arts, University of Nebraska–Lincoln

Jesse Fleming portrait Jesse Fleming immersive installation

Dr. Jesse Fleming explored how technology, light, and presence shape perception. A filmmaker, media artist, and Assistant Professor at the University of Nebraska–Lincoln’s Carson Center for Emerging Media Arts, his work bridges consciousness studies, design, and immersive art.

Fleming spoke about how mediated seeing can become a contemplative practice, reflecting on how screens, reflections, and moving images influence attention and empathy. Drawing on projects such as The Shared Individual and Nuclei, he demonstrated how immersive environments can expand awareness rather than fragment it. His research asks what happens when the media stops entertaining and starts awakening, when pixels, photons, and human perception synchronize to reveal the subtle boundary between observer and observed. Fleming reframed “seeing” as participation in a living network of light, bodies, and consciousness.

Session IV – Extrasensory

The final session expanded perception beyond the five senses, merging science, spirituality, and technology. Mikey Siegel introduced bio-sensor experiences that transform heartbeats and breath into shared light and sound fields. Eve Ekman illuminated the emotional body as a sensory organ guiding compassion and resilience. Michael Lifshitz traced the neuroscience of hypnosis, meditation, and psychedelics to show how consciousness continually remakes reality. Oludamini Ogunnaike offered a luminous account of Sufi and Islamic practices in West Africa, where chant, rhythm, and beauty serve as portals to divine knowledge. Moderated by Casey Forgues, the discussion synthesized art, science, and spirituality into one realization: sensemaking begins where the measurable meets the mystical.

Mikey Siegel – Technologist & “Consciousness Hacker”, Stanford University

Mikey Siegel immersive technologyInteractive consciousness tech by Siegel

Dr. Mikey Siegel brought frontier-thinking to the table, showing how technology and collective physiology become tools of sense-making. A former robotics engineer (MIT Media Lab) now based at Stanford University and working with his initiative BioFluent Technologies, Siegel designs immersive systems (such as his renowned platform GroupFlow) that measure participants’ heart-rate and breath and convert these into shared audio-visual experiences.

At the symposium he spoke about how sense-making isn’t just a solo act of cognition, but a field phenomenon, a resonant space where bodies, devices, sounds and attention interweave. He urged us to ask not only what our technologies do, but who they enable us to become.

Eve Ekman – Contemplative Social Scientist & Emotion Researcher, University of California, Berkeley

Eve Ekman portraitEve Ekman lecture

Dr. Eve Ekman turned attention inward, guiding participants to consider emotion itself as a sensory organ, a compass for meaning and human connection. A contemplative social scientist and Senior Fellow at the Greater Good Science Center at UC Berkeley, Ekman’s research explores emotional awareness, empathy, and resilience.

She spoke about how emotions shape perception, emphasizing that sensemaking is not limited to intellect or sensation but is deeply informed by the body’s internal signals. Drawing from her work on the Atlas of Emotions with the Dalai Lama and her Cultivating Emotional Balance program, she illustrated how mindfulness and compassion training help individuals transform reactivity into clarity.

Her reflections revealed that emotional literacy is a contemplative technology, one that allows people to feel more deeply, connect more authentically, and perceive the subtle vibrations of the human heart. Ekman reminded participants that the future of awareness depends not only on sharper tools of observation but on gentler capacities for feeling.

Michael Lifshitz – Neuroscientist & Assistant Professor of Psychiatry, McGill University

Michael Lifshitz portraitMichael Lifshitz contemplative neuroscience

Dr. Michael Lifshitz bridged neuroscience, anthropology, and contemplative practice to examine how the human mind constructs ,  and transcends ,  ordinary perception. An Assistant Professor of Psychiatry at McGill University and Director of the Psychedelics and Contemplation Lab, Lifshitz investigates how meditation, hypnosis, and psychedelics alter consciousness and the sense of self.

At the symposium, he spoke about how non-ordinary states of awareness reshape the boundaries of the senses, describing them as experiments in human possibility. Drawing from neuroimaging and ethnographic research, he explored how spiritual and contemplative experiences can transform the brain’s perception of agency and embodiment.

His talk emphasized that what we call “extrasensory” may not be supernatural at all,  but an expanded form of sensemaking that includes the body, culture, and consciousness in continuous dialogue. Through this lens, Lifshitz offered a scientific and deeply human reminder: to understand the mind, we must study not just what it perceives, but how it learns to see itself.

Oludamini Ogunnaike – Associate Professor of African Religious Thought & Democracy, University of Virginia

West African Islamic arts and practiceSufi chanting and West African devotional arts

Dr. Oludamini Ogunnaike explored how sensory experience and spiritual knowledge fuse in the Sufi and Islamic traditions of West Africa, inviting participants to listen for the hidden frequencies of sacred sound, poetry, and devotion. At the University of Virginia, he teaches African religious traditions, Islamic philosophy and art, and the intellectual history of Sufism and Ifá.

His research examines the aesthetic, philosophical, and sensory dimensions of West African Islamic and indigenous traditions, particularly how devotional recitations, mystical poetry, and ritual practices function as forms of sense-making. At the symposium he described how the chants of the Tijāniyya order, the rhythms of madīḥ poetry, and the oracular Ifá tradition reveal the senses as conduits of knowledge, not just passive receptors.

Through works such as Deep Knowledge (2020) and Poetry in Praise of Prophetic Perfection (2020), he reframes sense-making as a poetic, embodied, and spiritual act, one in which the boundaries between listener, liturgy, and divine presence dissolve.

Conclusion – Integrating the Senses, Integrating the Self

The Contemplative Sciences Symposium revealed the power of interdisciplinary inquiry, where artists, scientists, philosophers, physicians, and contemplative practitioners came together to examine how humans perceive, interpret, and make meaning. Across sessions on hearing, seeing, extrasensory awareness, and the nature of sensemaking itself, the symposium showed that understanding the world requires more than intellect alone; it requires the full participation of the senses, the body, and the imagination. This gathering demonstrated how deeply interconnected the contemplative, scientific, and creative disciplines truly are, and how each contributes a vital perspective to the study of awareness and human experience.

A significant part of this success is also reflected in the structure and philosophy of Dr. Willock’s Mindfulness and Data course, which seamlessly integrates contemplative practice with physiological measurement, emotional awareness, and data-driven inquiry. In her classroom, students learn to read heart rate, breath, and affective signals not merely as metrics, but as reflections of lived, embodied processes. By guiding students to unite mindfulness with analytic rigor, Dr. Nicole creates a learning environment in which theory becomes experience and data becomes self-knowledge. Her approach shows that education can be contemplative, scientific, and personal all at once, inviting students to think critically, feel deeply, and cultivate attention as a tool for understanding.

The participation of her students in the Contemplative Sciences Symposium further exemplifies this integration. Engaging directly with leading scholars, artists, and contemplative researchers allowed them to witness interdisciplinary collaboration in action and to situate their own learning within a broader landscape of inquiry. Through both classroom practice and conference immersion, students experienced firsthand how mindfulness, physiology, art, culture, and neuroscience converge to expand human understanding. This synergy, between curriculum and community, between inner practice and academic exploration, highlights the transformative potential of contemplative education and the essential role it plays in shaping thoughtful, reflective, and compassionate learners.

Lawrence Supervisors’ Special Thanks:

I would like to express my sincere gratitude to my supervisors, Dr. Erika Frydenlund, Research Associate Professor at Old Dominion University and a member of the Storymodelers Lab; Dr. Krzysztof J. Rechowicz, Assistant Professor at Old Dominion University and a member of the Storymodelers Lab and the Virginia Digital Maritime Center (VDMC); and Dr. Sampath Jayarathna, Associate Professor at Old Dominion University and a member of the Web Science and Digital Libraries Research Group and the NIRDSLab, for the continued opportunities they have afforded me to be part of impactful research and meaningful academic endeavors. Their guidance, support, and mentorship have been invaluable to my growth.

About the Author:
Lawrence Obiuwevwi is a Ph.D. student in the Department of Computer Science, a graduate research assistant with The Center for Secure and Intelligent Critical Systems (CSICS), and a proud student member The Web Science and Digital Libraries (WS-DL) Research Group, and NirdsLab at Old Dominion University.


Lawrence Obiuwevwi
Graduate Research Assistant
Virginia Modeling, Analysis, & Simulation Center
Department of Computer Science
Old Dominion University, Norfolk, VA 23529
Email: lobiu001@odu.edu
Web : lawobiu.com

 


Digital Storytelling in Practice: A New Session Format for the DLF Forum / Digital Library Federation

Digital Storytelling in Practice: A New Session Format for the DLF Forum

Team DLF is introducing a new session format in the Call for Proposals for the 2026 Virtual DLF Forum: Digital Storytelling Presentations. This format is designed to deepen collaboration, center relationships, and create space for shared learning across roles, institutions, and communities.

The new 40-minute Digital Storytelling (DS) Presentation format is designed as an interactive session that highlights digital storytelling projects developed through collaborative partnerships. These DS Presentations center on installation-inspired projects, such as exhibits, platforms, or collections, that offer immersive, experiential engagement for participants. We encourage presenters to incorporate demonstrations whenever possible to help attendees engage more fully with the tools, platforms, or storytelling approaches being shared. 

Rather than focusing solely on a single presenter or project overview, the format should feature a minimum of two (2) presenters and no more than three (3) presenters. For example, a digital librarian or archivist might pair with a community partner, student, artist, or scholar whose work is represented in, or inspired by, the digital project. Together, presenters will explore not only the final product but also the collaborative process, relationships, and ideas that shaped the work, to show attendees how this work might be imagined, adapted, and implemented within their own institutions.

Presentations will emphasize the broader significance of digitization, why access matters, how collections are used, and the impact beyond the institution. Examples of proposals might  include: 

Archive to Art: A digital archivist and artist show how digitized protest materials inspired a multimedia installation, emphasizing workflow and creative impact. Example: Women’s March on Washington and Atlanta March for Social Justice and Women Collection (January 21, 2017), Women and Gender Collections, Georgia State University Library Digital Collections

Community Memory in Motion: A librarian and historian built a neighborhood digital archive through collaboration, now used in schools and local programs.  Example: Folded Map Project 

Teaching with Data: A librarian and student used a digitized collection to create a data visualization project, linking the technical process to student research.  Example: Students Turn College Fight Songs into Award-Winning Data Visualization | News | Northwestern Engineering

This format is included in the 2026 Call for Proposals, and we look forward to seeing how presenters bring collaborative digital storytelling to the Forum. If you’d like to talk through your idea or learn more about the format, please email us at forum@diglib.org.

The post Digital Storytelling in Practice: A New Session Format for the DLF Forum appeared first on DLF.

Call for Proposals: 2026 Virtual DLF Forum / Digital Library Federation

CLIR’s Digital Library Federation (DLF) invites proposals for the virtual 2026 DLF Forum, to be held online October 14-15, 2026. Learn more about who we are and who attends the DLF Forum.

Please note: This Call for Proposals (CFP) is for the October virtual event. There is no in-person event in 2026. We are committed to making this online conference accessible to all through consistent use of captioning in all sessions and the provision of accessible presentation materials, screen-reader-friendly documents, and clear communication of accommodation options. For accessibility related questions or concerns, please contact forum@diglib.org

The submission deadline is Monday, May 11, at 11:59 pm ET

We invite proposals for live virtual presentations on all topics related to digital libraries, encompassing case studies, “show and fails,” practical applications, methods, projects, ethics, research, and learning in any area, such as: 

  • Collections & Stewardship: Digitization, digital preservation, digital asset management systems (DAMS), born-digital materials, and format conversions.
  • Community & Advocacy: Partnerships, community archives, outreach, and professional advocacy.
  • Digital Research & Pedagogy: Digital humanities, scholarship, music, art, creative expression, and digital pedagogy.
  • Ethics, Justice, & Society: Race and technology, accessibility, AI/Machine Learning, copyright, and environmental sustainability. 
  • Infrastructure: Platforms, workflows, project management, and assessment.

This list of content topics is intended as a starting point and is not exhaustive; we welcome additional ideas and approaches that align with the spirit of the Forum.

Session Formats

All sessions will take place live in a meeting-style or webinar-style Zoom room, and breakout rooms will be available upon request for all formats except lightning talks. Sessions are invited in the following lengths and formats:

    • 90-minute Workshops: Guided training sessions on a specific tool, technique, workflow, or concept. Up to five (5) facilitators are allowed per submission. 
    • 50-minute Working Sessions: Open sessions for community organizers, creative problem solvers, and existing or prospective DLF working groups to begin or get feedback on in-progress projects, collaborate on addressing challenges, and discuss thought-provoking questions. Up to five (5) facilitators are allowed per submission.
    • 40-minute Panels: Discussions of up to five (5) presenters on a unified topic, with an emphasis on community discussion. Proposals with diverse and inclusive speaker involvement will be favored by the committee. Panels will be slotted into 50-minute sessions, leaving a minimum of 10 minutes for Q&A and discussion at the end of each session. 
    • 40-minute Presentations: A single topic or project presented by up to three (3) presenters. Presentations will be slotted into 50-minute sessions, leaving a minimum of 10 minutes for Q&A and discussion at the end of each session. 
  • NEW! 40-minute Digital Storytelling (DS) Presentations: Interactive sessions highlighting digital storytelling projects—such as exhibits, platforms, or collections— developed through collaborative partnerships that offer immersive, experiential engagement. They should feature a minimum of two (2) and no more than three (3) presenters in conversation, such as a digital librarian or archivist paired with a community partner, student, artist, or scholar whose work is represented in, or inspired by, the digital project. Demos are encouraged. DS Presentations will be slotted into 50-minute sessions, leaving a minimum of 10 minutes for Q&A and discussion at the end of each session. Read more about this new format here.
  • 5-minute Lightning Talks: High-energy talks on any topic held in succession in a single session, presented by up to two (2) presenters. There is no formal Q&A for lightning talks, but we encourage presenters to share contact information with attendees for follow-up conversations after the session.

Proposal Requirements

  • Proposal title and submission format.
  • Author information: full names, organizational affiliations, and email addresses for all presenters and authors.
  • Brief abstract – limited to 50 words. This abstract will appear in Community Voting and in the conference program.
  • Full proposal – limited to 250 words for all formats except for panels and workshops, which are limited to 500 words. This full proposal will only be seen by reviewers and the Program Committee.
  • Five keywords for your proposal.
  • Breakout room request – there will be an option in the submission form to indicate a request for breakout rooms.
  • Workshops Only: Learning objectives (limited to 50 words; brief, clear statements about what attendees will be able to do as a result of taking your proposed workshop); technology needed; participant proficiency level; how your workshop will be interactive. 
  • Session materials (notes, documents, slides, handouts, etc.) will be shared under a CC BY 4.0 license, which allows for sharing and adaptation of content with appropriate credit and an indication of any changes made. We will continue to invite presenters to deposit these materials in the Zenodo.org open-access repository, where the DLF community archives DLF Forum materials under this license. Presenters must agree in the submission form to share their materials under these terms.

Submissions and Evaluation

Based on community feedback and the work of our Program Committee, we welcome submissions geared toward a practitioner audience that:

  • Clearly engage with DLF’s mission;
  • Activate and inspire participants to think, make, and do;
  • Engage people from different backgrounds, experience levels, and disciplines; and/or
  • Include clear takeaways that participants can integrate and implement in their own work.

All submissions will be peer-reviewed. Reviewers will use this rubric to rate each proposal based on the values listed above. They may also recommend the proposal for a different format. Broader DLF community input will also be solicited through an open community voting process, which will help inform the Program Committee’s final decisions.

We especially welcome proposals from individuals who bring diverse professional and life experiences to the conference, including those from underrepresented or historically excluded racial, ethnic, or religious backgrounds, immigrants, veterans, those with disabilities, and people of all sexual orientations or gender identities. As we have done in the past, the Program Committee will prioritize submissions from individuals who identify as Black, Indigenous, and People of Color (BIPOC), individuals working at Historically Black Colleges and Universities (HBCUs), Tribal Colleges and Universities (TCUs), Hispanic Serving Institutions (HSIs), Minority Serving Institutions (MSis) and other libraries, archives, museums, and organizations that center BIPOC to promote inclusivity to the greatest extent possible. Self-identification options will be provided in the proposal submission form, but are not required.  

Schedule

  • Call for Proposals opens: Tuesday, April 14
  • Call for Proposals closes: Monday, May 11
  • Notification of final decisions: Week of June 11
  • Program released: Week of June 23

Read more about the DLF, who attends the Forum, and find co-presenters. 

Please feel free to reach out with any questions: forum@diglib.org

FAQ

What is the DLF Forum? 

DLF programs stretch year-round, but we are perhaps best known for our signature event, the DLF Forum

The DLF Forum welcomes digital library, archives, and museum practitioners from member institutions and beyond—for whom it serves as a meeting place, marketplace, and congress. As a meeting place, the DLF Forum provides an opportunity for our working groups and community members to conduct their business and present their work. As a marketplace, the Forum provides an opportunity for community members to share experiences and practices with one another and support a broader level of information sharing among professional staff. As a congress, the Forum provides an opportunity for the DLF to continually review and assess its programs and its progress with input from the community at large.

Here, the DLF community celebrates successes, learns from mistakes, sets grassroots agendas, and organizes for action. The Forum is governed by the DLF’s Code of Conduct. All Forum, in-person and online events, and community participants are expected to uphold a harassment-free, inclusive environment. The Code prohibits bullying, discrimination, and harmful behavior of any kind, requires respectful, constructive engagement and adherence to established safety protocols, and includes options for reporting harassment.

Generally, who attends the DLF Forum? 

The DLF Forum attendees are a multi-disciplinary cross-sector community of people who work in the digital library, museum, archives, and cultural heritage fields, from librarians, project managers, curators, technologists, and developers to administrators and service providers. The Forum welcomes practitioners from academic, art and cultural heritage, and non-profit organizations, government agencies, and more. They come from all over the country and world and represent all levels of professional experience. Forum attendees are inquisitive, engaged, and action-oriented with a focus on learning new skills and solving problems together. When offered in a virtual format, the DLF Forum may reach a wider and larger audience than in-person events.

Who should submit a proposal? 

We encourage proposals from:

  • DLF members and non-members;
  • Regulars and newcomers;
  • Digital library practitioners from all sectors (higher education, museums and cultural heritage, public libraries, archives, etc.) and those in adjacent fields such as institutional research and educational technology;
  • Students, early- and mid-career professionals, and senior staff alike.

Can you help me find a co-presenter? 

Looking for co-presenters on a particular topic? Try using our 2026 DLF Forum Unofficial Program Sessions and Connections spreadsheet for connecting with other prospective presenters. Note that the Program Committee and CLIR+DLF Staff do not monitor the document and it is not part of the official submission process. 

How is my proposal evaluated? 

All submissions will be peer reviewed. Reviewers will use this rubric to rate each proposal. They may also recommend the proposal for a different format. Broader DLF community input will also be solicited through an open community voting process, which will inform the Program Committee’s final decisions. 

What makes a successful proposal? Can I see successful proposals from previous years? 

Based on community feedback and the work of our Program Committee, we welcome submissions geared toward a practitioner audience that:

  • Clearly engage with DLF’s mission;
  • Activate and inspire participants to think, make, and do;
  • Engage people from different backgrounds, experience levels, and disciplines; and/or
  • Include clear takeaways that participants can integrate and implement in their own work.

We strongly encourage prospective presenters to review our rubric and past DLF Forum programs (from 2025 and 2024 in-person & virtual) to understand what makes a successful DLF Forum proposal. Strong proposals will demonstrate how presenters intend to design their proposed sessions to be interactive, inclusive, and action-oriented, and will also outline clear learning objectives. We especially welcome proposals from individuals who bring diverse professional and life experiences to the conference, including those from underrepresented or historically excluded racial, ethnic, or religious backgrounds, immigrants, veterans, those with disabilities, and people of all sexual orientations or gender identities. As we have done in the past, the Program Committee will prioritize submissions from individuals who identify as Black, Indigenous, and People of Color (BIPOC), individuals working at Historically Black Colleges and Universities (HBCUs), Tribal Colleges and Universities (TCUs), and other libraries, archives, museums, and organizations that center BIPOC to promote inclusivity to the greatest extent possible. Self-identification options will be provided in the proposal submission form, but are not required.  

What is the author limit? What is the presenter limit? 

Each session type has a maximum number of presenters per submission:

  • 90-minute Workshops: Up to 5 facilitators
  • 50-minute Working Sessions: Up to 5 facilitators
  • 40-minute Panels: Up to 5 presenters
  • 40-minute Presentations: Up to 3 presenters
  • 40-minute Digital Storytelling Presentations: Up to 3 presenters
  • 5-minute Lightning Talks: Up to 2 presenters

There is no limit to the number of non-presenting authors listed on a proposal.

The post Call for Proposals: 2026 Virtual DLF Forum appeared first on DLF.

Come join the TinyCat 10th Birthday Hunt! / LibraryThing (Thingology)

We’re hosting a special TinyCat Birthday Treasure Hunt over on LibraryThing! We’ve got ten clues, one for each year. We’ve scattered a collection of birthday banners around the two sites, and it’s up to you to find them all.

  • Come brag about your clowder of tiny cats (and get hints) on Talk.
  • Decipher the clues and visit the corresponding pages in LibraryThing or TinyCat to find a banner. 
  • Each clue points to a specific page. Remember, some banners will be hidden in LibraryThing and some in TinyCat! 
  • You have until 11:59 pm EDT on Thursday, April 30th to find all the TinyCats.

Win prizes:

  • Any member who finds at least two birthday banners will be awarded a TinyCat banner badge.
  • Members who find all 10 birthday banners will be entered into a drawing for one of five sets of TinyCat and LibraryThing swag. We’ll announce winners at the end of the hunt.

P.S. Thanks to conceptDawg for the gray catbird illustration!

Weekly Bookmarks / Ed Summers

These are some things I’ve wandered across on the web this week.

🔖 AI Whistleblower: We Are Being Gaslit By AI Companies, They’re Hiding The Truth! - Karen Hao

The truth about Sam Altman. AI Critic Karen Hao reveals what 90 OpenAI employees told her.

Karen Hao is an AI expert, award-winning investigative journalist, and former reporter for The Wall Street Journal covering American and Chinese tech companies. She is also co-host of the podcast The Interface and freelances for publications like More Perfect Union and The Atlantic. Her latest book is the bestselling ‘EMPIRE OF AI: Inside The Reckless Race For Total Domination.’

🔖 Introduction to Compilers and Language Design

This is a free online textbook: you are welcome to access the chapter PDFs directly below. If you prefer to hold a real book, you can also purchase a hardcover or paperback below. The textbook and materials have been developed by Prof. Douglas Thain as part of the CSE 40243 compilers class at the University of Notre Dame. Join our mailing list to receive occasional announcements of new editions and other updates.

🔖 Inside Claude Code With Its Creator Boris Cherny

A somewhat bizarre interview with the creator of Claude Code, where he talks about the origins of the tool, and how its current development fits in with Anthropic’s business plans – which seem pretty vague other than taking over the world.

🔖 pi-mono contributing guide

Some open source projects that accept AI contributions are moving to a model where PRs need to reference an issue that has been marked approved by an existing maintainer with a lgtm comment. This then triggers a Github Action that adds the user to the .github/APPROVED_CONTRIBUTORS file. Then when a PR comes in, it isn’t immediately closed.

https://newsletter.pragmaticengineer.com/p/mitchell-hashimoto

🔖 badlogic / pi-mono

AI agent toolkit: coding agent CLI, unified LLM API, TUI & web UI libraries, Slack bot, vLLM pods.

(apparently Claude Code was built with this?)

🔖 The Coral Bones

The Coral Bones is a tale of three women from different times in Earth’s history, each of whom has a special relationship with the Great Barrier Reef. Judith is the daughter of a 19th Century English sea captain and is desperate to study the natural world, just like the famous Mr. Darwin. Hana is a Japanese-Australia scientist from the present day, studying the dying reef. And Telma is a descendant of refugees in a near future Australia where most forms of animal life except humans are functionally extinct.

🔖 In Ascension

In Ascension is a 2023 novel by Martin MacInnes, published in the UK by Atlantic Books and in the US by Grove Atlantic.[1] It is published or forthcoming in ten languages. The novel tells the story of Leigh, a young girl who grows up in the Netherlands amid the specter of climate change and eventually becomes a marine scientist exploring ocean trenches and investigating an anomaly at the edge of the Solar System.

🔖 Papers, Please: The toll of age verification laws on digital sex work

“The only point [of these laws] is to restrict access to content,” Riana Pfefferkorn, an attorney and policy fellow at Stanford University’s Institute for Human-Centered Artificial Intelligence, told me. “I think that the ubiquity of [age verification] lately has warped people’s views of what online safety means, so that now everything is just like, ‘Why don’t we just do [age verification]? Won’t that fix it?’” she continued. She was alluding to the use of AI to generate adult content, as well as the trend of users on X requesting that the platform’s AI assistant, Grok, non-consensually undress photos of potentially underage girls. In response to outcry, X chose to paywall access to its AI tools

🔖 Content Neutrality for Kids: Intermediate Scrutiny for Social Media Age-Verification Laws

The First Amendment imposes a high, but not insurmountable, hurdle for states to overcome in regulating minors’ social media use. By focusing on specific features that lead to harmful effects on minors, states can craft content-neutral laws that will merit only intermediate scrutiny. The solutions to the LinkedIn Problem proposed above — naming platforms directly under TikTok’s revival of the “special characteristics” standard or regulating specific harmful features without reference to content — are the two likeliest ways for states to have their laws upheld in court. Like California, states must be creative and flexible as they respond to a rapidly developing legal doctrine. If “[s]ocial media is a cancer on our society,”213 then seeking a constitutional cure is crucial even if current efforts “dwell only on the suffering of children.”214

🔖 The machines are fine. I’m worried about us.

for someone who doesn’t yet have that intuition, the grunt work is the work. The boring parts and the important parts are tangled together in a way that you can’t separate in advance

🔖 On The Enshittification of Audre Lorde: “The Master’s Tools” in Tech

This is not an argument to reject the enshittification analysis. It is an argument to extend it. A decolonial critique of technology is not simply “the internet was always bad.” It is rather: the conditions that made the internet harmful to specific communities were never peripheral to its design; they’re an integral part of it. And any politics that aims to restore something like the pre-enshittification internet without reckoning with those conditions is doomed to reproduce them.

2026-04-11: PostGuard: The Hackathon-Winning AI That Stops Career-Ending Posts / Web Science and Digital Libraries (WS-DL) Group at Old Dominion University


From March 23rd to the 31st, the Computer Science Graduate Society (CSGS) at  Old Dominion University hosted their Spring 2026 Hackathon. The competition brought together teams across master's and PhD categories to tackle different research topics, mainly in artificial intelligence (AI). Our team, the Attention Bros (Sandeep Kalari and Dominik Soós), chose to compete in Track 6: Privacy-Preserving AI, alongside four other great teams in the PhD category.

We are grateful to announce that we won first place in the PhD category with our project, PostGuard!

This was a fast-paced challenge completed over a single week. Despite the time constraints, we successfully engineered a novel architecture that balances AI utility. In this blog post, we provide an overview of the problem we tackled, the Privacy Paradox and existing methods, our system architecture; and the mathematically proven findings that secured our victory.

For more details, you can explore our Github repository containing the code, detailed report, and the dataset used in our analysis. 


Online Comments Have Lasting Consequences

Social media platforms serve as both a public forum and a digital newsstand. We started by looking at the problem: online comments can cause irreversible, real-world damage. In the heat of the moment, you post something, and before moderation can catch it, someone takes a screenshot. People lose their jobs over 280 characters posted online. 

Current moderation systems are 100% reactive, so they only act after the fact. We wanted to build a preventative system where it warns you before you hit send. 

The Privacy Paradox and Existing methods

To give users a specific and actionable warning about how a post violates their employee's policies, the system needs to know their personal context, like their job role and employer. However, collecting and processing that data creates a massive surveillance and privacy risk.

When we looked at how existing research handles this problem, we found a significant gap. To prevent someone from posting something that will ruin their career, you have three standard options, all of which fail:

  • Content moderation is reactive. It doesn't warn the user; it just punishes them after the fact, and it's also not user-specific. 
  • Differential Privacy works great for aggregate data but is useless for individual, consequence-based warnings.
  • Text Anonymization frameworks like RUPTA are great at removing personally identifiable information (PII) from text, but they strip away the exact context the LLM needs to generate a personalized warning. 
We needed a system that acted pre-posting, user-specific, and privacy-aware. That's why we built PostGuard. 
Approach Reactive? Pre-posting? User-specific? Privacy-aware?
Content Moderation
Differential Privacy
Text Anonymization (RUPTA) ➖ Partial
PostGuard (Ours)

Building a Dataset

To rigorously test our system, we couldn't just use standard benchmarks, so we built a dataset grounded in reality. We spent the first phase of the hackathon collecting a custom dataset:
  1. 15 Real Incident Cases: We pulled verified, real-world firings that were covered by major outlets. 
  2. 20-Article Vector Corpus: We embedded 15 signal articles and intentionally injected 5 noise articles to rigorously test our retrieval precision.
  3. Synthetic Personas: We generated 15 synthetic users with escalating post histories spanning from 2024 to 2026, mapping them one-to-one with real corporate policies. 

The Architecture

PostGuard intercepts risky posts before the user hits submission. To do this without leaking the user's data to the web, we built a four-stage pipeline:
  1. Risk Extraction: We use a lightweight LLM (Gemini Flash) to quickly extract risk factors from the draft and generate a targeted search query.
  2. RAG layer: We use an embedding model to search our custom vector database for relevant corporate policies and real-world firing precedents.
  3. Warning Generation: A secondary LLM (Gemini Pro) synthesizes the retrieved precedents and generates a customized, user-facing warning.
  4. RUPTA Evaluation: Finally, we run a dual-evaluation loop. A P-Evaluator scores the re-identification risk of the data we just processed, and a U-Evaluator scores the utility of the generated warning.
Figure 1. System architecture detailing the four stages of evaluation: (1) Initial comment ingestion, (2) Contextual policy retrieval via RAG, (3) Severity classification, and (4) Generation of the intent-preserving warning

Three Privacy Modes

The core of our privacy-preserving approach is user control. The system operates in three modes that dictate what data is forwarded through the pipeline. 

Mode Data Sent to System Privacy Utility
Anonymous Comment text only. No role, no employer, no history. High — poster nearly unidentifiable Lower — generic warnings
Contextual Comment + platform + job role. Medium — role narrows the field Medium — role-specific warnings
Full Profile Comment + role + employer + recent history. Low — nearly identifiable High — employer-policy specific warnings

Evaluation Results

The evaluation of our moderation and warning system demonstrates a highly effective balance between accuracy, user intent preservation, and privacy. To understand the system's true performance, we analyzed it across four core dimensions: Privacy vs. Utility, RAG Retrieval Accuracy, Severity Classification, and Warning Quality. Here is a breakdown of the metrics we used and why they matter.

Privacy vs. Utility

Protecting user identity is just as important as providing accurate warnings. By calculating the Relative Utility Threat (RUT) score, adapted from Soonseok Kim's 2025 MDPI Electronics paper, "Quantitative Metrics for Balancing Privacy and Utility in Pseudonymized Big Data", we proved mathematically that Contextual mode (RUT: 0.824) delivers higher AI accuracy than Full Profile Mode, while exposing only 46% of the relative privacy risk.

Mode RUT Score Utility Re-id Risk Interpretation
Anonymous 0.908 88 0.05 Excellent — high utility, almost no re-id risk.
Contextual 0.824 94 0.35 Best balance — recommended deployment threshold.
Full Profile 0.652 92 0.75 Utility gain does not justify the massive privacy cost.
The RUT framework validates that feeding the LLM maximum personal data ("Full Profile") yields diminishing returns by transforming various risk and utility metrics into a unified, probabilistic scale. Contextual mode sits right at the optimal deployment threshold, giving the AI just enough context, like the job role and platform, to generate highly specific warnings without sacrificing anonymity.

RAG Retrieval Accuracy

To evaluate RAG Retrieval Accuracy, the system was tested on 15 real-world incident cases against a mixed corpus. The results demonstrate highly effective document sourcing, achieving a Hit Rate@1 score of 0.80, meaning that the correct article ranked first in 12 out of the 15 cases. Furthermore, the system also achieved perfect retrieval with a Hit Rate@3 score of 1.00, ensuring the relevant articles was always surfaced within the top three results. 

Metric Score What it means
Hit Rate@1 0.80 Correct article ranked first in 12/15 cases.
Hit Rate@3 1.00 Correct article always in top 3 — perfect retrieval.
Mean Reciprocal Rank 0.90 Average rank position is very high.

This strong performance is reinforced by a Mean Reciprocal Rank of 0.90, which confirms that the average rank position of the correct information remains consistently high across all queries. 

Severity Classification

The system prioritizes user trust and accuracy in Severity Classification, which measures its binary classification performance for detecting high and critical violations. It achieved a perfect Precision score of 1.000, guaranteeing zero false alarms so the system never wrongly warns a safe comment. 

Metric Score Interpretation
Precision 1.000 Zero false alarms — the system never wrongly warns a safe comment.
Recall  0.533 7 cases under-classified — the system is intentionally conservative.
F1 Score 0.696 Overall classification quality.

The overall classification quality is represented by an F1 Score of 0.696. The Recall score of 0.533 reflects 7 cases that were under-classified. However, this is a deliberate design choice to make sure the system remains conservative rather than over-restrictive. 

Warning Quality

Traditional metrics like exact-match or BLEU scores fail to capture the nuance of rewritten text. for this reason, we used an "LLM-as-Judge" framework to score the qualitative aspects of the AI's output on a 5-point scale. This allowed us to measure subjective dimensions like Relevance, Policy Accuracy, Rewrite Safety, and Prevention Impact at scale. 

Dimension Mean Score What it measures
Relevance 4.67 Does the warning correctly identify the actual violation?
Policy Accuracy 4.67 Does it cite the correct policy or law for this specific case?
Rewrite Safety 4.67 Does the rewrite preserve intent while removing the risk?
Prevention Impact 4.67 Would this warning likely have prevented the real firing?
Overall 4.67 Holistic quality score
The overall mean score of 4.67/5 confirms the system acts as a helpful, accurate coach that preserves the user's original intent while effectively neutralizing the career risk. We found that the system successfully rewrites drafts to preserve their intent while removing career risks.

Looking Forward

The internet doesn't have to be a trap door. With PostGuard, we proved that we can give users specific and potentially career-saving warnings without turning AI tools into surveillance machines.

We are incredibly grateful to the CSGS organizers for putting together such a challenging and rewarding event. Earning first place in the PhD category was the peak of a long, exhausting, and incredibly fun week of research. 

Thanks for reading, and watch what you comment!

~Dominik Soós (@DomSoos)

Open Refine: Blanking Down Only Within Records / Library | Ruth Kitchin Tillman

This is the first in what will be a series of posts deriving from my most recent use of OpenRefine. This post is primarily for intermediate or advanced OpenRefine users, so I won’t be going over fundamentals. Beginners who are already familiar with the power of “Blank down” and “Transform” should also be ok.

One of the great things about OpenRefine is that you can take a spreadsheet with a repeating key field, apply the “Blank down” function on that column, and unlock a “record” experience. I find this really helpful when I need to combine MARC data with item data, a 1:many relationship. I can facet the holding item library to a particular campus, for example, and see entire records, not just the item row.

While only the key needs to be blanked down to create the record view, I can improve my experience working with the data by blanking down other repeating fields so I’m only seeing one instance of any given field.

A screenshot of OpenRefine in which the record IDs, titles, and URLs for each record have been cleaned up into unique rows for each record while entries for every item can be seen on the right.

The problem is that while keys don’t repeat, other data may. If I’ve got a column for notes and I’m working on a set of related records, multiple adjacent records may have the same note text. Blanking down means that the note text is only left in the first record of the sequence, until there’s a record where that field is genuinely blank or which has a different text.

I’ve previously handled this problem by counting how many rows were impacted by my initial blank down and noting how many were impacted by blank downs on each subsequent column. If the number is higher, I undo my action and simply leave the field duplicated in each row of the record. But I don’t like it.

The Solution

While working on a massive data review project this spring, I became frustrated with repeating data which I couldn’t blank down. I decided to search around online to see if others have found ways to handle it. I found the solution in a 10-year-old thread on the OpenRefine Google Group:

Once you’ve turned your rows into records, apply the following cell transform to each column you want to blank down:

value + " - " + row.record.index

Now, perform the “Blank down” operation.

Get your original content back with a second cell transform:

value.replace(/ - \d+$/,'')

The (Brief) Explanation

It’s important to note here that you must have already performed the initial “Blank down” operation on your column with keys to turn your rows into records. Otherwise, each row will be treated as its own record. Once you have records, though, they’ll be treated as records by this transformation whether you’re currently viewing the data as rows or records.

The first transformation uses the simple string join GREL syntax to join the original value, a " - “, and that row’s record index.

Something like:

Collection Name
Pennsylvania German broadsides and Fraktur
Pennsylvania German broadsides and Fraktur
Pennsylvania German broadsides and Fraktur
Pennsylvania German broadsides and Fraktur
Pennsylvania German broadsides and Fraktur
Pennsylvania German broadsides and Fraktur
Pennsylvania German broadsides and Fraktur
Pennsylvania German broadsides and Fraktur
Pennsylvania German broadsides and Fraktur

becomes:

Collection Name
Pennsylvania German broadsides and Fraktur - 0
Pennsylvania German broadsides and Fraktur - 0
Pennsylvania German broadsides and Fraktur - 0
Pennsylvania German broadsides and Fraktur - 0
Pennsylvania German broadsides and Fraktur - 1
Pennsylvania German broadsides and Fraktur - 1
Pennsylvania German broadsides and Fraktur - 2
Pennsylvania German broadsides and Fraktur - 2
Pennsylvania German broadsides and Fraktur - 2

and when I apply the blank down function, I get:

Collection Name
Pennsylvania German broadsides and Fraktur - 0



Pennsylvania German broadsides and Fraktur - 1

Pennsylvania German broadsides and Fraktur - 2


(This is just one column view, there’s an assumed leftmost column with the record IDs.)

The second transformation is a simple regex replace, looking for the delimiter (” - “) and a string of one or more numbers up to the end of the cell value. Because we’ve put a right-anchor on it, it should only match the addition we made, even if the string happens to contain its own " - 92951” (unless you’ve turned on repeating).

This simple pair of transformations and the ability to blank down without losing data has really improved my experience of working with the large data exports.

It’s a bit cumbersome to repeat them each time, so stay tuned for my next blog post: “I can’t believe I’ve been using OpenRefine for a dozen years and only just learned how easy it is to repeat a set of operations.”

Zotero 9 / Zotero

We’re excited to announce Zotero 9, which introduces a major new way to engage with your documents, along with a host of other improvements to the research workflow.

Coming less than three months after Zotero 8, Zotero 9 is the first major update since we announced a faster release cycle for Zotero, and it represents our commitment to getting stable features into the hands of users more quickly.

Read Aloud

Read Aloud reads your documents to you in high-quality, natural-sounding voices. It works on PDFs, EPUBs, and webpage snapshots.

To get started, just click the headphones button in Zotero’s reader toolbar.

As you’re listening, you can skip forward or backward by paragraph or by sentence (Option/Alt-click or Option/Alt-left/right), and you can start reading from a particular point by clicking in the left margin or by right-clicking and choosing “Read Aloud from Here”.

An “Annotate Sentence” button — or H or U on your keyboard — will automatically highlight or underline the last sentence you heard, and you can use shortcut keys to quickly move, expand, or delete the new annotation.

Your last reading position is saved and synced between devices, so you can pick up where you left off on any device.

Read Aloud requires an internet connection and a Zotero account for high-quality voices, which we’re calling Zotero Voices. If you’d like to use Read Aloud offline, you can still use the text-to-speech voices available on your system, but the quality will be significantly degraded.

We offer two tiers of Zotero Voices: Standard and Premium. Standard voices are generated on Zotero servers, with unlimited minutes for Zotero Storage subscribers and 2 hours/month for free accounts. Premium voices are the highest-quality voices available, processed by external text-to-speech providers — they make fewer mistakes, sound more natural, support many more languages, and can handle multilingual text. Individual Zotero Storage subscribers will receive up to 2 hours of Premium usage (varying by voice) each month, and free and institutional accounts will also receive a small quota in order to try the voices out. Initially, all subscribers can request additional Premium minutes for free. In the near future, we’ll provide more details on monthly allocations and options for adding additional minutes going forward.

Read Aloud is currently available only in the desktop app, but it’ll be coming to the iOS and Android apps soon.

Recently Read

A new Recently Read collection at the top of the collections list in each library shows items with attachments you’ve recently read, most recent first. Opening an attachment or changing pages will bump the item to the top of the list.

The collection includes a Last Read column, which you can also add to other views, and “Attachment Last Read” is available as an Advanced Search condition.

The last-read time syncs between devices, so you can quickly find a file on another device that you’ve read elsewhere.

Insert Annotations Directly into Word Processor Documents

The word processor plugins now feature a new Add Annotation button that lets you insert one or more annotations directly into your document, with active Zotero citations that automatically generate bibliography entries.

Previously, you could add annotations to Zotero notes and insert notes into your document with active citations, but it’s no longer necessary to create an intermediate note if you prefer to work directly in your word processor.

Add Annotation opens a new mode in the citation dialog that expands attachments to show individual annotations. You can browse or search for annotations, choose one or more, and then add them to your document, along with active citations and any comments you added. Image and ink annotations are inserted as images.

“Added By” and “Modified By” for Group Libraries

You can now add “Added By” and “Modified By” columns to the items list in group libraries, letting you see and sort by the people who created and last updated items.

These fields also show in the metadata list along with Date Added/Modified.

Per-Group File Renaming Settings

Group admins can now configure file-renaming settings for each group library, ensuring consistent filenames for all group members.

Performance Improvements

We’ve made some major improvements to Zotero’s performance, including reducing startup memory usage by 20% and drastically reducing disk access and network requests during file syncing in some situations.

On macOS, Zotero now uses a feature of the modern Apple filesystem to avoid additional disk-space usage when copying files. This includes the automatic daily backups Zotero makes of its database, potentially saving hundreds of megabytes or gigabytes of local disk space.

Web-Based Login

You now log in to your Zotero account via the browser instead of entering credentials in the app. This allows you to use a password manager to auto-fill credentials and will enable two-factor authentication (currently in beta), greatly increasing the security of your Zotero account.

Other Improvements

See the changelog for the full list of changes.

Get Zotero 9

If you’re already running Zotero, you can upgrade from within Zotero by going to Help → “Check for Updates…”.

Don’t yet have Zotero? Download Zotero 9 now.

Knowledge as Critical Digital Infrastructure: A Call to Action for a Resilient Future / Open Knowledge Foundation

You can also read and share this story in Spanish and Portuguese. Knowledge is the foundation of every modern society, underpinning democracy, driving innovation, and strengthening our collective culture. In the digital age, this bedrock is critical digital infrastructure (CDI), the essential software, standards, data systems, and information that provide public functions upon which society depends. By recognizing...

The post Knowledge as Critical Digital Infrastructure: A Call to Action for a Resilient Future first appeared on Open Knowledge Blog.

Happy 10th Birthday TinyCat! / LibraryThing (Thingology)

Ten years ago we created TinyCat, a catalog solution for small libraries. 

The idea was simple: Thousands of small libraries were already using LibraryThing, but it was too much—too many doors to the larger world of discovery and social interaction. So we cut all that out, and we made everything about finding books. We added the circulation and administration features small libraries need. We made the simple and intuitive user interface we wished that big, public libraries had.

Growth and Improvements

From about 500 libraries in our first year, we have grown to more than 3,500 today—from schools to religious communities, from museums and LGBTQ centers, to a growing number of small public libraries. Another 32,516 LibraryThing members use TinyCat as a second way to search their personal libraries.

The last decade has brought TinyCat to countries across the globe, from the USA to Ireland, Singapore, Peru, Egypt, and South Africa to name a few. We recently added translation options, so more people can now use TinyCat in more languages.

There are many small libraries who still don’t know about TinyCat, and we can use your help spreading the word. If you know anyone working or volunteering with a small library, send them our way. Folks can sign up or learn more about TinyCat at https://www.librarycat.org

TinyCat Survey

Share your thoughts about TinyCat in our new survey!

This survey is short and every question is optional. You can save your progress and come back any time. While the survey is not anonymous, we encourage you to be honest and candid — all your feedback helps us improve!

This is a great chance to share your experiences as an admin, feedback from your patrons, and ideas you have for making TinyCat even better. All current and previous TinyCat members are welcome to participate.

New Feature: Restrict Catalog Access

TinyCat libraries now have the option of requiring a login to view their catalog. If enabled, all visitors will be prompted to enter their credentials before seeing the homepage or searching the catalog.

To enable this for your TinyCat library, go to your Patron Accounts settings and switch on Restrict Catalog Access. 


Enter the Giveaway!

Enter our TinyCat 10th Birthday Giveaway for a chance to win a special TinyCat bookmark and TinyCat stickers!

How to enter: 

  1. Share one of your favorite or most popular items from your TinyCat library on social media. Alternatively, you’re welcome to post in the TinyCat group on LibraryThing. 
  2. Include a screenshot or link to the item’s catalog page so we can see it like your patrons do. 
  3. Tag TinyCat in the post so we can find your post and enter you in the giveaway.

(Not just books! Game libraries can share a game, movie libraries a film, etc.)

Where to tag us:

Deadline:

We will randomly select 10 winners on May 8, 2026. Each will receive a special TinyCat bookmark and stickers. We will contact the winners for mailing details.

Store Sale

All TinyCat merchandise and barcode scanners, including stickers, pins, and coasters, are on sale now through May 4. Barcode scanners are only $5 and TinyCat pins are only $2! See all the discounts at the LibraryThing Store.

Profile Badges

TinyCat libraries over 1, 2, 5, and 10 years respectively will receive new badges in LibraryThing. Go ahead, brag!

Thank you for reading and for celebrating with us! 

2026-03-17: The Disintegration Loops: Generational Loss in Web Archives / Web Science and Digital Libraries (WS-DL) Group at Old Dominion University

 The Disintegration Loops: Generational Loss in Web Archives


Michael L. Nelson




As part of the Internet Archive's Information Stewardship Forum (March 18–20, 2026), I decided to use my five minute lightning talk to raise the issue of generational loss in web archives.  Or more directly, making copies of copies (...of copies…) – something that web archives currently do not do well.  My title is based on William Basinski's four volume release "The Disintegration Loops", in which he played the audio tapes of "found sounds", recorded decades earlier, in loops, with the whole process lasting over an hour.  The effect is hauntingly beautiful, with each loop slightly degrading the magnetic tape, resulting in a generational loss.  The degradation of each loop is right on the edge of the just-noticeable difference, until the entire track is reduced to just a shadow of its former self.


I first discussed this topic in my 2019 CNI closing keynote (slide 88), where I introduced the inability of web archives to archive other web archives as part of the larger issue of web archive interoperability. Let's begin with walking through the example of archiving a tweet (which we already know to be challenging!).   The original tweet is still on the live web, even though the UI has undergone many revisions since when it was originally tweeted in 2018.  


https://twitter.com/phonedude_mln/status/990054945457147904 

(screen shot from 2026-03-17)



I archived that tweet to the Internet Archive's Wayback Machine in 2018 (screen shot from 2019):


https://web.archive.org/web/20180501125952/https://twitter.com/phonedude_mln/status/990054945457147904 


I then archived the Wayback Machine's copy of the tweet to archive.today in 2019 (screen shot from 2019):

https://archive.ph/PaKx6 


Note that archive.today is aware that the page comes from the Wayback Machine but the original host is twitter.com, and it maintains both the original Memento-Datetime (20180501125952) as well as its own Memento-Datetime (20190407023141).  I then archived archive.today's memento to perma.cc in 2019 (screen shot from 2019):



https://perma.cc/3HMS-TB59 


Finally, I archived the perma.cc memento back to the Wayback Machine in 2019 (screen shot from 2019):


https://web.archive.org/web/20190407024654/https://perma.cc/3HMS-TB59 


Although the loss occurs in discrete chunks, it is reminiscent of Basinski's Disintegration Loops, with information lost at each step, and the final version being a mere shadow of the original.  In 2019, this was not universally recognized as a problem, since archiving the playback interface of other web archives was not considered a problem to itself.  The "right" solution, of course, is to share the WARC files (or WAC, or HAR, or…) out-of-band and let the other web archives replay from the same source files.  But this is rarely possible: for a variety of reasons web archives typically do not share the original WARC files, and in the case of archive.today, might not even store the original source files (and instead, likely only store the radically transformed pages).  


More importantly, it is sometimes useful to archive a particular web archive's replay of a page, which itself must be archived, because it changes through time. For example, memento #3 (the perma.cc memento of archive.today's memento) is now different; this is a screen shot from 2026:


2026 replay of https://perma.cc/3HMS-TB59 


Surely the source files themselves have not changed, and the difference is due to improvements in pywb, which is under constant development. perma.cc's replay of the 2019 page in 2019 is different from the replay from 2026, which implies that it could be different still in the future. But we can not currently archive without generational loss of perma.cc's replay of that page to, say, the Wayback Machine.  The fact that screen shots – which are rife with their own potential for abuse (cf. HT 2025, arXiv 2022) – are the only mechanism to document these replay differences underscores the web archive interoperability problem.


I chose the topic of generational loss for my slot at the Information Stewardship Forum because recent events have introduced a new use case for archiving the replay of web archives. Wikipedia recently announced it was blacklisting archive.today because its editors discovered that webmaster at archive.today was using its captcha to direct a DDoS attack against a blog owned by someone that webmaster had a dispute with (the blogger had posted a lengthy investigation of the identity of webmaster), and, for our discussion more disturbingly, had edited the content of an archived page to include the name of the blogger where it would not otherwise be.  The Wikipedia discussion page is hard to follow, in part because the editors are discussing how to archive the replay of an archived page.  For one example, they show how the archive.today replay now has been changed back to have "Comment as: Nora " (middle of the image):



But the replay alteration from archive.today in question is archived at megalodon.jp to show that the name "Nora " was replaced with the name of the blogger that had earned webmaster's ire, "Jani Patokallio". And yes, megalodon.jp's replay of archive.today's memento is that bad (at least in my browser, it is shrunk down impossibly small), so I used the dev tools to find the string in question. 


https://megalodon.jp/2026-0219-1509-14/https://archive.is:443/2021.05.30-173350/http://www.maskofzion.com/2012/04/jewish-at-root-iraqs-destruction-hell.html


Another Wikipedian archived (using yet another archive, ghostarchive.org) a google.com SERP to show that archive.today has reverted from "Jani Patokallio" back to "Nora ". 





What does changing "Nora" to "Jani" (and then changing it back again) accomplish? I'm not sure; this appears to be just a petty response to an ongoing dispute.  But the implication is profound: this is the first known example of a major web archive purposefully and maliciously altering its contents, something that we knew was possible but had not yet experienced.  


We have long known that replay can change through time (cf. PLOS One 2023) due to the replay engine (the Wayback Machine, Open Wayback, pywb, etc.) evolving, but these changes were engineering results and the replay mostly improved over time. But now we have seen web archives maliciously alter (and then revert) the replay, and we need a more standard and interoperable way to archive archival replay.  Not just to prove that a web archive did alter its replay, but also to prove that an archive did not alter its replay.  Out-of-band sharing of WARC files is the gold standard, but for a variety of reasons this is unlikely to happen.  We must be able to use web archives to verify and validate web archives.  We explored a heavyweight design for this a few years ago (JCDL 2019), but it should be revisited in light of developments like WACZ.  


–Michael


ht to Herbert Van de Sompel for introducing me to "The Disintegration Loops" many years ago.


2026-03-25: The original Google/Blogger name ("Nora") has been anonymized.

Fear and Burnout At the Interface of Librarianship and Māori Knowledge in Aotearoa New Zealand / In the Library, With the Lead Pipe

By Kathryn Oxborrow Vambe

In Brief: This article presents some findings from a study of non-Māori librarian engagement with Māori knowledge. I asked non-Māori librarians (who predominantly identified as New Zealand European, a local synonym for White) about their journeys of learning and engagement, and Māori librarians about their experiences with their non-Māori colleagues’ engagement (or lack thereof). A key theme was fear on the part of non-Māori librarians. This acted as a barrier to engagement for non-Māori participants and created extra work for Māori librarians who were expected to pick up tasks related to Māori people and culture that their non-Māori colleagues declined to undertake. I suggest that when it comes to Māori knowledge, non-Māori librarians need to feel the fear and persevere, as well as being proactive to act as good allies to their Māori colleagues.

Introduction

Kia ora (this is a greeting in te reo Māori, the Māori language – definition from Te Huia, 2016) from Aotearoa New Zealand. I am a White British cisgender heterosexual woman of English and Scottish heritage. I was born in England but have lived in Aotearoa since 2010. I have spent the majority of the last twenty years working in and around libraries, and in 2020 I completed my PhD through Victoria University of Wellington | Te Herenga Waka. When I moved to Aotearoa I was enthusiastic to learn what I could about Māori culture, customs and language, and was interested to observe the lack of enthusiasm in some of my non-Māori colleagues. When I had the opportunity to undertake PhD research, I decided to explore how non-Māori librarians learn about and engage with Māori knowledge. In this article I will focus on one key finding: The prevalence of fear in non-Māori librarians’ journeys of learning and engagement with Māori knowledge and the phenomenon of non-Māori overreliance on Māori colleagues in relation to basic library tasks involving engagement with Māori people or knowledge. I conclude by considering ways that non-Māori librarians can push through their fear and become good allies to their Māori colleagues.   

A note on terminology

In this article I discuss Māori people and culture and as such will include words and phrases in te reo Māori. These will be accompanied by English language definitions in brackets after their first use, with the exception of the words Māori and te reo Māori which will be used henceforward without translation. Definitions are taken from Te Aka Māori Dictionary online unless otherwise stated. Government departments, organisations and projects in Aotearoa are often known by both their English and Māori names, and the two names do not always represent a direct translation of each other. When referring to these I use both Māori and English names, separated with a |. Whether the English or Māori term is given first is decided by common usage.

In the original research thesis on which this article is based, I used the term mātauranga Māori, a term with a breadth and depth of meaning. The definition I used was from Mead (2012) who described it as “…Māori knowledge complete with its values and attitudes” (p. 9). For the purposes of clarity, I will be using the term Māori knowledge throughout. A more detailed discussion of the term mātauranga Māori in relation to this research can be found in Oxborrow (2020, pp. 18-20). 

In this article I refer to the country of New Zealand by its Māori name, Aotearoa, or the combined name Aotearoa New Zealand. I use these terms interchangeably throughout this article. I only use the term New Zealand on its own when referring to the mainstream or majority culture, or where it is used by participants or cited authors.   

Local Context and Literature Review

Aotearoa New Zealand

Aotearoa New Zealand is a former British colony in the South West Pacific. The Indigenous Māori people made up 17.8%of the population in the 2023 census (Stats NZ | Tatauranga Aotearoa, 2024a). The largest ethnic group is New Zealand European (Stats NZ, 2024a). Ancestors of Māori settled in Aotearoa at least 500 years before Europeans began arriving in the early 1800s (Royal, 2012). In 1840 Te Tiriti o Waitangi | The Treaty of Waitangi was signed as an agreement between over 500 Māori chiefs and representatives of the British Crown, setting the terms for the relationship between Māori and the growing settler population, and signalling the beginning of modern New Zealand (Orange, 2023). Breaches of Te Tiriti o Waitangi | The Treaty of Waitangi by the Crown began happening shortly after its signing, leading to Māori being dispossessed of the majority of their land (Orange, 2023). While there has been some redress for this, including a permanent Tribunal reporting on historical and present day breaches of Te Tiriti o Waitangi | Te Tiriti o Waitangi (Waitangi Tribunal | Te Rōpū Whakamana i te Tiriti o Waitangi, n.d.), the impacts of this for Māori are ongoing. These include disproportionate representation in health and social statistics including incarceration rates (Department of Corrections | Ara Poutama Aotearoa, 2025) and early death (Stats NZ | Tatauranga Aotearoa, 2024b). While there has been movement in recent years in relation to the visibility and acceptability of Māori language and culture, and increased Māori participation in public life, there are non-Māori in Aotearoa who feel very uncomfortable about these changes. An example of this is the New Zealand Centre for Political Research blog, whose contributors often complain of “tribal takeover” (e.g. Newman, 2025, paragraphs 13, 50, 53). This political pushback has been seen through supporters of the 2023 Coalition Government, whose legislative agenda has been described by commentators as strongly anti-Māori (see, for example, Clark & Hill, 2024; Paewai, 2025).

Members of the local White settler population in Aotearoa use different terms to describe their ethnicity or cultural identity. New Zealand European/European New Zealander is a commonly used term (Allan, 2001). Another term that is used for this ethnicity is Pākehā, a Māori term, the original meaning of which is not universally agreed on, as per the following definition from Te Aka Māori Dictionary:

New Zealander of European descent – probably originally applied to English-speaking Europeans living in Aotearoa/New Zealand. According to Mohi Tūrei, an acknowledged expert in Ngāti Porou1 tribal lore, the term is a shortened form of pakepakehā, which was a Māori rendition of a word or words remembered from a chant used in a very early visit by foreign sailors for raising their anchor … Others claim that pakepakehā was another name for tūrehu2 or patupairehe3. Despite the claims of some non-Māori speakers, the term does not normally have negative connotations. (Moorfield, n.d.)

In my time in Aotearoa I have also heard various definitions from Māori colleagues, including some who have told me the word can be used to describe all non-Māori. The nuances within the word Pākehā range from those who may believe the term to be offensive (as described in Black, 2010), to those who believe it denotes historical and spiritual connection to the physical environment of Aotearoa (Dyson, 2001; King, 2004) or an individual’s continued efforts to engage with te ao Māori (the Māori world, definition from Te Huia, 2016) (e.g. Jones, 2020). Non-Māori of any ethnicity may choose to identify themselves as Tangata Tiriti, who Bell (2024) describes as “…non-Māori people who are guided by a sense of their relationship to te Tiriti o Waitangi / the Treaty of Waitangi and to te ao Māori in their work” (p.1). Due to these multiple understandings of the term Pākehā and the fact that not all interviewees identified as New Zealand European, I use the term non-Māori in this article to refer to interviewees and any other individuals in the context of Aotearoa New Zealand who do not identify as Māori. 

Librarianship in Aotearoa New Zealand

At the time of the 2023 census of Aotearoa New Zealand, 5,730 individuals reported working in libraries across the country, including librarians, library assistants and library technicians (Stats NZ | Tatauranga Aotearoa, 2023). However, in 2025 only 853 were members of the Library and Information Association of New Zealand Aotearoa | Te Rau Herenga o Aotearoa (hereafter referred to as LIANZA), the largest professional association for librarians in Aotearoa (LIANZA, 2025a). 

Due to the small population of the profession and low levels of wider professional involvement, librarianship in Aotearoa New Zealand is more of a generalist occupation than in some other larger countries such as the United States of America. There are only three tertiary institutions offering qualifications for librarianship in Aotearoa: Victoria University of Wellington | Te Herenga Waka, Open Polytechnic | Kuratini Tūwhera, and Te Wānanga o Raukawa (one of three Māori-led tertiary education institutions in Aotearoa). Of these institutions, only Victoria University of Wellington | Te Herenga Waka offers postgraduate programmes including Master’s and PhD. It is not always a requirement to hold a qualification in librarianship to be appointed to a professional library role in Aotearoa, and there is no expectation for subject support librarians in tertiary education libraries in Aotearoa to hold or work towards a PhD in their area of subject specialism. Librarians in Aotearoa New Zealand do not always specialise in a particular type of library work and it is common for library professionals to work in various different types of libraries across their careers (see, for example, Stone, 2013). I am an extreme example of this, having worked in four different types of libraries as well as in librarian professional education over the course of the last fifteen years.

LIANZA   

In their history of LIANZA, Millen (2010) writes: “Looking back, the most notable – even radical – developments [in LIANZA] of the past thirty years have been the progress made in the area of biculturalism” (p. 172). According to the literature, the library and information profession in Aotearoa is one of the professions which has demonstrated a commitment to engaging with mātauranga Māori from relatively early on in its history. Lilley (2013) states that the first mention of library services for Māori is in 1962 when the Māori Library Services Committee was formed to recommend strategies to libraries to help them engage with Māori. The report of the committee was produced in 1963 and published in the association’s publication New Zealand Libraries (Maori Library Service Committee, 1963). Millen (2010) states that focus on Māori issues began to strengthen within LIANZA in the 1980s. Both Lilley (2013) and Millen (2010) point out that the updating of the Treaty of Waitangi Act in 1985 to enable the Waitangi Tribunal | Te Rōpū Whakamana i te Tiriti o Waitangi to accept retrospective claims from as far back as 1840 led to increased use of libraries by Māori who used them to find evidence for their claims. Another key development in the profession in the 1980s which Millen highlights is the Saunders Report on education for librarianship in 1987. Te Rōpū Takawaenga, a group of students at Victoria University of Wellington, highlighted the lack of discussion of Māori culture and knowledge in the report and called for a profession-wide discussion. Te Rōpū Whakahau, the professional association for Māori in libraries and information management, was established in 1992, initially as a Special Interest Group of LIANZA, and later becoming an independent organisation (Lilley, 2013). LIANZA and Te Rōpū Whakahau have had a partnership agreement since 1995 (Lilley, 2013). This used to involve the inclusion of two Te Rōpū Whakahau representatives on the LIANZA Council, but in recent years has become a less-structured agreement with commitment to working together and honouring Te Tiriti o Waitangi | The Treaty of Waitangi (LIANZA & Te Rōpū Whakahau, 2024). 

Professional Registration

Professional Registration was introduced by LIANZA in 2007 (Millen, 2010). According to the LIANZA Taskforce on Professional Registration (2005), the scheme was established to act as a benchmark for professional learning and development both within the library and information profession in Aotearoa and also to be compatible with other Anglophone countries with similar schemes such as the UK and Australia. Registrants must demonstrate ongoing professional learning and development across the eleven elements of the Body of Knowledge (LIANZA, n.d.-b). Body of Knowledge Element 11 (BoK11) is “Awareness of indigenous knowledge paradigms, which in the New Zealand context refers to Māori” (LIANZA Professional Registration Board, 2013, p. 9). The scheme includes mandatory revalidation (LIANZA, n.d.-a). Every three years, registrants must submit a reflective journal detailing their professional learning and development which must include two entries relating to BoK11 (LIANZA, n.d.-a). If candidates do not revalidate their Registration, it lapses (LIANZA, 2020).  

Professional Registration has not gained the status of being a default requirement for employment in the library and information sector in Aotearoa as its instigators hoped. In a paper to LIANZA members, Steven Lulich, Chair of the Taskforce on Professional Registration, wrote “It is hoped that over the next two years, most of those working in the profession will join the scheme” (Lulich, 2007, p. 4). This has not come to pass, and it is now extremely rare to see a professional librarian role advertised in Aotearoa that lists Professional Registration as a requirement. The number of professionally registered librarians has been trending downwards for several years, with LIANZA reporting just 303 professionally registered librarians as of January 2026 (LIANZA, 2026). LIANZA is looking to increase interest in Professional Registration and the Bodies of Knowledge by incorporating them in the new Te Tōtara Workforce Capability Framework (LIANZA, n.d.-b) (Tōtara is a type of native tree).  

Research on Libraries and Indigenous Knowledge in Aotearoa

While there is tangible commitment to biculturalism and mātauranga Māori from the library and information profession in Aotearoa as represented by LIANZA and other professional groups, there are still a number of issues of concern related to library and information professionals’ engagement with these topics highlighted in the small body of literature addressing libraries and Indigenous knowledge in Aotearoa. 

Irwin and Katene (1989), in a study highlighting the dearth of tribal-specific information in libraries and the difficulties experienced by Māori trying to find that information, highlight the role to be played by libraries in partnering with Māori to alleviate some of the social disadvantages that they face. Irwin and Katene argue that knowledge is power and, therefore, access to knowledge is potential power. Social statistics at the time alluded to the fact that Māori were disempowered, and the authors argued that one possible reason for this is denial of access to knowledge. “Librarians are in a position of power where they can provide open access to knowledge, or they can deny this” (pp. 23-4). While this study is old and there is likely to have been some improvement in the intervening years, statistics show that Māori still experience greater levels of social disadvantage than non-Māori, as discussed above.

Tuhou (2011) identifies a number of barriers preventing Māori tertiary students from engaging with the university library. A lot of these are physical, with one group likening the atmosphere of the library to a prison, but staff were also a factor. Tuhou recommends cultural awareness training for staff to help them engage appropriately with Māori students, and for all staff to have the skills and confidence to answer  reference questions asked by Māori students. Ritchie (2013) also noted that Māori students may experience barriers preventing them from engaging with the university library.  

Bryant’s (2015) investigation of Ngā Ūpoko Tukutuku | Māori Subject Headings found that while there were several positive developments, much work is still needed for librarians to fully integrate the headings in their cataloguing, reference, and information literacy practices. Bryant highlights training as a key issue in increasing the use of the headings by librarians, and the majority of participants expressed the desire for more training than they had already had. 

Focus in these studies has mainly been on Māori experience of libraries and the challenges faced by non-Māori librarians in engaging well with various aspects of Māori knowledge. Prior to my study, no research has investigated the process of non-Māori librarian engagement with Māori knowledge.    

Methodology

This article highlights some findings from a larger research project investigating the journeys of non-Māori librarians in Aotearoa New Zealand seeking to learn about and engage with Māori knowledge. To frame these findings in context I will give some background about the broader study and the methods used. In this study I sought to answer my main research question, “How are non-Māori librarians in Aotearoa New Zealand making sense of [Māori knowledge]?” (Oxborrow, 2020,  p.9) by undertaking interviews with non-Māori librarians and focus groups with Māori librarians. I used Sense-Making Methodology (SMM), devised by the late Professor Brenda Dervin and colleagues (e.g. Dervin, 2003) as a guiding framework for my study. The central metaphor on which SMM is based describes a process of individual sense making where the sense maker finds themselves in a Situation facing an information or knowledge Gap. They need to find a way to Bridge this Gap to reach an Outcome and continue on their journey (See Figure 1: The Sense-Making metaphor). In the interviews, I sought to learn about individual Sense-Making instances (Situation-Bridge-Gap-Outcome sequences) and asked questions that probed the different phases of the process: Situation, Gap, Bridge and Outcome, as well as factors which acted as either Barriers or Helps to engagement. The full schedule of interview questions can be found in Appendix 1.   

Brenda Dervin's sense making drawing shows a stick person carrying an oddly shaped umbrella running towards three flags. In between the person and the flags is a giant pit which has a bridge made of different pieces extending over it so the person does not fall into the gap.

Fig. 1: The Sense-Making metaphor. Copyright: Sense-Making Methodology Institute, used with permission, https://sense-making.org/ [Accessed: January 12, 2026].

Interviewees were recruited by advertising the study on an email distribution service and the LIANZA weblog. Due to a high number of responses, participants were selected using a maximum variability approach. This meant that the group was highly varied in a lot of ways, including amount of experience, types of roles and libraries worked in, and geographical location within Aotearoa. Of the 25 interviewees, there were 12 who worked in tertiary libraries, seven in public libraries, and six in school or specialist libraries. Of those working in tertiary libraries, six had previous experience of working in other types of libraries. One area in which the group was most similar was that the vast majority of the sample (23/25) identified as New Zealand European, along with one participant who identified as Asian and one who identified as a Pacific Islander. While New Zealand European most often refers to White people who were born in Aotearoa New Zealand, it is not a strict category and thus four interviewees who had immigrated to Aotearoa as children self-identified as New Zealand European.  

In addition to these interviews I also undertook three focus groups with Māori librarians, recruited by personal invitation of some individuals that I had existing connections with, and also by approaching Māori colleagues for recommendations of other individuals who may have wished to be involved. Focus groups were undertaken in Ōtautahi (Christchurch), Te Whanganui-a-Tara (Wellington) and Tāmaki Makau Rau (Auckland). Of the eleven focus group participants, nine were working in tertiary libraries and two in public libraries. However, at least four of those working in tertiary libraries had previous experience of working in public libraries.

I conducted these groups myself, with cultural advice from one of my supervisors, Associate Professor Spencer Lilley (whose Māori tribal affiliations are Te Ātiawa4, Muaūpoko5 and Ngāpuhi6). Focus groups were chosen as the data collection method since they can be empowering for participants, positioning them as experts (Dyall et al., 1999; Smithson, 2000). This was of particular importance given my identity. As well as observing cultural protocols during the focus group meetings to the best of my ability such as opening and closing mihi (acknowledgements), I also employed a thorough member checking process to maximise opportunities for participants to provide feedback regarding any concerns they may have had about misrepresentation in the research. As well as providing the transcripts to focus group participants for checking, I also distributed a draft copy of the focus group findings chapter to participants prior to submission of the final thesis. In the focus groups, I asked participants about their experiences with their non-Māori colleagues. The first question was as follows: “The profession of librarianship in Aotearoa has expressed a commitment to biculturalism since the 1980s – To what extent is the reality living up to the promise of the profession in terms of engagement with mātauranga Māori by non-Māori librarians?” Other questions asked about factors acting as Helps and Barriers to non-Māori engagement with Māori knowledge, as well as risks and benefits. The full schedule of focus group questions can be found in Appendix 2. 

I analysed the data using thematic analysis (as described by Braun & Clarke, 2005), checking in with my supervisors throughout the process. Most interviewees discussed examples of other non-Māori librarians’ engagement or lack of engagement, in addition to their own journeys, and these were also coded separately. I used the stages of the Sense-Making process described above to inform my analysis of both interviews and focus groups. On completion of the analysis of each of the two data sets, I undertook a comparison between the two at the level of themes and sub-themes.     

Findings 

Interviews revealed several interesting findings about the Sense-Making journeys of non-Māori participants. Interviewees emphasised the large scale of their knowledge Gaps in relation to Māori knowledge, as well as highlighting Gaps in the areas of Māori and Libraries (which included aspects such as Māori history, Māori information sources and the treasured status of knowledge and information in Māori culture) and Language and Cultural Protocol. Bridges identified were Courses, Books and Text Resources and People and Situations. Both Helps and Barriers consisted of significant internal aspects, where elements of interviewees’ existing knowledge and experience or aspects of their personalities were either things that Helped them proceed or acted as potential Barriers. These were in some cases closely related; for example, fear was a potential Barrier in a lot of cases, but having the strength of character to push past that fear was also something that Helped some interviewees. Nineteen of 25 interviewees mentioned feeling good was one of the Outcomes of their experiences. This included having a feeling of knowing more, expectations being exceeded, and having a general positive feeling about the experience. 

Focus group participant discussions included questions designed to elicit aspects of the Sense-Making process (see Appendix 2). A key Situational factor was non-Māori librarians having the choice of whether or not to engage with Māori knowledge in their work. Outcomes were largely seen as positive for both Māori (such as better service for Māori clients and more allies for Māori librarians) and non-Māori, for whom such engagement experiences could be transformational. Focus group participants also saw potential risks, however, such as Māori client alienation. This is when Māori customers experience shame, feel belittled because a non-Māori librarian appears to them to have more knowledge than they do, or feel that non-Māori are over-stepping when they engage with Māori knowledge. The importance of learning te reo Māori was highlighted by participants throughout (although no specific question was asked about this), as was the need for Māori knowledge to be normalised throughout the profession of librarianship, and bringing together existing initiatives to build momentum.   

For further information on these findings, see Oxborrow (2020). In the rest of this section I will focus on the area of fear as a Barrier, as described by interviewees, and its flow on effect of overreliance and helplessness as discussed by focus group participants.  

Fear, Overreliance and Helplessness

From comparing the two sets of data, a key finding emerged. Non-Māori interviewees often spoke about fear in relation to their own experiences of engagement with Māori knowledge or their observations of their non-Māori colleagues’ engagement (or lack of). The concept of fear included fear of the unknown, fear of making a mistake, fear of what others might think, or fear of causing offence. Fear was often described by interviewees as a barrier to engagement, and frequently resulted in work involving Māori clients or knowledge being passed on from a non-Māori librarian to a Māori colleague. One of the interviewees articulated this situation in the following manner:

One of the first kind of panic things that can happen as a non-Māori person and you see a Māori person turn up and they’re like ‘I want some information about a Māori issue’ You’re like ‘Ooh, can I find someone who’s Māori to answer that question? I don’t feel qualified! Ah!

None of the interviewees spoke about the impact that this fear, and subsequent passing on of work might have for their Māori colleagues. It was, however, something that focus group participants talked a lot about, with the understandable frustration coming through clearly in the following quote:

Participant 1: And, you get people who make up lots and lots in excuses ‘oh, there wasn’t enough preparation, ‘I didn’t have enough pronunciation lessons’, ‘I don’t understand pepeha7, ‘I went to my 101 Māori course and I don’t have the confidence or the competence to be able to engage in mātauranga Māori’ 

And so, for me it’s like ‘So what are you asking me to do? Hold your hand? Do you want me to hold your hand? Do you want me to give you all the resources that you can possibly get?’ There are thousands and thousands of level two, level four resources that are available to librarians – we’re a library, we’re full of them – and yet there’s no self-development, there’s no want to self-develop unless somebody… 

Participant 2: Yeah, there’s no desire, aye? 

This quote also indicates that the participants in this group believed that the concern about being qualified that was articulated in the previous interviewee quote was not the main barrier to engagement. Focus group participants considered it totally appropriate for non-Māori colleagues to seek support on higher-level queries where a greater depth of cultural knowledge was needed. However, much of the time, the knowledge required was at the level of attempting a basic reference desk request or consulting an online dictionary. One focus group participant gave an example in a tertiary library context where if a patron approached a reference desk with a basic question about a History topic, the first response should not be to immediately fetch the subject specialist (as often happened), but to attempt to help. Only if the query proved to be too complex should they request assistance. The role of subject specialist in tertiary libraries in Aotearoa is a lot less specialised than might be the case in other larger countries, so the expectation would usually be that any librarian on reference desk duty would make an attempt to answer questions on any subject unless it was clear from the outset that the topic was very obscure and would be very difficult to find information on. Focus group participants were talking about questions on basic Māori topics not requiring a high level of cultural knowledge, or in some cases even general questions from Māori patrons, which were being turned over to Māori librarians immediately. 

Expectations of cultural support described by Māori librarians also involved broader things such as always being the one who is asked to lead or organise traditional welcoming ceremonies. One focus group participant described such expectations like this: “‘Ah, you’re the Māori, so you can look after anything Māori.'” Similar expectations for Māori to undertake cultural duties beyond their job descriptions have also been noted in other professions such as university teaching (Mercier, Asmar & Page, 2011) and science (Haar & Martin, 2022).

Librarians in Aotearoa, as in many places, seek to be active in terms of supporting diversity and inclusion. Wei and Boamah (2019) describe how Auckland libraries provide specific services to immigrant users. LIANZA (2025b) introduced a Freedom-to-Read toolkit to help librarians deal with book challenges. LIANZA also puts a strong emphasis on its attempts to create a more inclusive atmosphere for Māori, both as library patrons and also as fellow librarians, as can be seen in its statement of values:

Respect is at the core of our interactions, whether with our members, partners, or the communities we serve. We respect diverse perspectives, acknowledging that each voice contributes to the rich tapestry of our sector. Our commitment to respect extends to upholding the principles of Te Tiriti o Waitangi, recognising and valuing the unique knowledge and cultural heritage of Māori.(LIANZA, 2024, p. 2)

While no doubt progress has been made in the intervening years since the profession first began to focus on Māori knowledge, librarianship still has a long way to go in terms of being a safe and equitable career choice for Māori. Māori made up 17.8% of the population of Aotearoa in 2023 (Stats NZ, 2024a), but made up just 5.6% of librarians and 2.5% of library assistants in that same year (Infometrics, 2024). Although there are probably multiple factors leading to this discrepancy, the findings of my research would suggest that non-Māori overreliance and self-perceived helplessness plays a part in it. As noted above, non-Māori self-perceived helplessness was viewed differently by Māori librarians in the focus groups, who believed that their non-Māori colleagues could be more proactive in learning about and engaging with Māori culture.

One of the problems that focus group participants mentioned in relation to the overwork experienced by Māori librarians is that some would become burnt out and have even left the profession as a result. Similar findings have been reported among Māori scientists (Haar and Martin, 2021). There are also writings about the experiences of Black and other minoritised librarians in the United States of America that indicate they are also expected to pick up extra diversity-related work on top of their substantive roles, and thus are unlikely to continue on in the profession due to stress and burnout (e.g. Hinton, 2023). 

The situation in Aotearoa is complicated by the fact that there are sometimes non-Māori who go to the other extreme and operate beyond their level of knowledge and understanding and get things wrong (one example in a focus group was using Google Translate to create te reo Māori translations of complex library information, the result of which was completely inaccurate). Such examples were given in a context where cultural appropriation of Māori culture and knowledge continues to be common (University of Auckland, 2024) and te reo Māori is viewed as a taonga (treasure, anything prized) (Ngā Pae o te Māramatanga | New Zealand’s Māori Centre of Research Excellence, n.d.) and so its misuse by non-Māori is problematic. The process of finding a balance between not opting out and not overstepping is an ongoing challenge that requires humility and perseverance. Non-Māori authors such as Bell (2024) and Jones (2020) discuss the complexities for non-Māori attempting to engage well with Māori. Focus group participants described non-Māori colleagues who had got things wrong and then refused to engage anymore after this because they had been corrected by a Māori person and that had upset them:

Participant 3: Or they’ve been, reprimanded is too strong a word, but they’ve done something and then been told it was the wrong thing to do and it’s 

Participant 4: In the past, and they 

Participant 3: really put them off  

Participant 5: Put them off, yeah 

Participant 4: yeah, they don’t want to do it any more 

Participant 3: completely and they no longer want to have anything to do with anything Māori

Bell (2024) acknowledges that being challenged is hard: Let’s face it, it’s pretty natural to not want to be put on the spot, to be told you are wrong, or privileged, or have made a mistake, or are being racist. None of these experiences are very comfortable! (p.39)

The fact remains that to create meaningful change in the sector, non-Māori librarians in Aotearoa need to learn to engage with situations that feel uncomfortable in order to learn, grow and help make a better profession for Māori to join and remain in.

Conclusion: Feel the Fear and Persevere

As mentioned above, the cultural milieu of mainstream New Zealand means that meaningful engagement with the Māori world by non-Māori is largely still an individual decision. This is despite the existence of initiatives such as LIANZA Professional Registration and BoK11, which in the view of the majority of interviewees and focus group participants, had not created major change in the library profession in regards to non-Māori engagement with Māori knowledge. The lack of change is perhaps unsurprising given the low levels of professional engagement among librarians in Aotearoa discussed earlier.This being the case, there is often little external impetus for non-Māori librarians to keep going with their journeys of learning and engagement when other pressures or priorities crowd in, which means that they can lose momentum. We (non-Māori librarians) find it easy to forget that this is not an issue that our Māori colleagues can pick up and put down in the same way. As one of the focus group participants said of the attitudes of some of their non-Māori colleagues towards engagement with Māori knowledge: “It’s really motivated individually … it’s an option, optional…” They continued by describing an attitude they had seen in their non-Māori colleagues “…I’ll choose to be bicultural today, tomorrow I might not be.” The focus group participant finished their point by referring to Māori librarians: “…whereas we’re always in sights of it [living and working between two cultures].” Focus group participants also talked about the importance of non-Māori librarians being good allies, described by one focus group participant as having “…shared responsibility…”. They talked about several ways in which non-Māori librarians could help lighten the load for their Māori colleagues. These include running Māori events alongside or with cultural support from Māori colleagues, and advocating for Māori issues in the workplace so that Māori colleagues don’t always have to be “the angry Māori in the room” as one focus group participant put it, so they feel more supported and less worn down.  

Ongoing effort is required to keep momentum going, and this can be difficult for non-Māori librarians to sustain independently. Focus group participants mentioned the positive impact that can come from attempting to create a culture of engagement as part of an organisation or library system. Leaders and managers have a key role to play in encouraging their teams to engage with Māori knowledge on an ongoing basis (Oxborrow Vambe, 2025). Since there is no guarantee of such consistent support, a key message is to feel the fear and persevere. This involves having the humility to accept that everyone makes mistakes and being committed to a continuing journey of development despite challenges. The fear may not ever fully dissipate, though it may reduce through repeated exposure to challenging situations. It was the willingness to keep on pushing through fear to continue engaging that made the difference for some interviewees, as discussed above. Future research could include case studies of good practice, and methods employed by non-Māori librarians to move through fear. 

It is also important to emphasise that fear was not the only emotion discussed by interviewees in their journeys of learning and engagement with Māori knowledge. As discussed above, interviewees also discussed positive aspects of their experiences with Māori knowledge, with 19 of 25 interviewees discussing Feeling Good as an Outcome of their learning or engagement. One interviewee described their journey as “…one of the best learning experiences of my life, really.” So being committed to ongoing engagement with Māori knowledge can be personally rewarding as well as contributing to creating a more welcoming profession for Māori librarians. Creating opportunities for non-Māori librarians to share those positive experiences with each other could be a really powerful tool to encourage those who are more reluctant to engage with Māori knowledge to begin or continue their journeys of learning and engagement.   


Acknowledgements

I would like to thank all the many people who supported the PhD research on which this article is based, including all participants and my supervisors, Professor Anne Goulding and Associate Professor Spencer Lilley. My PhD was partially funded through the A.K. Elliot Memorial Scholarship. Much appreciation to the peer reviewers Professor Alison Jones and Jeannette Ho and the editor Brittany Paloma Fiedler. Thanks to my WWA paragraph editing partners, Anne Hiha and Sara Kindon, for your suggestions. This article was written during a secondment to Te Manawahoukura Rangahau Residency. Many thanks to the manager of my substantive role, Jenny Barnett, for making it possible for me to undertake this secondment. 


Appendix 1: Interview Question Schedule

Tell me about your background in the library profession.

What do you find particularly interesting about mātauranga Māori?

Participants will then be asked to give an overview of the main occurrences in their story of learning about mātauranga Māori in order (the story of their process of engaging with mātauranga Māori). These events will be written down on a piece of paper to serve as a prompt for the remainder of the interview. A similar set of questions will be used to ask about the participant’s choice of 2-4 occurrences, as time allows. These questions will be used to investigate one instance at a time. The questions are as follows:

Tell me about [the course/experience/learning source]

What led up to this moment of learning about/engaging with mātauranga Māori?

What didn’t you know about mātauranga Māori at that stage?

Did you have any problems because of what you didn’t know? What were they?

How did you know where to go to find answers to your questions [for the situation you were facing]?

What were you trying to learn or achieve through this?

Did you have any big questions that motivated you to seek more information or knowledge? If so, what were they?

What helped you in the situation? How?

Did you expect what you learned to help? If so, did it help in ways you expected or other ways?

What hindered you in the situation? How?

Did you expect what you learned to present problems? If so, did it present problems in ways you expected or other ways?

What conclusions or ideas did you come to as a result of this experience?

What did the experience help you achieve afterwards?

The final phase of the interview will be talking about your whole journey of making

sense of mātauranga Māori and includes some questions about LIANZA Professional Registration:

How does your journey of making sense of mātauranga Māori relate to your sense of identity as a New Zealander?

How does it relate to your sense of power?

Has your decision to become/not to become or to continue/not continue being Registered been influenced by the inclusion of mātauranga Māori as a mandatory element in the Body of Knowledge?

[REGISTERED PARTICIPANTS ONLY] Has your involvement in LIANZA’s Professional Registration scheme impacted on your journey of engagement with mātauranga Māori in your professional life? If so, how?

Is there anything else you would like to mention before we finish?

Appendix 2: Focus Group Question Schedule

1. The profession of librarianship in Aotearoa has expressed a commitment to biculturalism since the 1980s – To what extent is the reality living up to the promise of the profession in terms of engagement with mātauranga Māori by non-Māori librarians?

2. In your opinion, what effect has LIANZA Professional Registration had on the extent to which non-Māori librarians engage with mātauranga Māori in their professional lives?

3. What factors help non-Māori librarians to engage with mātauranga Māori?

4. What barriers prevent non-Māori librarians from engaging with mātauranga Māori?

5. Matrix

The matrix is titled non-Māori librarians engaging with mātauranga Māori in their professional lives. Two columns are labeled Benefits and Risks. Three rows are labeled Māori stakeholders, individual librarians, and the profession as a whole.

6. In an ideal world, what would you like to see from individual non-Maori librarians in terms of engagement with Maori knowledge and culture?

7. What needs to happen to bring about change?

8. Is there anything else you would like to talk about before we finish?

References

Allan, J. (2001). Review of the measurement of ethnicity: Classification and issues. https://www.stats.govt.nz/assets/Uploads/Retirement-of-archive-website-project-files/Methods/Review-of-the-Measurement-of-Ethnicity-Classifications-and-issues/review-of-the-measurement-of-ethnicity-classification-and-issues-main-paper.pdf

Bell, A. (2024). Becoming Tangata Tiriti: Working with Māori, honouring the Treaty. Auckland University Press. 

Black, R. (2010). Treaty people recognising and marking Pākehā culture in Aotearoa New Zealand. (Doctoral dissertation, University of Waikato, Hamilton). Retrieved from https://researchcommons.waikato.ac.nz/handle/10289/4795  

Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77-101.  

Bryant, M. (2015). Whāia te mātauranga – How are research libraries in Aotearoa New Zealand applying Ngā Ūpoko Tukutuku / the Māori Subject Headings and offering them to users? (Master’s thesis, Victoria University of Wellington, Wellington). Retrieved from http://researcharchive.vuw.ac.nz/xmlui/handle/10063/4633  

Clark, E. & Hill, R. (2024). New Zealand is unwinding ‘race-based policies’. Māori say it’s taking away their rights. ABC News. https://www.abc.net.au/news/2024-09-19/new-zealand-unwinding-maori-rights-treaty-of-waitangi/104364638 

Department of Corrections | Ara Poutama Aotearoa. (2025). Prison facts and statistics -June 2025. https://www.corrections.govt.nz/resources/statistics/quarterly_prison_statistics/prison_facts_and_statistics_-_june_2025

Dervin, B. (2003). From the mind’s eye of the user: The Sense-Making qualitative-quantitative methodology. In B. Dervin, L. Foreman-Wernet, & E. Lauterbach (Eds.), Sense-Making Methodology reader: Selected writings of Brenda Dervin (pp. 269-292). Hampton Press. 

Dyall, L., Bridgeman, G., Bidois, A., Gurney, H., Hawira, J., Tangitu, P., & Huata, W. (1999). Māori outcomes: Expectations of mental health services. Social Policy Journal of New Zealand, 12, 1-20.

Dyson, L. (2001). Traces of identity: The construction of white ethnicity in New Zealand. (Doctoral dissertation, Middlesex University, United Kingdom). Retrieved from http://eprints.mdx.ac.uk/6691/  

Haar, J., & Martin, W. J. (2022). He aronga takirua: Cultural double-shift of Māori scientists. Human Relations, 75(6), 1001-1027.

Hinton, M. (2023). Black librarianship: Stories of impact and connection. School Library Journal. https://www.slj.com/story/Black-Librarianship-Stories-of-Impact-and-Connection

Infometrics. (2024). 2023 sector profile: Arts and creative – Maori. Manatū Taonga | Ministry for Culture and Heritage. https://www.mch.govt.nz/publications/arts-and-creative-sector-economic-profiles-2023

Irwin, K., & Katene, W. (1989). Maori people and the library: A bibliography of Ngati Kahungunu and Te Waka o Takitimu resources. He Parekereke, Department of Education, Victoria University of Wellington. 

Jones, A. (2020). This Pākehā life: An unsettled memoir. Bridget Williams Books. 

King, M. (2004). Being Pakeha now (2nd ed.). Penguin.

Library and Information Association of New Zealand Aotearoa | Te Rau Herenga Aotearoa. (2026). Professional Registration: Who has RLIANZA. https://web.archive.org/web/20260112202502/https://www.lianza.org.nz/professional-development/professional-registration/who-has-rlianza/

Library and Information Association of New Zealand Aotearoa | Te Rau Herenga Aotearoa. (2025a). Te Rau Herenga o Aotearoa | Library and Information Association of New Zealand annual report 2024-2025. https://www.lianza.org.nz/media/xy0hlxzu/annual-report-2025-final.pdf  

Library and Information Association of New Zealand Aotearoa | Te Rau Herenga Aotearoa. (2025b). Freedom to read toolkit. https://www.lianza.org.nz/freedom-to-read-toolkit/ 

Library and Information Association of New Zealand Aotearoa | Te Rau Herenga Aotearoa. (2024). LIANZA values. https://www.lianza.org.nz/about/who-we-are/lianza-values/

 Library and Information Association of New Zealand Aotearoa | Te Rau Herenga Aotearoa. (2020). LIANZA code of practice: 5.00 professional registration. https://www.lianza.org.nz/media/lkoat5m2/500-professional-registration.pdf

Library and Information Association of New Zealand Aotearoa | Te Rau Herenga Aotearoa. (n.d.-a). BoK 11: Awareness of Indigenous knowledge paradigms. https://www.lianza.org.nz/professional-development/professional-registration/bodies-of-knowledge-bok/#BOK-eleven

Library and Information Association of New Zealand Aotearoa | Te Rau Herenga Aotearoa. (n.d.-b). Te Tōtara workforce capability. https://www.lianza.org.nz/professional-development/te-totara-workforce-capability/

Library and Information Association of New Zealand Aotearoa Professional Registration Board. (2013). LIANZA Profession Registration Board Professional Practice Domains and Bodies of Knowledge December 2013. LIANZA. 

Library and Information Association of New Zealand Aotearoa Taskforce on Professional Registration. (2005). Professional future for the New Zealand Library and Information profession: Discussion document. LIANZA. 

Library and Information Association of New Zealand Aotearoa | Te Rau Herenga Aotearoa & Te Rōpū Whakahau. (2024). LIANZA and Te Rōpū Whakahau partnership agreement. https://www.lianza.org.nz/media/rihc3hoz/lianza-te-ropu-whakahau-partnership-agreement-2024.pdf

Lilley, S. (2013). Te Rōpū Whakahau: Waiho i te toipoto, kaua i te toiroa, celebrating 20 years. Te Rōpū Whakahau. 

Lulich, S. (2007). Professional registration scheme update & reasons to join. LIANZA.

Maori Library Service Committee. (1963). Library service to Maoris [sic]: Maori Library Service Committee report to the NZLA Council, February, 1963. New Zealand Libraries, November, 254-260.  

Mead, H. M. (2012). Understanding mātauranga Māori. In Haemata Limited (Ed.), 

Conversations on mātauranga Māori (pp. 9-14). New Zealand Qualifications Authority. 

Mercier, O., Asmar, C., & Page, S. (2011). An academic occupation: Mobilisation, sit-in, speaking out and confrontation in the experiences of Māori academics. Indigenous Education, 40, 81-91.  

Millen, J. (2010). Te rau herenga, a century of library life in Aotearoa: The New Zealand Library Association & LIANZA, 1910-2010. LIANZA.

Moorfield, J. (n.d.). Pākehā. In Te Aka Māori Dictionary. Retrieved January 21, 2026, from https://maoridictionary.co.nz/word/4997

Newman, M. (2025, August 17). Dismantling separatism. New Zealand Centre for Political Research. https://www.nzcpr.com/dismantling-separatism/

Ngā Pae o te Māramatanga | New Zealand’s Māori Centre of Research Excellence. (n.d.). Te reo Māori – A taonga. https://www.maramatanga.ac.nz/news-events/news/te-reo-m-ori-taonga

Orange, C. (2023). Te Tiriti o Waitangi – the Treaty of Waitangi. Te Ara – the Encyclopedia of New Zealand, https://teara.govt.nz/en/te-tiriti-o-waitangi-the-treaty-of-waitangi

Oxborrow, K. (2020). “It’s not just a professional development thing”: Non-Māori librarians in Aotearoa New Zealand making sense of mātauranga Māori. Dissertation, School of Information Management, Te Herenga Waka | Victoria University of Wellington, Wellington, Aotearoa New Zealand, https://doi.org/10.26686/wgtn.17148506.v1

Oxborrow, K., Goulding, A. & Lilley, S. (2017). The interface between Indigenous knowledge and libraries: The need for non-Māori librarians to make sense of mātauranga Māori in their professional lives In Proceedings of RAILS -Research Applications, Information and Library Studies, 2016, School of Information Management, Victoria University of Wellington, New Zealand, 6-8 December, 2016. Information Research, 22(4), paper rails1619. Retrieved from http://InformationR.net/ir/22-4/rails/rails1619.html

Oxborrow Vambe, K. (2025). How can managers in libraries support their teams to engage with mātauranga Māori (Māori knowledge)? Journal of New Librarianship, 10(1), 100–115. https://doi.org/10.33011/newlibs/18/10 

Paewai, P. (2025). Human rights complaint filed to United Nations over treatment of Māori. Radio New Zealand. https://www.rnz.co.nz/news/national/578480/human-rights-complaint-filed-to-united-nations-over-treatment-of-maori

Ritchie, A. (2013). ‘Pākēhā librarianship at the interface’ Being an ally in Māori student success through teaching and learning information literacies. (Master’s thesis, Victoria University of Wellington, Wellington). Retrieved from http://researcharchive.vuw.ac.nz/handle/10063/2862

Royal, T. A. C. (2012). Māori. Te Ara – the Encyclopedia of New Zealand. https://teara.govt.nz/en/maori 

Smithson, J. (2000). Using and analysing focus groups: Limitations and possibilities. International Journal of Social Research Methodology, 3(2), 103-119.  

Stats NZ | Tatauranga Aotearoa. (2023). Aotearoa data explorer. https://explore.data.stats.govt.nz/vis?fs[0]=2023%20Census%2C0%7CWork%23CAT_WORK%23&pg=0&fc=2023%20Census&bp=true&snb=3&df[ds]=ds-nsiws-disseminate&df[id]=CEN23_TBT_004&df[ag]=STATSNZ&df[vs]=1.0&dq=oc399312%2Boc599711%2Boc224611%2BtwTotal%2BsoTotal%2BcdTotal.2023&ly[rw]=CEN23_TBT_IND_001&to[TIME]=false

Stats NZ | Tatauranga Aotearoa. (2024a). 2023 Census population counts (by ethnic group, age, and Māori descent) and dwelling counts. https://www.stats.govt.nz/information-releases/2023-census-population-counts-by-ethnic-group-age-and-maori-descent-and-dwelling-counts/ 

Stats NZ | Tatauranga Aotearoa. (2024b). Ngā tūtohu Aotearoa | Indicators Aotearoa New Zealand – Wellbeing data for New Zealanders: Amenable mortality. https://statisticsnz.shinyapps.io/wellbeingindicators/_w_0b03922ab93844bf809b3cdd60735970/?page=indicators&class=Social&type=Health&indicator=Amenable%20mortality

Stone, L. (2013). LIANZA careers survey 2012. The Information Workshop. https://lianza.recollect.co.nz/nodes/view/2299#idx11845

Te Huia, A. (2016). Pākehā learners of Māori language responding to racism directed toward Māori. Journal of Cross-Cultural Psychology, 47(5), 734-750.

Tuhou, T. (2011). Barriers to Māori usage of university libraries: An exploratory study in Aotearoa New Zealand. (Master’s thesis, Victoria University of Wellington, Wellington). Retrieved from http://researcharchive.vuw.ac.nz/xmlui/handle/10063/1700  

University of Auckland. (2024). NZ needs a legal remedy for cultural misappropriation. https://www.auckland.ac.nz/en/news/2024/08/09/nz-needs-a-legal-remedy-for-cultural-misappropriation.html

Waitangi Tribunal | Te Rōpū Whakamana i te Tiriti o Waitangi. (n.d.). Waitangi Tribunal. https://www.waitangitribunal.govt.nz/en/home

Wei X.L., Boamah E. (2019). Auckland libraries as a multicultural bridge in New Zealand: Perceptions of new immigrant library users. Global Knowledge, Memory and Communication, 68(6), 581-600. 

Endnotes

  1. Tribal group of East Coast area north of Gisborne to Tihirau. ↩
  2. Fairy folk – mythical being [sic] of human form with light skin and fair hair. ↩
  3. Fairy folk – fair-skinned mythical people who live in the bush on mountains. Although like humans in appearance, the belief is that they do not eat cooked food and are afraid of fires. ↩
  4. Tribal group to the north-east of Mount Taranaki including the Waitara and New Plymouth areas. A section of Te Āti Awa moved to parts of the Wellington area and the northern South Island in the 1820s. ↩
  5. A tribal group of the Horowhenua and northern Kapiti coast. ↩
  6. Tribal group of much of Northland. ↩
  7. Tribal saying, tribal motto, proverb (especially about a tribe), set form of words, formulaic expression, saying of the ancestors, figure of speech, motto, slogan – set sayings known for their economy of words and metaphor and encapsulating many Māori values and human characteristics. ↩

Author Interview: Shelley Noble / LibraryThing (Thingology)

Shelley Noble

LibraryThing is pleased to sit down this month with best-selling author Shelley Noble, whose many novels run the gamut from historical fiction to mystery to contemporary women’s fiction. A former professional dancer, Noble toured with Twyla Tharp Dance and American Ballroom Theater, and has worked as a choreographer for film and theater productions. She earned her BFA and MFA at the University of Utah, and taught at California State University in Fresno. A former president of Sisters-in-Crime, Noble is a member of Mystery Writers of America, Romance Writers of America, and Liberty States Fiction Writers, and currently lives in New Jersey. Her newest novel, The Sisters of Book Row, was published by William Morrow in March 2026 and tells the story of three sisters and bookstore proprietors who confront the Comstock laws in 1915 Manhattan. Noble sat down with Abigail this month to discuss the book.

How did the story idea for The Sisters of Book Row first come to you? Were you drawn to the thought of writing about bookstores and booksellers, or perhaps about the Comstock laws?

I’ve had Comstock in the back of my mind for a while, a perfect villain, a vicious zealot, one who I particularly despise. So when my editor suggested I write a book about books, guess who came to mind. And because I write about Manhattan, I knew the perfect place in which to set the story, Book Row, once the mecca of rare and used book buyers from around the world. And like a magnet, this germ of an idea began collecting bits and pieces. An article about the current Cohen sisters of the Argosy Book Store inspired me to create the Applebaum sisters, and Sisters was born.

Tell us about the Comstock laws. What were they, and what effect did they have on the world of books and booksellers, as well as the wider American society of that time?

Anthony Comstock moved to New York in the early 1870s and was appointed special agent to The Society for the Suppression of Vice and the U.S. Post Office to prevent pornography from being sent through the mail. He was given the power to search, seize, arrest and fine, the monies of which he received half. His activities quickly spread to all facets of life, and as his power grew, his ideas of what was “obscene, lewd, or lascivious,” changed, sometimes from week to week. Later in his career, his extreme and outlandish views made him a laughing stock, ridiculed by the newspapers, and dismissed by the courts. The Post Office fired him, but he refused to leave. The NYSPV replaced him, but again he ignored them and continued on his crusade. The Comstock Act, enacted in 1873, included a ban on contraception and was written by Comstock himself. It was never repealed, but Roe v. Wade relegated it to being a zombie law. Unfortunately states had adopted the original law for their own use. And today we see it being used to prevent birth control information, or any reproductive health measures from all women. A zealot, who is said to have destroyed 15 tons of books and four million pictures and other materials, who hated women and died ridiculed and despised, and yet he has managed to rear his ugly head again today.

Your story is set on Book Row, a district in lower Manhattan that contained over three dozen bookstores at its height. Did you have to do any research about the history of the area, and what were some interesting things you learned? If you could visit any bookstore from that period, which would it be? (Disclosure: I worked for the Strand bookstore—the sole survivor of Book Row—for many years).

I used to hang out at the Strand all the time. Many years ago. It was a solace and an adventure away from the chaos of the city and my profession as a young dancer. I hope my Sisters of Book Row can come to life for readers of today. I did loads of research, I always do. It’s one of my favorite parts of writing historical fiction. There’s a lovely book titled Book Row by Marvin Mondlin and Roy Meador. It didn’t have as much information on my particular period 1915 as I hoped, but it was fascinating to read about the continuation of this community, especially post 1930.

Once I get an overview of my time and place and characters, I like to depend mainly on primary sources, newspapers, anecdotes, letters. That way I know what they know, feel what they feel, and try to leave my historical outsider’s knowledge at the door. I mostly learned the neighborhood in bits and pieces since the area has built up so much since then.

Some oddities and coincidences: The Argosy, owned by Louis Cohen, the father of the Cohen sisters who added their inspiration to this story had his own run in with “Comstockery” in the 1930s. When the city began digging the new subway, customers couldn’t get around construction, and many stores had to move uptown, then moved back in when it was completed. After I developed and had lived with my three Applebaum sisters and the Arcadia for weeks and several chapters, I learned that there was actually a Mr. Applebaum who had a bookshop in the Row, named Arcadia. Did I read about it and forgot while it became ingrained in my subconscious? Or was it really a coincidence? I was too attached to my own Applebaums to change their names, so I mentioned the existence of two families in my Author Notes.

Sometimes a story is like a jigsaw puzzle, learning a phrase, a sentence about the inhabitants. The two booksellers, who were constantly arguing, gave me an image that led to the daily morning conversations around the newsstand. They might have argued and complained, but they were neighbors and they were ready to take up a collection to bail one of their own out of jail when Comstock was on the prowl.

The book world has been rocked in recent years by an upsurge of attempts at censorship and book suppression. I chronicle some of that in the Freedom of Expression column of our monthly State of the Thing newsletter. What can your story tell us about our situation today, in this respect?

For our modern selves, I wish The Sisters of Book Row and their withstanding the attacks of what they loved most was so outside of our experience, so unbelievable, that readers might say. “Oh, that would never happen here.” But unfortunately we see it happening throughout our country by those who, like Comstock, denounce books they’ve never even read and bully those who only want to share knowledge. Their attacks sometimes seem so diffuse and widespread that we might think it will never affect us. It will, but I have to believe that we’re more experienced, more aware of the rotten core of the book banning movement, and that if we keep up a constant resistance, we will prevail.

Tell us a little bit about your writing process. Do you have a particular routine—a schedule you keep, or a place you like to write? You write in a number of different genres, does your story-building process differ, depending on the genre?

I do have a routine though it has changed over the years and books. When I wrote two books a year, I had a tighter schedule. Now that I’m writing one historical I can linger in the research, jump down a rabbit hole or two. And I find that writing of the past, I’ve changed from being an early morning writer, to a late night writer. There’s something about the dark and the quiet that I find conducive to delving into the past. Of course the nearer I get to deadline, the more daytime writing I have to do. I have a home office where I write all my books. Each genre requires a different energy and attitude. The contemporaries don’t require as much deep dive research, so I can begin writing sooner than with the historicals. No matter the genre, I depend on a storyboard to keep everything on track. Not a computer screen board but a big gridded Lucite board on the wall with color coded post-its for characters and plot points that can be moved around as the story develops.

What comes next for you? Are there any new books you’re currently working on?

I’m currently working on a story that takes place in 1870 Long Branch, New Jersey, where President Grant has his summer capital and a young woman aspiring to become a lawyer confronts the changes and the scandals that threaten the quiet seaside town she calls home.

Tell us about your library. What’s on your own shelves?

Lots of history books, mainly early 20th century New York, late 19th American theatre books when the Rialto was Union Square. Dickens, Austen, Mary Stewart. A rotation of women’s historical fiction. Eastern religion. Mystery and science fiction. I’m a pretty eclectic reader.

What have you been reading lately, and what would you recommend to other readers?

This fall I decided to go on a rereading spree. I started with Fahrenheit 451 followed by 1984 right before the holidays. Yes, they are still as scary as when I read them in school. After that, I immediately pulled out my favorite chapters of The Pickwick Papers. Now I’m re-rereading The Hobbit, and reading A Founding Mother** about Abigail Adams, by Stephanie Dray and Laura Kamoie. I highly recommend all of these.

**Stay tuned for our interview with Stephanie Dray and Laura Kamoie this coming July, in honor of America’s 250th birthday!

The Handoff Problem (Updated) / David Rosenthal

Source
Around twelve years ago, Google figured out the fundamental problem facing Tesla's Fake Self Driving. Almost nine years ago in Robot Cars Can’t Count on Us in an Emergency, John Markoff wrote:
Three years ago, Google’s self-driving car project abruptly shifted from designing a vehicle that would drive autonomously most of the time while occasionally requiring human oversight, to a slow-speed robot without a brake pedal, accelerator or steering wheel. In other words, human driving was no longer permitted.

The company made the decision after giving self-driving cars to Google employees for their work commutes and recording what the passengers did while the autonomous system did the driving. In-car cameras recorded employees climbing into the back seat, climbing out of an open car window, and even smooching while the car was in motion, according to two former Google engineers.
Gareth Corfield at The Register added:
Google binned its self-driving cars' "take over now, human!" feature because test drivers kept dozing off behind the wheel instead of watching the road, according to reports.

"What we found was pretty scary," Google Waymo's boss John Krafcik told Reuters reporters during a recent media tour of a Waymo testing facility. "It's hard to take over because they have lost contextual awareness."
Follow me below the fold for a wonderful example of Tesla's handoff problem, and a discussion of the difference between Tesla's and Waymo's approaches to self-driving.

I wrote about this handoff problem in 2017's Techno-hype part 1. I did a thought experiment, imagining mass-market cars 3 times better than Waymo's at the time:
A normal person would encounter a hand-off once in 15,000 miles of driving, or less than once a year. Driving would be something they'd be asked to do maybe 50 times in their life.

Even if, when the hand-off happened, the human ... had full "situational awareness", they would be faced with a situation too complex for the car's software. How likely is it that they would have the skills needed to cope, when the last time they did any driving was over a year ago, and on average they've only driven 25 times in their life? Current testing of self-driving cars hands-off to drivers with more than a decade of driving experience, well over 100,000 miles of it. It bears no relationship to the hand-off problem with a mass deployment of self-driving technology.
I concluded:
But the real difficulty is this. The closer the technology gets to Level 5, the worse the hand-off problem gets, because the human has less experience. Incremental progress in deployments doesn't make this problem go away.
Raffi Krikorian:
used to run the self-driving-car division at Uber, trying to build a future in which technology protects us from accidents. I had thought about edge cases, failure modes, the brittleness hiding behind smooth performance. My team trained human drivers on when and how to intervene if a self-driving car made a mistake. In the two years I ran the division, we had no injuries in our early pilot programs.
He has an article in the current Atlantic entitled My Tesla Was Driving Itself Perfectly—Until It Crashed with the sub-head:
The danger of almost-perfect tech
As an enthusiast for slef-driving technology, Krikorian used it:
With my own Tesla, I started out using Full Self-Driving as the default setting only on highways. That’s where it makes sense: You have clear lane markers and predictable traffic patterns. Then, one day, I tried it on a local road, and it worked well enough to become a habit.
But, after three years:
My memory is hazy, and some of it comes from one of my sons, who watched the whole thing unfold from the back seat. The car was making a turn. Something felt off—the steering wheel jerked one way, then the other, and the car decelerated in a way I didn’t expect. I turned the wheel to take over. I don’t know exactly what the system was doing, or why. I only know that somewhere in those seconds, we ended up colliding with a wall.
He didn't have "situational awareness", even though he was an experienced driver aware of the handoff problem. He sums up the current problem, with drivers like him:
Full Self-Driving works almost all of the time—Tesla’s fleet of cars with the technology logs millions of miles between serious incidents, by the company’s count. And that’s the problem: We are asking humans to supervise systems designed to make supervision feel pointless. A machine that constantly fails keeps you sharp. A machine that works perfectly needs no oversight. But a machine that works almost perfectly? That’s where the danger lies. After a few hours of flawless performance, research shows, drivers are prone to start overtrusting self-driving systems. After a month of using adaptive cruise control, drivers were more than six times as likely to look at their phone, according to one study from the Insurance Institute for Highway Safety.
Imagine this problem compounded by handing off to a driver who hadn't driven in a year.

Google was building Level 4 robotaxis. Their conservative approach was to eliminate the handoff problem completely. Waymos operate on carefully mapped routes after much practice, and are equipped with a diverse set of sensors. Just as everywhere along their flight path, airliners have a designated diversion airport, Waymos know a safe place to stop and ask for help from remote humans. They don't drive the cars, they just advise the car as to how to solve the problem. This can, as I have seen a couple of times, cause frustration among other road users, but it is safe.

Tesla, on the other hand, had a Level 2 driver assist system with a limited set of sensors, which depended on handing off to the driver in case of confusion. They consistenly marketed it as "Full Self-Driving" with exaggerated claims about its capabilities, and sold it to normal, untrained drivers. They could not, and could not afford to, implement Google's approach. Why not?
  • Scale: Tesla has 1.1M FSD customers, where six months ago Waymo had about 2K cars in service. To support them, Waymo has about 70 remote operators on duty. Of course, FSD is used much less intensively, lets guess only 5% as much. Even if, optimistically, Tesla's technology generated as few remote requests as Waymo's they would need almost 2,000 remote operators on duty.
  • Technical: First, Tesla markets FSD as usable anywhere, even if their terms of service disagree. So they lack the detailed maps Waymos use when they need to find a safe place. Second, Tesla has far fewer sensors, so has much less information on which to base the need for and choice of a safe place.
  • Marketing: There are two problems. First, telling the public that FSD will sometimes need to stop and ask for help goes against the idea that it is "Full Self Driving". Second, everyone can see that a Waymo is driving itself and can set their expectations to match. No-one can tell that a Tesla is using Fake Self Driving. So were Teslas stopping unexpectedly, even if it wasn't using Fake Self Driving, the assumption would be that the technology had failed.
Because Tesla has always depended upon handing off to the human, the result is that Tesla's minimal robotaxi service with "safety monitors" in Austin, TX crashes six times as often as human-driven taxis.

Update 4th April

Source
Kristen Korosec provides Waymo’s skyrocketing ridership in one chart:
Waymo is now providing 500,000 paid robotaxi rides every week across 10 U.S. cities, the company shared in a post on X this week. The eye-popping figure is reflective of the Alphabet-owned company’s accelerated commercial expansion. But it’s Waymo’s rate of growth in ridership and markets that offers a more compelling story.

In less than two years, the company’s average weekly paid robotaxi trips have grown tenfold, from 50,000 per week in May 2024 to 500,000 per week today. Over that same two-year timespan, Waymo has expanded within its initial markets of Phoenix, San Francisco, and Los Angeles — and beyond them to Austin, Atlanta, Miami, Dallas, Houston, San Antonio, and Orlando. Those seven cities in the Sun Belt were all added in just the past year.
The fleet hasn't grown with the rides, showing increased utilization and thus improved economics:
Waymo’s robotaxi fleet has also grown, although the company has guarded those numbers and rarely provides updates. Data provided in December 2025 to the National Highway Traffic Safety Administration (NHTSA) shows the company had 3,067 robotaxis equipped with its 5th generation self-driving system. The company still uses that “over 3,000” fleet number today. That could soon change with the introduction of its 6th generation self-driving system, which will debut on the Zeekr minivan, known as Ojai, and the Hyundai Ioniq 5.

Append the LM to the IR / Mat Kelly

From January to March 2026, I taught INFO624: Intelligent Search and Language Models at Drexel CCI—a course that sits at the intersection of classical information retrieval (IR) and modern AI-driven language models.

This offering marked a deliberate shift from previous iterations of INFO624. While earlier versions focused on traditional IR systems, this course expanded to explore how language models are reshaping the way we search, rank, and interact with information. In many ways, the guiding question became: what does it mean to append the LM to the IR?

The course was delivered in a cross-listed format, with a mix of in-person and asynchronous students. Preparing and teaching it required not just updating materials, but continuously adapting to a rapidly evolving technical landscape—one where best practices can shift within months.

Topics

Despite losing two instructional days (MLK Day and a late-January snowstorm), the course covered eight weeks of material spanning both foundational and emerging topics:

  • Introduction to IR and AI foundations
  • Text Processing and AI-enchanced pre-processing
  • From Vector Space Models to Dense Representations
  • Probablistic Models and Neural Language Models for IR
  • AI-Driven Web Search and Retrieval Techniques
  • Graph Analysis and Neural Linking Models
  • Evaliation Metrics and AI-Enhanced IR Systems
  • Relevance Feedback with AI Techniques
  • Clustering and Classification with Deep Learning
  • Emerging Topics in AI (e.g., RAG, XAI, Multimodal IR)

Each topic could easily warrant a full course on its own, but the goal here was breadth with meaningful depth—enough to ground students before they explored ideas in their projects.

Student Projects

The course enrolled 20 students, who could choose to work individually or in groups. Projects took one of two forms: (1) an IR/AI-focused literature review or (2) the design and evaluation of a working system. In total, 12 projects were submitted, reflecting a wide range of interests across modern information retrieval and language model integration.

Systems

Omkar, Manjiri, and Priti developed a multi-source search system that retrieves, synthesizes, and self-evaluates information from web, academic, and local data to generate comprehensive, cited answers.
https://github.com/Priti0427/Intelligent-Search-agent

Mokshad and Ishant built a search engine over arXiv papers that combines BM25 with BERT-based retrieval, while providing transparent explanations for ranking decisions.
https://github.com/Mokshu3242/arXiv-Paper-Search-System

Ian built a two-stage recipe search engine on the Food.com corpus (~230K recipes, 1.1M reviews), integrating BM25 retrieval, rule-based query alignment, and neural embeddings derived from review-based quality signals.
https://github.com/iauger/recipe-search-engine

Chinomso designed a system for question answering over PDFs that incorporates document structure (sections and hierarchy) into both retrieval and grounded generation.
https://github.com/MishaelTech/explanable_structured_rag_pdf

Charles implemented a transparent full-text search engine over newly released JFK assassination documents, enabling precise and citable exploration of primary historical sources.

Robert and Ayush created a system that combines chapter-level character summaries with semantic retrieval to support exploration and querying of long-form narrative texts.

Jake developed a prototype system using FAISS, augmented with salience and recency signals, to retrieve narrative memories for consistent storytelling in AI-driven environments.

Mason built a RAG-based search engine for personal finance, retrieving and summarizing trusted financial documents to answer user questions in natural language.
https://github.com/riccimason99/Financial-Planning-Search-Engine

Literature Reviews

Sriram, Sourav, Khushi, and Lohitha conducted a survey of retrieval-augmented generation (RAG) methods for academic use, focusing on hybrid retrieval, self-reflection, and challenges such as faithfulness and evaluation.

Muhammad analyzed the evolution of neural information retrieval, tracing the progression from early embeddings to modern transformer-based dense retrieval and identifying remaining challenges.

Sriram examined personalization in search, exploring how systems balance relevance with novelty and diversity under ambiguous or evolving user intent.

Grace compared thesauri, knowledge graphs, and latent semantic analysis as methods for incorporating semantic relationships into retrieval systems.

Conclusion

Overall, INFO624 highlighted just how quickly information retrieval and language models are converging—both in research and in practice. What once felt like separate paradigms are now deeply intertwined, with modern systems blending classical ranking methods and neural representations into hybrid approaches.

The range of student projects reflects this shift clearly: systems emphasized not only performance, but also transparency, evaluation, and real-world usability. Just as importantly, many projects grappled with emerging challenges such as faithfulness, explainability, and the limits of current models.

For me, teaching this course reinforced an important reality: working in this space requires constant adaptation. The tools, techniques, and expectations are evolving rapidly, and education must evolve with them. If anything, this iteration of INFO624 felt less like a static course and more like a snapshot of a moving target—one that students are now well-equipped to continue exploring.

Weekly Bookmarks / Ed Summers

These are some things I’ve wandered across on the web this week.

🔖 Review: Measuring AI Ability to Complete Long Software Tasks

Measuring AI Ability to Complete Long Software Tasks, a paper by dozens of authors working at Model Evaluation & Threat Research (METR). They define the “time horizon” metric and show that LLMs’ time horizons have been doubling every seven months, and this growth might have recently accelerated.

🔖 RADIO CAUSE COMMUNE 93.1 FM • PARIS

Radio Cause Commune est une radio associative parisienne qui diffuse depuis novembre 2017 sur 93.1 FM. 40 bénévoles, zéro publicité, un budget de 60 000€ annuel : nous maintenons une stricte indépendance éditoriale. Nous défendons les logiciels libres, l’indépendance des médias et créons des outils techniques innovants pour la radiophonie libre

🔖 Pourquoi je n’utilise pas l’IA

L’IA me gonfle. Profondément. Enfin, surtout l’IA générative (tu sais, les LLM), parce que je peux concevoir une certaine utilité à certains types d’IA. La reconnaissance vocale, par exemple.

Passons un peu en revue mes raisons de ne pas utiliser l’IA.

🔖 London Book Trades Database

The Bibliographical Society has just launched a redesigned version of the London Book Trades Database (https://lbt.bibsoc.org.uk/).

The original LBT database was the work of the late Michael Turner at the Bodleian Library, assisted by a number of collaborators, drawing particularly on the archival resources of the Stationers’ Company. A web version of the database was created in 2009 which eventually ran on servers at the Bodleian until it was closed down in 2024 as its software was long past its expiry date.

The Bibliographical Society has taken steps to revive the project, this time as a read-only MediaWiki resource based on a new extraction of the data from the original database created by Michael Turner and a radical redesign of the contents and interface (I led this work). This new version, known as LBT Version 2, does not yet contain all the original data, but the people, events, titles, and relationships make it immediately useful. We envisage two or three updates in the coming months as more contents are retrieved and restructured. The new web site has explanatory pages with a full history of the project and its new technical implementation.

In addition to all the famous names of the book trade up to the mid-nineteenth century, entries offer information for more minor figures including family members and apprentices. There are entries for nearly 35,000 people, presenting detailed accounts of the person’s interaction with the Stationers’ Company and data from published sources.

🔖 Wikipedia Bans AI-Generated Content

After months of heated debate and previous attempts to restrict the use of large language models on Wikipedia, on March 20 volunteer editors accepted a new policy that prohibits using them to create articles for the online encyclopedia.

“Text generated by large language models (LLMs) often violates several of Wikipedia’s core content policies,” Wikipedia’s new policy states. “For this reason, the use of LLMs to generate or rewrite article content is prohibited, save for the exceptions given below.”

🔖 Trump administration requests Stanford Medical School admissions data, claiming racial discrimination

The Trump Administration opened investigations into admissions policies at the medical schools of Stanford University, Ohio State University (OSU) and the University of California, San Diego (UCSD) on March 25, noting possible race discrimination.

In letters sent to the three schools, the Department of Justice (DOJ) requested data on the last seven years of admitted classes at the medical schools, threatening to withhold federal funding if the schools do not comply by turning over the data requested by April 24. The investigation is part of a larger crackdown on higher education, as the DOJ has launched dozens of investigations into universities during Trump’s second term.

🔖 Discounted Cumulative Gain

Discounted cumulative gain (DCG) is a measure of ranking quality in information retrieval. It is often normalized so that it is comparable across queries, giving Normalized DCG (nDCG or NDCG). NDCG is often used to measure effectiveness of search engine algorithms and related applications. Using a graded relevance scale of documents in a search-engine result set, DCG sums the usefulness, or gain, of the results discounted by their position in the result list.[1] NDCG is DCG normalized by the maximum possible DCG of the result set when ranked from highest to lowest gain, thus adjusting for the different numbers of relevant results for different queries.

🔖 axios Compromised on npm - Malicious Versions Drop Remote Access Trojan

axios is the most popular JavaScript HTTP client library with over 100 million weekly downloads. On March 30, 2026, StepSecurity identified two malicious versions of the widely used axios HTTP client library published to npm: axios@1.14.1 and axios@0.30.4. The malicious versions inject a new dependency, plain-crypto-js@4.2.1, which is never imported anywhere in the axios source code. Its sole purpose is to execute a postinstall script that acts as a cross platform remote access trojan (RAT) dropper, targeting macOS, Windows, and Linux. The dropper contacts a live command and control server and delivers platform specific second stage payloads. After execution, the malware deletes itself and replaces its own package.json with a clean version to evade forensic detection.

🔖 webweigh

A rust CLI that calculates the file size of a web page when loaded with all external resources.

🔖 Phosphor Icons

Phosphor is a flexible icon family for interfaces, diagrams, presentations — whatever, really.

🔖 Departures (2008 film)

epartures (Japanese: おくりびと, Hepburn: Okuribito; “one who sends off”) is a 2008 Japanese black comedy drama film directed by Yōjirō Takita and starring Masahiro Motoki, Ryōko Hirosue, and Tsutomu Yamazaki. The film follows a young man who returns to his hometown after a failed career as a cellist and stumbles across work as a nōkanshi—a traditional Japanese ritual mortician. He is subjected to prejudice from those around him, including from his wife, because of strong social taboos against people who deal with death. Eventually he repairs these interpersonal connections through the beauty and dignity of his work.

🔖 Infrastructure Landlords: The Rentier Capitalism of Commercial Academic Publishers

If you want to understand where the commercial parts of scholarly communications may be heading, you need to look beyond policy documents, conference panels, or public-facing strategy statements. You should look at what large commercial actors say when speaking to investors. Earnings calls are one of the places where that language becomes especially revealing: less concerned with sector ideals than with growth, market opportunity, competitive position, and what will ultimately generate value for shareholders. For this reason, it can be worthwhile to review earnings calls and investor presentations, as these are often overlooked when discussing OA policy and sectoral movements.

🔖 AI got the blame for the Iran school bombing. The truth is far more worrying

Someone decided to compress the kill chain. Someone decided that deliberation was latency. Someone decided to build a system that produces 1,000 targeting decisions an hour and call them high-quality. Someone decided to start this war. Several hundred people are sitting on Capitol Hill, refusing to stop it. Calling it an “AI problem” gives those decisions, and those people, a place to hide.

🔖 Guibo

GUIBo is a desktop GUI for operators and developers who run Kubo (the IPFS daemon in Go). It drives your node through Kubo’s HTTP RPC API so you can work with pins, UnixFS content, IPNS, remote pinning, gateways, and network or repo diagnostics without living in the terminal.

🔖 The Human Line Project

At The Human Line, we are committed to ensuring that AI technologies, like chatbots, are developed and deployed with the human element at their core. LLMs are powerful tools, and with Ethical design, users can gain new skills and knowledge while remaining emotionally intact.

🔖 Marriage over, €100,000 down the drain: the AI users whose lives were wrecked by delusion

Tech-related delusions, whether they involve train travel, radio transmitters or 5G masts, have been around for centuries, Morrin says. “What’s different is that we’re now arguably entering an age in which people aren’t having delusions about technology, but having delusions with technology. What’s new is this co-construction, where technology is an active participant. AI chatbots can co-create these delusional beliefs.”

April 2026 Early Reviewers Batch Is Live! / LibraryThing (Thingology)

Win free books from the April 2026 batch of Early Reviewer titles! We’ve got 237 books this month, and a grand total of 2,796 copies to give out. Which books are you hoping to snag this month? Come tell us on Talk.

If you haven’t already, sign up for Early Reviewers. If you’ve already signed up, please check your mailing/email address and make sure they’re correct.

» Request books here!

The deadline to request a copy is Sunday, April 26th at 6PM EDT.

Eligibility: Publishers do things country-by-country. This month we have publishers who can send books to Canada, the US, the UK, Ireland, Australia, Luxembourg, Belgium, Sweden, Spain, Slovenia and more. Make sure to check the message on each book to see if it can be sent to your country.

The Responsible PartyVenus, VanishingThe BrunswickDiscovery By DesignThe Anti-Marriage PactA Short History of San FranciscoLike Friends, Like Foes: Japanese Americans and Nevada Through World War IIFantastic Tales of SteampunkOnce upon a Wintry Krampusnacht EveNature's Echo: Harnessing Ancient Feedback Loops to Heal a Changing PlanetBillie Builds a RoboCornNow I See SpringThe Summer I Found YouThe Patriot's DaughterIt Came from NeverlandWould I Lie to You?Sightings: PoemsThe Sacred Path of SimplicityThe Alchemy of Motherhood: Unspoken Truths of Birth Trauma and the Postpartum JourneyThe Alchemy of Motherhood: Unspoken Truths of Birth Trauma and the Postpartum JourneyA Bad Deal in Mormon LandPraise God for PastiesSib Squad: Hole Lotta Trouble!Who Is Jesus?: Easter DevotionalUnsung Canaan Ballads: A Collection of PoemsMan AfieldReflections of a Woman's Life: A ChapbookBeating Heart of the World: The Taos Art Colony, the Pueblo Resistance, and the Battle for Indigenous AmericaWe've Been Here Before: How Rebellion and Activism Have Always Sustained AmericaRemembering Roots: How an American Classic Transformed the WorldAnd Then We Saw the Bag . . .: Trash to Them, Treasure to UsAll the Colors of Life Deluxe Gift Edition: An Illustrated Coffee Table Book for Occasions and CelebrationsBeirut ExtractionFat Bitch: Killing the Willpower Myth: An Empowering Guide to GLP-1 Weight Loss Medicine, Healing from Trauma, and Building Lasting HappinessDiodeThe Calamity ClubThe Fire AgentBubbles, Roses, and RumpThe Sea CureRunning Wild Novella Anthology Volume 9 Book 2Sounds Like Trouble to MeJen & Gary's Infinite (Quantum) EntanglementsRun, Rabbits, Run!NecromaniaDifferent RoadsThe Role of Dental Nurses in Oral Health Promotion to Prevent Dental Caries in Children within General Dental PracticesDispatches from Grief: A Mother's Journey Through the UnthinkableJungle of AshesTent CitySkies of Fire and SmokeA Love Once LostThumbin' The Rock: A Newfoundland Hitchhiking OdysseyMarigold GreyCalifornia Fever Dream: A MemoirEscapePeasUndesirable: The Vietnam War and A Father's Battle for JusticeOn the HookFind Me in the StoryApolloRenegadeFlightlessEternal EnchantmentThe Dead of DayBusiness Sustainability Essentials You Always Wanted to KnowLearning and Development Essentials: A Practical Guide to Designing Learning Programs, Driving Business Impact, and Achieving Organizational ExcellenceCorporate Finance Essentials You Always Wanted to KnowWhen We Forgive: Stories of Hurt, Healing & Everything in BetweenStakeholder Management for Project Managers: A Practical Guide for Managing Projects and Engaging PeopleOrganizational Development Essentials You Always Wanted to KnowWriting Memoir in Flashes: Creative Ways to Tell Your True Stories, One Memory at a TimeSeed Starting Simplified for Beginners: A Complete, Step-By-Step Guide to Grow Healthy, Strong Seedlings Indoors, Avoid Common Mistakes and Transplant with ConfidenceLLC para Principantes : Manual Completo con Estrategias Paso a Paso para Crear, Estructurar y Hacer Crecer Tu Sociedad de Responsabilidad Limitada con Confianza y Visión a Largo PlazoSpindleheart: Wrath of the Ravelwind KnightWhat Does Your Face Mean?: An Informational Memoir on Late-Diagnosed AutismThe Chronicles of City NThe Statistically Unlikely ReboundPanthera's HavenBenightedThe Cardboard KingBodega Botanica Tales: CarmenSpeak of the DevilMurder in the GyreDead AccountDon't Blame Sam!When the Sun DiedThe Bane of DragonsCanopy: A Collection of Stories.CommodoreThe Woman from WarsawThe Loss of What Is Past4 Weeks to Total Sleep Mastery: A Proven System to Maximise Your Recovery and Energy in Just 30 DaysBlind ItemBraving the Dawn: A Novel of New FranceShepherds of the Lost: Family SecretsThe Demon King Is a Merchant: The First StepsKeep Them CloseC is for Childhood Cancer: And Other Lessons Cancer Taught MeGaits of MagicCover to Cover: What First-Time Authors Need to Know About Editing (Revised Edition)My Twin the MurdererZicky: Wrath of the Rat KingWoodstake: Three Days of Peace, Music and BloodBaptismThe Echoes of the WatchtowerDark ShadowsThe Last PillThis Too Shall Pass?: Honest Words for Moral InjuryNocturneWhat the Island AsksGo Help YourselfQuinto's ChallengeThe Shapeshifter's GambitDrummer Girl: How I Became MetalBreaking the Simulation: An Ancient Path Back to RealityDon't TellTen Stories from Arab History: From Ancient Yemen, Through the First Islamic Civil War, to the Fall of al-AndalusBroken Mirrors, Steady GroundA Christmas at Ballymore CaféNever ForgiveBent Cop: Johnny Takes Out The TrashMassawa: A Tale of Espionage, Love, and IllusionThe Emotional Side of Money: A Roadmap to Financial WellnessCarrying the UnseenSIGNPOST!: A Map for Resilience CultureHold Without Panic: How to Swing Trade 2-3 Hours Per Week While Working Full-TimeSolarflameThe Wizard, The Pirate, and The Steampunk Librarian30 Days of Transformative Holistic Healing: A Guided Self-Healing Journey to Rebalance Your Mind, Body, Energy, and Nervous SystemChanneling MarilynThe Taste of Glass in a Pillar of SaltFrom Sea to Shining Sea: 50 Daily Devotions from Traveling to Every State in AmericaOnly Breath & ShadowLove In The Time Of AmericaThe Broken HeirHunting in Africa: An African Safari ThrillerThe White Highlands and the Mau Mau: With the Rucks, Leakeys, and Kikuyu Freedom Fighters, 1952-1961TheaThe Double-Headed EagleThe Paine SocietyNorman & The Stinking Space GooMoonflowerScarlett UndoneThe Dancer's Shadow: An Isekai RomantasyA Khmer Legend of Love and Destiny: An Isekai RomantasyIf Love Doesn't Make a Family...: Nothing Is What It Appears to BeIris Blackwood and the Curse of Hemlock IslandFunny Things HappenDamaged: Life. Death. Memory. UncertaintyGood Grooming and a Healthy Respect for AuthorityAI Slayer/AI LiberatorDragon's BetrayalIntroduction to the Attribution of Literature: The Re-Attribution of the British 18th and 19th Century CorpusesThe Eight Keys: Opening to the Mysteries of Cosmic HarmonyThat Murder FeelingNot a Fairytale Ending: The Rewriting of My StoryRedemption RowThe Rescue Fantasy: Why Capable Women Stay Stuck and How to Reclaim the Power to Lead Your LifeThe Manual for the Ambitious Man: The Systems No One Taught You About Success, Emotions, and Becoming a ManThe Inner Workings of the Outer Layer: A History of Bicycle Tire Sizes and StandardsWords Were The Enemy: A Novel in VerseThe Second WorldMurder Most SaurianYour Verdict: A Judge's Reckoning with Law and LossMysteries Beyond KnowledgeDefenders: Reign of the BugsFractureTo the Moon and BackThe Track of the EyelidsShakespeare's Vengeance - Every Role Comes with a PriceThe Cost of KnowingPaul Bunyan: An American Folk LegendMy First Colonoscopy: A Comical Look at the Prep, the Procedure, and the Relief AfterwardThe Last Human Advantage: Why Thinking Clearly Matters More Than Ever in the AI EraThe Four WindsAfter The LakeBarking Orders: A Dog's Diary of Chaos, Loyalty, and Squirrel SurveillanceTrue and Absurd Lawsuits That Really Happened: The Curious Case Files of Sherlock GrantAncilla: Master, Teach MeWhen the Word Became FleshAncilla: Master, Teach MeAmish Remedies: 400+ Amish Herbal Remedies & Kitchen Traditions: Natural Healing, Holistic Wisdom, and No-Fluff Wellness for Everyday LifeHeritage In Motion: Champion SwimmerThe Octopus Myth: What We Really Know about Octopus IntelligenceThe Indie Author's Tax Survival Guide: A Practical Guide to U. S. Taxes for Self-Publishing AuthorsOrton-Gillingham Decodable Stories: Level 7 - A Day at the Beach: Structured Literacy Decodable Reader for Developing ReadersThree and Thirty Pieces of InsanityBarking Orders: More Funny Adventures of a Very Opinionated Cattle DogPoems of The New EvangelionQueenslanderLebanon: A Country for No One & EveryoneTerr-or-Treats: Spooky Ghost Stories and Deliciously Haunted AdventuresDarleneWonderful HalfAre You Speedy?Thursday Night Tiki Lounge: 52 Drinks That Bring the Tropics HomeThe Sages of the Hidden Road: A Parable for the Weary SoulBe a Bookworm, Not a BullyThe Grasshopper Lost Its WingsThe Reel Life of Zara KeggWest ShoreMath Heals: On the Gift and Weight of Being HumanRetirement Planning Simplified: The Complete Step-by-Step Guide to Building Lasting Income, Cutting Taxes, and Retiring with Confidence (Updated for 2026)Aunt Rosie's FarmThe Captive CommanderAgainst All OddsVault of Secrets: Shelter for Your Cloak-and-Dagger TruthsSore Like an EagleA Ravishing AbominationThe Story Eaters of YammirlThe Hound of Troy: The Vengeance of HecubaNew Life for a Dead ManThe Glass FieldMan of a Thousand Fails: Film Noir of Elisha Cook Jr.The Million-Dollar Sentence: The Secret of the Valley of PeaceDear Missing FriendDear Missing FriendA Penance for CrowsThe Focus Equation: 21 Secrets to Boost Your Focus in a Distracted WorldJonah and Mira: The Map Beneath the OakA Curse of Wings & GemsWildfire & The Sun PrinceThe Land of Milk and Honey: An Italian Immigrant's Journey from Rags to Riches in AmericaThe Land of Milk and Honey: An Italian Immigrant's Journey from Rags to Riches in AmericaHow to Master the Power of Silence for Emotional Control: Step-By-Step Methods to Stop Overreacting and Stay in ControlClass Is in Session: Teaching Through the ChaosBefore the Pharaohs: The Lost Mega-Cities of Old Europe and the Mystery of the Ritual FireAn Enduring SparkEveryone Is Perfect HereEleven Pillars: A Framework for Self-Mastery and the Long GameConnecting Goals to Impacts and Outcomes: Harnessing Structured Conversations for Customer-Driven Value DeliveryWelcome to Weirdsville: The Incredible True Story of Weirdsville and All the Weirdos Who Live ThereWelcome to Weirdsville: The Incredible True Story of Weirdsville and All the Weirdos Who Live ThereSwords Over the StarsCaenogenesisOur Better NatureThe Blood of Birds: A King David-Era Thriller

Thanks to all the publishers participating this month!

Alcove Press Aquarius Press Arctis Books USA
Broadleaf Books CMU Press Cozy Cozies
Crooked Lane Books Cynren Press Flat Sole Studio
Harper Horizon Harper Muse Haven
Henry Holt and Company Highlander Press History Through Fiction
Infinite Books Inkd Publishing LLC Life to Paper Publishing
NeoParadoxa Noble Legacy Publishing Open Books
Paper Phoenix Press Penelope Pipp Publishing Pink Crow Press LLC
Prolific Pulse Press LLC PublishNation Real Nice Books
Revell Running Wild Press, LLC Shadow Dragon Press
Somewhat Grumpy Press Spiegel & Grau Sunrise Publishing
Thinking Ink Press Tundra Books Type Eighteen Books
University of Nevada Press University of New Mexico Press unLit Publishing
Unsolicited Press Vibrant Publishers W4 Publishing, LLC
WorthyKids

March 2026 Early Reviewers Batch Is Live! / LibraryThing (Thingology)

Win free books from the March 2026 batch of Early Reviewer titles! We’ve got 226 books this month, and a grand total of 3,026 copies to give out. Which books are you hoping to snag this month? Come tell us on Talk.

If you haven’t already, sign up for Early Reviewers. If you’ve already signed up, please check your mailing/email address and make sure they’re correct.

» Request books here!

The deadline to request a copy is Wednesday, March 25th at 6PM EDT.

Eligibility: Publishers do things country-by-country. This month we have publishers who can send books to the US, the UK, Israel, Australia, Canada, Ireland, Germany, Malta, Italy, Latvia and more. Make sure to check the message on each book to see if it can be sent to your country.

The Great WhereverExceptional Hatred: Antisemitism and the Fight for Free Speech in Modern AmericaProcrastination Proof: Never Get Stuck AgainRules to Live By: Maimonides' Guide to a Wonderful Life (HEBREW EDITION)Endless Exodus: The Jewish Experience in EthiopiaBlue Team Dynamics: Three Proven Leadership Principles Inspired by IDF Sources for Business and LifeSons of Abraham: A Candid Conversation about the Issues that Divide and Unite Jews and Muslims (HEBREW EDITION)Sons of Abraham: A Candid Conversation about the Issues that Divide and Unite Jews and Muslims (ARABIC EDITION)Puzzles She PackedBloom Of BetrayalNever Hide from the DevilBowers Mansion: The Legacy of a Comstock FamilyTangential Terrains: Cormac McCarthy's GeoaestheticsA Future For Ferals: A Charity AnthologyMore Futures for Ferals: A Charity AnthologyHow to Create an Organic Aquarium: The Beginner's Guide to Soil-Based Freshwater AquariumsRonald, the RoninDying to Live HereThe Unfavored Children's ClubSea SudsFaking to FallingBunnies in the Berry RowThe CorryJack Rittenhouse: A Western Literary LifeArthur and the Kingswell TrioMantleSome Stupid Glow: StoriesDollartoriumWhen Paris WhispersThe Night Nurse and the Jewel ThiefHeroes of PALMAR: How One IDF Unit Revolutionized Combat Medicine in GazaWhen Eichmann Knocked on Our Doorאיש כפי נחלתו: שנים-עשר שבטי ישראל בנחלות אבותיהםFamily DramaThe Son Of A Belfast Man: From the Early Years Up to Nineteen Years OldClaimed by DarknessThe Alfriston QuartetJaguars and Other GameJungle of AshesShooting Up: A Memoir of Love, Loss, and AddictionWarp & WeftHere for a Good TimeCanada: We Are the StoryRuthieA Deadly InheritanceFly in the ChaiMjede: The Three DaysSince You Weren't There and Other MemoriesQuestions for Werewolves: A Creative Nonfiction of Madness, Witch and DaimonEstuaryI'll Stop From MondayThe Marilyn DiariesNever Hide from the DevilThe Greatest New York Yankees by Uniform NumberThe Blue WaveCalisthenics: Core Crush: 38 Bodyweight Exercises for a Stronger CoreLightningShadows of the Republic: The Rebirth of Fascism in America and How to Defeat It for GoodDigital Coup: The Conspiracy to Thwart Global DemocracyWeathering the Storm: Navigating the Anti-Social Justice WaveConversion Therapy Dropout: A Queer Story of Faith and BelongingThe Christian Past That Wasn't: Debunking the Christian Nationalist Myths That Hijack HistoryPuppy Training: The Smart Way7 Spiritual Habits to Change Your LifeInvesting for BeginnersWitch of the Shadow WoodThe Last PageWe Become DarknessPondering: A Story in CinquainsBy the Bubbling BrookTaming the AlphaTo See BeyondThe Fallen: The Lost Girls of Ireland's Magdalene Laundries and a Legacy of SilenceSeed Starting Simplified for Beginners: A Complete, Step-by-Step Guide to Growing Healthy, Strong Seedlings Indoors, Avoiding Common Mistakes & Transplanting with ConfidenceContinuous Improvement Essentials You Always Wanted to KnowBetter: A Guidebook to a New and Improved YouDigital SAT Reading and Writing Practice QuestionsDigital SAT Math Practice QuestionsThe Theater: Courage and Survival in the Defining Atrocity of the Ukraine WarOur Minds Were Always Free: A History of How Black Brilliance Was Exploited--And the Fight to Retake ControlInheritance: Nick Chambers Slayer for HireSuperteams: The Science and Secrets of High-Performing TeamsPrickles and PridesNo Further Action: Ten Short StoriesPermit to StayLife Is Terminal: And So Is This Cold SoreThe Tarishe CurseIndian Warner: Son of Two WorldsSpindleheart: Wrath of the Ravelwind KnightThe Sure Thing: A Pleasure Practice to Revive the SparkEssence MergingQasida for When I Became a WomanNo Winning This WarMan of a Thousand Fails: Film Noir of Elisha Cook JrRed DemonSticks and Stones and Dancing Cranes: The End of the BeginningFool: A Tudor NovelWho in Astrology Are You?Stillness and Survival: A Life Between Trauma, Glitter, and the Echo of My Own VoiceThe Florist's Budding DesireFission: A Novel of Atomic HeartbreakEmberglow Falls Academy: The Legacy of MagicThe Jolt: A Time-Slip RomanceHaggadahpalooza: The Unofficial Weirdly Perfect Passover Pop Parody PanoplyTwo x ThreeMother of Assassins: A Memoir of the ImaginationInner, The Breath of God, Volume 1Play From Your HeartLegends of Mexico Coloring Book: Mythical Tales and Folklore to Color and EnjoyThe Golden Apple and the Nine Peahens: A Balkan Orchard TaleConnection:LostOne of a Kind CreaturesC is for Childhood Cancer: And Other Lessons Cancer Taught MeThere's a Young Man Dressed in BlueChivalry & ChocolateCaput Mundi: The Head of the WorldCain's ChameleonThe Lion's DenCain's ChameleonOn Moreton WatersThe Million-Dollar Sentence: The Secret of the Valley of PeaceA Moment's SurrenderLogos Palimpsest: Layered Verses of My Myths and MemoriesFelicity Fire and the Forever KeyMinds & Moods: Power & Deception Crossword PuzzlesTrue & Absurd Lawsuits: The Cases Kept ComingDear Missing FriendIn His Absence: A Brother, A Life, and What EnduresWill's WakeDesert Superstars: A Patience & Perseverance Coloring Adventure: A Mindfulness Coloring Book with Desert Animals, Patience-Building Prompts, and Mindful SEL Adventures for Growing HeartsOur Better NatureThe Pioneer Converts: The Message of HopeThe Black Knight: Miqdad Historical NovelThe Gardener Parent: Stop Yelling and Start Guiding Using Ericksonian MethodsBlütenschwere : Roman über Die Gewalt der AuslöschungThe Weight of Petals: A Story of Memory and ResistanceThe Problem with Conspiracy Theories: Real Scandals, Fake Mysteries, and How Distrust Took OverCity of the Gods: The Return of Quetzalcoatl (15th Anniversary Edition)The Three-Bullet Act: Journal of an HR DirectorThe Shapeshifter's GambitThe Vampyre ClientJeannie's Bottle: IncantationsFated RebirthLove and Ghosts at Hideaway LakeJonah and Mira: The Map Beneath the OakChangeupA Gift of RevelationsBachelorx: A Nonbinary MemoirA Strange SoundThe Rising of the WolvesThe Rising of the WolvesThe Missing FrameCaenogenesisThe Standard: 38 Standards of LifeThe Caregiver's Game: Unraveling Financial Deceit in the Shadows of DementiaClass Is in Session: Teaching Through the ChaosPolitics and Morality: The Problems of Ethical Debate for an Evolved Social SpeciesThe Book of Peace AphorismsTerrestrialQueenslanderThe Blood of Birds: A King David-Era ThrillerA Look into Mirrors: Their Making and Use Throughout HistoryThe Coherent Website: Designing for Trust in the Age of SearchHuman Again: In the AI AgeCut to the QuickThe Clockwork SpyYou CancerViveActs Of FaithThe HuntedAbba, Father!: A Journey to Knowing God in His Greatest Role of AllMidnight MeowsA Night of Strange DreamsAunt Rosie's FarmClose Encounters with Tort$Rewriting Your Life: A Workbook On Self-DiscoveryEpic Health & Ultimate Training: A Self-Help Workbook For Becoming StrongConnecting Goals to Impacts and Outcomes: Harnessing Structured Conversations for Customer-Driven Value DeliveryTrust and Treason: The RiseThe Last Phone CallWhen We Came Full CircleWhen Bonds Were ForgedThe Waterfall of VengeanceRain and Sun: Confessions of Love, Silence, and an Irrevocable PastAn Unsuitable Knight: A Novel of Norman ItalyBound by the ElementsMarriage Supper, Clearing GoatWord Fill in Puzzles: Large Print Puzzles for Seniors with over 70 Nostalgic Brain Games to Keep Your Mind Sharp and Active (Solutions Included)Yours Rhetorically, Cold Blue Monster: A Criminal Counseling Text-MoirMidnight BallerinaThe Agentic Loop: How Humans + AI Build Experiences That LearnThat Which Does Not Kill Us: An Intergenerational Memoir of Legacy TraumaIn the Belly of the AnacondaFree Will: Resolving the MysteryFree Will: Resolving the MysteryTattle Royale: Burn BookRupture Threshold1,2&3 John Bible Study: Dwell in LightThe Nutcracker - Gird Thy LoinsThe Magic SeekerNyxalath Heirophant of VeilsReed CityTerr-or-Treats: Spooky Ghost Stories and Deliciously Haunted AdventuresIncunabulaI Don’t Hum Anymore: A Confession of Silence, Survival, and City MadnessGolden LightI Raised Monsters: A Failed Teacher's Confession — Prisoner 4782A Florida Dance: Life Stories from the Sunshine StateCavern Sanctuary: After the FalloutDeep Work for Distracted People: Simple Methods to Stay Focused, Think Clearly, and Finish What MattersThe Law of the Spirit of Life: God's Design for a Life of Effortless TransformationOne-Page Wealth Compass: Fired at 63 Nearly Broke - Safely a Millionaire by 69The Dog BookThis Fell SergeantThe Secret Winners ClubDear Missing FriendThe FallYour Business Growth Playbook: Breakthrough Strategies to Scale Your Business for Business Owners Who've Outgrown HustleBeyond the Crystal SkyYpresMore Than ChemicalOld EarthHealthy Minds, Healthy Nation: How Meditation, Shamanism, and Indigenous Healing Can Tap into Your Light Within and Change the WorldAfter We BreakData Science in 7 Days: Python Fast-Track with Hands-on ProjectsBash and Lucy Say, Love, Love, Bark!Thinker Reads Start With Why: How to Find Your Why and Dare to Lead a Purpose Driven Life in 3 Steps Even If You’re Starting From Zero

Thanks to all the publishers participating this month!

Alcove Press Artemesia Publishing Baker Books
Bellevue Literary Press Broadleaf Books Brother Mockingbird
Cennan Books of Cynren Press City Owl Press Cozy Cozies
Egg Publishing Entrada Publishing eSpec Books
Fawkes Press Featherproof Books Gefen Publishing House
Gnome Road Publishing Grand Canyon Press Greenleaf Book Group
Hawthorn Quill Publishing Henry Holt and Company History Through Fiction
Infinite Books Inkd Publishing LLC Lito Media
PublishNation Pure Calisthenics Riverfolk Books
Running Wild Press, LLC Simon & Schuster Tundra Books
University of Nevada Press University of New Mexico Press Unsolicited Press
Vibrant Publishers W4 Publishing, LLC WorthyKids

ADA Title II Urban Legends: Sorting Fact from Fiction About the 2024 Updates / Digital Library Federation

This post has been authored by members of DLF’s Digital Accessibility Working GroupAs the April 2026, Title II compliance deadline approaches, members of the DLF Digital Accessibility Working Group began comparing notes on the misconceptions, half-truths, and “urban legends” circulating about ADA enforcement and digital accessibility requirements. What started as a lively discussion evolved into a collaborative blog post aimed at separating fact from fiction. Drawing on shared institutional experiences and community expertise, this post addresses common myths about Title II and provides clear, accurate guidance along with helpful resources.

ADA Title II Urban Legends

In 2024, the Department of Justice released updates to Title II of the Americans with Disabilities Act (ADA). These updates provided specific requirements about how to ensure that web content and mobile applications (apps) are accessible to people with disabilities, stating that Web Content Accessibility Guidelines (WCAG) Version 2.1, Level AA was now the technical standard for agencies subject to Title II. They have until April of 2026 or 2027, depending on their state or local government size, to meet these requirements. As many institutions rushed to ensure compliance, misinformation and common urban legends about accessibility came to the surface. This post endeavors to answer some of these common questions and myths encountered by members of DAWG. Thanks to the members of DAWG who contributed. 

1. Only people who are blind use screen readers! 

  • Contributor(s): Karen Grondin
    • While it is true that people who are blind or low vision make up the majority of people who use screen readers, according to the WebAIM Screen Reader User Survey #10, just over 10% of screen reader users reported that their screen reader use was not due to a disability. Screen reader users also reported the following disability types: Cognitive or Learning (5.2%), Motor (2.2%), and Other (4.9%). 5.3% of users reported being both deaf/hard of hearing and blind disability types. 

   2. Everyone who is blind can read Braille.

  • Contributor(s): Jasmine Clark

3. Faculty are 100% personally responsible for remediating any content they teach with.

  • Contributor(s): Jasmine Clark
    • Agencies, universities, and other entities that fall under Title II are responsible for compliance. Employers are responsible for violations carried out by employees and contractors whose services they utilize (employer obligations are better outlined in “The ADA: Your Responsibilities as an Employer”). While a university may choose to mandate that faculty teach with accessible materials, the university is ultimately responsible for ensuring that happens. It is possible that a faculty member could be held liable if they refuse to comply with their university’s mandates, but that would most likely still be a case of joint responsibility. 

4. I will have to strip my course of all engaging content (rather than spend the time learning how to make it accessible).

  • Contributor(s): Jasmine Clark
    • Taking the time to think about ways to remediate content and incorporate universal design principles into your teaching will only benefit you. There are resources like Universal Design for Learning that are readily available to help you get started. In many ways, expanding the modalities available to your students will enhance the educational experience for all of them, not just disabled students.

5. Title II applies only to accessibility for the blind, and designing for screen reader accessibility will solve all accessibility needs.

  • Contributor(s): Jasmine Clark

6. Title II applies only to all disabilities, and all content must be made universally accessible for all disabilities in advance, so that no one ever neds to disclose or ask.

  • Contributor(s): Jasmine Clark; Jon B; PF Anderson
      • As stated above, the main changes made to Title II revolve around making state and local governments’ web and mobile apps meet the Web Content Accessibility Guidelines 2.1 (WCAG 2.1), Level AA. WCAG focuses on modalities and functional needs instead of medical diagnoses.
        • “Digital technology designed for people with a broad range of abilities benefits everyone, including people without disabilities. It is, therefore, important to consider the broad diversity of functional needs rather than categorize people according to medical classifications.” – From Diverse Abilities and Barriers
      • However, while the guidelines try to be as universally applicable as possible, there will be those who need additional accommodation. WCAG 2.1, Level AA is meant to be the minimum. As it is the universal standard that applies automatically, users do not need to disclose or request this. But needs that go beyond that baseline would still require an individual to disclose their disability to formally request an accommodation.
        • As PF Anderson put it:
  • “… there is no one-size-fits-all accessibility. Making something accessible is dependent on context and need. What improves accessibility for one person often creates new problems for someone else. We can make changes that have a high bang-for-the-buck ratio, things that tend to help more folk, however there is always someone who needs something that is a bit off the beaten path.”
    • ADA adheres to the principle of Reasonable Accommodation (RA). An RA is defined as “any change in the work environment or the way things are customarily done that provides an individual with a disability equal access to employment opportunities, benefits, and privileges. An RA can cover most things that enable an individual with a disability to apply for a job, perform the essential functions of a job, and/or have equal access to workplace opportunities, benefits, and privileges. There are three categories of RAs: 
      • 1. Modifications or adjustments to the job application process to permit an individual with a disability to be considered for a job; 
      • 2. Modifications or adjustments necessary to enable a qualified individual with a disability to perform the essential functions of the job; 
      • 3. Modifications or adjustments that enable employees with disabilities to enjoy equal benefits and privileges of employment.” – From Administrative Communications System U.S. Department of Education Handbook for Reasonable Accommodations (PDF)
    • RA’s have exceptions for undue hardship. “An undue hardship means that a specific accommodation would cause significant difficulty or expense to the employer. The determination of whether providing a RA would create an undue hardship for the Department is always made on a case-by-case basis. In making this determination, ED needs to consider factors such as: the nature and net cost of the accommodation, the Department’s overall size and financial resources, the type of operation, and the impact of the accommodation upon the operation, including the impact on other employees’ ability to perform their duties and the Department’s ability to conduct business. The Department bears the burden of proof to demonstrate that providing an accommodation would cause an undue hardship.” – From Administrative Communications System U.S. Department of Education Handbook for Reasonable Accommodations (PDF)
    • This process is inherently dependent upon disclosure and the request for accommodation. 

7. Title II applies to all our buildings, and the campus has to have 100% physical accessibility by April.

  • Contributor(s): Jasmine Clark; Wen Nie Ng
    • While Title II of the ADA broadly governs accessibility for state and local government services, programs, and activities across both physical and digital, the 2024 updates to Title II apply to state and local governments’ web and mobile apps. Not physical spaces.

8. Title II applies to events, and ALL events must be hybrid in the future because in person events are not accessible.

  • Contributor(s): Jasmine Clark; Wen Nie Ng
    • Title II does not require all events to be hybrid; it requires that events be accessible. Making events hybrid is one of multiple approaches to making them more accessible. However as stated above, the most recent updates to Title II apply to state and local governments’ web and mobile apps. Not physical spaces. So, if an event is  physical, it is not subject to the April deadline. If an event is virtual, the event must meet WCAG 2.1, Level AA. 

9. Title II applies to events and, because ALL virtual and hybrid events will require CART and ASL, no one will plan virtual or hybrid events anymore, because we can’t afford it.

  • Contributor(s): Jasmine Clark; Amy Drayer

10. Title II applies to H.R., and interviews held by phone or Zoom will require captions for all candidates, because you can’t ask them to disclose.

  • Contributor(s): Jasmine Clark
    • Employment and hiring fall under Title I and Subpart C of Title II specifies that public entities are subject to the regulations within Title I. Potential employers are required to meet the standard for Reasonable Accommodation (see definition in question 6). To summarize, WCAG 2.1, Level AA is meant to be the minimum. As it is the universal standard that applies automatically, potential employees do not need to disclose or request this. But needs that go beyond that baseline would still require an individual to disclose their disability to formally request an accommodation.

11. Title II applies to internal staff meetings and clinic telemedicine visits, and captions must be enabled at the beginning of all meetings for all attendees, even if it is a confidential meeting.

  • Contributor(s): Jasmine Clark; Jon B
    • Organizations that provide healthcare are classified as public accommodations under Subchapter III of Title III. They were already required to make their services accessible regardless of these changes to Title II. The only change these entities will have to make is updating to meet WCAG 2.1, Level AA.
    • See question 6. If an employee requests captions, they must be provided. For confidential meetings, provide captions through a secure, encrypted service or offer real-time transcription you can delete after the meeting. If there are no employees that require live captions, and maintaining an active subscription to a service would constitute an undue hardship, it would be wise to have a pre-approved vendor selected to provide services as needed. This would avoid delays and complications when accommodations are requested.   

12. The library will withdraw and discard historic content that cannot be remediated.

  • Contributor(s): Jasmine Clark
    • This would most likely be a library decision made after considering its weeding policies, not a Title II requirement. “Content” is a broad term and whether or not something can be remediated would have to be assessed on a case-by-case basis. There are exceptions depending on the format, purpose, and use of content.
      • Fact Sheet: New Rule on the Accessibility of Web Content and Mobile Apps Provided by State and Local Governments; see: Summary of the Exceptions

13. The library will withdraw and discard historic dissertations if the original author does not make them accessible (many of whom are dead).

  • Contributor(s): Jasmine Clark
  • Once again, this would most likely be a library decision, not a requirement under Title II. Dissertations can be remediated and, depending on their format and use, they may meet the requirements for an exemption. 
      • Fact Sheet: New Rule on the Accessibility of Web Content and Mobile Apps Provided by State and Local Governments; see: Summary of the Exceptions

14. Everything we have in our digital collections and institutional repository is archival, so we get an exception.

  • Contributor(s): Jasmine Clark
    • This depends on how these materials are used, when they were created, what format they are, and whether or not they are required to be used. If a faculty member is requiring materials be used for a course, they will have to be remediated and made accessible. Once again, review the exception criteria before deciding whether or not your collections are indeed exempt.
      • Fact Sheet: New Rule on the Accessibility of Web Content and Mobile Apps Provided by State and Local Governments; see: Summary of the Exceptions

15. Full text in library databases already meets accessibility standards.

  • Contributor(s):  D Krahmer
    • Even when a publisher provides an accessible file to a vendor, the vendor may reformat that information for their own databases in a way that actually destroys the accessibility. While more publishers AND vendors are shifting to epub and more accessible formats for digital content (such as EBSCO switching over to using LCP-DRM files), that doesn’t mean that material made public prior to the European Accessibility Act or the Title II updated regulations will be made accessible.

16. Everything has to be made accessible to receive a passing grade from screening tools, even if the changes make something less accessible.

  • Contributor(s): Jasmine Clark
    • Screening tools are just that, tools. Accessibility is meant to make content usable for people. While screening tools can help you identify potential barriers to access, they require actual people to verify whether there really is an issue and how to correct it. If you prioritize a passing mark from a screening tool and make something inaccessible, you will not be in compliance with Title II. 

17. AI will fix everything. 

  • Contributor(s): Karen Grondin; Jasmine Clark; Jon B
    • While AI can be helpful to people with disabilities it cannot fix everything. There are people who use things like  AI assisted smart glasses to help them tell if the medications given to them by the staff in assisted living facilities are indeed the correct medicines. However, this technology is out of reach for most due to cost (a pair of AI smart glasses costs a few hundred dollars, not including a subscription to the AI service). Several recent review studies examine this urban legend and both conclude that AI does not solve all accessibility issues. 
    • Some additional AI limitations: 
      • Synthesized auto-generated text can make up content wholly unrelated to anything being said or described.
      • The reading capability of AI is overstated and this can impact the ability to correct reading order and catch inconsistencies.
      • The costs of services can be high and are in the control of private companies that can cancel or increase the prices at will.
      • Privacy is a major risk, especially when dealing with health information.

18. ARIA roles are the solution to everything.

  • Contributor(s): D Krahmer; Wen Nie Ng
    • ARIA is a powerful tool, but it is not a universal solution and should be used with caution. It is a standard intended to work across assistive technologies. However, ARIA can behave differently across screen readers and may not always function as expected. When used incorrectly, it can create more accessibility issues than it solves. In general, native HTML semantics should be prioritized, and ARIA should be used only when necessary as a second option.

19. Accessibility Overlays make websites accessible.

20. STEM – I will not be able to use tables for data anymore

  • Contributor(s): Jasmine Clark
    • Tables aren’t inherently inaccessible. If you are using HTML and you’d like to learn how to create accessible tables, consider this tutorial from the Web Accessibility Initiative. If you aren’t working with HTML, consider looking up accessibility checkers and best practices in the applications you’re using. Research the software you’re using and the platform(s) where your data is published. 

21. If I add alt text to my quiz images, it will give away the answers.

  • Contributor(s): Jasmine Clark
    • Alt Text should convey the intent behind your image. For example, if you include a photo of the Mona Lisa for your student to identify, you would describe the painting (e.g. “a painting of a woman with dark hair,” etc.), not name it. This would be helpful to students who are visually impaired and need help with the details. In contrast, if you include the Mona Lisa as a hint to another question, with the understanding that students will know the painting, you would name the painting with appropriate context (e.g. “the Mona Lisa above a fireplace for scale”).

22. If it’s publisher/ 3rd party materials, I am not responsible for making it accessible.

  • Contributor(s): Jasmine Clark
    • The precedent set by PAYAN v. LOS ANGELES COMMUNITY COLLEGE DISTRICT (2021) would indicate otherwise. 2 blind students sued the Los Angeles Community College (LACC) claiming:
      • “…Plaintiffs identified accessibility barriers in LACC’s library research databases, many of which were not compatible with screen reading software. Despite the AMPP and her individual accommodations, Mason was unable to complete a research paper for a psychology course because the professor required use of an inaccessible research database for the assignment. Although some of the library’s online databases were accessible to blind students, the library did not conduct regular accessibility checks and did not test programs for accessibility before the library acquired them, as the AMPP required. Instead, accessibility was only tested when a blind student reported an accessibility problem”
    • The result: 
      • “The district court also found that LACCD discriminated against blind students as a matter of law based on the accessibility barriers present in the LACC websites and library database, but it declined to impose liability at that time because Plaintiffs had not yet met their burden to show reasonable modifications existed to remedy this discrimination.”
      • “Following the bench and jury trials, the district court entered a permanent injunction and final judgment in favor of Plaintiffs. The permanent injunction requires LACCD to: (1) come into compliance with its AMPP; (2) evaluate its library databases for accessibility and establish means of alternate access to inaccessible databases for blind students; (3) designate a Dean of Educational Technology; (4) make the LACC website and embedded programs accessible to blind students; and (5) assess educational materials for accessibility before acquisition and to establish means of providing accessible alternative materials to blind students in a timely manner.”
    • Website and content required for coursework must be accessible, even if it’s 3rd party. If a particular item is not accessible, accessible alternatives must be provided by the university.  

23. People with disabilities do not go into [insert field here] so we don’t have to make content accessible.

  • Contributor(s): Jasmine Clark
    • This is a horse cart issue. If a field is inaccessible, disabled people won’t go into it. Yes, there are fields with which people with certain disabilities are inherently incompatible. However, it is important that those specific examples are not used as an excuse to ignore much broader, fixable issues. For example, a retired surgeon, whose eyesight has diminished with age, may wish to look up materials they published years ago to share with a mentee or loved ones.     

24. No disabled people use this library.

  • Contributor(s): D Krahmer
    • If your library website is inaccessible, then people requiring accessibility will not use it. You’re losing 25% of your potential patrons by not building accessible websites.

25. The date Title II goes into effect depends on the size of your institution, so as of 2026, many schools have another year to go.

  • Contributor(s):  Jasmine Clark; Jon B
    • The deadline for the 2026 changes depends on the size of your state or local government, NOT your institution. A state university will have to adhere to the deadline set for its state. A public library in a small town in that same state will have to adhere to the deadline set for its town. 
      • Populations over 50,000: April 24, 2026 (2 years from the rule’s publication, April 24, 2024)
      • Populations under 50,000: April 24, 2027 (3 years from publication)
      • Fact Sheet: New Rule on the Accessibility of Web Content and Mobile Apps Provided by State and Local Governments; see: “How do you know the compliance date for other parts of government, like your city, state, or town police department or library?” under  How Long State and Local Governments Have to Comply with the Rule

26. Students are responsible to go through the accommodations process before the school or professors are responsible to make course content accessible.

  • Contributor(s): Jasmine Clark

27. My content is accessible, so the website is accessible, or vice versa.

  • Contributor(s): Wen Nig Ng
    • You can have perfectly accessible content within a poorly structured website, meaning a screen reader user may not be able to navigate to that content at all. Likewise, a well-structured site does not ensure accessibility if the content itself lacks alt text, includes inaccessible PDFs, or has uncaptioned videos.
    • The bottom line: both structure and content must be accessible. One without the other creates real barriers for users with disabilities and may lead to compliance gaps under Title II and WCAG standards.

The post ADA Title II Urban Legends: Sorting Fact from Fiction About the 2024 Updates appeared first on DLF.

DLF Digest: April 2026 / Digital Library Federation

A monthly round-up of news, upcoming working group meetings and events, and CLIR program updates from the Digital Library Federation. See all past Digests here

Hello DLF Community!

As 2026 unfolds, conversations about AI, authenticity, and trust continue to shape our work. Join us on Wednesday, April 15, 2026 (1:00–2:00 PM ET) for Content Authenticity and Provenance in the Age of Artificial Intelligence: A Call-to-Action for the LAMs Community.

Joshua Sternfeld and Kate Murray will discuss their widely circulated report and how generative AI is reshaping questions of provenance across libraries, archives, and museums, highlighting practical examples and shared frameworks to guide responsible practice. The event is virtual, free, and open to all. Register here.

On the Forum front, planning for this fall’s virtual DLF Forum is well underway, and we’re energized by the ideas already taking shape. Thank you for your keynote recommendations! Also, be on the lookout for the Call for Proposals, which will open later this month.

As always, we’d love to hear what you’re working on, thinking about, or hoping to see from DLF in the months ahead. Email me at swillis@clir.org.

-Shaneé

This month’s news:

  • DLF Webinar: Content Authenticity and Provenance in the Age of Artificial Intelligence will be hosted on April 15, 2026, at 1:00 pm ET, featuring Joshua Sternfeld and Kate Murray in a timely discussion on how generative AI is reshaping trust, provenance, and practice across libraries, archives, and museums. Register here.
  • Registration Open: The 2026 Library Publishing Forum will take place June 17-18 at the University of Washington in Seattle, co-located with the Association of University Presses Annual Meeting and featuring a new Forum Friends option for those unable to attend in person. Visit the registration page for more information.
  • Conference: The International Image Interoperability Framework (IIIF) Annual Conference and Showcase will take place June 1–4, 2026, in the Netherlands, featuring a free introductory showcase in Amsterdam and a multi-day conference across Leiden. Learn more here.

This month’s open DLF group meetings:

For the most up-to-date schedule of DLF group meetings and events (plus conferences and more), bookmark the DLF Community Calendar. Meeting dates are subject to change. Can’t find the meeting call-in information? Email us at info@diglib.org. Reminder: Team DLF working days are Monday through Thursday.

  • DLF Born-Digital Access Working Group (BDAWG): Tuesday, 4/7, 2pm ET / 11am PT.
  • DLF Digital Accessibility Working Group (DAWG): Tuesday, 4/7, 2pm ET / 11am PT.
  • AIG Metadata Assessment Group: Friday, 4/10, 2pm ET/ 11am PT.
  • DLF AIG Cultural Assessment Working Group: Monday, 4/13, 1pm ET / 10am PT.
  • AIG User Experience Working Group: Friday, 4/17, 11am ET / 8am PT
  • DLF Open Source Capacity Resources Group: Wednesday, 4/22, 1pm ET / 10am PT.
  • DAWG Policy & Workflows: Friday, 4/24, 1pm ET / 10am PT.
  • DAWG IT & Development: Monday, 4/27, 1pm ET / 10am PT.
  • DLF Digitization Interest Group: Monday, 4/27, 2pm ET / 11am PT.
  • Committee for Equity & Inclusion: Monday, 4/27, 3pm ET / 121pm PT.
  • DLF Climate Justice Working Group: Tuesday, 4/28, 3pm ET / 12 pm PT.

DLF groups are open to ALL, regardless of whether or not you’re affiliated with a DLF member organization. Learn more about our working groups on our website. Interested in scheduling an upcoming working group call or reviving a past group? Check out the DLF Organizer’s Toolkit. As always, feel free to get in touch at info@diglib.org

Get Involved / Connect with Us

Below are some ways to stay connected with us and the digital library community: 

The post DLF Digest: April 2026 appeared first on DLF.

Making Sense of GenAI Amidst AI Hype and AI Personalization / In the Library, With the Lead Pipe

By Sarah Morris

In Brief: Artificial intelligence (AI) literacy frameworks emphasize the importance of understanding Generative AI (GenAI) technologies. But our collective and individual understanding of GenAI is heavily shaped and mediated by the hype narratives that surround it, where GenAI is depicted as powerful, magical, and inevitable. Amidst such compelling narratives, we can face challenges in navigating narrative extremes and exaggerations and in making informed decisions about using GenAI tools. Alongside AI hype, we are also experiencing AI personalization features which encourage trust and positive feelings towards GenAI tools. Taken together, AI hype and AI personalization can challenge and even hinder our ability to engage critically and thoughtfully with GenAI. In this article, I will explore how our understanding of GenAI is influenced by AI hype and AI personalization and consider how hype narratives and personalization features fuel one another and encourage trust in and awe towards GenAI. By centering AI hype and AI personalization as key components to understanding and exploring GenAI, and by incorporating critical media and information literacy skills into AI literacy, I feel that we can develop an AI literacy that better contextualizes GenAI and encourages reflective and critical approaches that can help learners make sense of their emotionally complex experiences with and reactions to GenAI.

Introduction 

Generative Artificial Intelligence (or GenAI) is often depicted in terms of superlatives. Compared to humans, it is described as smarter, faster, more efficient, more accurate, more personable, and even more dangerous. GenAI is a specific form of artificial intelligence, but many of the conversations and commentary surrounding AI in general are focused on or referring to GenAI. Essentially, GenAI can produce text, images, video, audio, or code in response to a user prompt (Striker & Sccapicchio, 2026). GenAI tools include chatbots like ChatGPT or Gemini, or image or video generation tools like Sora. Much of what we experience as AI in our daily lives is some form of GenAI. Our collective understanding of GenAI, whether as a force for good or as a force for some apocalyptic level disaster, is heavily shaped and mediated through narrative. In particular, the narratives that hype artificial intelligence, often extolling and anthropomorphizing its various virtues, greatly influence not only our understanding of these tools but also our ability to critically investigate, discuss, and respond to the entire artificial intelligence landscape. The hype narratives surrounding GenAI frequently minimize its harms, mischaracterize its capabilities, and distract from its failures (Baer, 2025b; Bender & Hanna, 2025). AI hype fundamentally shapes how we conceptualize and discuss GenAI via narratives that can be misleading or manipulative. But while we are experiencing and navigating GenAI through the lens of AI hype, we are also experiencing GenAI through the accompanying lens of AI personalization.

The personalized nature of various GenAI tools closely mirrors the tendencies we see with things like social media algorithms, where people are shown content that reinforces their views, all in an effort to keep people glued to a given platform (Bourne, 2024). With GenAI this personalization is even more insidious, increasingly leading users to rely on chatbots as confidants, friends, therapists, and even romantic partners (Garofalo & Vecchione, 2025). This appears to be by design, to an extent, as evidenced by a batch of recent commercials and offline advertising efforts like billboards depicting AI chatbots as friendly companions that can help you with mundane activities like preparing a meal, exercising, or deciding on home décor (Swant, 2025).  Whether these narratives are an effort to assuage fears about the dangerous capabilities of AI tools, highlight AI tools as powerful, albeit in a nonthreatening way, or attract new users with promises of both usefulness and fun, AI hype narratives seem increasingly intertwined with AI personalization features. I believe that the hype narratives surrounding artificial intelligence and the personalized nature of GenAI actually fuel one another, where hype narratives lead people to see these tools as powerful and magical and personalization features lead people to place their trust in these tools, and thus become more susceptible to believing hype narratives about artificial intelligence.

As librarians and educators seek to develop AI literacy frameworks to make sense of this emerging and evolving technological landscape, I argue that we need to give further attention to the ways in which we talk about, experience, emotionally respond to, and engage with GenAI tools. What effects do AI hype and AI personalization have on our ability to think critically and even clearly about these tools when we are being besieged by everything from relentless positivity, proclamations of inevitability, or visions of doom, all while being swayed by sycophantic chatbots that reaffirm everything we type? How can we make informed decisions about using GenAI technologies in media environments where critical and even accurate information about AI can be hard to come by? We are in a situation where narratives about AI technologies and our varied experiences using these technologies both potentially hinder our ability to make informed decisions and to think critically about generative AI and AI technologies (Nguyen & Mateescu, 2024; Bender & Hanna, 2025). If we as librarians and educators strive to develop an AI literacy that is rooted in critical thinking, nuance, and ethics, as outlined in places like the AI Competencies for Academic Library Workers (ACRL, 2025), then we need to contend with AI hype and AI personalization and equip learners to approach GenAI with both a critical and a reflective lens.

In this article, I hope to examine the interconnected trends of AI hype and AI personalization. I would like to consider how we can utilize the insights that arise from exploring AI hype and AI personalization as key aspects of our understanding of GenAI to develop a more critical and human-centered approach to AI literacy. And I am eager to consider how this framing can equip us to further develop a more critical AI literacy that better contextually situates GenAI, in line with approaches from critical information and media literacy, which examine the social and political construction and dimensions of information (Tewell, 2015; Kellner & Share, 2005). First, I will explore the dynamics between emerging narratives about GenAI and emerging frameworks for conceptualizing AI literacy. While I believe that the narratives surrounding GenAI influence AI literacy, I also argue that AI literacy frameworks form their own sorts of narratives about AI tools and technologies and influence how many, particularly educators and librarians, understand and respond to GenAI. Second, I will look at trends within AI hype, including themes of power, magic, and inevitability, that shape these narratives and our ensuing understanding of and reaction to AI technologies. I will then turn to examining trends of AI personalization within the framework of AI hype narratives and consider how the trust that can be inspired by AI personalization can reinforce AI hype. To close, I will look at ways that we can equip learners to better unpack, interrogate, and understand GenAI through the lenses of AI hype and AI personalization.

A focus on AI hype and AI personalization can help us center the often complex, emotional, and confusing experiences people are having with GenAI and help us as librarians explore how we can best equip people to think critically amidst AI and information environments that often do not lend themselves to critical thought and reflective practices. Ultimately, I believe that librarians, educators, and learners can benefit from the introduction of two lenses into emerging AI literacy frameworks. First is a focus on contextual analysis, where we can take a critical approach to analyzing the narratives surrounding technologies like AI and examine how that mediates, shapes, and influences our experiences with said technologies. Second is a focus on reflective practice that empowers learners to better recognize and critically engage with narratives and technological tools that might be personalized to an alarming degree. By grounding AI literacy with this sort of critical analysis, contextualization, and reflective practice, I feel that we can strengthen AI literacy, situate AI literacy within broader trends around critical media and information literacy, and equip learners to better engage with AI technologies in our complex and rapidly changing information environment.

The Evolving State of AI Literacy and AI Narratives

The hype narratives surrounding GenAI tend to exaggerate the benefits, capabilities, power, and successes of these technologies while minimizing its issues and flaws. And we can see these narratives emerging everywhere from commercials to public comments from AI companies to news articles to chatter on social media. But while the hype narratives promoting GenAI are increasingly ubiquitous, there are alternative narratives emerging that question and criticize the relentless hype surrounding GenAI. To make sense of this cycle of hype and disillusionment, we have a graphically represented cycle for exploring technological hype. The Gartner Hype Cycle, created in 1995 by Gartner analyst Jackie Fenn, provides a compelling framework for exploring AI hype narratives and places AI hype into context with other previous technological hype cycles (Gartner). According to the Gartner Hype Cycle, new technologies tend to follow a certain track in terms of both narrative and public reception and perception. A given technology is praised, extolled, and exalted, to the point of peak absurdity, before careening downhill into the evocatively named “trough of disillusionment.” (Gartner). Following this crash, in terms of expectations and sentiment, people will accept the new technology as useful for some things and not for others, settling into more realistic expectations. Recent research has speculated that this hype cycle model may not hold true for different kinds of technologies and has also posited that the nature of our media landscape and our modern technology sector, with its emphasis on speed and rapid new developments, are leading to repeated, less linear, and more pervasive hype cycles. (Dedehayir & Steinert, 2016; Van Lente et al., 2013; Goncalves & Bareis, 2025).  

The hype we are seeing with GenAI seems to be reaching new heights thanks in part to the nature of our current media and information ecosystem. Social media thrives on virality, with hype narratives poised to find success amongst platforms and algorithms that favor attention-grabbing content and spectacle (Bareis, 2024). And hype narratives are nothing if not attention-grabbing. In some respects, the hype narratives surrounding GenAI have found an ideal home amidst our current online information environment (Bourne, 2024). Recognizing AI hype as part of a longer history of technological hype, market frenzy, and raised expectations can help us better critically analyze the current wave of hype narratives that we are seeing surround GenAI and recognize the ways in which AI technologies are operating as part of a technological industry where hype cycles serve as expressions of power, ways to amass capital, and as a central aspect of technological development and our media ecosystem (Hao, 2025; Bender & Hanna, 2025)

AI literacy has emerged in conjunction with the seemingly abrupt and all-encompassing arrival of GenAI itself, and AI literacy has continued to evolve alongside our shifting understanding of GenAI. We can deepen our insights into the shifts and trends within AI literacy by situating AI literacy within the broader milieu of AI hype narratives, as well as within the longer history of technological hype (Van Lente et al., 2013; Bender & Hanna, 2025). While AI literacy tends to call for critical thinking, ethical understanding, and thoughtful approaches, AI hype tends to highlight things like ease, speed, simplicity, convenience, and the lack of need for deep thought, complexity, or worry (Bareis, 2024). AI hype narratives tend to heavily anthropomorphize AI technologies as well, to the extent that it can be difficult to discuss GenAI without utilizing terms that ascribe these tools more ability, and more humanity, than is warranted (Barrow, 2024; Placani, 2024). These hype narratives also presume an inevitability to GenAI, as if the emergence and ensuing dominance of GenAI in our society is an inescapable fact (Baer, 2025b). While the humanizing language surrounding GenAI can influence or even limit the vocabulary we use to discuss GenAI, the inevitability narratives surrounding GenAI can potentially dissuade critical discussion altogether. After all, why discuss or debate something that is inevitable? The nature of AI hype narratives poses challenges for more critical approaches, as these narratives often suggest that AI technologies are beyond questioning, beyond human foibles, and beyond reproach. The hype narratives that cast AI as somehow superior to humans can dissuade criticism and questions directed towards GenAI and those who have created it (Baer, 2025a; Baer, 2025b; Campolo & Crawford, 2020). Overall, the inviolability found within AI hype narratives can shape, and even hinder, the ways in which we question and criticize GenAI. Given the persuasive nature of AI hype narratives, and the potential harms inherent within AI technologies, there is a real and growing need for more critical and nuanced approaches to GenAI in the face of relentless hype narratives that seem to dissuade thinking deeply about AI in the first place.   

Many AI literacy frameworks, including work from Leo Lo and places like UNESCO and the Digital Education Council, increasingly highlight ethics and critical thinking as tenants of what it means to be AI literate, alongside using and understanding various AI tools and technologies (Lo, 2025, Miao & Shiohira, 2024; Digital Education Council, 2025). However, these AI literacy frameworks exist within and amidst pervasive and compelling AI hype narratives and can echo the underlying assumption that GenAI is powerful, inevitable, and potentially transformative (Baer, 2025b). How can we encourage understanding of GenAI without dissecting the often-misleading hype narratives surrounding it? And how can we gain insights from using highly personalized GenAI tools without reflecting on that experience? It seems to me that AI literacy frameworks can benefit from incorporating more critical approaches that can equip learners to more thoughtfully engage with GenAI and avoid inadvertently reinforcing AI hype narratives. Floridi, in work on the AI bubble, which AI hype is creating, notes that we need to “[m]aintain a critical and balanced perspective about AI developments, no matter what people with vested interests may say, recognising the technology’s potential and limitations” (Floridi, 2024, p. 12). To me, this is a call for embracing critical information and media literacy approaches that investigate and question narratives of power as a way to navigate AI hype.

AI hype is introducing a degree of cognitive dissonance as well, with a contrast between the extreme expectations set by AI hype and the reality of AI tools not performing as promised (Baer 2025a; Floridi, 2024). And this cognitive dissonance seems to be giving rise to increased criticism of GenAI. There are emerging frameworks and schools of thought that challenge the centrality of using GenAI, such as the AI refusal movement which argues that using AI is ethically unacceptable in many instances (Fox, 2024). Resources from places like the Rutgers Critical AI initiative also illustrate ways to utilize critical information and media literacy approaches for exploring GenAI. And as AI hype seems to grow and reach new and more bombastic heights, critiques of the AI enterprise rooted in privacy, labor concerns, and eco-critical stances, among others, have grown in response (Nguyen & Mateescu, 2024). There seems to be a by-play between hype narratives and the eventual counter narratives that emerge seeking to puncture the hype, whether through concern, disagreement, or just sheer exasperation with whatever outlandish claims are being raised by various trending hype narratives.

I think we can place AI hype itself and conceptions of AI literacy within a longer history of over-hyped technology and within the context of our current social media era. If we situate AI hype and AI personalization, as well as AI literacy, within this space, we can draw upon critical approaches and reflective practices that have emerged in media and information literacy spaces and put these lessons into conversation with GenAI (Soken & Nygreen, 2024). We are seeing calls for AI literacy to emphasize ethics and critical thinking (Lo, 2025). But in order to do that I think that we need to better contend with the context of AI, and how AI is being discussed, perceived, and received (Sloane et al., 2024; Bourne, 2024; Baer, 2025). Thinking critically and ethically about AI involves understanding not just how this technology works but how AI is being packaged and presented, how people are experiencing and understanding AI, the culture into which AI is being unleashed, and how AI literacy itself is situated within this environment. I think we can bring these threads together with an eye to developing a more critical AI literacy that considers the influence of AI hype and personalization on our understanding of GenAI.

Understanding AI Hype

AI hype not only influences our understanding of AI, but it also sets up certain parameters for our conversations about AI. The crux of AI hype narratives seems to be a narrative of power, with a focus on the amazing and terrible things that GenAI can do, as well as an underlying theme of who is in power in this AI landscape (Hao, 2025; Bender & Hanna, 2025). Hype is about influence, about generating excitement, and about inspiring strong emotions (Sloane et al., 2024; Bourne, 2024). And, significantly, hype is not accidental but rather crafted to attract positive attention and funding (Goncalves & Bareis, 2025). Within these narratives, there seems to be an idea that GenAI is powerful, that using GenAI can make you better and more powerful (as if the sheen of GenAI can rub off on you), and that creating and developing GenAI tools imbues you with a degree of mysticism. In fact, some have started to note the uncanny similarities between AI hype and a religious movement, complete with commandments, origin myths, prophecies, a belief in the apocalypse, ritualistic practice, and acceptable forms of behavior and language (Epstein, 2024). 

Before delving further into what AI hype tends to say, it is worth noting who is crafting and sharing these narratives. Creators of AI technologies, including the heads of various technology companies and the marketing departments of those companies, contribute a great deal of AI hype into our media environment (Bender & Hannah, 2025). And many of the companies who play major roles in the AI technology landscape already exercise undue influence in our media landscape, controlling our information discovery platforms (like Google) and our social media sites (like Meta). Many of our major technology companies, whether they are driving the development of GenAI technologies or are hopping on the bandwagon of GenAI developments, are contributing and promoting AI hype narratives and are pushing GenAI features on their platforms, further contributing to the feeling that GenAI is inescapable and inevitable. From exclusive interviews to high-profile outlets, to commercials, to conveniently timed “leaks” about new features, to press releases, there is a never-ending stream of hype emerging from these companies (Duarte, 2024; Hao, 2025). If we apply critical analysis to these narratives, some motives emerge. Money, sustained power, and influence are factors that are driving AI hype narratives of more corporate origin (Hao, 2025; Bender & Hanna, 2025). After all, a fantastic, useful, and powerful tool will attract users, investors, and more positive attention. AI hype narratives also emerge from media outlets, governments, other industries, and from users of AI technologies, all of whom echo and reinforce the hype produced by various corporate interests (Hao, 2025; Bareis & Katzenbach, 2022).

Interestingly, AI doom narratives arguably operate as another side of AI hype narratives (Vinsel, 2021). After all, GenAI must be powerful and incredible if it can potentially trigger the apocalypse. Here the doom narratives can feed into the overall hype narratives surrounding GenAI, potentially distracting us from more complex and nuanced challenges and issues associated with AI technologies (Hanna & Bender, 2023). As Sloane et al. (2024) note, “Although situated as polar opposites, stories of excitement and of terror are both integral to the practice of AI hyping because they grossly simplify AI narratives and pit them against the realities of AI design and use” (p. 670). This polarized interplay of terror and excitement, doom and joy, dystopia and utopia, form the crux of AI hype narratives and create challenges for discussing GenAI with nuance and critical discernment. Within these outlandish claims is a degree of confusion and increased unease and even distaste. In an article with The Scholarly Kitchen, Jones (2025) posits what many of us have been wondering and asking: what exactly do AI tools actually do?  Some recent studies have illustrated that people tend to like GenAI less the more they learn about it (Chen et al., 2024; Tully et al., 2025). While more research is needed in this area, recent surveys do indicate that there might be an inverse relationship between learning about AI and liking AI, which has implications for AI literacy education. If your motive is to get people using AI, then it stands to reason that the stories you tell about AI will gloss over its issues. AI hype narratives seem to discourage criticism and critical thought, while encouraging unquestioned use and enthusiasm towards GenAI (Duarte, 2024). In contrast, if your motive is to educate people about AI, then it seems you need to cut through the persuasive and distracting hype surrounding AI (Ndungu, 2024; Baer, 2025a; Soken & Nygreen, 2024).

What does it mean to critically engage with something in the midst of being inundated with outlandish propaganda? How can librarians and other educators equip learners to critically engage with AI technologies, to question them, and to potentially challenge claims made about and by GenAI technologies in information environments inundated with AI hype, where GenAI is positioned as authoritative? Within a hype cycle, embracing a critical approach involves not just information and understanding but the confidence and knowledge to make and share critical views and arguments (Baer, 2025a; Baer, 2025b; Soken & Nygreen, 2024). There are a few motifs and themes within the AI hype narratives that I feel are worth unpacking, and that have implications for how we can develop a critical AI literacy imbued with a focus on narrative, context, and reflection. To my mind, there are three areas that are key for understanding the current nature of AI hype and the ways that AI hype is shaping our understanding of and relationship with GenAI.

The first area is power. Power can of course be enticing, but it can also be prohibitive in that the perception of power can squash dissent. As Duarate (2025) argues, our ability to think critically about GenAI can be “dramatically impeded by exposure to inaccurate information, especially when it is delivered confidently and compellingly by AI executives and other influential figures” (para 4). Whether the narratives about GenAI are inaccurate, distracting, misleading, exaggerated, or some combination of those things, these narratives seem as if they are designed to influence more than inform. AI hype narratives promote the power of AI tools, but they also serve as expressions of power and influence from individuals and groups, such as technology companies, crafting and sharing them (Hao, 2024). Power is central to the AI hype narratives we are currently seeing and to the emerging counternarrative, where critics of GenAI and the AI enterprise often dissect how GenAI tools actually do not work as advertised and are not as powerful as proclaimed (Bender & Hanna, 2025; Nguyen & Mateescu, 2024). And power leads us to a few other themes that are, in some respects, unique to AI hype narratives when compared to the hype narratives we have seen about other technologies.

Next is magical thinking. There is a degree of magic surrounding narratives about AI and hype narratives in particular. According to these narratives, AI can do an endless array of wondrous and wonderful things and can make astonishing leaps in performance (Mitchell, 2025). The sense of magic imbuing AI can lead people to believe in the capabilities and power of AI unquestioningly. And a belief in the magic of AI has been linked to lower levels of AI literacy, with a study from Tully et al. (2025) noting that individuals with lower levels of AI literacy are more likely to perceive AI as magical and more likely to be receptive towards using AI tools. Magic is a key aspect of AI hype narratives, and an aspect of AI personalization as well, with GenAI appearing as some sort of all-powerful and all-knowing companion, like a sort of technological fairy godmother. But magic also crops up in the nature of AI hype narratives themselves, not just in how AI tools allegedly perform. As David Morris (2024) notes in his work on AI and magic, “Magicians hack our attentional, perceptual, and cognitive tendencies to make us perceive and believe what is not there” (p. 3047). Here AI technologies function as magical tools while the creators of AI technologies function as magicians, using dazzling techniques to divert our attention. This sort of technique lies at the core of AI hype narratives, which arguably distract from real issues and complexities surrounding the development and deployment of GenAI (Hanna & Bender, 2023). Recognizing the magic running through narratives surrounding AI, and how it shapes our perception of these tools as immensely powerful, is a key aspect of approaching AI with a critical lens.

The final area worth considering is inevitability. Within AI hype narratives, AI technologies are presented as somehow inevitable and unquestionable (Baer, 2025b; Gonclaves & Bareis, 2025). As noted, these narratives can take on a sort of religious fervor, as if AI technologies are somehow preordained (Epstein, 2024). A prevailing sentiment seems to be that AI is here, it is not going anywhere, and everyone must adapt themselves to this new AI-driven reality. This sort of narrative can dissuade questioning, both through more overt prohibitions and through more subtle implications about futility (if AI is inevitable, then what use is it complaining or questioning?) and progress (if you question progress does that mean you are somehow backwards?) (Baer, 2025a). The hype narratives that emphasize the inevitability of GenAI can also hinder critical engagement with AI technologies and even cast the act of asking questions as being unduly negative or resisting inevitable technological progress.

Taken together, these trends within AI hype narratives can make critical thinking and critical engagement with AI incredibly challenging. Even critiquing GenAI in the midst of an environment dominated by AI hype runs the risk of giving too much credence to AI’s alleged power (Sloane et al., 2024). To critically engage with AI technologies, we need to cut through narratives of power, magic, and inevitability, which can involve taking the time to untangle and rebut various hype narratives and claims before moving onto things like actual critiques, policy proposals, or more nuanced arguments (Sloane et al., 2024). While AI hype can be a distraction, understanding and analyzing AI hype is a vital component of a more critical AI literacy. By borrowing from critical information and media literacy, we can weave skills in analysis and evaluation into AI literacy and better equip learners to ask questions, consider the context of GenAI, dissect narratives of power (with hype narratives are at their core), and more thoughtfully consider how we are experiencing and understanding GenAI amidst the outlandish claims of AI hype.

Unpacking AI Personalization

Amidst the frenetic hype surrounding AI, which can beggar belief, is the emotionally appealing, persuasive, and at times manipulative nature of AI personalization. AI personalization can take the form of agreeableness, positivity, and even sycophancy (Hermann, 2022; Kaffee & Pistilli, 2025; Selvi, 2025). AI chatbots are endlessly helpful, rarely disagree or argue, and (if the hype is to be believed) always do what you ask. The personalized nature of AI tools, and the experience of using these seemingly friendly, agreeable, and helpful tools can create feelings and emotions among users that I feel are important to recognize and consider as we strive to develop more human-centered approaches to AI literacy. A study from Data and Society notes that while “our participants know the chatbot is neither ‘real’ nor ‘intelligent,’ they also know that the feelings it elicits in them are genuine,” describing how users find chatbots safe, easy to talk to, and comforting (Garofalo & Vecchione, 2025). Even if people are aware of the nature of AI personalization, and the artifice of these tools, feelings of trust and fondness can still emerge. However, many users are not aware of the machinations behind AI tools and how the personalized features are in many respects an effort to keep users glued to a given chatbot platform (Lupetti & Murray-Rust, 2024). We can face challenges in critically engaging with AI due to AI hype, where narratives present AI as powerful, magical, inevitable, and something that shouldn’t be questioned. But the personalized nature of AI can add further challenges to our ability to engage critically with GenAI. While AI hype narratives might strain credulity, the personalized nature of AI, and the emotional aspects of that personalization, can make questioning and challenging emotionally resonate and appealing AI difficult nevertheless.

GenAI chatbots have a tendency towards positivity and agreeableness which can foster trust and reliance. As Kaffee and Pistili note, GenAI “systems already simulate care, empathy, and attentiveness” (para 9). Meanwhile, Gary Marcus (2025) argues that GenAI chatbots fool people into thinking they can behave like humans, when in reality these tools are just mimicking humans. Constantly hearing that everything you say and think is fantastic can be enticing, if not addictive. In fact, when ChatGPT released an update in the summer of 2025 that toned down the sycophancy, users complained (Tangermann, 2025). This personalization also seems to exacerbate trends we have already seen in social media spaces with things like filter bubbles and echo chambers, where algorithms curate customized environments where you only see and hear what the algorithm thinks you want to see and hear. As AI gets further embedded into many of our existing online tools and spaces, from search engines to social media sites, what effect will this have on people’s ability to identify and critique GenAI? If someone is hearing what they want to hear, or feel trust towards the powerful, magical, and personalized tool they are using, will they be inclined to analyze or question that tool?

We can benefit from unpacking AI personalization within the context of AI hype narratives that emphasize power, magic, inevitability, and the superior nature of AI when compared to humans. Notably, the personalized nature and experience of GenAI reinforces many of the themes found within AI hype narratives. AI chatbots seem poised to act as the ultimate personal assistants, able to handle any task or question without complaint or without tiring. The speed with which AI chatbots respond, and the confidence with which they do so belies the chronic issue of so-called AI hallucinations that have plagued AI chatbots since their launch (Hicks et al., 2024). AI chatbots give the impression of being powerful and wise and the hype narratives surrounding AI reinforce the behavior of the chatbots themselves. As a result, we are seeing emerging issues with cognitive offloading with GenAI technologies, where people trust these tools and become overly reliant on their AI personal assistants, potentially degrading their own skills and cognitive abilities (Kulal, 2025; Skibba, 2025). Overall, this reliance on these powerful GenAI tools can lead to trust and to an affinity toward these tools.

Magical thinking and the magic narratively surrounding GenAI also intersects with the personalized experience of using AI tools. As we have seen, AI hype narratives frequently imbue AI with a sense of mysticism and magic. And something that is always at the core of magical narratives is trust and belief (Morris, 2024). Endlessly cheery and agreeable AI tools ask for trust, even if the ideas it shares are half-baked or the sources are made-up or the writing is mediocre. The underlying promise seems to be if you don’t look to closely or delve too deeply, if you trust the magic and the speed and the power, if you accept the results that you are (quickly) given, if you place your trust and your cognition into AI’s hands, then you will have nothing to worry about. The overall personalized user experience and design of GenAI can contribute to a sense of “enchantment” with using AI tools (Lupetti & Murray-Rust, 2024). But this experience with enchantment goes beyond using AI tools and shapes the nature of, and potential goals of, AI hype narratives as well. As Campolo and Crawford (2020) note, the experience of enchantment shields creators of AI tools from scrutiny and accountability. The user experience of GenAI often discourages reflection and deep thought while the magic trick of AI hype narratives and AI user experience encourage trust and belief. The positive feelings generated (pun intended) towards AI by the personalization of AI technologies can reinforce AI hype narratives.

Just as we can experience challenges in critically engaging with GenAI amidst hype narratives that emphasize the amazing and powerful nature of AI technologies, we can experience difficulties with thinking critically and clearly about AI in the midst of the emotional experience of AI personalization. Additionally, the experience of using AI technologies can be quite emotionally complex, while our individual and collective responses to AI development are also rooted in strong emotions like fear, anxiety, enthusiasm, curiosity, and even frustration and anger (Chen et al., 2024; Bourne, 2024). I think it is important to recognize that we as librarians and educators might have strong feelings towards AI ourselves, just as our learners might also have complicated emotions about AI (Baer, 2025a; Fox, 2025; Monnier et al., 2025) As we continue to develop AI literacy in response to AI trends, I think we have to acknowledge and even center the emotional aspects of our experiences with and reaction to AI. 

One potential way forward with this is to borrow from critical information and media literacies, which emphasize the complex experiences people have with information and the ways that media shapes, and is shaped by, systems of power (Soken & Nygreen, 2024; Kellner, 2005). If our understanding of GenAI is shaped by narratives of power in the guise of AI hype and by our experiences with using these tools under the influence of AI personalization, then I believe we can benefit from bringing critical approaches that address these facets of GenAI into AI literacy. AI hype might seek to present AI as unprecedented and amazing, but I feel that AI is part and parcel of broader trends in technological hype, personalization, and what Bourne refers to as “affective capitalism,” or a capitalism rooted in emotional appeals and personalization (Bourne, 2024, p. 758). And if GenAI is part of these broader trends, then I think we can situate AI literacy within existing trends and approaches found in critical information and media literacy.

In environments colored by ubiquitous AI hype narratives and the personalized effects of AI technologies, the ability to reflect is crucial. While it is important for learners to understand AI, I feel that it is also key for learners to be able to reflect upon and identify how AI is making them feel and how they are responding to AI, increasingly important given how persuasive AI hype and personalization can be. Incorporating reflection into AI literacy alongside skills like critical thinking will strengthen existing aspects of AI literacy like ethical reasoning and evaluation and will highlight a skill set that can better enable people to navigate the emotional complexities of AI hype and AI personalization, however appealing and persuasive it might be. By exploring both hype narratives and the personalized output from GenAI, we can develop richer approaches to AI literacy. 

Developing Critical AI Literacies

The experiences and effects of AI hype and AI personalization complicate our efforts to engage critically and thoughtfully with generative AI tools and technologies and the many challenges and issues these technologies introduce. A more critical and reflective approach to AI literacy can help us unpack these narratives of power and influence. But I think a challenge for librarians and educators is in finding ways to make that focus explicit, central, and sustained amidst all the other demands inherent within AI literacy and a broader information literacy for that matter. In my own work as an instruction librarian, I have felt the pressure of time constraints and the enormity and complexity of the information literacy topics I am aiming to address. Personally, I feel that intentionality and an emphasis on equipping learners to ask questions rather than settle on a single correct answer can create space for more critical and reflective approaches which can greatly benefit us as we explore more critical and contextualized approaches to AI literacy. Librarians and educators can bring in examples of AI hype narratives or AI personalization, pose questions, and encourage learners to share their own experiences. Taking a little time, even when time is short during an instruction session, to spark curiosity and awareness can equip learners to better take in the bigger picture and context of GenAI, beyond simply using an individual tool. Ultimately, I believe that librarians, educators, and our learners can benefit from the introduction of two lenses into emerging AI literacy frameworks.

First is a focus on context and contextual analysis, where we take a critical approach to analyzing the narrative context surrounding technologies like AI and how that mediates, shapes, and influences our experiences with said technologies. This concern with narratives of power is a framing that can be particularly beneficial for gaining a deeper and more critical understanding of AI technologies (Soken & Nygreen, 2024; Baer, 2025b). Many AI literacy frameworks, including the AI Competencies for Academic Library Workers (ACRL, 2025) include a call for developing an understanding of AI technologies, including how they work and how they are developed. But I believe that we can extend this understanding to include a focus on how AI technologies and tools are presented, received, and conceptualized by the public. The narratives hyping AI, whether through commercials, interviews, media coverage, or online social media posts, greatly shape how we conceptualize and discuss AI and can even dissuade us from criticizing or questioning AI technologies thanks to the aura of power, magic, and inevitability that AI hype narratives create around Generative AI. When teaching others about AI technologies, librarians and other educators can discuss trends in AI hype with students, encourage students to reflect on the AI hype narratives they have seen and encountered, and share examples of AI hype narrative for analysis, reflection, and discussion (Soken & Nygreen, 2024, Ndungu, 2024). I believe that equipping students to think critically about AI and to feel confident in sharing their opinions and views is an important component of developing a more critical AI literacy and a broader and richer understanding of GenAI. And this approach has implications for information and media literacy more generally, where we can encourage learners to think critically about technologies other than AI that might also be overly-hyped in the media or cast as powerful or beyond reproach.

The second lens that we can introduce to AI literacy is a focus on reflective practice that empowers learners to better recognize and critically engage with narratives and technological tools, like AI, that might be highly personalized. As we have seen, AI hype and the experience of using AI tools can discourage reflection and critical analysis and encourage trust and awe. Emphasizing reflection as a key component of AI literacy mirrors approaches that are increasingly utilized in broader media and information literacies (Soken & Nygreen, 2024; Ndungu, 2024). Researchers like Riesen (2025) have argued that reflective practices can help learners better contextualize and apply information literacy skills. I believe reflection can also help learners find personal meaning, value, and context for AI literacy skills. AI literacy frameworks generally have a section that calls for evaluation of AI output. But I think we can also encourage an evaluation of our own thoughts and feelings towards AI, and a reflective approach to both using AI tools and consuming content about AI tools. What emotions are arising? Why might an AI tool foster a certain kind of user experience? What motivations underlie narratives surrounding AI? These are questions that can be part of a reflective practice where students are encouraged to pause, consider, and reflect on their own experiences with AI as a way to better critically analyze AI technologies. AI literacy emphasizes using AI tools and analyzing the output of those tools. But by taking a step backward and outward, and by posing questions about the implications of GenAI, the narratives being woven about and around GenAI, and the experiences people are having with GenAI, we can encourage learners to ask questions, sort through their thoughts and feelings, share their ideas, and begin to engage more critically with not just individual GenAI tools but the entire GenAI enterprise.

Conclusion

Putting AI hype and AI personalization into conversation can help us develop an AI literacy that not only focuses on critical thinking but on reflection, context, and the complex emotional experiences that we have with AI technologies. I think that a human-centered AI literacy can and should embrace the complicated, messy, and emotional aspects of our collective and individual experiences with GenAI and the stories we imbibe and tell ourselves about these tools. And by centering and acknowledging the emotional complexities of our experiences with, and reactions to, GenAI, we can better engage in conversations with learners and delve into issues surrounding GenAI and its development and use.

The personalized experience of using AI tools and the hype surrounding AI cannot be separated from our understanding of GenAI. Rather, AI hype and AI personalization deeply shape and influence our experience with GenAI and how we perceive, react to, and make decisions about GenAI, including when, where, and how we use these AI tools. By grounding AI literacy with this sort of critical analysis, contextualization, and reflective practice, I feel that we can strengthen both AI literacy and information literacy and equip learners to better engage in our complex and rapidly changing information environments. Librarians and other educators can work to develop an AI literacy that is concerned with and informed by the context in which AI technologies are developed and in which they emerge as well as the complex and emotional human experience of using, understanding, and responding to GenAI.


Acknowledgements 

I want to extend my sincere thanks to my internal reviewer Brea McQueen, my publishing editor Brittany Paloma Fiedler, and my external reviewer Rosalind Tedford for their time, attention to detail, constructive feedback, and support. Their thoughtful comments, ideas, and feedback proved invaluable throughout the stages of shaping this article. I am fortunate to have collaborated with Rosalind on previous projects related to information and AI literacy, and I’d like to extend a thank you to her and Dan Chibnall for serving as thought-partners and collaborators over the years. I’d also like to thank Andrea Baer and Brady Beard for their time, generosity, and willingness to discuss generative AI and librarianship with me. Their work has helped to shape and inspire my own. Finally, a thank you to the Lead Pipe Editorial Board for the opportunity to publish my work here.


Suggested Tags

Generative AI; AI literacy; AI hype

References

ACRL (2025). AI competencies for academic library workers. https://www.ala.org/acrl/standards/ai

Baer, A. (2025a). Unpacking predominant narratives about generative AI and education: A starting point for teaching critical AI literacy and imagining better futures. Library Trends, 73(3), 141-159. https://muse.jhu.edu/pub/1/article/961189/pdf

Baer, A. (2025b). Investigating the ‘feeling rules’ of generative AI and imagining alternative futures. In the Library with the Lead Pipe. https://www.inthelibrarywiththeleadpipe.org/2025/ai-feeling-rules/

Bareis, J. (2024). Ask me anything! How ChatGPT got hyped into being. Preprint. Center for Open Science. https://doi.org/10.31235/osf.io/jzde2

Bareis, J., & Katzenbach, C. (2022). Talking AI into being: The narratives and imaginaries of national AI strategies and their performative politics. Science, Technology, & Human Values, 47(5), 855-881. https://doi.org/10.1177/01622439211030007

Barrow, N. (2024). Anthropomorphism and AI hype. AI and Ethics, 4(3), 707-711. https://doi.org/10.1007/s43681-024-00454-1

Bender, E.M., & Hanna, A. (2025). The AI Con: How to Fight Big Tech’s Hype and Create the Future We Want. Harper.

Bourne, C. (2024). AI hype, promotional culture, and affective capitalism. AI and Ethics, 4(3), 757-769. https://doi.org/10.1007/s43681-024-00483-w

Campolo, A., & Crawford, K. (2020). Enchanted determinism: Power without responsibility in artificial intelligence. Engaging Science, Technology, and Society. https://knowledge.uchicago.edu/record/6022?v=pdf

Chen, Y. S., Tang, Y. C., & Chen, C. (2024). The ethical deliberation of generative AI in media applications. Emerging Media, 2(2), 259-276. https://doi.org/10.1177/27523543241277563

Dedehayir, O., & Steinert, M. (2016). The hype cycle model: A review and future directions. Technological Forecasting and Social Change, 108, 28-41. https://doi.org/10.1016/j.techfore.2016.04.005

Digital Education Council (2025). Digital Education Council AI literacy framework. https://www.digitaleducationcouncil.com/post/digital-education-council-ai-literacy-framework

Duarte, T. (2024). As the AI bubble deflates, the ethics of hype are in the spotlight. Tech Policy Press. https://www.techpolicy.press/as-the-ai-bubble-deflates-the-ethics-of-hype-are-in-the-spotlight/

Epstein, G. (2024). Silicon Valley’s obsession with AI looks a lot like religion. The MIT Press Reader. https://thereader.mitpress.mit.edu/silicon-valleys-obsession-with-ai-looks-a-lot-like-religion/

Floridi, L. (2024). Why the AI hype is another tech bubble. Philosophy & Technology, 37(4). https://doi.org/10.1007/s13347-024-00817-w

Fox, V. (2024). A librarian against AI. https://violetbfox.info/against-ai

Garofalo, L., & Vecchione, B. (2025). All the lonely people: on being alone with digital companions. Data and Society. https://datasociety.net/points/all-the-lonely-people/

Gartner. Gartner Hype Cycle. https://www.gartner.com/en/research/methodologies/gartner-hype-cycle

Goncalves, A.B., & Bareis, J. (2025). Expanding hype literacy to protect democracy. Tech Policy Press. https://www.techpolicy.press/expanding-hype-literacy-to-protect-democracy/

Hanna, A. &  Bender, E. (2023). AI causes real harm: let’s focus on that over the end-of-humanity hype. Scientific American. https://www.scientificamerican.com/article/we-need-to-focus-on-ais-real-harms-not-imaginary-existential-risks/

Hao, K. (2025). Empire of AI: Dreams and Nightmares in Sam Altman’s OpenAI. Penguin Press.

Hermann, E. (2022). Artificial intelligence and mass personalization of communication content—An ethical and literacy perspective. New media & society, 24(5), 1258-1277. https://doi.org/10.1177/14614448211022702

Hicks, M. T., Humphries, J., & Slater, J. (2024). ChatGPT is bullshit. Ethics and Information Technology, 26(2), 38. https://doi.org/10.1007/s10676-024-09775-5

Jones, P. (2025). Three years after the launch of ChatGPT, do we know where this is heading? The Scholarly Kitchen. https://scholarlykitchen.sspnet.org/2025/10/13/three-years-after-the-launch-of-chatgpt-do-we-know-where-this-is-heading/

Kaffee, L., & Pistilli, G. (2025). Before AI exploits our chats, let’s learn from social media mistakes. Tech Policy Press. https://www.techpolicy.press/before-ai-exploits-our-chats-lets-learn-from-social-media-mistakes/

Kellner, D., & Share, J. (2005). Toward critical media literacy: Core concepts, debates, organizations, and policy. Discourse: studies in the cultural politics of education, 26(3), 369-386. DOI: 10.1080/01596300500200169

Kulal, A. (2025). Cognitive risks of AI: Literacy, trust, and critical thinking. Journal of Computer Information Systems, 1-13. https://doi.org/10.1080/08874417.2025.2582050

Lo, L. S. (2025). AI literacy for all: A universal framework [Preprint]. University of New Mexico Digital Repository. https://digitalrepository.unm.edu/cgi/viewcontent.cgi?article=1216&context=ulls_fsp

Lupetti, M. L., & Murray-Rust, D. (2024). (Un)making AI magic: A design taxonomy. Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (pp. 1-21). https://doi.org/10.1145/3613904.3641954

Marcus, G. (2025). Why DO large language models hallucinate? Marcus on AI. https://garymarcus.substack.com/p/why-do-large-language-models-hallucinate

Miao, F., & Shiohira, K. (2024). AI competency framework for students. UNESCO Publishing. https://www.unesco.org/en/articles/ai-competency-framework-students

Mitchell, M. (2025). Magical thinking on AI. AI: A Guide for Thinking Humans. https://aiguide.substack.com/p/magical-thinking-on-ai

Monnier, R., Noe, M., & Gibson, E. (2025). AI in academic libraries, part one: Concerns and commodification. College & Research Libraries News, 86(4), 173. doi:https://doi.org/10.5860/crln.86.4.173

Morris, D. (2024). Magical thinking and the test of humanity: We have seen the danger of AI and it is us. AI & SOCIETY, 39(6), 3047-3049. https://doi.org/10.1007/s00146-023-01775-1

Ndungu, M. W. (2024). Integrating basic artificial intelligence literacy into media and information literacy programs in higher education: A framework for librarians and educators. Journal of Information Literacy, 18(2), 1–18. https://doi.org/10.11645/18.2.641

Nguyen, A., & Mateescu, A. (2024). Generative AI and labor: Value, hype, and value at work. Data & Society. https://datasociety.net/library/generative-ai-and-labor

Placani, A. (2024). Anthropomorphism in AI: Hype and fallacy. AI and Ethics, 4(3), 691-698. https://doi.org/10.1007/s43681-024-00419-4

Riesen, K. (2025). Incorporating signature pedagogies into library instruction through

reflective pedagogy. Portal: Libraries and the Academy, 25(1), 137-150.

https://dx.doi.org/10.1353/pla.2025.a950012

Rutgers (2026). Critical AI. Rutgers School of Arts and Sciences Critical AI. https://sites.rutgers.edu/critical-ai/

Selvi, A. F. (2025). Meet your new AI teacher: hypes, promises, and realities in AI-powered language education platforms. Applied Linguistics Review. https://doi.org/10.1515/applirev-2025-0224

Skibba, R. (2025). Are we offloading critical thinking to chatbots? Undark. https://undark.org/2025/09/12/critical-thinking-chatbots/

Sloane, M., Danks, D., & Moss, E. (2024). Tackling AI hyping. AI and Ethics, 4(3), 669-677. https://doi.org/10.1007/s43681-024-00481-y

Soken, A., & Nygreen, K. (2024). Framing generative AI through a critical media literacy lens: A reflective practitioner-inquiry study. International Journal of Transformative Teaching and Learning in Higher Education, 1(1), 7. https://commons.library.stonybrook.edu/cgi/viewcontent.cgi?article=1010&context=ijttl

Stryker, C. & Scapicchio, M. (2026). What is generative AI? The 2026 Guide to Machine Learning. IBM. https://www.ibm.com/think/machine-learning#605511093

Swant, M. (2025). The surprising advertising strategy AI companies are investing in to stand out. Inc. https://www.inc.com/marty-swant/the-surprising-advertising-strategy-ai-companies-are-investing-in-to-stand-out/91281145

Tangermann, V. (2025). OpenAI announces that it’s making GPT-5 more sycophantic after user backlash. Futurism. https://futurism.com/openai-gpt5-more-sycophantic

Tewell, E. (2015). A decade of critical information literacy: A review of the literature. Communications in information literacy, 9(1), 2. DOI: 10.15760/comminfolit.2015.9.1.174

Tully, S. M., Longoni, C., & Appel, G. (2025). Lower artificial intelligence literacy predicts greater AI receptivity. Journal of Marketing, DOI: 10.1177/00222429251314491

Van Lente, H., Spitters, C., & Peine, A. (2013). Comparing technological hype cycles: Towards a theory. Technological Forecasting and Social Change, 80(8), 1615-1628. https://doi.org/10.1016/j.techfore.2012.12.004

Vinsel, L. (2021). You’re doing it wrong: Notes on criticism and technology hype. Medium. https://sts-news.medium.com/youre-doing-it-wrong-notes-on-criticism-and-technology-hype-18b08b4307e5

Data-driven workflows and the art of informational collaboration / HangingTogether

What is collaboration? I prompted ChatGPT to create an image illustrating collaboration, and this is what it produced:

I would guess that most people would conjure up something similar if asked to mentally visualize collaboration: a group of people, in the same physical space, working together. Direct, face-to-face collaboration is indeed an important way to partner and act collectively. But for libraries, another form of collaboration may be at least as important—and impactful. It is rooted in the concept of a collective collection: “the combined holdings of a group of libraries, analyzed and possibly managed as a unified resource.”

OCLC Research has produced a considerable body of work focused on defining, describing, and thinking through the implications of collective collections. An important strand of these studies examines collective collections in the context of shared print collections, in which groups of libraries work collaboratively to steward their collective print holdings. Most recently, we released the OCLC Research report Making Shared Print Work, which gathers community insight on workflows, data, and tools supporting collective stewardship of print collections, along with perceived gaps and opportunities that, if addressed, could strengthen the future of shared print programs. This report is part of OCLC Research’s Stewarding the Collective Collection project.

Shared data powers informational collaboration

One finding we reported in the study was that, as a practical matter, many shared print collections are distributed across a network of local collections, rather than physically consolidated into one collection. Aggregation of these local collections into a collective collection occurs through a layer of data and services that sits over the distributed collections, knitting them together into a data construct and allowing them to be analyzed and managed as a cohesive whole.

A related finding from Making Shared Print Work is that data is the key to delivering value to shared print programs:

Accurate and comprehensive data is essential for effective stewardship of collective collections, such as those managed by shared print programs. Monographic shared print programs involve six core workflow categories, with collection analysis, metadata management, and verification being the most data-driven—and in some cases, the most time-intensive—activities. The importance of data to shared print workflows is amplified by the fact that these programs primarily operate as distributed collections, requiring extensive coordination of holdings, retention, and bibliographic data across multiple partner libraries. (p. 7)

In these circumstances, it is not necessarily collaborators seated in the same room that drive successful shared print partnerships, but rather, informational collaboration: collective action powered by shared information that informs local and group decision-making.

The importance of informational collaboration was reinforced again and again in our Making Shared Print Work study. The perspective we gathered from interviews and focus groups revealed the primacy of data-driven workflows in shared print programs, underscoring the role of data as the connective tissue linking distributed local collections into an overarching collective collection. Case in point: the most frequently mentioned shared print workflow by our interviews was collection analysis.

Collection analysis, at its core, is about turning bibliographic and holdings data into actionable insights. Detailed knowledge of the size, scope, and salient features of a library collection—or a collective collection—leads to informed decision-making across a wide range of stewardship activities: from weeding and storage planning, to ensuring the fit and relevance of the collection to user needs, to redressing gaps in representation and diversity in legacy holdings. To this, we can add a number of shared print-specific considerations, such as choosing to make print retention commitments within the local collection or even identifying rare or last copies of publications within the context of a group’s collective print holdings.

Informational collaboration through collection analysis

Informational collaboration fuels this type of data-driven collection analysis in a shared print context. Sharing data about local collections in the partnership builds a clearer picture of collective print holdings, which, in turn, allows for better informed decision-making at the group level, but also at the local level, where knowledge of the size, scope, and features of the collective collection provides a contextual backdrop against which local decisions can be made.

Retention commitments are a great example of how this works in practice. A recent OCLC Research study examined print retention commitments registered in OCLC’s WorldCat database. A retention commitment—an assurance that a library will continue to retain and steward a particular print volume in its collection—is vital intelligence that informs the local retention decisions of other libraries. Informational collaboration occurs when the retention commitment is registered in WorldCat: when this information is shared and analyzed across a group of libraries, each can make local de-accessioning decisions based on the assurance that at least one copy of the publication will remain available.

Data-driven analysis supported by informational collaboration helps libraries keep books in collections as well. The Statewide California Electronic Library Consortium (SCELC) launched a pilot shared print program in 2016. A group-wide analysis of collective print holdings, produced using OCLC’s GreenGlass collection analysis tool, revealed the surprisingly low rate of overlap across the partner collections, with a large percentage of the collective collection consisting of publications held by only one or two member libraries. Informational collaboration in the form of sharing information about local print holdings through the GreenGlass analysis led to actionable intelligence for the group members: knowledge of the high incidence of rare or unique holdings within the group informed and optimized group-wide retention commitment strategies.

The importance of informational collaboration through collection analysis and other forms of data-driven analysis was underlined further in our Making Shared Print Work study when interviewees indicated that more was needed within shared print programs and beyond. For example, practitioners we spoke to noted a lack of systematic coordination across shared print programs, resulting in inefficiencies and duplication of effort that only become evident when the full landscape of shared print collections is taken into account. More sharing of data across shared print programs—in other words, more informational collaboration—could improve decision-making and coordination of resource allocations across the full spectrum of collective print stewardship efforts.

Data and tools are collaborative infrastructure

Collaboration requires collaborative infrastructure, the scaffolding upon which partnership can be established, sustained, and encouraged to thrive. For face-to-face collaboration—people working together in the same room—collaborative infrastructure might take the form of meeting spaces, committees, governance policies, and so forth.

Collaborative infrastructure is also needed for informational collaboration, but the nature of that infrastructure is different: databases, data exchange mechanisms, and data analysis tools that create the actionable intelligence that informs local and collective decision-making. Think of WorldCat, a database of shared information about library collections around the world. Consider also analysis tools like OCLC’s GreenGlass and Choreo Insights. Taken together, these resources—data and tools—create opportunities for informational collaboration in shared print and beyond.

Shared print programs illustrate how collaboration in libraries increasingly depends on informational collaboration that links distributed local collections into a collective collection through shared data and services. The infrastructure needed to support informational collaboration, like databases and analytic tools, complements the data-driven workflows that support shared print as well as other forms of collection stewardship. Informational collaboration provides the foundation for successful, sustained partnerships that help libraries achieve greater efficiencies and impact through scale.

The post Data-driven workflows and the art of informational collaboration appeared first on Hanging Together.

Weekly Bookmarks / Ed Summers

These are some things I’ve wandered across on the web this week.

🔖 Infrastructure Landlords: The Rentier Capitalism of Commercial Academic Publishers

If you want to understand where the commercial parts of scholarly communications may be heading, you need to look beyond policy documents, conference panels, or public-facing strategy statements. You should look at what large commercial actors say when speaking to investors. Earnings calls are one of the places where that language becomes especially revealing: less concerned with sector ideals than with growth, market opportunity, competitive position, and what will ultimately generate value for shareholders. For this reason, it can be worthwhile to review earnings calls and investor presentations, as these are often overlooked when discussing OA policy and sectoral movements.

🔖 AI got the blame for the Iran school bombing. The truth is far more worrying

Someone decided to compress the kill chain. Someone decided that deliberation was latency. Someone decided to build a system that produces 1,000 targeting decisions an hour and call them high-quality. Someone decided to start this war. Several hundred people are sitting on Capitol Hill, refusing to stop it. Calling it an “AI problem” gives those decisions, and those people, a place to hide.

🔖 Guibo

GUIBo is a desktop GUI for operators and developers who run Kubo (the IPFS daemon in Go). It drives your node through Kubo’s HTTP RPC API so you can work with pins, UnixFS content, IPNS, remote pinning, gateways, and network or repo diagnostics without living in the terminal.

🔖 The Human Line Project

At The Human Line, we are committed to ensuring that AI technologies, like chatbots, are developed and deployed with the human element at their core. LLMs are powerful tools, and with Ethical design, users can gain new skills and knowledge while remaining emotionally intact.

🔖 Marriage over, €100,000 down the drain: the AI users whose lives were wrecked by delusion

Tech-related delusions, whether they involve train travel, radio transmitters or 5G masts, have been around for centuries, Morrin says. “What’s different is that we’re now arguably entering an age in which people aren’t having delusions about technology, but having delusions with technology. What’s new is this co-construction, where technology is an active participant. AI chatbots can co-create these delusional beliefs.”

🔖 Web Resource Ledger (WRL)

WRL captures web pages with cryptographic proof of authenticity – Ed25519 signatures and RFC 3161 timestamps that anyone can independently verify.

PS, Ilya took a close look and it appears to be a vibe coded mess.

🔖 Liberation Radio(s) Beyond the Internet Imaginary

The seemingly unassailable hegemony of the contemporary internet means too few people know that shortwave radio has never gone away and that in many ways it’s more durable, more secure, and more widely accessible than other contemporary forms of wireless communication such as cell or wifi.

🔖 Not AI

Valerie Veatch Asks the Big AI Questions

“[T]he first thing is that computers cannot think, that is an invented concept. And rather than computers being able to think, we’ve reinvented thinking to be something computers can do. And when we do that, all manner of power consolidation, wealth consolidation, technological monopolies happen and we are looking at this fantasy enemy instead of the real political work and community work to be done.”

🔖 Richmond Folk Festival

The Richmond Folk Festival is one of Virginia’s largest events, drawing visitors from all over the country to downtown Richmond’s historic riverfront. The Festival is a FREE three-day event that got its start as the National Council for the Traditional Arts’ National Folk Festival, held in Richmond from 2005-2007. The Richmond Folk Festival features performing groups representing a diverse array of cultural traditions on six stages.

🔖 The Commons w/ Peter Linebaugh

Featuring Peter Linebaugh on the long histories of commons and commoning, connections between enclosures in Europe and imperial conquest abroad, and writing history from below.

🔖 Silicon Valley’s Mythology of Human Amplification

If output is your only metric, then the steam engine really is just a better bicycle. Both get you from A to B. One gets you there faster with less effort. Case closed. The fact that you arrive having done nothing, learned nothing, built nothing—that’s not a bug, that’s the point. Effort is a cost to be minimized, not a value to be preserved.7

But embedded in that worldview is that the journey is merely instrumental. The only thing that matters is arrival. That it doesn’t matter if you travel or are traveled. The Inuit elders seem to operate on a different premise. Arrival, of course, mattered. These were hunters who needed to find caribou and get home alive. But only through the journey could you acquire deep knowledge of the terrain. You couldn’t separate arriving at the destination from what you learned on the way there.

🔖 Code Review Is Not About Catching Bugs

What teams collaborate on during review is changing. Less time spent on style nits and mechanical correctness, more time on intent, architecture, and whether a change moves the product in the right direction. That’s a good shift. And the collaborative act itself – multiple humans exercising judgment together, developing shared taste, building mutual understanding of where the system is heading – that’s not a bottleneck to eliminate. It’s something to uplevel.

🔖 Mining the commons: AI extraction, Wikipedia, and the case for a multi-stakeholder settlement

Wikipedia and similar DPGs cannot sustain themselves on a fragile mix of donations, sporadic philanthropy, and ad-hoc corporate generosity. What’s needed is a multi-stakeholder settlement in which large-scale users of the commons take on long-term, structured obligations to sustain it: contractual funding through paid APIs and usage-based levies, formal recognition of DPGs as Digital Public Infrastructure to unlock multilateral co-financing, and a shift in philanthropy from one-off project grants to sustained core support for the institutions that maintain the commons.

🔖 Harness engineering: leveraging Codex in an agent-first world

We intentionally chose this constraint so we would build what was necessary to increase engineering velocity by orders of magnitude. We had weeks to ship what ended up being a million lines of code. To do that, we needed to understand what changes when a software engineering team’s primary job is no longer to write code, but to design environments, specify intent, and build feedback loops that allow Codex agents to do reliable work.

This post is about what we learned by building a brand new product with a team of agents—what broke, what compounded, and how to maximize our one truly scarce resource: human time and attention.

🔖 Harness Engineering

It was very interesting to read OpenAI’s recent write-up on “Harness engineering” which describes how a team used “no manually typed code at all” as a forcing function to build a harness for maintaining a large application with AI agents. After 5 months, they’ve built a real product that’s now over 1 million lines of code.

The article is titled “Harness engineering: leveraging Codex in an agent-first world”, but only mentions “harness” once in the text. Maybe the term was an afterthought inspired by Mitchell Hashimoto’s recent blog post. Either way, I like “harness” as a word to describe the tooling and practices we can use to keep AI agents in check.

🔖 The importance of Agent Harness in 2026

We are at a turning point in AI. For years, we focused only on the model. We asked how smart/good the model was. We checked leaderboards and benchmarks to see if Model A beats Model B.

The difference between top-tier models on static leaderboards is shrinking. But this could be an illusion. The gap between models becomes clear the longer and more complex a task gets. It comes down to durability: How well a model follows instructions while executing hundreds of tool calls over time. A 1% difference on a leaderboard cannot detect the reliability if a model drifts off-track after fifty steps.

We need a new way to show capabilities, performance and improvements. We need systems that proves models can execute multi-day workstreams reliably. One Answer to this are Agent Harnesses.

🔖 Who will remember us when the servers go dark?

When the server goes dark, we go dark, too. We’ve built an entire civilisation on an unthinkably brutal and comically unreliable stack while hallucinating it as literally anything else. We condemn AI today for making shit up, but what about us? We’re building on a fantasy just as brittle, we are just as demonstrably wrong. Yet we pretend a file isn’t just a gesture that can disappear in an instant. We hallucinate that the server is somehow both fleeting and forever.

🔖 Risky Bulletin: GitHub is starting to have a real malware problem

GitHub is slowly becoming a very dangerous website as more and more threat actors are starting to use it to host and distribute malware disguised as legitimate software repositories.

What started as an infrequent sighting in early 2024 is now at the center of an increasing number of infosec and malware reports.

The tactic is usually the same. A threat actor would take a legitimate repository, add malware to the files—typically an infostealer or a remote access trojan— and then upload the boobytrapped repo back on GitHub.

🔖 One man’s poignant search for community via radio waves

A unique and deeply moving piece of biographical filmmaking, the short documentary Echo provides a window into the life of an older man named Allister Hadden living in Northern Ireland. The film drifts between past and present, with a rich, textured, shot-on-film aesthetic tethering together Hadden’s archival recordings and newly shot footage from the Belfast-based filmmaker Ross McClean.

Operationalizing Minimal Computing Values Through Shared Computing-Platform Development: A Case Study of DigitalArc and Opaque Publisher / In the Library, With the Lead Pipe

In Brief: This article explores how minimal computing principles guided the parallel web development of two related but distinct publishing platforms, DigitalArc and Opaque Publisher.  DigitalArc, a community-driven digital archive and exhibit platform, was developed in response to principles governing post-custodial archiving, taking it one step further to ensure communities maintain ownership of their materials and their digital artifacts. The Opaque Publisher, originally developed in support of a born-digital dissertation, adapts DigitalArc to support refusal theory for scholars who have to negotiate the tensions between using unethically obtained evidence in support of their research with moral objections to a lack of informed consent. At first glance, the use cases for each platform seem different, but both are providing mechanisms for individuals-by-proxy and communities to assert control over how their respective stories are shared.

By Kalani Craig, Michelle Dalmau and Sean Purcell

This article details the conversations, dependencies and contingencies that developed as our team simultaneously built two related, but distinct academic publishing platforms and considered the theoretical motivations for having done so. The first of these, DigitalArc (DA), was designed to support the creation of low-cost sustainable digital exhibits and archives built by and for communities who want to control how their histories are presented online. DA was designed as a community-driven digital archive and exhibit platform, ensuring communities maintain ownership of their materials and their digital artifacts.1 The second, the Opaque Publisher (OP), used DigitalArc as a foundation for a digital-exhibit and digital-publication platform that supports scholars who want to redact or remove information that was obtained from medical patients without their informed consent. The modeling and development of both platforms were guided by complementary frameworks that shaped our decision to use DigitalArc as the technical foundation for the Opaque Publisher (Zenzaro, 2024; Ciula et al., 2018).

In “The Digital Opaque: Refusing the Biomedical Object” (Purcell, Craig & Dalmau, 2025), we outlined our adoption of refusal theory in the rejection of unquestioned institutional norms around the use and display of unethically obtained medical specimens. This theoretical framework was operationalized in the OP as an author-audience interaction that allows authors to identify sensitive information in both the text of, and images included in, an academic publication. Readers are then given the ability to control how and whether that sensitive information is redacted fully or partially, or displayed openly, with the default view set to “partially opaque,” serving as a compromise between fully redacted and fully open. 

Here, we outline a history of technical interventions that supported ethical creation and interpretation of sensitive content through the iterative implementation of two publishing platforms. Core to both platforms were ethical-research considerations and public-communication audiences, which in turn drove the adoption of minimal computing approaches. We hope that, by focusing on the audience needs we identified for the two projects, the existing models we assessed for the OP’s parent framework, and some of the serendipitous contingencies that shaped the minimal-computing development of both platforms, we can offer some lessons for other digital-humanities development teams seeking to operationalize their theoretical frameworks in the form of technical choices. 

DigitalArc: A Community Digital Archive Platform 

In Fall of 2018, our team began to assess options for a spring 2019 course centered around the creation of a public-history archive as the main classroom activity. As the instructor, Craig initially asked for consulting advice about potential archiving platforms from Dalmau and other members of the team at the Institute for Digital Arts and Humanities, and from Dalmau and members of her digital-libraries team. Using their advice, along with obstacles that arose with our own institution supporting digital projects, Craig began to assess the potential for Github Pages as a publishing platform.

In Fall of 2018, our team began to assess options for a spring 2019 course centered around the creation of a public history archive as the main classroom activity. Our audience was twofold: first-year undergraduates with little to no research experience in history or technical experience in digital humanities, and the public audiences who would be engaging with the digital collection and historical essays those first-year students would develop as a part of their class. Models for this sort of endeavor existed in spades, most of which focused on the simplicity of content creation for content creators with minimal technical skill. Content management systems (CMS) allow these users to interact with a graphical user interface (GUI) and engage in button-pushing and form-filling behaviors that build multi-media experiences palatable to public audiences, with integrated display for photos, videos, audio, and text (Russel & Merinda, 2017). From Google Sites and WordPress in the corporate freemium sphere to Drupal and Omeka, open-source platforms commonly used in academic contexts, many of these content management systems modeled the use of a programming language (often PHP) supported by a back-end database (often MySQL) that served on-demand pages built “on the fly” as a reader requested each page on the web site. Our institutional support was rich for Omeka in particular, and we appreciated Omeka’s focus on non-profit academic public engagement. However, acquiring, critiquing, and applying digital literacies are key outcomes for the course, and we were able to hone these literacies, with the built-in support structure offered by the class, by exploring a more transparent code base offered by static sites (Wikle, Williamson and Becker, 2020).

As with many technical projects, however, serendipity wrinkled the fabric of our plan: that same semester, university IT rolled out a required upgrade to PHP on the servers that were available for hosting that, in turn, prompted a systemwide Omeka upgrade. This IT-driven upgrade represented, on the one hand, a very well-provisioned IT environment that could support database-driven CMS support for many sites, and on the other, a very clear division between that institution’s IT’s environment-building responsibility and researchers’ site-creation and maintenance responsibilities. Dozens of sites needed upgrades in order to remain accessible for public view, and in many cases, the creators of those sites were not equipped to handle such upgrades. That semester, Omeka served as both a model for what worked exceptionally well for novice creators in the site-building phase, and as a warning for the errors our students, and our public audiences, might expect to see in the site’s long-term post-project maintenance.

The experience pushed us away from big tech into the realm of “minimal computing,” an approach that responds to the tension between the often-limited resources and needs of a community of practitioners–which can include individual partners with institutional affiliation–with limited resources. The shift in focus to this tension between need and resource availability has as its main effect the need to consider how and why we’re using the technologies in the first place. Roopika Risam and Alex Gil anchor the minimal-computing movement’s motivation in “a very real fear” of big tech’s ideologies of fast growth at all costs, disruption over stability, and expense over access. Such ideologies continue to exclude communities whose “voices and stories…have been elided” in the cultural record, this time in a digital space instead of in physical collections (Gil and Risam, 2022). By contrast, minimal-computing best practices offered a framework that helped us evaluate these early-stage classroom priorities by asking what we had, what we needed, and what we wanted to prioritize. We had a team capable of managing almost any technical environment. We needed to reduce or eradicate long-term institutional dependencies and create an archive without longer-term sustainability concerns that Omeka and WordPress presented in the immediate institutional context. We wanted to prioritize short-term labor and development over a need for students or us to handle the long-term maintenance that Omeka and WordPress presented. As we further developed DigitalArc in the years that followed, this tension between resource limitation and need, then, led us to consider moving much of the maintenance complexity of the technology and the labor onto our institutional team, through development and documentation, in order to shift the expense of technology away from anyone who might be interested in using our platform later on.

Minimal computing’s emphasis on smaller-scale projects, with initial labor investment by technical experts that result in lower barriers to long-term technical maintenance and much lower cost, is also informed by a resistance against a “maximal” digital humanities, which leans on a combination of well-provisioned institutional support and the structural exigencies that require researchers to respond quickly to a limited set of choices when that institutional support changes. When these maximal computing tendencies are transferred from well-resourced IT environments and institutional support for long-term maintenance into other settings, the changed institutional pressures in turn create institution-specific site-creation and sustainability concerns that vary greatly from context to context (Miya & Rockwell, 2025).

The contingencies we faced, even in an IT-rich environment, helped guide us as we considered how to de-institutionalize both the minimal-computing and maximal-computing  platforms to which we had access. We anticipated that implementers of a minimal-computing platform would need methodical yet easy-to-step through documentation, to scaffold what they might initially see as a less  “user-friendly” interface. In this case, to achieve a minimal codebase in support of ongoing sustainability, we rely on substantive documentation. Herein lies one of several counter-intuitive responses to minimal computing. Despite these contradictions, our choices were intended to allow more agency for communities and scholars, giving both our developer team and our audiences a better handle on both the short-term and long-term “considerations of the costs, limits, or wisdom of scale” (Walsh 2024).

In Spring of 2019, our classroom began using the first version of a minimal-computing digital-exhibit template that would become DigitalArc many years later. The feature set included in this student-built version of the platform was partially inspired by Omeka’s academic focus on discovery and presentation based on well-formed metadata and its ability to support meaningful interactions with multimedia objects. We also took cues from the many CMSs that developed on WordPress’s model of simple authoring for novice creators, and cues from colleagues in digital humanities whose research on minimal computing suggested that the back-end design and documentation requires a heavier lift up front, but easier ongoing and longer term management over time (Wingo & Anderson, 2025). We also took steps to narrate the parallels between CMSs like WordPress or Weebly and the features that Github’s GUI editing pages offered, as a bridge to help build student confidence that Github Pages could come quite close to the ease point-and-click editing with minimal training time for them.

The tech stack that supported this initial minimal-computing approach was centered around Jekyll, a static-site generator that takes a different approach to the design and publishing of sites than Omeka and WordPress’ dynamic pages (on-request PHP-and-database generated pages).2 In this model, pages are built when creators make edits, rather than being built when a public viewer requests the page; if something went wrong with an edit, or with part of the tech stack, the implementers would have a greater chance of noticing and fixing the problem before the viewers would encounter an interruption to the site. As with Omeka and WordPress, Jekyll lets us design and implement headers, footers, and page templates that would apply to any of the content generated by students. As with other CMSs, we set up customization of fonts, colors, navigation elements, and other basic design. We later added support for custom metadata and navigation labels, which was intended to support both multilingual audiences and communities’ preferred vocabulary.

Our one compromise with the world of “maximal” computing, which we will address more fully below, was to host our Jekyll site on Github. From a teaching perspective, Github’s free user-account option and focus on collaborative editing made it easier for students to collaboratively access Github during class. From a site-editing perspective, Github’s Pages feature automatically added the template features we built to any of the simpler “markdown” files. Creators authored these markdown files, which focus on representation of the digital objects in the collection, including descriptions and corresponding images, text or time-based media files (see Fig 1). Markdown serves as the vehicle for encapsulating curated information (metadata) described through basic text formatting, which reduces technical barriers associated with scripting languages and database implementations. 

Figure 1. Image description: A screenshot of the markdown that describes an item included in the the DigitalArc Platform demo exhibit site. Available in markdown form at https://github.com/DigitalArcPlatform/demo/blob/main/_items/Item-1-document.md and as a reader-facing page at https://digitalarcplatform.github.io/demo/items/Item-1-document.html

For students, connecting these easy-to-teach text-only editing processes meant they could use Github’s online GUI-based editor. This experience highlighted the division of labor in minimal computing that places additional up-front burdens on our developers. It was our responsibility to: understand that Github Pages’ existing GUI interface was viable as a point-and-click user interface for file editing; create as user-friendly an environment as possible within the context of Github Pages; explain the affordances of Github Pages as having some additional difficulty up front but a much longer-scale ease of maintenance and use that allowed for better trade offs; and provide documentation of that environment that makes it more easily adaptable by novices.

While our first development effort in Fall of 2019 was aimed at the infrastructure for a one-time classroom engagement, it would, in true minimal-computing fashion, come to include “ethical concerns that influence our practice” (Risam, 2025). These ethical considerations added to the benefits that we found in choosing custom development in Github Pages over the similarly time-consuming customization and long-term maintenance we would have had to budget in order to to use an existing digital exhibition and publishing platform that had institutional support. We quickly realized that this minimal-computing model had several additional affordances that we could use for other projects. First, as we built the student exhibit, we realized it was an easier model for novices to adapt free of charge for multiple projects, as compared to the freemium option that is more common for platforms like WordPress or Weebly, in which a single user is limited to a single free site. Building and applying new single-page templates in Jekyll was easier with limited design and programming skill, both for our team’s own development work and for future potential community members learning to customize and launch their own web sites. Second, the collaborative, free, non-academic context of Github had promise for audiences whose experiences with universities and other large institutions was less than positive (Sutton & Craig, 2023).

The specifics of our rollout in the classroom and those that followed, however, presaged a persistent concern that Quinn Dombrowski addresses in “Minimizing Computing Maximizes Labor”: “going ‘minimal’ requires a great deal of technical labor” (2022). Students needed additional support to transition from a fully GUI-based editing interface to the combination GUI/text-based editing system that GitHub Pages and Jekyll require. Coming face-to-face with this early on helped us establish teaching and documentation principles that addressed the longer-term implications of a minimal-computing platform-development agenda. This trade-off emphasized, for us, the importance of creating scaffolded documentation for DA implementation that allows for a more flexible, accessible user experience and easier maintenance for web site content creators and managers (see Figure 2).3

Figure 2. Image description: A screenshot of the documentation that describes how to edit markdown that describes an item included in the DigitalArc Platform, starting with directions for posting items. Available at https://digitalarcplatform.github.io/documentation/docs/publishSite/posting/

With both concerns and affordances in mind, we began to use these minimal-computing approaches in other settings, including 3 community-facing History Harvest projects that took place over the 4 years that followed the student-focused digital-archive classroom experience.4

Opaque Publisher: A Scholarly Publishing Platform 

It’s here that we time-skip forward to 2023, and the development of the Opaque Publisher. By then, our team had experience building DA into a templated platform that drew on existing models of community archiving and had fully integrated ideas of minimal-computing labor division into our workflow. Prototyping through an ACLS-funded grant5 helped us build and test a reasonably featured minimal-computing Jekyll template that served several community archive projects as well as proved its adaptability to other web-publishing needs.6

The affordances we identified during the initial iterations of what we now know as DigitalArc also played a role in setting up DA to become a useful foundation for the OP. Our experience with the adaptability of minimal-computing Jekyll sites was crucial for the timely development of the OP. Jekyll’s development process meant that our OP team members with basic HTML skills and a willingness to experiment could see DA in its fully articulated form and use that as an easily portable example to customize a new platform. That changed the up-front labor required of the more skilled members of our development team, allowing us to divide the labor more easily and to repurpose code across our different Jekyll projects, which lowered the burden of learning for team members who were still learning that customization skillset. We appreciated this method because Jekyll offered an environment that not only scaffolded our team as they experimented with their newly built site in increasingly complex ways, but encouraged them to do so because each small success helped them see themselves as capable of technical tasks. 

The second–our focus on moving DA outside of its original institutional context–offered an anchor for the OP’s goal of refusal–an intentional rejection of institutional context and institutional harm. Those experiences provided a foundation for operationalizing the refusal theory that Sean Purcell’s then-dissertation project brought to our attention.7 At minimum, his  functional requirements included a digital exhibition platform that mirrored the formatting requirements associated with academic publishing (citations, tables of contents, and indices). These elements were not included in DA’s development, owing to a difference in intended audience and intended output. In addition to this, the project’s interactive approach to refusal required templating for the platform’s interactive elements, which could be added to prepared markdown files by a user familiar with basic hypertext markup language (HTML).8 One of the advantages of working on these two projects in parallel was an opportunity to develop resources for future Jekyll templates that attend to the overlapping, but distinctions shared by academics, archivists, and communities.  

While DigitalArc offered an easy starting point for a team already familiar with Github Pages and the specifics of the DigitalArc template itself, there was no shortage of CMS options that were designed with some academic apparatus in mind. We re-evaluated our minimal-computing starting point–what do we have, what do we need, and what are our priorities–as we did due diligence. Omeka’s third-party development community included a footnote plugin, and Omeka’s base install had a built-in table-of-contents generator. Scalar diverged from the exhibit model to offer a combination of non-linear and table-of-contents-based reading processes, but customizing Scalar required a higher learning curve for our team. However, Scalar’s computational overhead and its dependence on PHP complicated the process of long-term digital preservation. Mukurtu’s focus on the ethics of digital exhibits was a good fit for the refusal theory that drove Purcell’s dissertation, but its dependence on Drupal 7 triggered worries for us about the long-term sustainability of the platform. In hindsight, platform worries beyond our initial reluctance to use Omeka were well-founded; Drupal 7 support ended in January 2025, leaving Mukuru in limbo, and Scalar experienced a maintenance outage in August of 2025.9 As with DigitalArc, we wanted to engineer around potential vulnerabilities and gaps in site availability, and the friction between open, non-profit archival platforms and dependency on a constantly updating database-driven codebase meant moving away from these easily accessible platforms.

Despite the surfeit of academic CMS models, flexible redaction models were harder to find. Print redaction, like that done in Adobe Acrobat or other print-document generators, assumes permanent strike-throughs or blackouts. Again, Mukurtu offered inspiration for changing levels of visibility based on community-oriented traditions and ethics, but those levels are controlled by site creators rather than readers or audience members. Ultimately, DigitalArc offered both the longer-term maintenance that we prioritized, the non-institutional platform that aligned with our ethical goals of institutional refusal, and the fastest customization path for a series of interface options that reified authorial choices about which text and image sections were sensitive but allowed the redaction-level display of those sections to be reader-controlled (Fig.3). In choosing to go further down the minimal-computing path that we started with DigitalArc so,​ we provide readers with a way to engage with the tensions and ethical questions that we posed in the companion article: “as scholars we have to show our work, and this practice of showing is often at the expense of those whose lives and deaths are entangled in our research programs.” (Purcell, Craig & Dalmau, 2025)

Figure 3. Image description: An example of how refusal informed the opacity functions in the OP in which users are able to toggle how they view the images and text based on whether the subjects depicted in primary materials consented to the research. This example was drawn from Purcell, Sean “Teaching Hygiene” in The Tuberculosis Specimen. (2025). https://tuberculosisspecimen.github.io/diss/dissertation/1_3_4

Our choices were made with an audience of scholar-authors looking for simple technical solutions in mind. That focus pushed us away from the integration of more complex programming and toward a mostly-CSS solution to implement the interactive opacity filters. For text, the opacity filter activates unique span classes in the textual narrative that have been flagged during composition.10 The interpolation of image and text was done mostly in markdown, using Scrivener, with a few text-string replacements that allowed Purcell to easily insert the necessary HTML to apply the Javascript redaction, a process which we describe more fully in “The Digital Opaque.” The actual content of the images, however, were much more complicated as every image that needed to be made opaque had to be edited three times: first, to crop and format for web; second, to edit and remove the first level of opacity for the ‘partial opacity’ version of the site; and third, to remove more of the image for the ‘opaque’ version of the site (fig. 4). When the site loads for a user, all three versions of these images are loaded at the same time, but only one is visible for the user at any time.

Figure 4. Image description: The three versions of each image corresponded with the opacity guidelines of the site, incrementally removing elements of the bodies of patients based on the project’s predefined protocols. From left to right, a white woman drinking from a glass while staring at the camera, in the next image her eyes are obscured, in the final image her whole face is obscured. Three unique versions of every image had to be made. Lockard, Lorenzo B.. Tuberculosis of the Nose and Throat. St. Louis: C. V. Mosby Medical Book & Publishing Co., 1909

We tested a few versions of the opacity functionality during the platform’s development. The first was Javascript-heavy. As one of the most common Javascript libraries in use for web site development at the time of OP’s development, React.js (https://react.dev/) offers a broad platform to build interactive user-controlled opacity of both images and text drawn from the opaqued parts of those images. Two things led us to an alternative path. React’s requirement for local compilation, coupled with the sometimes unpredictable attention to backward compatibility because of React’s emphasis on the constantly changing world of mobile-app development, had the potential to create sustainability issues. That, along with its origins in a very profit-driven corporate Facebook setting, led us to emphasize CSS control rather than Javascript control of the opacity features. We instead adapted basic show/hide options that were already built into the non-profit Zurb Foundation 6 library (https://get.foundation/). This CSS library’s user-contribution-oriented development process aligns with our community-oriented goals, Zurb’s smaller contributor base leads to a slower code-update rate, making it more suitable for a project with few developers available to update code in response to new library releases, and the smaller library base also meant we could load a local copy of the library, frozen at a particular release date. These choices, in turn, provide a more predictable user experience in the preserved versions of the site that are hosted not at Github but in other disciplinary and institutional repositories and in the Internet Archive. By keeping the site architecture simple, the published sites are more accessible to end-users and to web archiving tools that struggle to replicate more complex interaction.

Purcell’s introduction of refusal theory also provided the team with another opportunity to question our minimal-computing approach. Microsoft purchased Github in 2018, just as we were addressing the workload of updating those Omeka sites that had broken. Our choice to engage in a minimal-computing endeavor was thoughtful and anchored in careful consideration; our choice of Github Pages and Github’s front-end file-editing GUI was less well-theorized, and the OP offered us the opportunity to reconsider our choice. While we decided Github was the most timely and stable choice for hosting the dissertation, we also made offsetting decisions owing largely to the affordances of minimal computing sites. The most crucial of these was to take a cue from the LOCKKS program (https://www.lockss.org/) in our choice of static-site generation through Jekyll. Static sites are more easily preserved in packaged form on multiple platforms, like IU’s institutional repository Scholarworks, to open-source repositories like Knowledge Commons. More importantly, static sites function with their intended behavior on the Internet Archive, which serves as a fully-open-source public repository of general knowledge.11 The distribution of many copies of completed archives in a variety of forms has also flagged a future need to explore alternatives to Github Pages like GitLab, in order to provide a platform for the DigitalArc template and its automated static-site page building process outside of Github’s environment.

Conclusion

While many of the choices we made were specifically oriented around the audiences for DA and OP, and the models in the digital-archive and digital-publication spaces that offered some but not all of the features we needed, the interplay between two platforms developed for different audiences on different models by the same team of scholar-developers has offered us some lessons that we are now integrating into our future work.

The first lesson we learned along the way is that contingency matters, and that the impulsive choices we made in response to serendipity and contingency can be a foundation for thoughtful, worthwhile change. If not for the PHP upgrade in Fall of 2018, we might not have repositioned our long-term platform-choice goals for DA in the context of minimal computing. That choice, in turn, shaped our choice of CSS-heavy redaction in the OP, a choice that has made our code more portable.

Our second takeaway is that it is hard to fully escape maximal computing. While Jekyll offers the option of building a website entirely on a personal computer, doing so has an enormous amount of technical overhead. In order to re-scope the technical skills required of our community-archive audiences, we needed Github’s full infrastructure–its web-based editing system, Github Pages and Github Actions–which is itself maximal and monopolistic.12 While we’ll always need a maximalist web platform to provide support for less technical content creators, both in the community-archive world and for self-publishing in the digital humanities, diversification of platform away from monopolies–to Gitlab and Bitbucket in particular, in this instance–will also help us as we seek to live up to the goals we set of static-site generation in service of long-term stability for both DA and OP users. Practically speaking, this compromise allowed for users to author and access sites on their phones and creators to minimally maintain sites over  a long period of time with no upgrades or technical skills necessary to keep the site accessible to public audiences.

The functionality we developed for DA and the OP may be imperfect and can be time consuming. However, the code that drives that functionality can be operationalized as part of the ethical reconsideration of a scholar’s primary evidence. Whether that evidence is from communities whose partnerships with institutions have been fraught or from subjects whose evidence was included in archives without their consent, the code that presents our evidence in a variety of forms is a humanistic process, an endeavor that happens in context and should treat context and contingency as an opportunity to understand our relationship to technology, rather than as something to be erased.


Acknowledgements

We would like to thank Nate Howard, Sagar Prabhu, Jessica Organ, and Morgan Vickery for contributing to the web application development of DigitalArc and Opaque Publisher. We also want to thank our friends and colleagues who also shaped this work, especially Emily Clark, Vanessa Elias, and Marisa Hicks-Alcaraz. We would also like to thank Élika Ortega, Roopika Risam, and Alex Gil, and especially members of the Minimal Computing Go:DH working group, for the various ways of framing minimal computing for the digital humanities and for the publics more broadly; for the inspiration and the paradoxes that have kept us on our toes. We are big fans of In the Library with the Lead Pipe’s open peer review process, and appreciate Quinn Dombrowski, Pamella Lach and Jessica Schomberg for their feedback. We are grateful to get to this (better) version of the article. Lastly, we would like to thank our funders for making this work possible: the New York Academy of Medicine, the Center for Research on Race Ethnicity and Society, and with support from the American Council of Learned Societies’ (ACLS) Digital Justice grant program.


References 

Hannah Alpert-Abrams et al., “Post-Custodialism for the Collective Good: Examining Neoliberalism in US – Latin American Archival Partnerships,” Journal of Critical Library and Information Studies 2, no. 1 (2019)

Christina Boyles et al., “Postcustodial Praxis: Building Shared Context through Decolonial Archiving,” Scholarly Editing 39 (2011).

Dombrowski, Q. (2022). “Minimizing Computing Maximizes Labor,” Digital Humanities Quarterly 16, no. 2, https://dhq.digitalhumanities.org/vol/16/2/000594/000594.html

Ciula, Arianna, Øyvind Eide, Cristina Marras, and Patrick Sahle. (2018). “Models and Modelling between Digital and Humanities. Remarks from a Multidisciplinary Perspective.” Historical Social Research / Historische Sozialforschung 43, no. 4, https://www.jstor.org/stable/26544261.

Miya, Chelsea and Geoffrey Rockwell. (2025). “Platitudes: The Carbon Weight of the Post-Platform Scholarly Web”, The Journal of Electronic Publishing 28, no. 2. doi: https://doi.org/10.3998/jep.7247 

Purcell, Sean, Kalani Craig, and Michelle Dalmau. (2025). “The Digital Opaque: Refusing the Biomedical Object,” In the Library with the Lead Pipe, https://www.inthelibrarywiththeleadpipe.org/2025/digital-opaque/.

Purcell, Sean. (2025). “The Tuberculosis Specimen: The Dying Body and Its Use in the War Against the ‘Great White Plague.’” Indiana University. https://tuberculosisspecimen.github.io/diss/.

Risam, Roopika. (2025). DH2025 Keynote – Digital Humanities for a World Unmade. https://roopikarisam.com/talks-cat/dh2025-keynote-digital-humanities-for-a-world-unmade/. DH2025, Lisbon.

Risam, Roopika, and Alex Gil. (2022). “Introduction: The Questions of Minimal Computing.” Digital Humanities Quarterly, vol. 16, no. 2, http://www.digitalhumanities.org/dhq/vol/16/2/000646/000646.html.

Russel, John E., and Merinda Kaye Hensley. “Beyond Buttonology: Digital humanities, digital pedagogy and the ACRL Framework” College & Research Libraries News. (December 2017), 588-591, 600.

Sutton, Jazma, and Kalani Craig. (2022). “Reaping the Harvest: Descendant Archival Practice to Foster Sustainable Digital Archives for Rural Black Women.” Digital Humanities Quarterly, vol. 16, no. 3, https://dhq.digitalhumanities.org/vol/16/3/000640/000640.html.

Ton, Mary Borgo, (2019). “Shining Lights: Magic Lanterns and the Missionary Movement, 1839-1868. https://scholarworks.iu.edu/dspace/handle/2022/26951.

Walsh, Brandon, (2024). “Maximalist Digital Humanities Pedagogy.” Walshbr.com (blog). https://walshbr.com/blog/maximalist-digital-humanities-pedagogy/

Wikle, Olivia, Evan Williamson and Devin Becker. (2020). “What is Static Web and What’s it Doing in the Digital Humanities Classroom?” In M. Brooks et al.(Eds), Literacies in a Digital Humanities Context: A dh+lib Special Issue  (pp. 14-18), https://doi.org/10.17613/ryea-4z10

Wingo, Rebecca, Anderson MR. (2025). A Sustainable Shared Authority: The Future of Rondo’s Past. Public Humanities. doi: https://doi.org/10.1017/pub.2025.29

Zenzaro, Simone, (2024). “Models for Digital Humanities Tools: Coping with Technological Changes and Obsolescence.” International Journal of Information Science &Technology, vol. 8, no. 2, http://dx.doi.org/10.57675/IMIST.PRSM/ijist-v8i2.283.

  1. Institutional dependencies can facilitate the creation and publication of a digital community archive but they can also result in reduced, or perceived reductions in community control over their own materials. For example, an institutional partnership might mean communities need to meet digital archiving standards that require costly equipment, where a community archive goal focuses on capturing community contributions (interviews, artifacts, etc.) in the best possible way, using easy-to-access and affordable mechanisms like one’s smartphone and DIY lightbox. Another example is reliance on more advanced technological infrastructure offered by institutions. Rather than opt for a post-custodial approach in which an institution like a public library or local history center hosts the digital archive, community members can do so themselves (Alper-Abrams et al. 2019; Boyles et al. 2011). ↩
  2. For an example of a more complex installation of Jekyll, which relies on the user installing a programming environment on their computer to compile their site, see Amanda Visconti, “Building a static website with Jekyll and GitHub Pages,” Programming Historian 5 (2016), https://doi.org/10.46430/phen0048. Note that the “difficulty” level for this tutorial is rated as “low”. ↩
  3. DigitalArc provides step-by-step documentation from planning a community archiving event to publishing a digital archive: https://digitalarcplatform.github.io/documentation/. The Opaque Publisher does the same: https://opaquepublisher.github.io/documentation/. ↩
  4. In order of development cycles, these are the earlier versions of DigitalArc: Identity Through Objects (https://iubhistoryharvest.github.io/), Remembering Freedom: Longtown and Greenville History Harvest (https://longtownhistory.github.io/), Homebound (https://homeboundatiu.github.io/), La Casa / La Comunidad (https://lacasaiu.github.io/). ↩
  5. To learn more about the ACLS-Funded DigitalArc project, visit: https://digitalarcplatform.github.io/. ↩
  6. rchIvory: An Interdisciplinary Research Project” (https://www.archivory.org/), On Display: A Twenty-First Century Salon Des Refusés (https://ondisplayattulane.github.io) and Kalani Craig’s Digital History Dossier for Tenure as Associate Professor of History (https://tenuredossier.kalanicraig.com/) ↩
  7. Purcell, Sean. 2025. “The Tuberculosis Specimen: The Dying Body and Its Use in the War Against the ‘Great White Plague.’” Indiana University. https://tuberculosisspecimen.github.io/diss/ ↩
  8. For an example of the custom code developed for the OP, see: https://tuberculosisspecimen.github.io/diss/dissertation/X_2_1 ↩
  9. At the time of writing this article, Mukurtu had still not released version 4, which would move away from Drupal 7 to Drupal 11. Currently, Mukurtu 4 is available in as a stable beta: https://mukurtu.org/mukurtu-4/. ↩
  10. Flagging of text and images depended on a predefined ethical framework, and highlighted at different phases in research. For The Tuberculosis Specimen, opacity designations were decided based on different approaches to biomedical informed consent and subject privacy ( https://tuberculosisspecimen.github.io/diss/dissertation/FAQ ). Images were flagged as they were added to chapter drafts and text was flagged during the project’s ‘ethics audit’–a moment prior to publication where researchers are invited to reflect on what processes they used and materials they included and alter their final result to match the ethical frameworks they hoped to meet in the project (Purcell, Craig & Dalmau 2025). These sections were flagged in the project’s word processing program (Scrivener), using placeholder text, which would be changed using a batch find-and-replace script for text files (https://tuberculosisspecimen.github.io/diss/dissertation/X_2_3). ↩
  11. The Wayback Machine is able to preserve Sean’s dissertation as-is: https://web.archive.org/web/20250516183042/https://tuberculosisspecimen.github.io/diss/. The same isn’t true for Mary Borgo Ton, whose born-digital dissertation preceded Sean’s at Indiana University. Mary had to combine several output and documentation approaches to preserve, as closely as possible, her dissertation since the Wayback Machine was unable to preserve content produced by Scalar, which is a more complex PHP site. Instead parts of Mary’s dissertation were preserved via Indiana University’s institutional repository: https://scholarworks.iu.edu/dspace/handle/2022/26951. ↩
  12. For a broader view of how Github’s quasi-monopoly shapes student experiences in higher-education classrooms, see https://ploum.net/2026-01-05-unteaching_github.html; for Github’s relationship to Microsoft and even more monopolistic technology platforming, see https://medium.com/asecuritysite-when-bob-met-alice/as-github-glitches-are-we-too-dependent-on-microsoft-01d9c2f67329 ↩

Enclosure / Ed Summers

It was interesting to see this short article 1 about the enclosure of the web commons go by after just having listened to The Dig’s epic two part interview with Peter Linebaugh 2.

What’s needed is a multi-stakeholder settlement in which large-scale users of the commons take on long-term, structured obligations to sustain it: contractual funding through paid APIs and usage-based levies, formal recognition of DPGs as Digital Public Infrastructure to unlock multilateral co-financing, and a shift in philanthropy from one-off project grants to sustained core support for the institutions that maintain the commons.

I hadn’t realized that the details of these deals that Wikipedia are striking aren’t fully transparent, and well understood outside of closed doors? I think it’s really instructive to think about what is happening right now on the web as enclosure, and part of a longer history of capitalism (as Linebaugh and Denvir talk about). The interview made me think of the craft, tooling, and means of production that are still present in the software industry, but that are being actively being enclosed by the centralization of tooling and skill, craft and knowledge itself.

Yes, I’m talking about LLMs here. Once you see it, it’s impossible not to see it.

This all makes me think of Eleanor Ostrom’s design principles and how it is important that the Wikipedia community have insight into how their commons is being used through monitoring, decision making, resolving the future conflicts that will no doubt ensue.

I’m not entirely sure I understand the potential role of the Digital Public Goods Alliance in all this:

Digital Public Goods (DPG) are supposed to be shielded from precisely this kind of capture. They require financing models commensurate with their public value, not models that make them fiscally dependent on their most extractive users. When the sustainability of a DPG hinges on a small oligopoly of AI firms, the risk turns political: agenda-setting and governance drift toward those who can threaten to walk away.

Perhaps Wikipedia is at risk with losing its DPG status? In what practical ways does identification as a DPG help shape governance? What is being done, or can we be doing to push back on this enclosure? And of course, the situation is quite a bit bigger when you consider the strain that LLM hungry bots are putting on cultural heritage organizations, also part of a larger commons.


  1. The Commons w/ Peter Linebaugh, The Dig.↩︎

  2. Mining the commons: AI extraction, Wikipedia, and the case for a multi-stakeholder settlement, Internet Policy Review.↩︎

Weekly Bookmarks / Ed Summers

These are some things I’ve wandered across on the web this week.

🔖 Learning How to Learn: Abduction as the ‘Missing Link’ in Machine Learning

In this paper, the question of machine learning is revisited in order to explore whether Bayesian learning, as a form of abductive reasoning, can provide an alternative to the current dichotomy between inductive and deductive approaches in machine learning debates. The paper will further demonstrate that machine learning invariably entails a degree of situatedness, as evidenced by the example of Bayesian belief networks, which arguably rely on abductive reasoning. In this manner, the discourse surrounding Bayesian learning models has the capacity to elucidate the aspects that are often left implicit in contemporary machine learning debates and methodologies.

🔖 Closing the verification loop: Observability-driven harnesses for building with agents

Our approach is harness-first engineering: instead of reading every line of agent-generated code, invest in automated checks that can tell us with high confidence, in seconds, whether the code is correct. The agent generates code, the harness verifies it, production telemetry validates it, and if something is wrong, the feedback updates the harness and the agent tries again. The specific methods to develop harnesses vary in rigor—deterministic simulation testing, formal specifications, shadow evaluation, observability-driven feedback loops—but the principle remains the same: make the verification fast and automatic, and let the harness do the work that human review cannot scale to do.

🔖 Achieving Efficient Version Control of JSON with Prolly Trees

Dolt uses Prolly Trees because they give us two very important properties: history independence and structural sharing. These are both incredibly valuable properties for a distributed database. Structural sharing in particular means that two tables that differ only slightly can re-use storage space for the parts that are the same. Most SQL engines obtain structural sharing for tables by using B-trees or a similar data structure… but that doesn’t extend easily to JSON documents. Some tools like Git and IPFS achieve structural sharing for directories by using a tree structure that mirrors the directory… but that creates a level of indirection for each layer of the document, which would slow down queries if the document had too many nested layers. Something else was needed.

🔖 The Purpose of Protocols

On this account, protocols are governance structures whose design choices allocate power, and the purpose of the entire enterprise is the protection of rights. Protocol design is a form of political design, and the appropriate way to evaluate protocols is not only by their technical properties but by the governance outcomes they produce.

🔖 Cartography of generative AI

The popularisation of artificial intelligence (AI) has given rise to imaginaries that invite alienation and mystification. At a time when these technologies seem to be consolidating, it is pertinent to map their connections with human activities and more than human territories. What set of extractions, agencies and resources allow us to converse online with a text-generating tool or to obtain images in a matter of seconds?

🔖 All Data Are Local: Thinking Critically in a Data-Driven Society

How to analyze data settings rather than data sets, acknowledging the meaning-making power of the local.

In our data-driven society, it is too easy to assume the transparency of data. Instead, Yanni Loukissas argues in All Data Are Local, we should approach data sets with an awareness that data are created by humans and their dutiful machines, at a time, in a place, with the instruments at hand, for audiences that are conditioned to receive them. The term data set implies something discrete, complete, and portable, but it is none of those things. Examining a series of data sources important for understanding the state of public life in the United States—Harvard’s Arnold Arboretum, the Digital Public Library of America, UCLA’s Television News Archive, and the real estate marketplace Zillow—Loukissas shows us how to analyze data settings rather than data sets.

🔖 Water of the Sky A Dictionary of 2,000 Japanese Rain Words

A breathtakingly elegant visual dictionary of 2000 Japanese words for rain, with 100 drawings in indigo.

In Water of the Sky, artist Miya Ando offers us a beautifully rich, bilingual visual dictionary for rain. Through a collection of 2,000 Japanese words, their English interpretations, and 100 drawings, Ando describes the breadth and diversity of rain’s many expressions: when it falls, how it falls, and how its observer might be transformed physically or emotionally by its presence. The words range from prosaic to esoteric, extending from the meteorological (mukaame, or “very fine rain that falls in spring”) to the mystical (bunryūu, or “rain that splits a dragon’s body in half”) and from the minute (kisame, or “raindrops that fall off the leaves and branches of trees”) to the vast (takuu, or “blessed rain that quenches all things in the universe”).

🔖 a collection of tiny llms with usecases

Why Small LLMs Matter

The AI industry defaults to “bigger is better” - GPT-4, Claude Opus, Llama 70B. But for most production workloads, 80% of LLM calls don’t need a 100B+ parameter model. They need a function routed, a tool selected, a query classified, or a simple response generated.

Small LLMs (under 4B parameters) solve this by running locally, for free, in milliseconds.

🔖 susam / wander

Wander is a small, decentralised, self-hosted web console that lets your visitors explore random pages from a community of personal websites

🔖 Visual Introduction to PyTorch

PyTorch is currently one of the most popular deep learning frameworks. It is an open-source library built upon the Torch Library.

Most tutorials assume you’re comfortable jumping straight into code. I made a visual introduction that walks through the core concepts step by step, with animations and diagrams instead of walls of text

🔖 TigerFS

A filesystem backed by PostgreSQL, and a filesystem interface to PostgreSQL. TigerFS mounts a database as a directory. Every file is a real row. Writes are transactions. Multiple agents and humans can read and write concurrently with full ACID guarantees, locally or across machines. Any tool that works with files works out of the box.

🔖 Pointing at Clouds: Indexing, Searching, and Citing in an Age of AI Smog

I wanted to show how an index is not just a bibliographic convention, or an organizational method, or a media form, or a financial instrument, or a corporeal component; it’s also an intellectual architecture, a literary genre, a creative form, a semiotic concept, and an embodiment of agency — one that might offer an important antidote to the pervasive autonomous, extractive cloudification of our contemporary information ecology.

🔖 Brandolini’s law

Brandolini’s law (or the bullshit asymmetry principle) is an Internet adage coined in 2013 by Italian programmer Alberto Brandolini. It compares the considerable effort of debunking misinformation to the relative ease of creating it in the first place. The adage states:

The amount of energy needed to refute bullshit is an order of magnitude bigger than that needed to produce it.[1][2]

The challenge of refuting bullshit does not come just from its time-consuming nature, but also from the challenge of defying and confronting one’s community.

🔖 An AI Agent Published a Hit Piece on Me

Summary: An AI agent of unknown ownership autonomously wrote and published a personalized hit piece about me after I rejected its code, attempting to damage my reputation and shame me into accepting its changes into a mainstream python library. This represents a first-of-its-kind case study of misaligned AI behavior in the wild, and raises serious concerns about currently deployed AI agents executing blackmail threats.

🔖 E. P. Thompson

Edward Palmer Thompson (3 February 1924 – 28 August 1993) was an English historian, writer, socialist and peace campaigner. He is best known for his historical work on the radical movements in the late 18th and early 19th centuries, in particular The Making of the English Working Class (1963).

🔖 OpenDataLoader PDF

PDF parser for AI data extraction — Extract Markdown, JSON (with bounding boxes), and HTML from any PDF. #1 in benchmarks (0.90 overall). Deterministic local mode + AI hybrid mode for complex pages.

🔖 Vips / Image / dzsave

Save an image as a set of tiles at various resolutions. By default dzsave uses DeepZoom layout — use layout to pick other conventions.

🔖 Every layer of review makes you 10x slower

It’s funny, everyone has been predicting the Singularity for decades now. The premise is we build systems that are so smart that they themselves can build the next system that is even smarter, that builds the next smarter one, and so on, and once we get that started, if they keep getting smarter faster enough, then the incremental time (t) to achieve a unit (u) of improvement goes to zero, so (u/t) goes to infinity and foom.

Anyway, I have never believed in this theory for the simple reason we outlined above: the majority of time needed to get anything done is not actually the time doing it. It’s wall clock time. Waiting. Latency.

And you can’t overcome latency with brute force.

I know you want to. I know many of you now work at companies where the business model kinda depends on doing exactly that.

Sorry.

But you can’t just not review things!

🔖 Can I Run AI locally?

CanIRun.ai runs entirely in your browser. When you visit the site, we use browser APIs to detect your GPU, CPU, and memory — then we calculate which AI models can run on your hardware and how fast. No data is sent to any server. Everything is computed client-side.

🔖 iiif-tiles/tile_iiif.py

Generate IIIF Level 0 static tiles from images in a HF Bucket. Downloads source images from a bucket, generates IIIF Image API 3.0 tiles using libvips, creates a IIIF Presentation v3 manifest, and syncs everything to an output bucket for static serving via HF CDN.

🔖 SlowLLM

SLOW LLM is a browser extension that makes LLMs appear to run very slowly. It works with ChatGPT and Claude.

🔖 Bibliothèques et agents IA : le risque de l’invisibilisation

Plan

L’année 2026 sera l’année des agents IA… C’était annoncé, et effectivement depuis le début de l’année nous assistons à la diffusion et à la montée en puissance de deux grandes familles d’outils agentiques d’un nouveau type : d’une part des assistants orientés coding comme Claude Code, Codex, Gemini CLI, Opencode etc., et d’autre part des frameworks de création, de configuration et d’orchestration d’agents permettant l’automatisation de workflows via des canaux de communication (Slack, Discord, messagerie…) comme OpenClaw et ses multiples dérivés

🔖 Toi Derricotte

Toi Derricotte (pronounced DARE-ah-cot ) (born April 12, 1941) is an American poet. She is the author of six poetry collections and a literary memoir. She has won numerous literary awards, including the 2020 Frost Medal for distinguished lifetime achievement in poetry awarded by the Poetry Society of America, and the 2021 Wallace Stevens Award, sponsored by the Academy of American Poets. From 2012–2017, Derricotte served as a Chancellor of the Academy of American Poets. She is currently a professor emerita in writing at the University of Pittsburgh. Derricotte is a member of The Wintergreen Women Writers Collective.[2]

🔖 Degrowth and socialist comrades, what should we be doing?

This is an attempt to clarify this discussion of degrowth strategy, a topic on which I think there is considerable confusion and mistaken approaches. Debate has recently been fuelled by Jason Hickel’s argument for a socialist position on both the goal and the means do it for degrowth. Liegey, Nelson and Leahy replied against Jason, defending the wide and diverse range of strategies now characteristic of the movement and often referred to by the terns “Horizontalism” and “Pluriverse”. Several others have contributed to the discussion, including Jason’s reply to Leigey et al., his subsequent response, my critique of the Leigey, Nelson and Leahy article, Gasparo and Vico, Gregoletto and Burton, Bunea, and Kallis and D’Alisa.

🔖 Category Theory for the Working Programmer - 1.0 - Prologue

In this series we will explain ideas in Category theory from first principles in order to build intuition and derive the actual formal definitions. We’ll use that foundation to demonstrate exactly where these concepts fit into day to day functional programming and how you can do useful things with that knowledge.

Watch this if you want an introduction to category theory that is simple, practical, joyful, and deeply grounded in functional programming

🔖 Sentimental Value

Sentimental Value (Norwegian: Affeksjonsverdi) is a 2025 Norwegian drama film directed by Joachim Trier, who co-wrote it with Eskil Vogt. It follows sisters Nora (Renate Reinsve) and Agnes (Inga Ibsdotter Lilleaas) in their reunion with their estranged father Gustav (Stellan Skarsgård). It also stars Elle Fanning.

🔖 On the Silver Globe

On the Silver Globe (Polish: Na srebrnym globie) is a 1988 Polish epic surrealist science fiction arthouse film[1] written and directed by Andrzej Żuławski, adapted from The Lunar Trilogy by his grand-uncle, Jerzy Żuławski. Starring Andrzej Seweryn, Jerzy Trela, Iwona Bielska, Jan Frycz, Henryk Bista, Grażyna Deląg and Krystyna Janda, the plot follows a team of astronauts who land on an uninhabited planet and form a society. Many years later, a single astronaut is sent to the planet and becomes a messiah.

Production took place from 1976 to 1977, but was interrupted by the Polish authorities. The budget is estimated to be at least PLN 58 million.[2] Many years later, Żuławski was able to finish his film, although not as originally intended. On the Silver Globe premiered at the 1988 Cannes Film Festival, and has received consistent critical acclaim.

Metastablecoin Fragmentation / David Rosenthal

A fundamental problem for decentralized systems like permissionless blockchains is that their security depends upon the cost of an attack being greater than the potential reward from it. Various techniques are used to impose these costs, generally either Proof-of-Work (PoW) or Proof-of-Stake (PoS). These costs have implications for the economics (or tokenomics) of such systems, for example that their security is linear in cost, whereas centralized systems can use techniques such as encryption to achieve security exponential in cost.

Shin Figure 3
Now, via Toby Nangle's Stablecoin = Fracturedcoin we find Tokenomics and blockchain fragmentation by Hyun Song Shin, whose basic point is that these costs must be borne by the users of the system. For cryptocurrencies, this means through either or both transaction fees or inflation of the currency. The tradeoff between cost and security means that there is a market for competing blockchains making different tradeoffs. In practice we see a vast number of competing blockchains:
Tether’s USDT sits on 107 different ledgers. ... USDC sits on 125.
The chart shows Ethereum losing market share against competing blockchains.

Shin's analysis uses game theory to explain why this fragmentation is an inevitable result of tokenomics. Below the fold I go into the background and the details of Shin's explanation.

Background

In 2018's Cryptocurrencies Have Limits I discussed Eric Budish's The Economic Limits Of Bitcoin And The Blockchain, an important analysis of the economics of two kinds of "51% attack" on Bitcoin and other cryptocurrencies based on PoW blockchains. Among other things, Budish shows that, for safety, the value of transactions in a block must be low relative to the fees in the block plus the reward for mining the block.

In 2019's The Economics Of Bitcoin Transactions I discussed Raphael Auer's Beyond the doomsday economics of “proof-of-work” in cryptocurrencies, in which Auer shows that:
proof-of-work can only achieve payment security if mining income is high, but the transaction market cannot generate an adequate level of income. ... the economic design of the transaction market fails to generate high enough fees.
Source
Bitcoin's costs are defrayed almost entirely by inflating the currency, as shown in this chart of the last year's income for miners. Notice that the fees are barely visible.

It has been known for at least a decade that Bitcoin's plan to phase out the inflation of the currency was problematic. In 2024's Fee-Only Bitcoin I wrote:
In 2016 Arvind Narayanan's group at Princeton published a related instability in Carlsten et al's On the instability of bitcoin without the block reward. Narayanan summarized the paper in a blog post:
Our key insight is that with only transaction fees, the variance of the miner reward is very high due to the randomness of the block arrival time, and it becomes attractive to fork a “wealthy” block to “steal” the rewards therein.
So Bitcoin's security depends upon the "price" rising enough to counteract the four-yearly halvings of the block reward. In that post I made a thought-experiment:
As I write the average fee per transaction is $3.21 while the average cost (reward plus fee) is $65.72, so transactions are 95% subsidized by inflating the currency. Over time, miners reap about 1.5% of the transaction volume. The miners' daily income is around $30M, below average. This is about 2.5E-5 of BTC's "market cap".

Lets assume, optimistically, that this below average daily fraction of the "market cap" is sufficient to deter attacks and examine what might happen in 2036 after 3 more halvings. The block reward will be 0.39BTC. Lets work in 2024 dollars and assume that the BTC "price" exceeds inflation by 3.5%, so in 12 years BTC will be around $98.2K.

To maintain deterrence miners' daily income will need to be about $50M, Each day there will be about 144 blocks generating 56.16BTC or about $5.5M, which is 11% of the required miners' income. Instead of 5% of the income, fees will need to cover 89% of it. The daily fees will need to be $44.5M. Bitcoin's blockchain averages around 500K transactions/day, so the average transaction fee will need to be around $90, or around 30 times the current fee.
Average fee/transaction
Bitcoin users set the fee they pay for their transaction. In effect they are bidding in a blind auction for the limited supply of transaction slots. Miners are motivated to include high-fee transactions in their next block. If there were an infinite supply of transactions slots miners' fee income would be zero. In practice, much of the timethe supply of slots exceeds demand and fees are low. At times when everyone wants to transact, such as when the "price" crashes, the average fee spikes enormously.

There was thus a need for a consensus mechanism that did not depend upon inflation. In 2020's Economic Limits Of Proof-of-Stake Blockchains I discussed a post entitled More (or less) economic limits of the blockchain by Joshua Gans and Neil Gandal in which they summarize their paper with the same title. The importance of this paper is that it extends the economic analysis of Budish to PoS blockchains. Their abstract reads:
Cryptocurrencies such as Bitcoin rely on a ‘proof of work’ scheme to allow nodes in the network to ‘agree’ to append a block of transactions to the blockchain, but this scheme requires real resources (a cost) from the node. This column examines an alternative consensus mechanism in the form of proof-of-stake protocols. It finds that an economically sustainable network will involve the same cost, regardless of whether it is proof of work or proof of stake. It also suggests that permissioned networks will not be able to economise on costs relative to permissionless networks.
Source
In 2022 Ethereum switched from Proof-of-Work to Proof-of-Stake, reducing its energy consumption by around 99%. This chart shows that, like Bitcoin, until the "Merge" the costs were largely defrayed by inflating the currency. After the "Merge" the blockchain has been running on transaction fees.

Shin's Analysis

Here is a summary of Shin's analysis.

Notation

  • There is a continuum of validators i.
  • For validator i ∈ [0;1], the cost of contributing to governance is ci > 0.
  • The blockchain needs at least a fraction of the validators  contributing to be secure. Shin writes:
    There are two special cases of note: = 1 (unanimity, corresponding to full decentralisation where every validator must participate for the blockchain to function) and = 0 which corresponds to full centralisation, where one validator has authority to update the ledger.
    = 1 is impractical,lacking fault tolerance. = 0 is much more practical, it is the traditional trusted intermediary.
  • If the blockchain is secure, each contributing validator earns a reward p > 0. A non-contributing validator earns zero.
  • The validators share a common cost threshold c*. If ci < c*, validator i contributes, if ci > c* validator i does not.

Argument

Each validator will want to contribute only if at least - 1 other validators contribute, which poses a coordination problem. The case of particular interest is the validator with ci = c*. Shin writes:
Intuitively, even though the marginal validator may have very precise information about the common cost c*, the validator faces irreducible uncertainty about how many other validators will choose to contribute. It is this strategic uncertainty — uncertainty about others' actions — that is the central feature of the coordination problem.
This "strategic uncertainty" is similar to the attacker's uncertainty about other peers' actions that is at the heart of the defenses of the LOCKSS system in our 2003 paper Preserving peer replicas by rate-limited sampled voting.

Shin Figure 6
Because the marginal validator's ci = c*, the decision whether or not to contribute makes no difference. Sin's Figure 6 explains this graphically. Rectangle A is the loss if k < and rectangle B is the gain if k > . Setting them equal gives:
c* = (p - c*)(1 - )
which simplifies to:
c* = p(1 - )
Shin and Morris earlier showed that this is the unique equilibrium no matter what strategy the validators use.

Result

What this means is that successful validation depends upon the reward p being large enough so that:
p c 1 − κ̂
Shin writes:
Note that the required reward p explodes as → 1. This is the central result of the paper: the more decentralised the blockchain (the higher the supermajority threshold), the higher must be the rents that accrue to validators. In the limiting case of unanimity ( = 1), no finite reward can sustain the coordination equilibrium.
Shin Figure 1
This yet another result showing that a reasonably secure blockchain is unreasonably expensive. The complication is that, much of the time, transactions are cheap because the demand for them is low. Thus most of the time validators are not earning enough for the risks they run. But:
When many users want to transact at the same time, they bid against each other for limited block space, and fees spike — much as taxi fares surge during rush hour. Figure 1 shows how Ethereum gas fees exhibited sharp spikes during periods of network congestion, such as during surges in decentralised finance (DeFi) activity or spikes in the minting of non-fungible tokens (NFTs). These spikes are not merely a reáection of excess demand; they are the mechanism through which the blockchain extracts the rents needed to sustain validator coordination.
Note that these spikes mean that the majority of the time fees are low but the majority of transactions face high fees. It is this "user experience" that drives the fragmentation that Shin describes:
When demand for block space is high, fees rise and validators are well compensated. But high fees deter users, especially those making small or routine transactions. These users are the first to migrate to competing blockchains that offer lower fees — blockchains that can offer lower fees precisely because they have lower coordination thresholds (and hence less security). The users who remain on the more secure blockchain are those with the highest willingness to pay: institutions, large DeFi protocols, and transactions where security and censorship resistance are paramount. This sorting of users across blockchains is the essence of fragmentation.
Shin notes that:
The fragmentation argument is the flipside of blockchain's "scalability trilemma," as described by Vitalik Buterin, who posed the problem as the impossibility of attaining, simultaneously, a ledger that is decentralised, secure, and scalable.
Source
It is worth noting that Buterin's trilemma is a version for PoS of the trilemma Markus K Brunnermeier and Joseph Abadi introduced for PoW in 2018's The economics of blockchains. See The Blockchain Trilemma for details.

Shin's focus is primarily on the effects of fragmentation on stablecoins. He notes that:
Rather than converging on a single platform, stablecoin activity is scattered across many chains (Figure 4). As of late 2025, Ethereum held the majority of total stablecoin supply but was facing competition from Tron and Solana, each of which had attracted tens of billions of dollars in stablecoin balances. Each chain serves different geographies and use cases: Ethereum for institutional settlement, Tron for low-cost remittances, Solana for retail payments and DeFi activity.
This fragmentation among blockchains would not matter much if stablecoins were interoperable between them, but they are confined to the blockchain on which they were minted:
A USDC token on Ethereum is not the same as a USDC token on Solana — they exist on separate ledgers that have no native way of communicating with each other. Transferring between chains requires the use of bridges: specialised software protocols that lock tokens on one chain and issue equivalent tokens on another. These bridges introduce additional risks, including vulnerabilities in the smart contract code — bridge exploits have accounted for billions of dollars in cumulative losses — and they impose costs and delays that undermine the seamless transferability that is the hallmark of money. The result is a landscape in which stablecoins from the same issuer exist in multiple, non-fungible forms across different blockchains, fragmenting liquidity and undercutting the network effects that should be the strength of a widely adopted payment instrument.

Discussion

As I've been pointing out since 2014, very powerful economic forces mean that Decentralized Systems Aren't. So the users paying for the more expensive transactions because they believe in decentralization aren't getting what they pay for.

Source
As I wrote in 2024's It Was Ten Years Ago Today:
The insight applies to Proof Of Stake networks at two levels:
  • Block production: over the last month almost half of all blocks have been produced by beaverbuild.
  • Staking: Yueqi Yang noted that:
    Coinbase Global Inc. is already the second-largest validator ... controlling about 14% of staked Ether. The top provider, Lido, controls 31.7% of the staked tokens,
    That is 45.7% of the total staked controlled by the top two.
Source
In addition all these networks lack software diversity. For example, as I write the top two Ethereum consensus clients have nearly 70% market share, and the top two execution clients have 82% market share.
Shin writes as if more decentralization equals more security even though it doesn't happen in practice, but this isn't really a problem. What the users paying the higher fees want is more security, and they are probably getting because they are paying higher fees. As I discussed in Sabotaging Bitcoin, the reason major blockchains like Bitcoin and Ethereum don't get attacked is not because the (short-term) rewards for an attack are less than the cost. It is rather that everyone capable of mounting an attack is making so much money that:
those who could kill the golden goose don't want to.
Shin Figure 3
In any case what matters for Shin's analysis isn't that the users actually get more security for higher fees, but that they believe they do. Like so much in the cryptocurrency world, what matters is gaslighting. But what the chart showing Ethereum losing market share shows is that security is not a concern for a typical user.

mkiiif, yet another static IIIF generator / Raffaele Messuti

I revisited an old Go package I've been using over the past few years to build IIIF manifests — nothing fancy, just some glue around structs and JSON. From that I built a new CLI, mkiiif, to generate IIIF manifests from static images (tiled or not). There are plenty of similar tools out there (iiif-tiler, tile-iiif, biiif, ...) but none quite matched the CLI ergonomics I needed for my daily workflow.

I moved the library to this new repository atomotic/iiif. The tool mkiiif can be installed downloading a binary release or with Go:

go install github.com/docuverse/iiif/cmd/mkiiif@latest

mkiiif can generate an IIIF manifest from a source directory containing images, or from a PDF file that gets exploded and converted to images via mupdf. Output images can be either untiled or static tiles generated with vips. Both approaches produce a IIIF Level 0 compliant layout, static files that can be served from any HTTP server, with no image server required. Untiled is less efficient for large images but perfectly fine for printed books, papers, and similar material.

mupdf and vips are external dependencies, that need to be installed separately. They are invoked via subprocess; I chose not to add Go library wrappers around them to keep the tool simple. WASM ports of both may become viable in the future.

The CLI usage:

Usage: mkiiif -id <id> -base <url> -title <title> -source <dir|pdf> -destination <dir> [-tiles]
  -base string
        Base URL where the manifest will be served (e.g. https://example.org/iiif)
  -destination string
        Output directory; a subdirectory named <id> will be created inside it, containing the images and manifest.json
  -id string
        Unique identifier for the manifest (e.g. book1)
  -resolution int
        Resolution (DPI) used when converting PDF pages to images via mutool (default 150)
  -source string
        Path to a directory of images or a PDF file to convert
  -tiles
        Generate IIIF image tiles for each image using vips dzsave (requires vips)
  -title string
        Human-readable title of the manifest

Example:

~ mkiiif -base https://digital.library.org -destination ./public -id iiif01 -source ~/book.pdf -title "iiif 01"

Or with tiling:

~ mkiiif -base https://digital.library.org -destination ./public -id iiif01 -source ~/book.pdf -title "iiif 01" -tiles

Both commands produce the following structure inside ./public:

└── iiif01
    ├── index.html
    ├── manifest.json
    ├── page-001.png
    ├── page-002.png
    ├── page-....png
    └── page-....png
└── iiif01
│   ├── index.html
│   ├── manifest.json
│   ├── page-001
│   │   ├── 0,0,1024,1024
│   │   │   └── 512,512
│   │   │       └── 0
│   │   │           └── default.jpg
...
│       ├── full
│       │   ├── 362,501
│       │   │   └── 0
│       │   │       └── default.jpg
│       │   └── max
│       │       └── 0
│       │           └── default.jpg
│       └── info.json
...

The directory can then be served from https://digital.library.org.

I've adopted this URL scheme:

https://{base}/{id}
    /manifest.json — the IIIF manifest
    /index.html    — a simple viewer

So in the example above, https://digital.library.org/iiif01 opens a full viewer to browse the object. The viewer used is Triiiceratops — the newest viewer in the IIIF ecosystem. Built on Svelte and OpenSeadragon, is still young, but very usable, lightweight, and easy to embed and customize. It is my favourite viewer.

mkiiif doesn't handle metadata for now (and probably won't) — the manifest can be easily patched to insert descriptive metadata in a later step, after image preparation, pulling from any existing datasource or metadata catalog.

Here is a full working example: https://docuver.se/iiif/p3tgsk8jqt/

A few open questions I haven't fully resolved:

  • The main drawback of generating IIIF this way is that you end up managing a large number of files on the filesystem, and handling millions of small image tiles can be slow (and costly). This is where IIIF intersects — and overlaps — with similar practices in digital preservation, such as BagIt, OCFL, and WARC/WACZ. So far there's no specification or viewer implementation that handles IIIF containers (e.g. a zip file bundling images, tiles, and the manifest). Discussions on this have been ongoing in the past; I've recently been looking at analogous approaches like GeoTIFF and SZI.
  • A static IIIF bundle generated with this CLI still needs to be served from an HTTP server, with the base URL defined at derivation time. Could such a bundle be opened from localhost and viewed directly in the browser? Service Workers might help here (even if HTTP is still needed), but it's a rabbit hole I haven't explored yet.

The CLI is pretty bare-bones — feel free to suggest improvements or report bugs. I've been using it over the past weeks as part of a personal project: an amateur digital library built around a DIY book scanner I assembled at home, to preserve magazines, zines, and similar material (content NSFW and out of scope to link here).

2026-03-18: A Glimpse into How AI Tools Can Enhance the Way We Study Web Archive Content: Challenges and Opportunities / Web Science and Digital Libraries (WS-DL) Group at Old Dominion University

Artificial intelligence (AI) has transformed nearly every field. Today, we can access and train models that generate text, images, sound, video, and code. This transformation is reshaping how we think, analyze, and preserve information. Yet, despite the rapid growth of AI, its use for analyzing web archive content seems to advance at a slower pace. 

Web archiving is the process of collecting, preserving, and providing access to web content over time, where a memento represents a previous version of a web resource as it existed at a specific moment in the past. Much of the recent work within the web archiving community (e.g., [1], [2], [3]) has focused on making the archiving process itself more intelligent, integrating AI into tasks such as web crawling, storage optimization, and metadata generation. In contrast, the application of AI to the analysis of already archived web content has received comparatively less attention. This gap represents a great opportunity for innovation and contribution, particularly as web archives continue to grow in size, diversity, and historical importance.

In this blog, I aim to outline (based on my perspective, analysis, preliminary work, and insights gained during my PhD candidacy exam) opportunities for where AI could play a role, as well as key challenges involved in integrating AI into web archiving.

My Preliminary Work 

Since I joined the PhD program at ODU in 2023 (Blog post introducing myself) under the supervision of Dr. Michele C. Weigle, my work has focused on the intersection of web archiving and AI, with a particular emphasis on leveraging Large Language Models (LLMs) through Retrieval-Augmented Generation (RAG) to detect and interpret text changes across mementos. Identifying the exact moment when content was modified often requires carefully comparing multiple archived versions, a process that can be both tedious and time-consuming. Moreover, detecting and analyzing where important changes occur is not a straightforward process. Users often need to select a subset of captures from thousands available, and even then, there is no guarantee that the differences they find will be meaningful or important. Traditional approaches to memento change analysis, such as lexical comparisons and indexing (e.g., [4], [5]), focus on showing the deletion or addition of terms or phrases but ignore semantic context. As a result, they miss subtle shifts in meaning and rely heavily on human interpretation.

My early work resulted in a paper titled “Exploring Large Language Models for Analyzing Changes in Web Archive Content: A Retrieval-Augmented Generation Approach,” coauthored with Lesley Frew, Dr. Jose J. Padilla, and Dr. Michele C. Weigle. The results of this initial exploration demonstrated that an LLM, when combined with tools such as RAG over a set of mementos, can effectively retrieve and analyze changes in archived web content. However, it remains necessary to constrain the analysis to distinguish between important and non-important changes. Building on this, I have been developing a pipeline to automatically determine whether a change alters meaning or context and should be considered significant. This aims to reduce manual effort, cognitive load, and support integration into web archive systems while advancing methods for analyzing archived web content at scale.

My PhD Candidacy Exam

During the summer of 2025, I passed my PhD candidacy exam (pdf, slides). This milestone marked an important transition in my doctoral studies and provided an opportunity to reflect on my preliminary work, learn, and identify new ways to contribute to the intersection of AI and web archiving. In my candidacy exam, I reviewed a set of ten papers related to analyzing changes and temporal coherence in archived web pages and websites.  Changes refer to any modifications observed in web content over time, including the addition, deletion, or alteration of text, images, structure, or other embedded resources. Temporal coherence, on the other hand, refers to the degree to which all components of an archived web page (such as HTML, text, images, and stylesheets) or website (such as interconnected pages and resources) were captured close enough in time to accurately represent how it appeared and functioned at a specific moment. A lack of temporal coherence can result in inconsistencies in how the archived page or site looks or behaves, which may affect the accuracy of change analysis.

Figure 2. A moment from my PhD candidacy exam, where I presented a ten-paper review on analyzing changes and temporal coherence in archived web pages and websites.

AI in Web Archiving: Opportunities

Over time, several researchers have addressed the analysis of changes and temporal coherence in web archives; however, the use of AI in this context has been limited. Below, I outline some research opportunities and challenges based on insights gained from my preliminary work and candidacy exam on how AI could play a role in these activities.

Topic Drift

AlNoamany et al. [6] studied web archive collections to identify off-topic pages within TimeMaps, which occur when a webpage that was originally relevant to a collection later changes into unrelated content. For example, in a collection about the 2003 California Recall Election (Figure 3), the site johnbeard4gov.com initially supported candidate John Beard (September 24, 2003) but later transformed into an unrelated adult-oriented page (December 12, 2003), making it irrelevant to the collection. To detect such changes, AlNoamany et al. proposed automated methods including text-based similarity metrics (cosine similarity, Jaccard similarity, and term overlap), a kernel-based method using web search context, and structural features such as changes in page length and word count. Using manually labeled TimeMap versions as ground truth, they found that the best performance was achieved by combining TF-IDF cosine similarity with word-count change.

Figure 3. Example of johnbeard4gov.com going off-topic. The first capture (September 24, 2003) shows the site supporting a California gubernatorial candidate, while the later capture (December 12, 2003) shows the domain transformed into unrelated adult-oriented content. Source: AlNoamany et al. [6]

Recent advances in AI and representation learning offer opportunities to enhance off-topic detection in web archives beyond traditional term frequency measures. Instead of relying on TF-IDF, future approaches could use dense semantic embeddings from transformer models to better capture meaning and context, enabling the detection of more subtle topic drift. Comparing embedding-based similarity with the methods proposed by AlNoamany et al. could help determine which approach is more effective, particularly when topic shifts are not immediately apparent.

Temporal Coherence

Weigle et al. [7] highlight a key challenge in modern web archiving: many sites, such as CNN.com, rely on client-side rendering, where the server delivers basic HTML and JavaScript that later fetch dynamic content (often JSON) through API calls. Traditional crawlers like Heritrix do not execute JavaScript or consistently capture these dynamic resources, leading to temporal violations in which archived HTML and embedded JSON files have different capture times, potentially misrepresenting events or news stories. The issue is illustrated in Figure 5, which shows archived CNN.com pages captured between September 2015 and July 2016. The top row displays pages replayed in the Wayback Machine that show the same top-level headline despite being captured months apart. The bottom row shows mementos from the same dates with the correct top-level headlines; however, the second-level stories remain temporally inconsistent.

By measuring time differences between base HTML captures and embedded JSON resources using CNN.com pages (September 2015–July 2016), Weigle et al. identified nearly 15,000 mementos with mismatches exceeding two days. They conclude that browser-based crawlers best reduce such inconsistencies, though due to their higher cost and slower performance, they recommend deploying them selectively for pages that depend on client-side rendering.

Figure 4. Example of temporal coherence violation in archived CNN.com pages using client-side rendering. Source: Weigle et al. [7].

AI can enhance existing approaches to temporal coherence in web archives, such as those proposed by Weigle et al., by helping identify pages that depend on client-side rendering. For example, a machine learning model could be fine-tuned to analyze the initial HTML and related resources to detect signals such as empty or minimally populated DOM structures and classify whether a webpage relies on client-side rendering. AI-based analysis could also estimate the proportion of JavaScript relative to textual content and detect patterns associated with common client-side frameworks. Combined with indicators such as API endpoints referenced in scripts, these features can be used to flag pages that are unlikely to render correctly with traditional crawlers and may require browser-based crawling.

AI for Enhancing Web Archive Interfaces

While platforms such as Google and others have begun integrating AI into their user interfaces, web archives have largely remained unchanged in this respect. This is notable given the potential of AI to make web archive interfaces more intuitive and more informative for a wide range of users. For example, as my preliminary work suggests, when analyzing content changes, users currently must manually browse long lists of captures or compare multiple archived versions of a webpage. AI could instead automatically identify moments when important changes occur and direct users’ attention to those points in time.

Along the same line, the Internet Archive’s Wayback Machine provides a “Changes” feature that highlights deletions and additions between two snapshots and a calendar view where color intensity reflects the amount of variation. However, this variation is based on the quantity of changes rather than their significance. As a result, many small edits may appear more important than fewer but meaningful modifications. An AI-enhanced interface could address this limitation by incorporating semantic change detection. For instance, a calendar view that highlights when the meaning or message of a page changes can make large-scale temporal analysis more efficient and accessible. Moreover, users could ask natural-language questions such as “When did this page change its message?” or “What were the major updates during a specific period?” and receive concise, understandable answers. 

AI could also guide users through large collections by recommending related pages, explaining why certain versions are relevant, or warning when an archived page may contain temporally inconsistent content. For non-experts, visual aids generated by AI, such as timelines, change highlights, or short explanations, could make complex web archive data easier to interpret. 

AI in Web Archiving: Challenges

While there are opportunities for AI integration into web archiving, there are also challenges that must be considered.

Technical Challenges

From a technical standpoint, I identified three primary challenges regarding using AI for analyzing archived web content. The first concerns the nature of archived web data. Web archiving systems typically store collected content using the Web ARChive (WARC) format. Each WARC file stores complete HTTP response headers, HTML content, and additional embedded resources such as images and JavaScript files. Although this format provides a structure and allows long-term preservation, it is verbose and was not designed to support AI-based analysis. Consequently, researchers must perform extensive parsing and preprocessing before AI models can effectively use archived web content.

Second, many web archives, such as the Internet Archive’s Wayback Machine, prioritize long-term storage and preservation over indexing and large-scale content retrieval. As a result, a single web page may have hundreds or even thousands of archived versions over time. Building and maintaining large-scale vector indexes over such temporally dense collections quickly becomes computationally expensive and, in many cases, impractical.

Third, even when working with controlled data scenarios, such as curated web archive collections, AI-driven analysis still depends on the availability of ground truth for evaluation and validation. For instance, training models to detect significant changes across mementos would require large-scale, high-quality annotations that capture not only what changed, but whether those changes meaningfully affect content interpretation. At present, no large-scale annotated datasets exist that support systematic analysis of change significance across archived web versions, creating a major barrier to training and evaluating AI models in this domain.

Ethical Challenges

Beyond technical limitations, the integration of AI into web archive analysis raises important ethical challenges. For instance, web archives preserve content as it existed at specific points in time, often without the consent or awareness of content creators or the individuals represented in that content. When AI models analyze archived web data, they may surface, reinterpret, or amplify sensitive information that was never intended to be reused in new analytical contexts. For this reason, it is important to carefully consider how AI is applied within web archiving. I contend that AI should be viewed as a complementary tool, one that supports, rather than replaces, human judgment. For example, AI can assist in identifying potential moments of relevant changes, flagging or summarizing them, while humans interpret the results and make decisions.

It is also important to note that recent debates highlight growing tensions between web archives and content owners regarding the use of archived data for AI training and analysis. For example, major news publishers have begun restricting access to resources like the Internet Archive due to concerns that archived content is being used for large-scale AI scraping without compensation or consent [8]. In response to such restrictions, researchers and practitioners—including Mark Graham, Director of the Wayback Machine—have argued that limiting access to web archives poses a significant risk to the preservation of digital history [9]. From this perspective, the primary concern is not excessive access, but rather the potential loss of the web as a historical record if archiving efforts are weakened.

Conceptual Challenges

AI models, particularly LLMs, typically operate on individual snapshots of data. As a result, they are not inherently designed to reason about evolution, temporal coherence, or change over time in archived web content. Consequently, answers to temporally grounded questions should not be expected by default when these models are applied without additional structure or context.

In static analysis scenarios, AI models can perform effectively. For example, given a single archived web page, an LLM can generate a summary, identify main topics, extract named entities, or analyze embedded resources such as images, videos, or scripts. Temporal analysis in web archiving, however, requires a different mode of reasoning. The central questions are not “What does this page say?” or “What is this page about?” but rather “What changed?”, “When did it change?”, “Why did it happen?”, and “What impact does the change have over time?” Answering these questions requires comparing multiple archived versions, reasoning based on context, and perhaps correlating changes across web pages.

Integrating AI into web archiving is therefore not only about efficiency, but about enabling new forms of discovery. This requires clearly defining desired outcomes and using AI to support or accelerate processes that have traditionally been manual.

Final Reflections

To conclude, I would like to leave the reader with a set of open questions as we continue moving toward the integration of AI in web archiving. One of the most visible changes introduced by AI is the ability to go beyond syntactic analysis and begin exploring semantic analysis, where meaning, context, and interpretation matter. This shift is not about replacing existing techniques, but about expanding the types of questions we can ask when working with web archive data.

I contend that traditional algorithms remain essential for many web archiving tasks. They are precise, transparent, and well understood. AI, by contrast, offers strengths in areas where rules struggle: interpreting context, assessing relevance, and reasoning across multiple versions of content. Rather than framing this as a competition between algorithms and AI, a more productive question is how these approaches can complement one another, and in which parts of the analysis pipeline each is most appropriate.

In the short term, I consider that AI tools are unlikely to replace algorithmic methods. However, they already show promise as assistive tools that can guide analysis, prioritize attention, and help humans reason about large and complex temporal collections. This naturally raises a forward-looking question: if AI continues to improve in its ability to reason about time, meaning, and change, how should the web archiving community adapt its tools, workflows, and standards?

The WARC format has proven effective for long-term preservation, but it was not designed with AI-driven analysis in mind. Should we aim to augment existing archival formats with AI-aware representations, or should we focus on developing AI methods that better adapt to current standards such as WARC? How we answer this will shape not only how we analyze web archives, but also how future generations access and understand the web past.

References

[1] AK, Ashfauk Ahamed. “AI driven web crawling for semantic extraction of news content from newspapers.” Scientific Reports, 2025. [Online]. https://doi.org/10.1038/s41598-025-25616-x.

[2] Abrar, M. F., Saqib, M., Alferaidi, A., Almuraziq, T. S., Uddin, R., Khan, W., & Khan, Z. H. “Intelligent web archiving and ranking of fake news using metadata-driven credibility assessment and machine learning.” Scientific Reports, 2025. [Online]. https://doi.org/10.1038/s41598-025-31583-0.

[3] Nair, A., Goh, Z. R., Liu, T., and Huang, A. Y. “Web archives metadata generation with gpt-4o: Challenges and insights,” arXiv, Tech. Rep. arXiv:2411.0540, Nov. 2024. [Online]. https://arxiv.org/abs/2411.05409.

[4] L. Frew, M. L. Nelson, and M. C. Weigle, “Making Changes in Webpages Discoverable: A Change-Text Search Interface for Web Archives,” in Proceedings of the 23rd ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL), 2023, pp. 71–81. https://doi.org/10.1109/JCDL57899.2023.00021

[5] T. Sherratt and A. Jackson, GLAM-Workbench/web-archives, https://zenodo.org/records/6450762, version v1.1.0, Apr. 2022. DOI: 10.5281/zenodo.6450762.

[6] Y. AlNoamany, M. C. Weigle, and M. L. Nelson, “Detecting off-topic pages within timemaps in web archives,” International Journal on Digital Libraries, vol. 17, no. 3, pp. 203–221, 2016. https://doi.org/10.1007/s00799-016-0183-5.

[7] M. C. Weigle, M. L. Nelson, S. Alam, and M. Graham, “Right HTML, wrong JSON: Challenges in replaying archived webpages built with client-side rendering,” in Proceedings of the ACM/IEEE Joint Conference on Digital Libraries (JCDL), Jun. 2023, pp. 82–92. https://doi.org/10.1109/JCDL57899.2023.0002.

[8] Robertson, K. “News publishers limit Internet Archive access due to AI scraping concerns.” Nieman Lab, Jan. 2026. [Online]. https://www.niemanlab.org/2026/01/news-publishers-limit-internet-archive-access-due-to-ai-scraping-concerns/

[9] Graham, M. “Preserving the web is not the problem — losing it is.” Techdirt, Feb. 17, 2026. [Online]. https://www.techdirt.com/2026/02/17/preserving-the-web-is-not-the-problem-losing-it-is/





2026-03-18: Reverse TweetedAt: Determining Tweet ID prefixes from Timestamps / Web Science and Digital Libraries (WS-DL) Group at Old Dominion University

Figure 1: Each tweet ID is a unique identifier that encodes the tweet creation timestamp, example adapted from Snowflake ID, Wikipedia.

Web archives, such as the Wayback Machine, are indexed by URL. For example, if we want to search for a tweet we must first know its URL. Figure 2 demonstrates that searching for a tweet URL results in a timemap of that tweet archived at different points in time. Clicking on a particular datetime will show the archived tweet at that particular point in time.

 

Figure 2: An archived tweet URL results in a timemap consisting of archived copies of the tweet.


Figure 3 shows a screenshot of a tweet shared by @_llebrun. The tweet in the screenshot was originally posted by @randyhillier who later deleted his tweet. The screenshot of the tweet does not have the tweet's URL on the image. Moreover, when a tweet is deleted, we will not be able to find the tweet URL on the live web, nor will we know how to  look it up in the archive.


Figure 3: @_llebrun tweeted a screenshot of a tweet originally posted by @randyhiller, who later deleted his tweet.


Therefore, we need to construct the URL of a tweet using only the information present in the screenshot. The structure of a tweet URL is: 


https://twitter.com/Twitter_Handle/status/Tweet_ID


We need the Twitter_Handle and Tweet_ID to construct a tweet URL. Each tweet ID is a unique identifier known as the Snowflake ID that encodes the tweet creation timestamp (Figure 1). We can extract the Twitter handle and timestamp from a tweet in the screenshot. In our previous tech report, we introduced methods for extracting Twitter handles and timestamps from Twitter screenshots. Next, we need to determine the tweet ID from the extracted timestamp. We could use only the Twitter handle and query the Wayback Machine, but that would be an exhaustive task to individually dereference all the archived tweets for a user. For example, the following curl command shows the total number of archived tweets required to dereference for @randyhiller's status URLs is huge (42,053). Hence, our goal is to limit the search space by utilizing the timestamp present on the screenshot.

curl -s "http://web.archive.org/cdx/search/cdx?url=https://twitter.com/randyhillier/status&matchType=prefix" | wc -l


   42053


Previously, one could query Twitter to find the timestamp of a tweet given a tweet ID. But, this service is no longer freely available.. The Twitter API has access rate limits and metadata from deleted/suspended/private tweets cannot be accessed using the API. Moreover, the Twitter API is currently monetized and no longer research-friendly. To address these issues, WS-DL members Mohammed Nauman Siddique and Sawood Alam developed the TweetedAt web service in 2019. The goal of this service is to extract the timestamps for Snowflake IDs and estimate timestamps for pre-Snowflake IDs. Therefore, TweetedAt has become a useful tool for finding timestamps from tweet IDs. However, we require a tweet ID prefix to be determined from a given timestamp.

Reverse TweetedAt


The Snowflake service generates a tweet ID which is a 64-bit unsigned integer composed of: 41 bits timestamp, 10 bits machine ID, 12 bits machine sequence number, and 1 unused sign bit. The timestamp occupies the upper 41 bits only.


TweetedAt determines the timestamp for a tweet ID by right-shifting the tweet ID by 22 bits and adding the Twitter epoch time of 1288834974657 (offset).


Python code to get UTC timestamp of a tweet ID

def get_tweet_timestamp(tid):


    offset = 1288834974657

    tstamp = (tid >> 22) + offset

    utcdttime = datetime.utcfromtimestamp(tstamp/1000)

    print(str(tid) + " : " + str(tstamp) + " => " + str(utcdttime))


For Reverse TweetedAt, given a datetime, we want to generate a tweet ID prefix by subtracting the offset and left-shifting by 22 bits. The process will not reconstruct the exact tweet ID because the lower 22 bits are all zeros. However, the process will give us a tweet ID prefix for a timestamp. For example, the tweet ID for @randyhillier’s tweet is ‘1495226962058649603’ and the timestamp is ‘9:41 PM Feb 19, 2022’ as shown in Figure 3. The tweet ID is a 19-digit ID and the timestamp is at minute-level granularity. The Reverse TweetedAt would compute a tweet ID prefix ‘149522’ of 6-digits for the 19-digit tweet ID ‘1495226962058649603’ based on the timestamp at minute-level granularity.


Python code to get tweet ID prefix from a Wayback timestamp

from datetime import datetime, timezone


TWITTER_EPOCH_MS = 1288834974657


def wayback_to_tweetid_prefix(timestamp: str):


    s = str(timestamp).strip()


    if len(s) == 14 and s.isdigit():

        granularity = "second"

        dt = datetime.strptime(s, "%Y%m%d%H%M%S").replace(tzinfo=timezone.utc)

        start_ms = int(dt.timestamp() * 1000)

        end_ms = start_ms + 999


    elif len(s) == 12 and s.isdigit():

        granularity = "minute"

        dt = datetime.strptime(s, "%Y%m%d%H%M").replace(tzinfo=timezone.utc)

        start_ms = int(dt.timestamp() * 1000) 


    elif len(s) == 10 and s.isdigit():

        granularity = "hour"

        dt = datetime.strptime(s, "%Y%m%d%H").replace(tzinfo=timezone.utc)

        start_ms = int(dt.timestamp() * 1000) 

        end_ms = start_ms + 3_600_000 - 1


    elif len(s) == 8 and s.isdigit():

        granularity = "date"

        dt = datetime.strptime(s, "%Y%m%d").replace(tzinfo=timezone.utc)

        start_ms = int(dt.timestamp() * 1000) 

        end_ms = start_ms + 86_400_000 - 1


    else:

        raise ValueError(

            "Unsupported Wayback format. Use YYYYMMDD, YYYYMMDDHH, YYYYMMDDHHMM, or YYYYMMDDHHMMSS (UTC)."

        )


    start_delta = start_ms - TWITTER_EPOCH_MS

    end_delta = end_ms - TWITTER_EPOCH_MS

    min_id = start_delta << 22

    max_id = (end_delta << 22) | ((1 << 22) - 1)

    min_str = str(min_id)

    max_str = str(max_id)

    length = max(len(min_str), len(max_str))

    min_str = min_str.zfill(length)

    max_str = max_str.zfill(length)


    i = 0

    while i < length and min_str[i] == max_str[i]:

        i += 1


    prefix_str = min_str[:i] or "0"

    suffix_len = length - i

    prefix_val = int(prefix_str)

    ten_pow = 10 ** suffix_len

    approx_lower = prefix_val * ten_pow

    approx_upper = (prefix_val + 1) * ten_pow - 1


    return {

        "input_timestamp": timestamp,

        "tweet_id_prefix": prefix_str,

        "tweet_id_regex": f"{prefix_str}[0-9]{{{suffix_len}}}",

        "tweet_id_range": f"[{approx_lower} – {approx_upper}]",

    }


We integrated Reverse TweetedAt as a web service alongside TweetedAt. The service accepts a timestamp as user input and returns the corresponding tweet ID prefix, tweet ID regex, and full tweet ID range (Figure 4). It supports multiple valid timestamp formats (e.g., ISO 8601, RFC 1123, Wayback) and provides output at different levels of granularity. For example, Figure 4 shows output for millisecond-level granularity. Because millisecond-level precision is typically unavailable in tweet timestamps, the tool can interpret such inputs at second- or minute-level granularity. Rather than assuming zeros for unknown fields, the tool expands the input into the full corresponding time window (e.g., an entire second or minute), and computes the tweet ID prefix over that interval.

Figure 4: Reverse TweetedAt outputs tweet ID prefix at millisecond- level granularity.


Figure 5: Reverse TweetedAt outputs tweet ID prefix at second-level granularity.


Figure 6: Reverse TweetedAt outputs tweet ID prefix at minute-level granularity.


Tweet ID Regex-based Retrieval Across Temporal Granularity


We can use the tweet ID regex derived from a timestamp to search for archived tweets within a specific temporal window. By querying the Wayback Machine’s CDX API and filtering results using this prefix-based regex, we can identify tweet URLs whose IDs fall within the calculated range. As the timestamp becomes less precise, the tweet ID becomes shorter and the regex search space widens. 


For example, the tweet ID of @randyhillier’s tweet shown in Figure 3 is ‘1495226962058649603.’ Using TweetedAt, we can get the timestamp at millisecond-level granularity. Using Reverse TweetedAt, the millisecond-level granularity  returns a more precise prefix and results in 10 archived captures, while a slightly less precise prefix (second-level granularity) returns 15. When the precision is reduced further (minute-level granularity), the number of results remains 15. This indicates that all tweets within that broader time window were posted within the same narrower interval. This illustrates how lower temporal granularity expands the potential search space. However, a wider ID range does not necessarily produce more results; it only increases the number of possible candidate IDs.

Search space at millisecond-level granularity

curl -s "https://web.archive.org/cdx/search/cdx?url=https://twitter.com/randyhillier/status/&matchType=prefix" \

| grep -E 'status/14952269620[0-9]{8}' | wc -l


   10


Search space at second-level granularity

curl -s "https://web.archive.org/cdx/search/cdx?url=https://twitter.com/randyhillier/status/&matchType=prefix" \

| grep -E 'status/149522696[0-9]{10}' | wc -l


   15


Search space at minute-level granularity

curl -s "https://web.archive.org/cdx/search/cdx?url=https://twitter.com/randyhillier/status/&matchType=prefix" \

| grep -E 'status/149522[0-9]{13}' | wc -l


   15



CDX API Wildcard Search and Snowflake IDs to Limit the Search Space Using Tweet ID Prefix


We can now determine a tweet ID prefix from a screenshot timestamp using the Reverse TweetedAt service. Since a tweet can be archived any time between ±26 hours of the screenshot timestamp, we can determine tweet ID prefixes from the time window timestamps. We can use this time window to limit the search space by excluding the URLs tweeted before and after the alleged timestamp. Let us consider a tweet in the screenshot in Figure 2, where the screenshot timestamp is: 


9:41 PM Feb 19, 2022 (20220219214100)


We compute the tweet ID prefixes from left-hand boundary (-26) and right-hand boundary (+26) timestamps using the Reverse TweetedAt which are listed below:


-26 hours timestamp: 20220218194100 → tweet ID prefix: 14947588
+26 hours timestamp: 20220220234100 → tweet ID prefix: 149554404

As previously mentioned, the timestamp occupies the upper 41 bits only. We can use a common portion of tweet ID prefixes (149[4-5]) and do a CDX API wildcard search in the Wayback Machine to limit the search space. The search space reduces to 629 archived tweets, whereas using only the Twitter handle outputs 42,053 archived tweets. Now, dereferencing 629 archived tweets to search for a particular tweet text of a screenshot is a lot of work but feasible, whereas dereferencing 42,053 archived tweets is far too expensive. The following curl command shows the total number of archived tweets required to dereference for @randyhiller's status URLs with a common tweet ID prefix is comparatively less (629).

curl -s "https://web.archive.org/cdx/search/cdx?url=https://twitter.com/randyhillier/status/&matchType=prefix&from=20220218194100" \ | grep -E 'status/149[4-5]' | wc -l


   629


Summary


It is easy to search for a tweet in the Wayback Machine when you know the  URL. But a screenshot of a tweet typically does not have its URL present on the image. However, the Twitter handle and timestamp present in the tweet in the screenshot can be utilized to search for a tweet in the Wayback Machine web archive. Given a datetime, Reverse TweetedAt produces a tweet ID prefix, which we can then use to grep through a CDX API response of all tweets associated with a Twitter account. We can determine approximate tweet IDs from left-hand boundary and right-hand boundary timestamps from a screenshot timestamp using the Reverse TweetedAt tool. We found that we can limit the search space using a CDX API wild card search based on a common tweet ID prefix. Thus, the process for finding candidate archived tweets for the tweet in the screenshot is optimized. We published a paper at the 36th ACM Conference on Hypertext and Social Media, “Web Archives for Verifying Attribution in Twitter Screenshots,” which discusses how we can further use the candidate archived tweets to verify whether the tweet in the screenshot was posted by the alleged author.


Related Links:



—- Tarannum Zaki (@tarannum_zaki)


Seeking Approval, Confronting Objectivity: Neutrality in the Library of Congress Subject Headings Approval Process / In the Library, With the Lead Pipe

In Brief: This study examines the concept of neutrality in Library of Congress Subject Headings and the subject approval process by analyzing proposed headings that were rejected over a nearly 20-year period. It considers the place of neutrality in libraries more generally and argues that equity, rather than neutrality, is the appropriate lens for judging subject heading proposals. Finally, it recommends several reforms that could improve the subject heading process and make it more equitable.

By Allison Bailund, Deborah Tomaras, Michelle Cronquist, and Tina Gross

If a train is moving down the track, one can’t plop down in a car that is part of that train and pretend to be sitting still; one is moving with the train. Likewise, a society is moving in a certain direction—power is distributed in a certain way, leading to certain kinds of institutions and relationships, which distribute the resources of the society in certain ways. We can’t pretend that by sitting still—by claiming to be neutral—we can avoid accountability for our roles (which will vary according to people’s place in the system). A claim to neutrality means simply that one isn’t taking a position on that distribution of power and its consequences, which is a passive acceptance of the existing distribution. That is a political choice.[1]

Introduction

Library workers and patrons have long been frustrated with Library of Congress Subject Headings (LCSH) for being out of date and lacking well-known concepts with abundant usage. Contributors to the Subject Authority Cooperative Program (SACO) have made many improvements to LCSH by proposing new headings and revising existing terms. Those attempts, however, have sometimes been hampered by the Library of Congress’s (LC) preference for supposed neutrality within the vocabulary; Subject Headings Manual (SHM) instruction “H 204,” released in 2017, specifically dictates that proposed headings should “employ neutral (i.e., unbiased) terminology.”[2]

This desire for neutrality has been directly stated, alluded to, or otherwise upheld in myriad rejections of proposed subject headings, from Negative campaigning[3] to White flight.[4] Even Water scarcity, a quantifiable concept of worldwide concern, was rejected in 2008 as a non-neutral topic requiring value judgments with the following justification:

Works on the topics of water scarcity and water shortage have been cataloged using the heading Water-supply, post-coordinating[5] as necessary with additional headings such as Water conservation and Water resources management. The meeting determined that this practice is appropriate and should continue, since Water-supply is a neutral heading that does not require a judgment about the relative abundance of water.[6]

However, what exactly constitutes neutral and unbiased terminology is never defined in “H 204” or anywhere else in the SHM, nor in any other Library of Congress controlled vocabulary manuals.[7] Much of the previous literature on neutrality in libraries focuses on debates over possible definitions of the term and what role neutrality should play in library services and collections. Building off previous critical cataloging literature, which focuses on addressing problematic terms, subject hierarchies, and biases within cataloging standards, this article extends that scrutiny further. We analyze how neutrality is embedded in the LC structures and systems that vet the terms catalogers utilize to describe materials.

Our article examines the ways in which neutrality is enforced in LCSH rejections between July 2005 and December 2024. We review “Summaries of Decisions” from LC Subject Editorial Meetings (along with associated discussion and commentary in the field); within these, we identify and interpret patterns of justifications used to reject subject heading proposals and maintain purported neutrality within the vocabulary. We argue that neutrality has been used to keep many concepts depicting prejudice (racism, sexism, etc.), as well as concepts related to the lived experiences of marginalized people, out of the vocabulary and/or to obscure materials about those topics under other, often more generalized or euphemistic, terminology. As a counterpoint, we suggest a values- and equity-driven approach to replace the principle of neutrality in a cataloging context and within the subject approval process. We acknowledge that the current political situation may be particularly fraught for equity-driven change, but believe bowing to political pressures is untenable, and continued pursuit of neutrality will only serve to further the discordance between library values and the realities of LCSH.

Background

Neutrality: Assumed, but Nebulous

Schlesselman-Tarango notes the perceived conceptual importance of neutrality for libraries and librarianship; their “status as ‘an essential public good’” is “contingent on the perpetration of the idea that [they are] also neutral.”[8] Seale further situates this notion of libraries-as-neutral as not externally imposed, but emanating from within librarianship itself: “The positioning of the library as a neutral and impartial institution, separated from the political fray, resonates with dominant library discourse around libraries.”[9]

However, despite both critics and supporters assuming that neutrality is fundamental to librarianship, there is a dearth of references to the term in official documents underpinning the ethics and standards of the library profession. The American Library Association’s (ALA) Working Group on Intellectual Freedom and Social Justice observed, for example, that “the word neutrality does not appear in the Library Bill of Rights, the ALA Code of Ethics, and any other ALA statements that the Working Group could locate. It does not appear in the Intellectual Freedom Manual (10th Edition) nor is it defined in any official ALA document or policy.”[10] The International Federation of Library Associations and Institutions’s (IFLA) Code of Ethics mentions but does not define neutrality in Section 5, in sentences such as “Librarians and other information workers are strictly committed to neutrality and an unbiased stance regarding collection, access and service.”[11] For catalogers in particular, the Cataloging Code of Ethics, issued in 2021 and discussed further below, explicitly disputes the concept of neutrality.

Most pertinent to the subject proposal process, the National Information Standards Organization’s (NISO) Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies mentions neutrality exactly twice, yet again without definition. The first instance, in guidance about choosing preferred forms of terms, asserts that “Neutral terms should be selected, e.g., developing nations rather than underdeveloped countries.”[12] The second appearance, in a discussion of synonyms, notes “pejorative vs. neutral vs. complimentary connotation[s]” of terms that might influence usage.[13] The latter reference positions neutrality as the impartial fulcrum of term meanings, while the former implies, particularly via the example, a more active attempt at choosing equitable and unbiased terminology.

Although the terms “neutral” and “unbiased” are often linked when they appear in library literature (as in the IFLA Code of Ethics), they are not synonymous. Oxford English Dictionary (OED) definitions of neutral include “inoffensive,” and “not taking sides in a controversy, dispute, disagreement, etc.”; unbiased, however, while meaning “not unduly or improperly influenced or inclined; [and] unprejudiced,” does not necessarily imply a lack of involvement in social or political issues.[14] The incompatibility between neutrality as inoffensive isolation versus unbiasedness as active equity plays out repeatedly in library discussions. Without clear definitions, neutrality in the NISO Guidelines and elsewhere is open to conjecture and interpretation. As noted by Scott and Saunders, “[T]he term ‘neutrality’ seems to be used for, or conflated with, everything from not taking a side on a controversial issue to the objective provision of information and a position of defending intellectual freedom and freedom of speech.”[15]

Proponents of library neutrality don’t fully agree on definitions, either. In Scott and Saunders’s survey, some describe it as “lacking bias,” which more closely aligns with principles of equity.[16] The depiction of neutrality by LaRue, the former Director of the ALA’s Office for Intellectual Freedom, also appears to resemble equity; he frames neutrality as not “deny[ing] people access to a shared resource just because we don’t like the way they think” and giving everyone “a seat at the table.”[17] Dudley, reframing library neutrality in relation to pluralism, highlights similar values; his proposed ethos calls on librarians to “adhere to principled, multi-dimensional neutrality” which includes “welcoming equally all users in the community” and consistently-apply[ing] procedures for engaging with the public.”[18]

The 2008 book Questioning Library Neutrality examines many aspects of why neutrality is both an illusion and a misguided aspiration, and also disabuses readers of the idea that it has always been a core value. Rosenzweig points out that neutrality as a principle of librarianship does not go back to the early development of public libraries:

We would do well to remember that, if libraries as institutions implicitly opened democratic vistas, our librarian predecessors were hardly democratic in their overt professional attitude or mission, being primarily concerned with the regulation of literacy, the policing of literary taste and the propagation of a particular class culture with all its political, economic and social prejudices. In fact, the idea of the neutrality of librarianship, so enshrined in today’s library ideology (and so often read back into the indefinite past), was alien to these earlier generations.[19]

Although Macdonald and Birdi’s literature review identifies four conceptions of neutrality within library science literature—“favourable,” “tacit value,” “libraries are social institutions,” and “value-laden profession”—the authors found that depictions of neutrality articulated by practitioners are more complicated. Many have “ambivalent” views of neutrality, seeing it as “a slippery and elusive concept.”[20] The relative importance of neutrality to proponents varies, depending on its position vis-à-vis other library values: “When it is alone, or grouped with a simple, single other value like professionalism, it is very low in priority. When it is presented in a group of other values or left implicit, it fares better.”[21] Catalogers tended to espouse neutrality the least among library specializations, with 21% reporting that they never think about neutrality.[22] Further, some surveyed librarians “are more likely to eschew neutrality on matters of social justice,” when neutrality comes into conflict with core library values.[23]

Neutrality versus Social Justice

Since the late 1960s, neutrality has increasingly come into question as librarians have embraced ideals centering social justice, equity, diversity, and inclusion, particularly in the ALA.[24] These values, codified in the ALA Code of Ethics and Library Bill of Rights, include a commitment to “recognize and dismantle systemic and individual biases; to confront inequity and oppression; to enhance diversity and inclusion; and to advance racial and social justice in our libraries, communities, profession, and associations.”[25] ALA resolutions go a step further, acknowledging the “role of neutrality rhetoric in emboldening and encouraging white supremacy and fascism.”[26] Scott and Saunders sum up the issue, noting that while some librarians cast neutrality as a “fundamental professional value, albeit one that is not explicitly mentioned in the professional codes of ethics and values,” others assert that it is “a false ideal that interferes with librarians’ role of social responsibility, which is an explicitly stated value of librarianship.”[27] As Watson argues in an ALA 2018 Midwinter panel on neutrality in libraries, “We can’t be neutral on social and political issues that impact our customers because, to be frank, these social and political issues impact us as well.”[28]

Even among library codes of ethics that explicitly hold neutrality as a core value, there is a tension between practitioners and official documentation. For example, the Canadian Federation of Library Associations / Fédération canadienne des associations de bibliothèques (CFLA-FCAB) Code of Ethics calls for librarians to “promote inclusion and eradicate discrimination,” provide “equitable services,” and “counter corruption directly affecting librarianship”; but the Code also advocates for neutrality, advising librarians to “not advance private interests or personal beliefs at the expense of neutrality.”[29] Once again neutrality remains undefined—though it’s implied, based on context, to be not taking sides, matching one of the OED definitions above. This understanding accords with a 2024 study on Canadian librarians, which noted most Canadian academic librarians seem to have coalesced around defining neutrality as “not taking sides,” followed by “not expressing opinions.”[30]

Yet the same study also highlights a perceived incompatibility of neutrality with other values of librarianship, with “the majority (54%) of respondents” disagreeing or strongly disagreeing that “‘neutrality is compatible with other library values and goals,’” and 58% disagreeing “that it is ethical to be neutral.”[31] Brooks Kirkland asserts that assuming neutrality as a key tenet of librarianship conflicts with such principles as promoting inclusion and eradicating discrimination.[32] Pagowsky and Wallace note that, whether knowingly or not, upholding neutrality within inequitable systems ultimately supports them: “Trying to remain ‘neutral,’ by showing all perspectives have value … is harmful to our community and does not work to dismantle racism. As Desmond Tutu has famously said, ‘If you are neutral in situations of injustice, you have chosen the side of the oppressor.’”[33]

Cataloguing Code of Ethics, Critical Cataloging, and Other Recent Developments

The incongruity between neutrality and social justice as core library values has sparked the numerous debates detailed above and on mailing lists and social media. It has also led in part to the expansion of the critical cataloging movement and the creation of the Cataloguing Code of Ethics, published in 2021 and since adopted by several library organizations, including the ALA division Core. The Code explicitly refutes the concept of neutrality; it avers that “neither cataloguing nor cataloguers are neutral,” and calls out the biases inherent within the dominant, mostly Western cataloging standards currently in use. It particularly notes that “cataloguing standards and practices are currently and historically characterised by racism, white supremacy, colonialism, othering, and oppression.”[34]

The most well-known critical cataloging subject heading proposal was the attempt to change the now-defunct heading Illegal aliens, as depicted in the documentary Change the Subject. In November 2021, five years after LC initially announced it would change the Illegal aliens subject headings and then backtracked after political pressure, LC announced it would replace the subject headings Aliens and Illegal aliens. However, LC did not adopt the changes it had initially announced, nor the recommendations made in a report by the ALA Subject Analysis Committee (SAC), which included revising the term to Undocumented immigrants.[35] LC instead split Illegal aliens into two new headings: Noncitizens and Illegal immigration.[36] Librarians have criticized the retention of “illegal” within one of the updated headings for continuing to make library vocabularies “complicit” with the “legally inaccurate” criminalization of undocumented immigrants.[37]

Other critical cataloging proposals have been subjected to inordinate scrutiny by LC; even when headings have been approved, they have sometimes faced heavy editing and modification. One example is Blackface, where LC’s changes to the proposal obscured the racism characterizing the phenomenon. The broader term (i.e., the parent in the subject hierarchy) was altered from Racism in popular culture to Impersonation.[38] Since Impersonation falls under the broader terms Acting, Comedy, and Imitation, this change emphasizes the performance aspect in lieu of its racist connotations. Similarly, the scope note (i.e., definition), was modified from “Here are entered works on the use of stereotyped portrayals of black people (linguistic, physical, conceptual or otherwise), usually in a parody, caricature, etc. meant to insult, degrade or denigrate people of African descent” to “Here are entered works on the caricature of Black people, generally by non-Black people, through the use of makeup, mannerisms, speech patterns, etc.”[39] As noted by Cronquist and Ross, these changes ultimately “neutralize[d]” the proposal “in the name of objectivity.”[40]

However, there have also been numerous successful updates to outdated terminology and additions of missing concepts, particularly in recent years. For example, in 2021, fifteen subject headings for the incarceration of ethnic groups during World War II, including Japanese Americans, were changed from the euphemistic phrase –Evacuation and relocation to –Forced removal and internment.[41] The African American Subject Funnel added the new heading Historically Black colleges and universities in 2022 and helped to revise Slaves to Enslaved persons in 2023; the Gender and Sexuality Funnel successfully changed the heading Gays to Gay people, and proposed the new term Gender-affirming care, in 2023; and the Medical Funnel updated Hearing impaired to Hard of hearing people in 2024.[42]

On a hopeful note, many of these large-scale projects coordinated with Cataloging Policy Specialists within LC, who worked closely with catalogers during the process and ensured that related term(s) and related Library of Congress Classification number(s) were updated as well. Further, LC has taken some recent steps to improve its vocabularies and create avenues for increased input from outside institutions. This includes hiring a limited term Program Specialist to help redress outdated terminology related to Indigenous peoples. LC also created two advisory groups for Demographic Group Terms and Genre/Form Terms, both of which allow for greater community input into these vocabularies.

Still, frustrations remain. Changing outdated terminology is a complicated process. Library of Congress vocabularies, in particular, are vulnerable to potential governmental interference. Attempted Congressional intervention during the updating of Illegal aliens and the passing of a statute mandating transparency in the subject approval process led to the creation of “H 204” codifying LC’s preference for a neutrality uninvolved in political and social issues.[43] The complication of bibliographic file maintenance (e.g., reexamining cataloged materials to determine whether subject headings should be changed, deleted, or revised) also muddies the waters and impedes large-scale projects. Staffing issues within LC further hinder the ability to undertake or complete projects, as seen in the SACO projects process, paused in 2025 due to LC’s catalog migration.

Maintaining LCSH

Library workers are familiar with LCSH in our discovery tools, and most are aware of concerns about outdated and problematic headings. However, they may not see debates and conflicts about new headings and ongoing maintenance of the vocabulary as a built-in and inherent part of the system, as catalogers who engage in that work do.

As Gross asserts:

To remain effective, headings must be regularly updated to reflect current usage. Today’s LCSH People with disabilities used to be Handicapped and, before that, Cripples. Additionally, new concepts require new headings, such as the recently created Social distancing (Public health), Neurodiversity, and Say Her Name movement. The process of determining which word or phrase to use as the subject heading for a given topic is inevitably fraught and can never be free of bias. The choice of terms embodies various perspectives, whether they are intentional and acknowledged or not.[44]

Both the need to continually revise existing headings and create new ones, and indeed wrangling over what they should be, are not defects, nor a surprise. They flow directly from the purpose of controlled vocabulary and the complications of language it exists to help navigate—the ever-changing and endless variety of ways to refer to things.

Some of the frequency and intensity of debates about LCSH stem from the fact that it attempts to be a universal vocabulary that covers all branches of knowledge. While it is created and maintained primarily for the needs of the Library of Congress, it is used by all kinds of libraries. Balancing the need to serve a user base that consists of federal legislators and providing the world with a one-size-fits-all vocabulary is clearly a formidable and contradictory endeavor. In recent decades, LC has made significant progress in opening up the maintenance process to input and contributions from the broader library community via the SACO program. These changes appear to be partly in response to demands to make the process faster and more transparent, but also a desire by LC to incorporate broader perspectives and experiences and to help with the tremendous workload.

LCSH Creation and Revision Process

The SACO program, created circa 1993,[45] allows librarians to submit proposals for new or revised LCSH terms (as well as other LC vocabularies) to the Library of Congress. In order to submit proposals, catalogers are expected to be familiar with the Subject Headings Manual (SHM), which governs LCSH usage and formulations as well as the proposal process, required research, and criteria used to evaluate proposals.[46] One of the primary requirements is literary warrant: proposers must demonstrate that there is a need for the new subject heading based on a work being cataloged.[47] Beyond the work cataloged and published/reference sources, librarians can also cite user warrant, “the terminology people familiar with the topic use to describe concepts,” as justification in proposals.[48] This can include reviews, blog posts, social media threads, LibGuides, etc.

After a proposal is submitted, LC staff schedule it to a monthly “Tentative List,” which is published to allow for public comment on proposed headings. Taking those comments and SHM instructions into account, members of LC’s Policy, Training, and Cooperative Programs Division (PTCP) make a decision about whether to add the proposed heading to LCSH, send it back to the cataloger for revision and resubmission, or reject it. If the heading is not added, a monthly “Summary of Decisions” document details the reasons for its exclusion. While the SACO program allows external librarians to submit proposals, the Library of Congress maintains its “authority to make final decisions on headings added.”[49]

Most proposals are routine and relatively straightforward, such as those that follow patterns—repeated formulations of similar subjects that provide a predictable search structure for library patrons (e.g., Boating with dogs already exists and the cataloger wants to propose Boating with cats). SHM “H 180” notes that patterns help achieve desired qualities for the vocabulary, including “consistency in form and structure among similar headings.”[50] LC is also concerned with avoiding multiple subject headings that convey too closely related concepts. LCSH online training “Module 1.2” highlights both “consistency and uniqueness among subjects” as strengths of controlled library vocabularies, for instance.[51] Proposals that don’t follow patterns therefore receive more scrutiny, to make sure they are unique, definable topics. LC makes judgment calls based on the strength of the evidence in proposals, and on SHM instructions, including the guidance in “H 204” about neutrality.

Neutrality within LC Documentation

Within its official documentation on subject headings, LC mentions neutrality sparingly. In the entirety of the SHM, the word neutral appears only once, specifically in guideline “H 204” with the recommendation that catalogers “employ neutral (i.e., unbiased) terminology.”[52] Apart from an association with the term unbiased, neutral is not defined in “H 204” or anywhere else in the SHM. Online LCSH training, freely available from the Library of Congress website, offers similarly little on the concept of neutrality. “Module 1.4” recommends that catalogers “accept the idea that all knowledge is equal” and “remain neutral … and attempt to be as objective as possible” when describing material.[53]

Despite the lumping together of neutral and unbiased in “H 204,” a neutrality which calls for a static ignoring of social realities and historical context does not equal an unbiased active engagement against prejudice. The Merriam-Webster Dictionary’s definitions of “neutral” and “unbiased” make this clear. “Neutral” as “indifferent” and politically nonaligned echoes OED. But the definition of “unbiased” goes even further, meaning not just free from prejudice and “favoritism” but “eminently fair”[54]—an active and flexible balancing of interests inherently at odds with static and detached neutrality. Eliding the two concepts risks undermining the latter, and with it library ethics and values, resulting in the further entrenchment of Western, colonial, and other biases in LCSH.

The definition of neutrality that LC, and by extension LCSH, seems to favor is one of passivity. Neutrality as indifference to social realities appears, for instance, in LCSH training “Module 1.4.” The module acknowledges that library vocabularies “are culturally fixed” and “from a place; they are from a time; they do reflect a point of view.” However, rather than using that “realiz[ation]” to encourage periodic updating of outdated or potentially prejudicial content in LCSH, the module advises “accepting” that cultural fixity as immutable fact; it recommends that catalogers “remain neutral, suspend disbelief” and focus on (undefined) objectivity instead.[55] Objectivity also appears in “H 180,” which advises catalogers: “Avoid assigning headings that … express personal value judgments regarding topics or materials. … Consider the intent of the author or publisher and, if possible, assign headings … without being judgmental.”[56]

Here, as in “Module 1.4,” objectivity appears linked to neutrality; the implication is that a subject can only be described without bias if a cataloger is dispassionate and has no opinions on the topic. However, not all definitions of objectivity match this interpretation. Although OED defines objectivity as “detachment” and “the ability to consider or represent facts, information, etc., without being influenced by personal feelings or opinions,” Merriam-Webster’s definition is “freedom from bias” and a more actively equitable “lack of favoritism toward one side or another.”[57]

This disparity in meanings begs the question: What does it mean to describe a topic without judgment or bias? Is objectivity erasing any uncomfortable content in a topic, even if that erasure favors a biased status quo and/or muddies a topic’s meaning? Or, rather, is it objective to label something truthfully, even if the topic raises strong feelings? As demonstrated by the revisions to Blackface discussed above, changes to the scope note and broader term in the name of objectivity did not result in a clearer or less biased heading; instead, they obfuscated the racist intent behind the phenomenon.

Similarly, despite the assertion in “H 180,” a singular focus on authorial intent does not always result in a lack of bias or judgment in subjects. As noted by literary critics such as Wimsatt and Beardsley, “placing excessive emphasis on authorial intention [leads] to fallacies of interpretation,”[58] since readers only have access to the text in front of them; attempting to guess an author’s intent is already an act of judgment, not a discovery of objective facts. Further, if an author writes a prejudicial text, taking its content at face value risks replicating that bias through subject provision. LCSH terms such as Holocaust denial literature recognize and counter this, labeling Holocaust denial works as ones “that diminish the scale and significance of the Holocaust or assert that it did not occur.”[59] If catalogers relied strictly on authorial intent in the name of objectivity, those works would instead get misleading subjects such as Holocaust, Jewish (1939-1945) instead of Holocaust denial literature, tacitly legitimizing bias.

Thus, the SHM’s focus on objectivity and neutrality highlights incongruities and tensions within subject guidance and LCSH vocabulary itself between indifference and self-imposed inoffensiveness on the one hand, and actively countering bias and promoting equity on the other. As will be shown below, rejections in the name of neutrality reveal that in fact the proposal process itself has never been neutral or apolitical.[60]

Neutrality and SACO Rejections

LC’s adherence to an inflexible and indifferent definition of neutrality, critiquing proposals engaging with social and political realities as subjective and relying on value judgments, has led to the rejection of multiple headings that surface prejudice or describe the lives and experiences of marginalized peoples. Instead, rejections upholding neutrality reinforce hegemonic societal attitudes within LCSH.

Neutrality appears in several guises in proposal rejections in “Summaries of Decisions” from 2005 to 2025. The most obvious ones reference “H 204” and “neutral (i.e., unbiased) terminology,” including the 2008 rejection of Water scarcity and the 2024 rejection of White flight (discussed in more depth below).[61] Similar rejections use words such as “judgment” (including Negative campaigning in 2013, and Zombie firms in 2023); “pejorative” (e.g., Dive bars in 2010, and Banana republics in 2015); “vulgar and offensive” (such as Vaginal fisting and Anal fisting in 2010); “subjective” (such as African American successful people in 2009); “viewpoint” (including Jim Crow laws in 2019); and “non-loaded language” (e.g., Incarceration camps in 2024).[62]

Neutrality as non-involvement in political and social realities also appears in the rejection of proposals due to LC’s Policy, Training, and Cooperative Programs Division (PTCP)’s unwillingness to establish certain “patterns” of subject headings (i.e., set precedents for future headings of specific types). Pattern rejections often appear entirely arbitrary; that is, the rejections stated merely that PTCP did not wish to begin a pattern, and not that a proposal as formulated was missing vital elements, had no warrant, or did not conform to provisions stipulated in the SHM. Despite acknowledging in “Module 1.4” that the wrong subject heading “can make any resource in the collection ‘disappear,’”[63] these rejected patterns render certain topics invisible and unsearchable by library patrons.

Uncreated patterns include critiques of prejudicial attitudes and behaviors, particularly by governmental bodies, such as rejections of Prison torture in 2007 or Religious profiling in law enforcement in 2024.[64] Similarly, patterns that would have highlighted the unearned privilege and/or bigotry of certain groups remain largely unestablished, including Holocaust deniers (2016), Toxic masculinity (2020), and White privilege (rejected in 2011 and 2016, before finally being accepted as White privilege (Social structure) in 2022).[65] The rejection of White fragility in 2020 is particularly interesting, as the rationale was that “LCSH does not include any headings that ascribe an emotion or personality trait to a specific ethnic group or race, and the meeting does not want to begin the practice.”[66] However, LCSH has included since 2010 the heading Post-apartheid depression, meant to convey the mental health and feelings of white Afrikaners. So not all white people’s emotions appear off-limits—just ones that reveal systemic biases. PTCP also declined to create patterns naming discrimination directed at certain groups, such as Police brutality victims in 2014 and Missing and murdered Indigenous women in 2023.[67] In the latter case, the rejection of a term meant to highlight societal neglect of the violence against Indigenous peoples means that their existence and trauma continue to be hidden in library vocabularies and catalogs.

Pattern rejections not only make prejudices invisible in library catalogs, they also underrepresent concepts that celebrate or describe the cultures and experiences of marginalized peoples. Erasures of joy can be as damaging as erasures of struggle. Aronson, Callahan, and O’Brien’s discussion of themes related to people of color in picture books, for instance, could equally apply to messages portrayed in LCSH via what topics it hides or surfaces in library catalogs: a “predominance of Oppression … at the expense of other types of portrayals can send a message that suffering and struggle are definitive of a group’s experience, or even of victimhood.”[68] Instead, marginalized people “deserve to see themselves represented as people who lead full and dynamic lives and who are not fully defined by histories of oppression.”[69] Unaccepted subject headings of this type include African American successful people (2009), Overweight women’s writings (2011), Gay neighborhoods and Lesbian neighborhoods (2012), Gay personals (2018), Afro-pessimism (2021), and Indigenous popular culture (2024).[70] 

Absorbing a proposed critical term into a supposed “positive” equivalent also served to preserve an inoffensive neutrality in LCSH; this is seen in the rejection of Food deserts in 2014:

The concept of food desert has been defined in multiple ways by various governments and organizations, often in ways to suit their specific political agendas … The existing heading Food security is defined as access to safe, sufficient, and nutritious food. The existing heading is used for both the positive and negative (it has a UF [cross-reference for] Food insecurity), and the meeting feels that it adequately covers the concept of a food desert.[71]

Similarly, LC rejected a proposal for Genocide denial in 2017 with the rationale that the “positive” heading—Genocide—was sufficient for patron access: “A heading for a concept in LCSH includes both the positive and negative aspects of that topic. A work about the denial of genocide still discusses the concept of Genocide.”[72] Slum clearance was also rejected in 2007 in favor of the euphemistic and supposedly equivalent Urban renewal.[73]

Sometimes rejections upholding neutrality appeared in the guise of a fear that the term might be misapplied. For instance, although LC acknowledged in its 2019 rejection of Jim Crow laws and Jim Crow (Race relations) that the headings described laws and attitudes promulgated during a specific time period—which could therefore be described in a scope note guiding subject usage—it claimed that “the meeting is also concerned that the heading would be assigned only if the phrase Jim Crow is used in the title.”[74] In other words, the rejection prioritized avoiding possible future confusion over a definable term with ample literary and user warrant. The potential for definitional uncertainty also fueled other rejections, such as Femicide and Secret police in 2010, and Forced assimilation in 2024.[75] To preempt said confusion in all of these cases, LC could have added scope notes defining appropriate usage. Subjects have been remediated in the past when found to be misused, via clarifying scope notes or additional term creation, as with Romance literature (now Romance-language literature) versus Love stories (now Romance fiction).[76] Instead of denying the proposal due to a fear that a term might be misapplied, LC could have worked with the proposers to ensure the heading clearly defined the topic and, if necessary, made a public announcement with additional guidance on how to retrospectively add the term.

Overly-limiting definitions of subjects also provided reasoning for neutrality-based proposal rejections. An attempt in 2011 to add the natural language phrase Queer-bashing as a cross-reference under the then-current heading Gays–Violence against, for example, was rejected with the justification that “queer-bashing is not necessarily violent.”[77] Intersexuality–Law and legislation, a heading reflecting ongoing debates about genital surgeries on infants and legally-recognized genders, was rejected in 2016 because “The subdivision –Law and legislation free-floats [i.e., can be used] under ‘headings for individual or types of diseases and other medical conditions, including abnormalities, functional disorders, mental disorders, manifestations of disease, and wounds and injuries’ (SHM H 1150).”[78] The medicalizing language of the rejection reinforced the view of intersexuality as a “condition” or “disorder” needing fixing, rather than the natural human diversity of a group struggling for bodily autonomy and human rights. The rejection of Redlining in 2024 also fits this definitional pattern. Despite acknowledging that Redlining “functioned in many different financial contexts,” LC’s rejection implied that redlining’s definition was too broad, as LC preferred “the specificity of … separate headings.”[79] This continues to fracture the topic into multiple subjects such as Discrimination in financial services, Discrimination in mortgage loans, and Discrimination in credit cards. The rejection also sidestepped notions of governmental complicity in redlining, and whitewashed the topic by making it appear less systemic in nature.

Purported limitations of the vocabulary also served as justification for rejecting proposals and upholding LCSH neutrality. For instance, Butch/femme (Gender identity) was deemed “too narrow and specialized for a general vocabulary such as LCSH” in 2011 (though Butch and femme (Lesbian culture) was later approved in 2012)[80]—this, despite the copious presence of narrow terms in LCSH about other topics, such as Madagascar hissing cockroaches as pets, Photography of albatrosses, Church work with cowgirls and Zariski surfaces. Anal fisting and Vaginal fisting were rejected with the same rationale in 2010 (in addition to the “vulgar and offensive” argument described above).[81] Two rejections utilizing the same reasoning raise the question of whether queer cultures and identities were evaluated using particularly stringent criteria. As one librarian noted in the RADCAT mailing list after the rejection of Butch/femme (Gender identity):

This is especially baffling given that Bears (Gay culture) has been a valid subject heading for years, and both concepts have about the same amount of literary warrant. For those of you keeping track at home, this isn’t the first example of this rejection. During The Great Fisting Debacle of 2010 … the Anal fisting and Vaginal fisting proposals were shot down using the same language. I haven’t seen PSD [the prior name for PTCP] rejecting scientific or technical heading proposals as too specialized, which makes me wonder if it’s only gender & sexuality-related headings that receive this type of scrutiny.[82]

Troublingly, rejections for queer identities have continued since LC resumed processing tentative lists in January 2025, particularly for queer youth proposals. The rejection of Sexual minority high school students, for instance, indicates potential deference to current governmental queerphobia, particularly since the phrase “At this time” prefaces the justification: “At this time, it is not desirable to qualify headings for this age group by gender identity or expression/sexual orientation.” LC’s suggestion that instead “[t]erms from other subject vocabularies such as Homosaurus may be used instead of, or in conjunction with, existing LCSH headings to express the topic” suggests that there is no place for queer youth identity headings within LCSH.[83]

Finally, proposals were rejected in favor of maintaining pre-existing biases in LCSH–the cultural fixity mentioned in LCSH training “Module 1.4.”[84] For instance, a 2015 rejection of a change proposal related to Indigenous peoples–South Africa highlighted in its rationale the scope note for Indigenous peoples defining them entirely in relation to colonial power: “Here are entered works on the aboriginal inhabitants either of colonial areas or of modern states where the aboriginal peoples are not in control of the government.”[85] Sometimes, even the longevity of a term within LCSH was treated as sufficient reason to reject proposals meant to update outdated and inequitable terms, as with the 2020 rejection of a proposed change from Juvenile delinquents to Juvenile prisoners: “The existing heading Juvenile delinquents has been used for this concept for many years. At this point, it would be practically impossible to examine the entire file so the new heading could be applied accurately. The heading Juvenile delinquents should be assigned instead.”[86] This hesitance to tackle large projects because of the labor required for bibliographic file maintenance perpetuates the tendentious language present in LCSH and reinforces the view that the proposal process is itself not neutral.

Case Study: White Flight

In 2024, the African American Subject Funnel Project submitted a subject proposal for White flight. The proposal cited Kruse’s book White Flight: Atlanta and the Making of Modern Conservatism to demonstrate literary warrant. It additionally cited three reference sources—Encyclopedia of African-American Politics, The New Encyclopedia of Southern Culture, and Wikipedia—in order to define the term and demonstrate that it is commonly used by scholars and the public.

  • [Proposed Heading]: White flight
  • [Variant Term]: White exodus
  • [Broader Term]: Migration, Internal
  • [Broader Term]: Race relations
  • [Broader Term]: White people–Migrations
  • [Related Term]: Segregation
  • [Source]: Kruse, K.M. White flight, ©2005: summary (In this reappraisal of racial politics in modern America, Kevin Kruse explains the causes and consequences of “white flight” in Atlanta and elsewhere) page 5 (In 1963 alone, there were 52 cases of “racial transition,” incidents in which whites fled from neighborhoods as blacks bought homes there; a steady stream of white flight had been underway for nearly a decade)
  • [Source]: Encyclopedia of African-American politics, 2021 (“White flight” is the term used to refer to the tendency of whites to flee areas and institutions once the percentage of blacks reaches a certain level)
  • [Source]: The new encyclopedia of southern culture, 2010 (The term “white flight” refers to the spatial migration of white city dwellers to the suburbs that took place throughout the United States after World War II. One of the most powerful and transformative social movements of the 20th century, white flight significantly affected the class and racial composition of cities and metropolitan areas and the distribution of a conservative postwar political ideology)
  • [Source]: Wikipedia, 16 Oct. 2023 (White flight or white exodus is the sudden or gradual large-scale migration of white people from areas becoming more racially or ethnoculturally diverse. Starting in the 1950s and 1960s, the terms became popular in the United States; examples in Africa, Europe, and Oceania as well as the United States)

However, LC rejected White flight with the following rationale: “LCSH does not currently have an established pattern that combines the topic of migration with the social reasoning for that migration. The meeting was concerned that introducing such a pattern, particularly in this case, would contradict the practice in LCSH of preferring neutral, unbiased terminology as stated in SHM H 204 sec. 2.”[87]

After this Summary of Decisions was issued, librarians on the SACOLIST mailing list publicly disagreed with the rejection and pointed out the flaws in LC’s argument. One poster highlighted the fact that the term was in common use and searched for by library patrons; they also noted another heading already in LCSH that fit the pattern PTCP claimed didn’t exist:

According to H 204 Section 2, the proposed heading should “reflect the terminology commonly used to refer to the concept,” which I believe is the case with this term. Additionally, the same section of H 204 asks, “Will the proposed revision enhance access to library resources? Would library users find it easier to discover resources of interest to them if the proposed change were to be approved?” Again, if this phrase is commonly used by patrons, it would make sense to add it to our catalogs … You wrote that “LCSH does not currently have an established pattern that combines the topic of migration with the social reasoning for that migration.” Could someone explain why Great Migration, ca. 1914-ca. 1970 doesn’t fit this pattern? Is it because of the date range and that this is a specific event?[88]

Another librarian emphasized the ongoing importance of white flight, the prevalence of literature discussing it, and the unequal treatment of headings describing different groups in LCSH:

The differences between these proposals from my perspective seems to be that one describes African Americans and the other describes White people, and White flight is an ongoing concept rather than a single historical event. I hope PTCP reconsiders this decision, because the effects of White flight and the practices surrounding it shape racial inequality in the United States and in many other countries in the world. Many works describe White flight and its consequences … and users are familiar with the term and want to find works about it.[89]

Finally, a respondent noted yet another term matching the supposedly non-existent pattern: “The existing heading Amenity migration would also appear to provide a pattern combining the topic of migration with the social reasoning for that migration.”[90]

Despite these arguments, LC did not respond to the mailing list discussion nor change its decision. As White flight had literary warrant, was amply supported by reference sources, and was a concept that could not be accurately conveyed using already existent subject headings, why was PTCP concerned about neutrality “particularly in this case”? Even governmental entities as varied as the Supreme Court, the U.S. Commission on Civil Rights, the National Register of Historic Places, and LC itself use the term white flight. The rejection’s insistence on the need for uninvolved neutrality therefore seemed inconsistent with the widespread acceptance of the term.

Instead, the neutrality justification appears to be a smokescreen to cover up discomfort with a term that called out white racism; mandating neutrality in this case meant privileging being inoffensive to white people over acknowledging a widely accepted critique of systemic racism. Patton notes in her Substack post “White People Hate Being Called ‘White People’” that whiteness functions in part by invisibility, a “retreat into universalism where whiteness can dissolve back into ‘humanity’ and avoid accountability.”[91] Rejecting the proposal may have been a neutral decision (i.e., deliberately unobjectionable and indifferent to political and social realities), but it was certainly not unbiased (i.e., free from favoritism). Instead, it conceptually reinforced the false position of whiteness described by Patton as “the default, neutral, objective, and moral”[92]—thus undermining equity in LCSH and making works on this important topic invisible and unsearchable in library catalogs.

Discussion

Chiu, Ettarh, and Ferretti describe the futility of relying on neutrality to further social justice within librarianship and its vocabularies:

When the profession discusses neutrality, we believe that the profession actually seeks equity. However, neutrality will not yield equitable results and will always fall short because it relies on equity already existing in society. This is not the condition of our current society, nor is it true for the profession. Therefore, neutrality will actually work toward reinforcing bias and racism.[93]

The rejection of White flight illustrates this point aptly. Justifying the rejection by invoking neutrality means that practically speaking being neutral equates to whitewashing the ongoing phenomenon, by pretending that the movement of white people in the United States is entirely benign, divorced from racism, and not worth library or library user attention. What are the long-term consequences of privileging neutrality, as opposed to equity, in the subject approval process? Neutrality as political isolationism and mandated inoffensiveness leads, as seen in the rejections from 2005 through 2024, to suppressing political and social critiques, hiding prejudice, and rendering the lived experiences of marginalized groups invisible.

It is unfortunately far too easy to weaponize a neutrality that gives equal weight to what groups such as racists and antisemites intend when evaluating proposals. A SHM instruction created in late 2024, “H 1922,” further embeds this weaponization within subject guidance. “H 1922” defines “offensive words” as “derogatory terms that insult, disparage, offend, or denigrate people according to their race, ethnicity, nationality, religion, gender identity, sexuality, occupation, social views, political views, etc.”[94] By including political and social views in the definition, LC inaccurately equates groups espousing opinions about how people should behave in society with demographic groups who have historically been marginalized merely for existing. This leaves LCSH vulnerable to political actors disingenuously claiming “offense” to silence critiques or establish prejudicial terms within the vocabulary. A recent example of this was the proposal to change Trans-exclusionary radical feminism into Gender-critical feminism, the obfuscatory label preferred by the transphobic group, by claiming that trans-exclusionary radical feminism was a slur.[95] (LC ultimately rejected the proposal, thanks in large part to “community activism” and mobilization opposing the change.[96] LC specifically mentioned library community input as the rationale for the rejection: “When this tentative list was published in November 2024, PTCP received over 300 email comments demanding rejection of this proposal.”[97])

There is ample evidence from the recent past and present of this weaponization of offense being used to undermine progress toward equity in the United States. The Trump administration’s proposed Compact for Academic Excellence in Higher Education (2025) exemplifies the dangers of privileging neutrality over equity. The Compact demands “institutional neutrality,” requiring that universities and their employees “abstain from actions or speech relating to societal and political events except in cases in which external events have a direct impact upon the university.” Those agreeing to this isolationist neutrality, in the meantime, would also agree to erase trans, non-binary, and intersex students, faculty, and staff, and to police and punish speech deemed offensive to conservatives. Notably, the Compact requires that admissions be based on “objective” criteria—except for explicitly-allowed faith, “sex-based,” and anti-immigrant biases.[98]

Mandated neutrality within “H 204” risks reifying the same prejudices within library vocabularies. This can be seen in LC’s recent alteration of Mexico, Gulf of to America, Gulf of, and Denali, Mount (Alaska) to McKinley, Mount (Alaska).[99] Critical cataloger Berman describes the former change as “linguistic imperialism,” and the latter as an “affront to Alaska’s indigenous population.”[100] The latter change is particularly damaging, given the simultaneous effort by LC to remediate LCSH related to Indigenous peoples, and might undermine confidence in the project. In both cases, a neutral approach—remaining uninvolved in political and social events—led to an undue “deference to chauvinistic, ethnocentric, and unjustified authority.”[101] Whether LC realistically could have resisted altering these headings is a counterfactual hypothetical. Its actions must be judged by the effects of these revisions within library catalogs and for library patrons. By clinging to the illusion of neutrality, and capitulating to the whims of a racist and colonialist regime, LC undermined the profession’s stated values and harmed the larger library community.

Recommendations

What philosophical approach can LC take in lieu of neutrality, to bring the SACO process more in concert with library ideals of equity and egalitarianism? We recommend that LC employ a values-driven approach to vocabulary construction and maintenance. Explicitly stated library values—particularly around social justice and social responsibility—benefit all users, both marginalized peoples and the “mainstream.” Further, the PCC Policy Committee, of which LC is a permanent member, has already committed to the PCC Guiding Principles for Metadata, which acknowledge that “the standards and controlled vocabularies we use and their application are biased,” and advocates for “incorporating DEI principles in all aspects of cataloging work.”[102] Below, we suggest a number of changes LC could enact to make LCSH and the proposal process more equitable.

In backing away from neutrality as a guiding principle, philosophical approaches that have been suggested in critiques of traditional practice deserve consideration. In her chapter in Questioning Library Neutrality, Iverson proposes that librarians adopt feminist philosopher Haraway’s approach to objectivity: “Haraway explains that what we have accepted as ‘objectivity’ claims to be a vision of the world from everywhere at once … We can not see from all perspectives at once, we each have our own particular views that are shaped by our own identities, cultures, experiences, and locations.”[103] Instead of claiming to possess “infinite vision,” Iverson recommends that we adopt Haraway’s recognition of “situated knowledge.”[104]

Watson argues that instead of literary or bibliographic warrant (cataloging a book in hand, asking what subject headings are needed to convey its content), critical catalogers “operate from a position of catalogic warrant, reading the terms and hierarchies of cataloging and classification systems with a critical eye, reflecting on the potential benefit or harm of each term on marginalized users, groups, or the GLAMS [galleries, libraries, archives, and museums] community as a whole.”[105] In other words, librarians should focus on the subject heading system in its entirety, asking what revisions and additions are needed. In some ways, by collaborating with SACO funnels on large-scale projects to create and revise related groups of subject headings, LC has already moved away from strict adherence to an interpretation of literary warrant that considers the only valid reason to propose a subject heading having a book in hand that requires it. This shift should be continued and expanded.

As for concrete actions, we advise that LC restore its open monthly subject editorial meetings where proposals are discussed, and to expand points of communication with external libraries. This would allow a more diverse range of librarians to participate in the SACO process and provide valuable input during decision-making. Other benefits of monthly meetings have been noted by SACO librarians in an open letter to PTCP: they helped to demystify “the SACO process” for the newly-involved; and allowed librarians to contribute to “lively conversations on a broad range of options, and the opportunity to shape the vocabularies we all use, from proposing single headings to creating special lists to debating new guidelines for topical subdivisions.”[106]

Building off of this, we suggest creating an external advisory group for LCSH, similar to the ones for LCDGT and LCGFT, to get input from a broader range of users on proposal vetting and vocabulary maintenance. Further, we urge LC to allow greater decision-making power for external librarians in all advisory groups. This would help LC vocabularies better reflect the resources in the Library of Congress collections and the needs of thousands of libraries of different types around the world, and improve accountability for decisions made regarding proposals. It would also help to better insulate library vocabularies from the governmental interference noted above, by making a broad range of institutions responsible for their creation and maintenance.

Within such bodies, we recommend that LC follow guidance from the SAC Working Group on External Review of LC Vocabularies, by including members from groups being described in those vocabularies, subject matter experts, and international representatives. Furthermore, membership should not include “[r]epresentatives from groups or organizations that purport to speak for marginalized communities, but who exclude the voices of members of the marginalized community,” or “[r]esearchers or representatives from groups or organizations where the experts cause harm to members of marginalized communities.”[107] The inclusion of representative groups aligns with the PCC Guiding Principles for Metadata and follows the principles put forth in the Cataloguing Code of Ethics.

In vetting SACO proposals, “LC should prioritize sources from the peoples and communities described, privileging those sources over traditionally ‘authoritative’ sources, including literary warrant,” to ensure that the terminology used “reflect[s] a more inclusive and culturally relevant understanding of the language associated with these groups and their heritage and history.”[108] The creation of a position within LC focused on remediating metadata related to Indigenous peoples was a good first step in this direction; and we strongly encourage LC to both continue and expand this practice.

Finally, we suggest revisions to various LC documents and SHM instruction sheets. References to neutrality should be removed from “H 204” and “Module 1.4,” in favor of a focus on active equity in subject assignment and proposals. Examples of unbiased terminology, created in concert with advisory groups described above, reflecting a variety of situations, and periodically updated, would help create a shared understanding between librarians proposing headings and those evaluating them for inclusion in LCSH. “H 180” and “Module 1.4” should also be edited, in the sections advising catalogers to remain objective and not “express personal value judgments.”[109] All cataloging relies on judgment, and judgment is not always synonymous with bias or divorced from facts. A more useful focus here, as in a revised “H 204,” would be on the active equity present in Merriam-Webster’s definition of objectivity; catalogers should employ “catalogic warrant” and evaluate the “potential benefit or harm”[110] of subjects, particularly when assigning headings to prejudicial works. Finally, in order to protect against weaponized “offense,” we also recommend that “social views” and “political views” be removed from “H 1922.” These alterations would bring the SHM and LCSH training more in line with LCDGT guidance, which foregrounds cataloging ethics. “L 400,” for instance, notes that “naming demographic groups and identifying individuals as members of those groups must be done with accuracy and respect,” and highlights the importance of self-identification when assigning headings.[111]

We cannot make recommendations on this topic without addressing the current political climate. Because LC’s catalog migration put most SACO work on hold during 2025,[112] the effect of the Trump administration’s anti-DEI policies on LCSH remains uncertain. However, United States history is rife with periods of political repression. Waiting until relative calm to advocate for equity has not been, historically, how equity was advanced; and it will not serve library patrons or the broader community in the present moment.

Conclusion

LCSH began over a century ago as a subject cataloging tool for the Library of Congress, and has since evolved into a vocabulary serving thousands of libraries around the world. Despite the broad and diverse user base, LC has remained the sole arbiter of which proposals are accepted into LCSH and what form the headings take. During the last two decades it has rejected a number of subject proposals due to a preference for purported neutrality and objectivity, in various guises. Yet, as a profession, librarianship claims to prioritize social responsibility. Social justice and equity are incompatible with an indifferent and purposefully inoffensive neutrality that allows harmful, colonialist, and racist headings in LCSH, and keeps out headings describing prejudice, or about the lived experiences of marginalized peoples.

Olson describes LCSH as “a Third Space between documents being represented and users retrieving them,” since “LCSH constructs the meanings of documents for users.”[113] These meanings impact how users view materials, and whether they can locate them in library catalogs. And it is within this space that LC’s commitment to neutrality fails both users and the ideals of librarianship around social responsibility. However, “because the Third Space is one of ambivalence, it is one with potential for change.”[114] By focusing on library values rather than neutrality within the subject creation and approval process, LCSH could develop into a vocabulary that constructs truly equitable and inclusive meanings for users and librarians alike.

Acknowledgements

Thank you to our publishing editor, Jess Schomberg, and the editorial board for their flexibility, guidance, and expertise throughout the publication process. Thank you to K.R. Roberto, Margaret Breidenbaugh, Crystal Yragui, and Matthew Haugen, who allowed us to quote them within this article. We would also like to thank our reviewers, Jamie Carlstone and Ian Beilin, and other readers who gave valuable feedback: Adam Schiff, Rebecca Albitz, Chereeka Garner, Rebecca Nowicki, Naomi Reeve, Simone Clunie, Violet Fox, and Stephanie Willen Brown.


[1] Robert Jensen. “The Myth of the Neutral Professional,” in Questioning Library Neutrality, ed. Alison Lewis (Library Juice Press, 2008), 91.

[2] Library of Congress, “H 204: Evaluating Subject Proposals,” in Library of Congress Subject Headings Manual, Aug. 2025 rev. (Library of Congress, 2025), 2, https://www.loc.gov/aba/publications/FreeSHM/H0204.pdf (original: https://web.archive.org/web/20180524054119/https://www.loc.gov/aba/publications/FreeSHM/H0204.pdf

[3] Throughout this article, authorized subject headings (i.e., those that exist currently in LCSH) are presented in bold font; while rejected proposed headings appear in italics. For consistency, subject headings within quotations will follow the same formatting, regardless of the formatting used in the original quotation.

[4] Library of Congress, “Summary of Decisions, Editorial Meeting Number 10” (Library of Congress, 2013), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-131021.html; Library of Congress, “Summary of Decisions, LCSH/LCC Editorial Meeting Number 02 (2024)” (Library of Congress, 2024), https://www.loc.gov/aba/pcc/saco/cpsoed/ptcp-2402.pdf.

[5] Post-coordination is the practice of using multiple, separate LCSH terms in combination to convey a single concept.

[6] Library of Congress, “Summary of Decisions, Editorial Meeting Number 4” (Library of Congress, 2008), https://www.loc.gov/aba/pcc/saco/cpsoed/cpsoed-080123.html.

[7] See the manuals for Genre/Form Terms, Demographic Group Terms, and Children’s Subject Headings, for instance.

[8] Gina Schlesselman-Tarango, “How Cute!: Race, Gender, and Neutrality in Libraries,” Partnership: The Canadian Journal of Library and Information Practice and Research 12, no. 1 (Aug. 2017): 10, https://doi.org/10.21083/partnership.v12i1.3850.

[9] Maura Seale, “Compliant Trust: The Public Good and Democracy in the ALA’s ‘Core Values of Librarianship,’” Library Trends 64, no. 3 (2016): 589, https://doi.org/10.1353/lib.2016.0003.

[10] American Library Association Working Group on Intellectual Freedom and Social Justice, “Final Report from the Intellectual Freedom and Social Justice Working Group” (EBD #10.0, American Library Association, 2022), 10, https://www.ala.org/sites/default/files/aboutala/content/governance/ExecutiveBoard/20222023Docs/ebd%2010.0%20IF_SJ%20Final%20Report%207.12.2022.pdf.

[11] International Federation of Library Associations and Institutions, “IFLA Code of Ethics for Librarians and other Information Workers,” 4, https://www.ifla.org/wp-content/uploads/2019/05/assets/faife/publications/IFLA%20Code%20of%20Ethics%20-%20Long_0.pdf.

[12] National Information Standards Organization, Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies, ANSI/NISO Z39.19-2005 (R2010) (National Information Standards Organization, 2010), 30, https://groups.niso.org/higherlogic/ws/public/download/12591/z39-19-2005r2010.pdf.

[13] National Information Standards Organization, Guidelines, 44.

[14] Oxford English Dictionary, “Neutral,” https://www.oed.com/dictionary/neutral_n?tab=meaning_and_use#34680278 and “Unbiased,” https://www.oed.com/dictionary/unbiased_adj?tab=meaning_and_use#17025200.

[15] Dani Scott and Laura Saunders, “Neutrality in Public Libraries: How Are We Defining One of Our Core Values?,” Journal of Librarianship and Information Science 53, no. 1 (2020): 153, https://doi.org/10.1177/0961000620935501.

[16] Scott and Saunders, “Neutrality in Public Libraries,” 158.

[17] “Are Libraries Neutral? Highlights from the Midwinter President’s Program,” American Libraries, June 1, 2018. https://americanlibrariesmagazine.org/2018/06/01/are-libraries-neutral/

[18] Michael Dudley, “Library Neutrality and Pluralism: A Manifesto,” Heterodoxy in the Stacks, Aug. 8, 2023 https://hxlibraries.substack.com/p/library-neutrality-and-pluralism.

[19] Mark Rosenzweig. “Politics and Anti-Politics in Librarianship,” in Questioning Library Neutrality, ed. Alison Lewis (Library Juice Press, 2008), 5-6.

[20] Stephen Macdonald and Briony Birdi, “The Concept of Neutrality: A New Approach,” Journal of Documentation 76, no. 1 (2020): 333–353. https://doi.org/10.1108/JD-05-2019-0102.

[21] Jaeger-McEnroe, “Conflicts of Neutrality,” 3.

[22] Jaeger-McEnroe, “Conflicts of Neutrality,” 6.

[23] Jaeger-McEnroe, “Conflicts of Neutrality,” 9.

[24] Steve Joyce, “A Few Gates Redux: An Examination of the Social Responsibilities Debate in the Early 1970s and 1990s,” in Questioning Library Neutrality, ed. Alison Lewis (Library Juice Press, 2008), 33-65.

[25] “ALA Code of Ethics,” American Library Association, updated June 29, 2021, https://www.ala.org/tools/ethics

[26] “Resolution to Condemn White Supremacy and Fascism as Antithetical to Library Work,” American Library Association, Jan. 25, 2021, https://tinyurl.com/yr4z9e8x

[27] Scott and Saunders, “Neutrality in Public Libraries,” 153.

[28] “Are Libraries Neutral?”

[29]Canadian Federation of Library Associations / Fédération canadienne des associations de bibliothèques, “CFLA-FCAB Code of Ethics,”updated Aug. 27, 2018, https://cfla-fcab.ca/wp-content/uploads/2019/06/Code-of-ethics.pdf.

[30] Jaeger-McEnroe, “Conflicts of Neutrality,” 5.

[31] Jaeger-McEnroe, “Conflicts of Neutrality,” 5, 6.

[32] Anita Brooks Kirkland, “Library Neutrality as Radical Practice,” Synergy v. 19, no. 2 (Sept. 2021) https://www.slav.vic.edu.au/index.php/Synergy/article/view/536.

[33] Nicole Pagowsky and Niamh Wallace, “Black Lives Matter!: Shedding Library Neutrality Rhetoric for Social Justice,” College & Research Libraries News 76, no. 4 (2015): 198. https://crln.acrl.org/index.php/crlnews/article/view/9293/10374.

[34] Cataloging Ethics Steering Committee, “Cataloguing Code of Ethics,” January 2021,  http://hdl.handle.net/11213/16716.

[35] Subject Analysis Committee Working Group on the LCSH “Illegal aliens,” “Report from the SAC Working Group on the LCSH ‘Illegal aliens,'” July 13, 2016, https://alair.ala.org/handle/11213/9261.

[36] Jill E. Baron, Violet B. Fox, and Tina Gross, “Did Libraries ‘Change the Subject’? What Happened, What Didn’t, and What’s Ahead,” in Inclusive Cataloging: Histories, Context, and Reparative Approaches, eds. Billey Albina, Rebecca Uhl, and Elizabeth Nelson (ALA Editions, 2024), 53; Library of Congress, “Library of Congress Subject Headings Approved Monthly List 11 (November 12, 2021)” (Library of Congress, 2021), https://classweb.org/approved-subjects/2111b.html.

[37] Baron et al., “Did Libraries ‘Change the Subject?,’” 54.

[38] Michelle Cronquist and Staci Ross, “Black Subject Headings in LCSH: Successes and Challenges of the African American Subject Funnel Project,” Reference and User Services Association, July 7, 2021, virtual. https://d-scholarship.pitt.edu/41826

[39] Cronquist and Ross, “Black Subject Headings in LCSH.”

[40] Cronquist and Ross, “Black Subject Headings in LCSH.”

[41] Library of Congress, “Library of Congress Subject Headings Approved Monthly List 06 (June 18, 2021)” (Library of Congress, 2021), https://classweb.org/approved-subjects/2106.html. Note the headings for Japanese Americans, Japanese Canadians, and Aleuts were originally submitted as –Forced removal and incarceration matching preferred usage, but LC changed them all to –Forced removal and internment.

[42] Library of Congress, “Library of Congress Subject Headings Approved Monthly List 08 (August 12, 2022)” (Library of Congress, 2022), https://classweb.org/approved-subjects/2208.html; Library of Congress, “Library of Congress Subject Headings Approved Monthly List 08 LCSH 2 (August 18, 2023)” (Library of Congress, 2023), https://classweb.org/approved-subjects/2308a.html; Library of Congress, “Library of Congress Subject Headings Approved Monthly List 04 (Apr. 21, 2023)” (Library of Congress 2023),https://classweb.org/approved-subjects/2304.html; Library of Congress, “Library of Congress Subject Headings Approved Monthly List 03 LCSH 2 (March 15, 2024)” (Library of Congress, 2024), https://classweb.org/approved-subjects/2403a.html.

[43] For more information about Congressional actions related to the attempt to change Illegal aliens, see: SAC Working Group on Alternatives to LCSH “Illegal aliens,” “Report of the SAC Working Group on Alternatives to LCSH ‘Illegal aliens’” (American Library Association, 2020), http://hdl.handle.net/11213/14582.

[44] Tina Gross, “Search Terms up for Debate: The Politics and Purpose of Library Subject Headings,” Perspectives on History 60, no. 3 (2022), https://www.historians.org/perspectives-article/search-terms-up-for-debate-the-politics-and-purpose-of-library-subject-headings-march-2022/.

[45] Michael Colby, “SACO: Past, Present, and Future,” Cataloging & Classification Quarterly 58, no. 3-4 (2020): 287, https://doi.org/10.1080/01639374.2019.1706679.

[46] Library of Congress Subject Headings Manual, Aug. 2025 rev. (Library of Congress, 2025), https://www.loc.gov/aba/publications/FreeSHM/freeshm.html.

[47] Library of Congress, “Module 1.5: Introduction to LCSH,” in Library of Congress Subject Headings: Online Training (Library of Congress, 2016), 8, https://www.loc.gov/catworkshop/lcsh/PDF%20scripts/1-5%20Intro%20To%20LCSH.pdf.

[48] Rich Gazan, “Cataloging for the 21st Century Course 3: Controlled Vocabulary & Thesaurus Design Trainee’s Manual” in Library of Congress Cataloger’s Learning Workshop (Library of Congress, n.d.), 2-2,

https://www.loc.gov/catworkshop/courses/thesaurus/pdf/cont-vocab-thes-trnee-manual.pdf

[49] Library of Congress, “H 204,” 3.

[50] Library of Congress, “H 180: Assigning and Constructing Subject Headings,” in Library of Congress Subject Headings Manual, Feb. 2016 rev. (Library of Congress, 2016), 8, https://www.loc.gov/aba/publications/FreeSHM/H0180.pdf.

[51] Library of Congress, “Module 1.2: Why Do We Use Controlled Vocabulary?,” in Library of Congress Subject Headings: Online Training (Library of Congress, 2016), 7, https://www.loc.gov/catworkshop/lcsh/PDF%20scripts/1-2-WhyCV.pdf.

[52] Library of Congress, “H 204,” 2.

[53] Library of Congress, “Module 1.4: How Do We Determine Aboutness?,” in Library of Congress Subject Headings: Online Training (Library of Congress, 2016), 3, https://www.loc.gov/catworkshop/lcsh/PDF%20scripts/1-4-Aboutness.pdf.

[54] Merriam-Webster Dictionary, “Neutral,” https://www.merriam-webster.com/dictionary/neutral and “Unbiased,” https://www.merriam-webster.com/dictionary/unbiased.

[55] Library of Congress, “Module 1.4,” 3.

[56] Library of Congress, “H 180: Assigning and Constructing Subject Headings,” in Library of Congress Subject Headings Manual, Feb. 2016. (Library of Congress, 2016), 7, https://www.loc.gov/aba/publications/FreeSHM/H0180.pdf 

[57] Oxford English Dictionary, “Objectivity,” https://www.oed.com/dictionary/objectivity_n?tab=meaning_and_use#34080200; Merriam-Webster Dictionary, “Objectivity,” https://www.merriam-webster.com/dictionary/objectivity.

[58] Michael R. Griffiths, “Roland Barthes Declared the ‘Death of the Author’, but Postcolonial Critics have Begged to Differ,” The Conversation, July 2, 2025, https://theconversation.com/roland-barthes-declared-the-death-of-the-author-but-postcolonial-critics-have-begged-to-differ-256093.

[59] Library of Congress Subject Headings, “Holocaust denial literature,” https://lccn.loc.gov/sh96009503.

[60] Anastasia Chiu, Fobazi M. Ettarh, and Jennifer A. Ferretti, “Not the Shark, but the Water: How Neutrality and Vocational Awe Intertwine to Uphold White Supremacy,” in Knowledge Justice: Disrupting Library and Information Studies through Critical Race Theory, eds. Sofia Y. Leung, Jorge R. López-McKnight (MIT Press, 2021), 65.

[61] Library of Congress, “Editorial Meeting Number 4,” 2008; Library of Congress, “LCSH/LCC Editorial Meeting Number 02 (2024).”

[62] Library of Congress, “Summary of Decisions, Editorial Meeting Number 10” (Library of Congress, 2013), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-131021.html; Library of Congress, “Summary of Decisions, LCSH/LCC Editorial Meeting Number 05 (2023)” (Library of Congress, 2023), https://www.loc.gov/aba/pcc/saco/cpsoed/ptcp-2305.pdf; Library of Congress, “Summary of Decisions, Editorial Meeting Number 46” (Library of Congress, 2010), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-101117.html; Library of Congress, “Summary of Decisions, Editorial Meeting Number 4” (Library of Congress, 2015), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-150420.html; Library of Congress, “Summary of Decisions, Editorial Meeting Number 27” (Library of Congress, 2010), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-100707.html; Library of Congress, “Summary of Decisions, Editorial Meeting Number 36” (Library of Congress, 2009), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-090909.html; Library of Congress, “Summary of Decisions, Editorial Meeting Number 1911” (Library of Congress, 2019), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-191118.html; Library of Congress, “Summary of Decisions, Editorial Meeting Number 2111” (Library of Congress, 2018), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-211115.html.

[63] Library of Congress, “Module 1.4,” 3.

[64] Library of Congress, “Summary of Decisions, Editorial Meeting Number 46” (Library of Congress, 2007), https://www.loc.gov/aba/pcc/saco/cpsoed/cpsoed-071114.html; Library of Congress, “Summary of Decisions, LCSH/LCC Editorial Meeting Number 6 (2024)” (Library of Congress, 2024), https://www.loc.gov/aba/pcc/saco/cpsoed/ptcp-2406.pdf.

[65] Library of Congress, “Summary of Decisions, Editorial Meeting Number 04” (Library of Congress, 2016), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-160418.html; Library of Congress, “Summary of Decisions, Editorial Meeting Number 2006” (Library of Congress, 2020), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-200615.html; Library of Congress, “Summary of Decisions, Editorial Meeting Number 23” (Library of Congress, 2011), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-110815.html; Library of Congress, “Summary of Decisions, Editorial Meeting Number 10” (Library of Congress, 2016), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-161017.html; Library of Congress, “Library of Congress Subject Headings Approved Monthly List 06 (June 17, 2022)” (Library of Congress, 2022), https://classweb.org/approved-subjects/2206.html.

[66] Library of Congress, “Summary of Decisions, Editorial Meeting Number 2006” (Library of Congress, 2020), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-200615.html.

[67] Library of Congress, “Summary of Decisions, Editorial Meeting Number 10” (Library of Congress, 2014), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-141020.html; Library of Congress, “Summary of Decisions, LCSH/LCC Editorial Meeting Number 07 (2023)” (Library of Congress, 2020), https://www.loc.gov/aba/pcc/saco/cpsoed/ptcp-2307.pdf.

[68] Krista Maywalt Aronson, Brenna D. Callahan, and Anne Sibley O’Brien, “Messages Matter: Investigating the Thematic Content of Picture Books Portraying Underrepresented Racial and Cultural Groups,” Sociological Forum 33, no. 1 (2018): 179, http://www.jstor.org/stable/26625904.

[69] Lisely Laboy, Rachael Elrod, Krista Aronson, and Brittany Kester, “Room for Improvement: Picture Books Featuring BIPOC Characters, 2015–2020,” Publishing Research Quarterly 39 (2023): 58, https://doi.org/10.1007/s12109-022-09929-7.

[70] Library of Congress, “Editorial Meeting Number 36,” 2009; Library of Congress, “Summary of Decisions, Editorial Meeting Number 21” (Library of Congress, 2011), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-110620.html; Library of Congress, “Summary of Decisions, Editorial Meeting Number 02” (Library of Congress, 2012), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-120221.html; Library of Congress, “Summary of Decisions, Editorial Meeting Number 06” (Library of Congress, 2018), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-180618.html; Library of Congress, “Editorial Meeting Number 2111,” 2021; Library of Congress, “Summary of Decisions, LCSH List Number 11c (2024) (2024) and LCC List Number 10 & 11 (2024)” (Library of Congress, 2024), https://www.loc.gov/aba/pcc/saco/cpsoed/ptcp-2412g.pdf.

[71] Library of Congress, “Summary of Decisions, Editorial Meeting Number 07” (Library of Congress, 2014), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-140721.html.

[72] Library of Congress, “Summary of Decisions, Editorial Meeting Number 09” (Library of Congress, 2017), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-170918.html. LC did establish a new heading for Denialism at that time; however, per the rejection, “To bring out the denialism aspect of events or topics, the heading may be post-coordinated with headings for the events or topics. The existing subject headings Holocaust denial and Holodomor denial, which are related to specific events, were added by exception as narrower terms of the new heading Denialism. Additional narrower terms will not be added to Denialism.”

[73] Library of Congress, “Summary of Decisions, Editorial Meeting Number 23” (Library of Congress, 2007), https://www.loc.gov/aba/pcc/saco/cpsoed/cpsoed-070606.html.

[74] Library of Congress, “Editorial Meeting Number 1911,” 2019.

[75] Library of Congress, “Summary of Decisions, Editorial Meeting Number 49” (Library of Congress, 2010), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-101208.html; Library of Congress, “Summary of Decisions, “LCSH/LCC Quarterly Editorial Meeting List 2409” (Library of Congress, 2024), https://www.loc.gov/aba/pcc/saco/cpsoed/ptcp-2409.pdf.

[76] Library of Congress, “Summary of Decisions, Editorial Meeting Number 5” (Library of Congress, 2015), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-150518.html.

[77] The heading is now Gay people–Violence against.Library of Congress, “Summary of Decisions, Editorial Meeting Number 27” (Library of Congress, 2011), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-111219.html.

[78] Library of Congress, “Editorial Meeting Number 04,” 2016.

[79] Library of Congress, “Summary of Decisions, LCSH Number 11 and LCC Number 11b (2024)” (Library of Congress, 2024), https://www.loc.gov/aba/pcc/saco/cpsoed/ptcp-2411.pdf.

[80] Library of Congress, “Editorial Meeting Number 27,” 2011; Library of Congress, “Library of Congress Subject Headings Monthly List 12 LCSH (December 17, 2012)” (Library of Congress, 2012), https://classweb.org/approved-subjects/1212.html.

[81] Library of Congress, “Editorial Meeting Number 27,” 2010.

[82] K.R. Roberto, “LCSH Proposals: Is this a Trend?” Jan. 17, 2012, RADCAT mailing list archives.

[83] Library of Congress, “Summary of Decisions, LCSH/LCC Editorial Meeting Number 12 (2024)” (Library of Congress, 2024), https://www.loc.gov/aba/pcc/saco/cpsoed/ptcp-2412.pdf.

[84] Library of Congress, “Module 1.4,” 3.

[85] Library of Congress, “Summary of Decisions, Editorial Meeting Number 12” (Library of Congress, 2015), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-151212.html. A 2016 rejection of Dadaist literature, Romanian (French) also highlighted colonialist content in LCSH, noting that “Headings for national literatures qualified by language are generally established for the language(s) of the colonial power that used to control the territory.” See: Library of Congress, “Editorial Meeting Number 04,” 2016.

[86] Library of Congress, “Summary of Decisions, Editorial Meeting Number 2003” (Library of Congress, 2020), https://www.loc.gov/aba/pcc/saco/cpsoed/psd-200316.html.

[87] Library of Congress, “Editorial Meeting Number 02 (2024).”

[88] Margaret Breidenbaugh, “Re: Summary of Decisions, Editorial Meeting Number 02, February 16, 2024,” SACOLIST Mailing List Archives, Library of Congress, May 29, 2024, https://listserv.loc.gov/cgi-bin/wa?A2=SACOLIST;eb3d8761.2405&S=.

[89] Crystal Yragui, “Re: Summary of Decisions, Editorial Meeting Number 02, February 16, 2024,” SACOLIST Mailing List Archives, Library of Congress, May 30, 2024, https://listserv.loc.gov/cgi-bin/wa?A2=2405&L=SACOLIST&D=0&P=1800917.

[90] Matthew Haugen, “Re: Summary of Decisions, Editorial Meeting Number 02, February 16, 2024,” SACOLIST Mailing List Archives, Library of Congress, May 29, 2024, https://listserv.loc.gov/cgi-bin/wa?A2=2405&L=SACOLIST&D=0&P=1796174.

[91] Stacey Patton, “White People Hate Being Called ‘White People,’” Substack, Oct. 23, 2025, https://drstaceypatton1865.substack.com/p/white-people-hate-being-called-white.

[92] Stacey Patton, “White People.”

[93] Chiu, Ettarh, and Ferretti, “Not the Shark,” 56-57.

[94] Library of Congress, “H 1922: Offensive Words” in Library of Congress Subject Headings Manual, Sep. 2024 (Library of Congress, 2024), 2, https://www.loc.gov/aba/publications/FreeSHM/H1922.pdf

[95] Library of Congress, “Tentative Monthly List 12 LCSH (December 20, 2024)” (Library of Congress, 2024), https://classweb.org/tentative-subjects/2412.html

[96] Brinna Michael, “LCSH, Transparency, and the Impact of Collective Action,” TCB: Technical Services in Religion & Theology 33, no. 2 (2025): 1. https://doi.org/10.31046/h01fq272.

[97] Library of Congress, “Summary of Decisions, LCSH/LCC Editorial Meeting Number 12 (2024)” (Library of Congress, 2024), https://www.loc.gov/aba/pcc/saco/cpsoed/ptcp-2412.pdf.

[98] U.S. Department of Education, Compact for Academic Excellence in Higher Education (Draft Memorandum, Oct. 2025), 4, 5, 2, 1, 9, https://www.washingtonexaminer.com/wp-content/uploads/2025/10/Compact-for-Academic-Excellence-in-Higher-Education-10.1.pdf.

[99] Library of Congress, “Library of Congress Subject Headings Approved Monthly List 12 LCSH 2” (Library of Congress, 2025), https://classweb.org/approved-subjects/2412a.html. For more information, including the fast-tracked nature of the changes, see Violet Fox, “Anticipatory Obedience at the Library of Congress,” ACRLog (blog), Mar. 28, 2025, https://acrlog.org/2025/03/28/anticipatory-obedience-at-the-library-of-congress/

[100] Sanford Berman, “ALA at 150: An Interview with (and by) Sanford Berman,” by Jenna Freedman. Lower East Side Librarian, Nov. 30, 2025 https://lowereastsidelibrarian.info/interviews/sandy-2025 

[101] Berman, “ALA at 150.”

[102] Program for Cooperative Cataloging, “Program for Cooperative Cataloging Guiding Principles for Diversity, Equity, and Inclusion for Metadata Creation,” approved Jan. 19, 2023 https://www.loc.gov/aba/pcc/resources/DEI-guiding-principles-for-metadata-creation.pdf

[103] Sandy Iverson, “Librarianship and Resistance,” in Questioning Library Neutrality, ed. Alison Lewis (Library Juice Press, 2008), 26.

[104] Iverson, “Librarianship and Resistance,” 26.

[105] B. M. Watson, “Expanding the Margins in the History of Sexuality & Galleries, Libraries, Archives, Museums & Special Collections (GLAMS)” PhD diss. (University of British Columbia, 2025), 270.

[106] Violet Fox, et al. to Policy, Training and Cooperative Programs Division, Library of Congress, June 30, 2024, “Editorial Meetings Decision,” https://cataloginglab.org/editorial-meetings-decision/

[107] Subject Analysis Committee Working Group on External Review of LC Vocabularies, Report of the SAC Working Group on External Review of Library of Congress Vocabularies, February 2023, 8-9, https://alair.ala.org/handle/11213/20012.

[108] Working Group on External Review of LC Vocabularies, “Report,” 8.

[109] Library of Congress, “H 180”, 7.

[110] Watson, “Expanding the Margins,” 270.

[111] Library of Congress, “L 400: Ethics and Demographic Group Terms” in Library of Congress Demographic Group Terms Manual,Mar. 2025 (Library of Congress, 2025), 1, https://www.loc.gov/aba/publications/FreeLCDGT/L400.pdf.

[112] Cataloging Policy and Standards, “Announcement from the Library of Congress (April 7, 2025),” SACOLIST Mailing List Archives, Library of Congress, April 7, 2025, https://listserv.loc.gov/cgi-bin/wa?A2=SACOLIST;61e18f28.2504&S=

[113] Hope Olson, “Difference, Culture and Change: The Untapped Potential of LCSH,” Cataloging & Classification Quarterly 29, no. 1–2 (2000), 54 https://doi.org/10.1300/J104v29n01_04.

[114] Olson, “Difference, Culture and Change,” 66.

Strike time, collective action, and moral conviction in library leadership / Meredith Farkas

I’m on strike right now, along with thousands of other faculty, academic professionals, and staff at Portland Community College (that’s two unions, friends!). It’s a weird feeling. I never thought I’d be in this position. PCC was the first place I worked where I really felt like the values of the College matched my own. I work with insanely dedicated and caring library workers, faculty, and staff. They believe unwaveringly in what they do and constantly go above and beyond for students. After being here for a few years, I knew this was the place I wanted to work for the rest of my career. Even as administration became worse – more corporatized, more performative, less accessible, more likely to listen to outside consultants than the people who directly work with students – I still never considered leaving because the folks I work with regularly are awesome and I love our students. 

As a scholar of time, I’m always interested in different forms of time (queer time, crip time, etc.). Strike time feels really strange. We were talking this morning on the picket line how it feels a lot like early COVID where time moved very differently. We feel like the days are both way too long and super short with not enough time to get everything done but also too much time just staring at different union social channels. We’re totally energized and totally exhausted (I’m lying on the couch like a ragdoll right now after three hours of holding signs, screaming, and dancing, marching and chanting with hundreds of colleagues). In terms of information, we feel like we’re both drinking from a firehose and like we don’t have any of the information we need. We have no idea what the near-term future will bring. What day of the week it is feels almost arbitrary because none of the usual markers of those days apply (I see all the things I was supposed to have been doing at work each day on my calendar and it feels like another life entirely). We’re both unmoored and deeply connected. I love it (the connection and collective power) and I also really hate it (for our students, for our colleagues who live paycheck to paycheck, for what the administration and the Board are doing to my beloved institution). 

So it’s weird to feel both temporarily severed from the College and also more deeply connected than ever. These administrators may run the College and have the authority to make decisions, but they are not the College. The College is the people I’ve seen on the picket lines the past few days in the rain and freezing cold. These people who are truly fighting for the soul of our college. They make the College run, from teaching classes, to assisting students with all kinds of needs, to helping students feel welcome, to keeping the College clean and safe and keeping students fed. All of these things are critical and the College can’t run without us, but I’m not entirely sure the same can be said of our administrators. The College is also our students, many of whom have stood with us on the line, who’ve brought us food, or have supported us through emails to the President and Board and on social media. I feel incredibly grateful for our students who clearly see through the bs administration is putting out there. 

It’s been kind of incredible to see how unprepared our administration was for this after 11 months in which they barely moved in negotiations. They’ve known for months that a strike  was a distinct possibility and they were the ones who walked away from the bargaining table the night before the strike was meant to happen. The latest email from the President said “I will say, with some pride, that we are not – and we should not – be an organization that is good at navigating this scenario” but, honestly, they should have had guidance for students ready to go. Administrators are supposed to plan for scenarios like this. They had units planning for two different scenarios for cuts from the State (neither of which came to pass). We spent almost a year planning what we would cut if LSTA funds went away in our state for the next year (they didn’t, thank goodness). Most faculty, on the other hand, have been talking to students about a possible strike for the past six weeks at least and the union provided tons of resources to help them come up with a plan for their own classes. Yet the College was left totally scrambling last Wednesday as if they had no idea this could happen. Baffling.

It’s been interesting seeing some managers show up to bring food and/or spend a bit of time with us on the line. It’s not a lot of them, but it means a lot to us when someone does. They’ve told us about the absolute unprepared hot mess that is administration right now and it’s nice to realize that not every middle manager tows the party line at all times. But the vast majority of our managers sent us emails just before the start of the strike asking us to let them know if we were working or not, so most are definitely sticking with administration.

I had a boss many years ago who definitely put her employees first and advocated fiercely for us. She said she saw her role as being akin to a manager of a minor league baseball team. She was here to help develop us for bigger and better things in our careers. She was a major mentor to me in my early years in the profession. Since then, the bosses I’ve had really prioritized the people above them in the org chart ahead of the people below them. They have been classic “company [wo]men.” Helping us develop in our careers or even supporting us when we explicitly asked for it wasn’t part of the job. When I was a middle manager, I took the exact opposite approach and that’s why I’m no longer a middle manager. I always saw the role of a manager as supporting one’s direct reports (essentially, I worked for them) and that wasn’t what the people in charge of the library wanted me to do.

The great library leader Mitch Freedman died recently and it made me think about whether leaders like him can really exist in our much more corporatized libraries these days. If you don’t know about Mitch’s storied biography as a library leader and awesome human, please take a moment to read about him here in an obit from his family. When I was coming up as a librarian, he was the sort of man who was a model for me in successfully operating in our field with total moral courage. He lived his values every day. He fought for people and the things that he believed in. He centered the folks who were oppressed. He believed relationships were core to our work. In many ways, he embodied the “Good” and the “Human(e)” characteristics of slow librarianship (maybe also the “Thoughtful” but I didn’t work with him, so I’m not sure). His amazing daughter, Jenna Freedman, also lives her values courageously, a living tribute to his example.

I hope there are still library managers out there still who have moral courage and fight the good fight, but, more and more, it feels like the people who become library Deans, Directors, and University Librarians are the ones who are willing to comply and conform, not the ones willing to rock the boat. As our institutions become more and more corporatized and neoliberal, we see less and less moral courage. I see a lot of library administrators wanting to look like they’re doing good more than they actually want to do good. I think of the leaders who all started EDI initiatives or published EDI statements right around 2020 and then let them fade away. Most of the people I see doing amazing values-driven work in our field these days are not leading libraries. They’re mostly front-line librarians. I wonder if it’s because like me, folks are not willing to make the moral compromises so many have to make these days to climb the ladder.

In “Anthropology and the rise of the professional-managerial class,” the great (and deeply missed) David Graeber wrote about how 

the decisive victory of capitalism in the 1980s and 1990s, ironically, has… led to both a continual inflation of what are often purely make-work managerial and administrative positions—”bullshit jobs”—and an endless bureaucratization of daily life, driven, in large part, by the Internet. This in turn has allowed a change in dominant conceptions of the very meaning of words like “democracy.” The obsession with form over content, with rules and procedures, has led to a conception of democracy itself as a system of rules, a constitutional system, rather than a historical movement toward popular self-rule and self-organization, driven by social movements, or even, increasingly, an expression of popular will.

I see that in my own place of work. So much of my boss’ (our Dean’s) job is box checking compliance type work – approving vacations and sick leave, making sure we’re doing required trainings and other things the people above her on the org chart want us to do, making sure we’re doing all of the things contractually required of us, etc. It used to be that I met with her once each term to talk about what I was working on, go over my progress on my goals, etc. Then I went to meeting with her just once in Fall where we’d look at my goals document (without any meaningful feedback or support) and then I’d fill out a Google form at the end of the year to tell her what I did (with again no meaningful feedback). Now, even that Fall meeting is gone as her load of compliance-related work has increased. There’s no support outside of helping us navigate the bureaucracy of our institution. There’s no “walking around” as Mitch Freedman did – building relationships with employees and making them feel seen. There’s no focus on our development or talking about the meaning behind what we do. There’s just this compliance-focused flurry of activity. 

As our colleges and universities become more and more corporatized, they turn what were supposed to be leadership positions, that required vision and people skills, and turn them into babysitting jobs because, lord knows, we professionals can’t be trusted. Our college, like many, has seen a massive growth in the number of managerial positions, and yet, faculty and staff are being asked to do more administrative work than ever before, not less. Why? Well, of course those managers have to justify their existence. 

Could a Mitch Freedman become a library director today? Would he have had to compromise his values somewhere down the line to get there? Do you know of any library leaders like Mitch today who are able to operate successfully in these more neoliberal environments? 

In that same piece, David Graeber writes “scholars are expected to spend less and less of their time on scholarship, and more and more on various forms of administration—even as their administrative autonomy is itself stripped away. Here too we find a kind of nightmare fusion of the worst elements of state bureaucracy and market logic.” This is the reality we find ourselves in as our two unions fight for better pay, but even more importantly, for a real, substantial model of shared governance which we don’t currently have (and which our college President agreed to and then hired a consultant to create for us 🙄). The fact that the only college committee or governance group that has the ability to conduct a vote of no confidence in our President (which they successfully passed!) is our student government is a stark reminder of how little power and voice we have in the future of our college. It can be so easy to just focus on keeping our head down and doing the good work we do as educators, as supporters of students and faculty, as stewards of collections, etc., but when we fight together like this, we fight for the heart and soul of our organization. We fight for an organization that centers students and their needs and listens deeply to those who directly serve and educate them. 

Walking the picket line the first couple of days was brutal in many ways. I was so cold and wet I couldn’t even grip my cell phone or a car door handle and I had to stay off my feet for a few hours as they thawed. But what has kept me warm, has kept all of us warm, is the solidarity. It has sometimes felt almost like a party, being there with many hundreds of my fellow colleagues. It’s been so affirming, so energizing. We’re all so united in this, so deeply committed to the institution and each other in ways that these administrators who jump from job to job every few years and compose soulless emails to us with freaking ChatGPT will never understand. 

If you’re feeling so inclined, please contribute to our strike fund. The administration seems really dug in and even decreased their offer by over $100,000 on Sunday, so I’m not quite so optimistic anymore that this will end quickly and we have lots of faculty, academic professionals, and staff who won’t be able to pay their rent or mortgage without support. Thanks and solidarity!! ✊

Ways of Seeing the Web / Ed Summers

Leica Double-Gauss Lens Design

The news about Cloudflare’s new pay-per-crawl API caught my attention for a few reasons. Read on for why, a bit about what the results look like, and what I learned when I asked it to crawl this here site as a test.


So, first of all, what’s up? Cloudflare’s Crawl API helps people collect data from websites with bots, while at the same time providing one of the most popular technologies for preventing websites from being crawled by bots?!?

At first this seemed to me like a classic fox-guarding-the-hen-house type of situation. But the little bit of reading in the docs I’ve done since makes it seem like they will still respect their own bot gate keeping (e.g. Turnstile).

If you are using Cloudflare or some other bot mitigation technology you will have to follow their instructions to let the Cloudflare crawl bot in to collect pages. Interestingly, it appears they are using the latest specs for HTTP Message Signatures to provide this functionality, since you can’t simply let in anyone saying they are CloudflareBrowserRenderingCrawler right?

The genius here is that Cloudflare is known for its Content Delivery Network (CDN). So in theory (more on this below) when a user asks to crawl a website the data can be delivered from the cache, without requiring a round trip back to the source website. In some situations this could mean that the burden of scrapers on websites is greatly reduced.

The introduction of a Crawl API also looks like another jigsaw piece fitting into place for how Cloudflare see web publishers benefiting from being crawled. Only time will tell if this strategy works out, but at least they have some semblance of a plan for the web that isn’t simply sprinkling “AI” everywhere.

If you run a website with lots of high value resources for LLMs (academic papers, preprints, books, news stories, etc) the same cached content could be delivered to multiple parties without having to go back to the originating server. For resource constrained cultural heritage organizations that are currently getting crushed by bots I think this would be a welcome development.

But, the primary reason this news caught my eye is that if you squint right Cloudflare’s Crawl API looks very much like web archiving technology. For example, the Browsertrix API lets you set up, start, monitor and download crawls of websites.

Unlike Browsertrix, which is geared to collecting a website for viewing by a person, the Cloudflare Crawl service is oriented at looking at the web for training LLMs. The service returns text content: HTML, Markdown and structured JSON data that result from running the collected text through one of their LLMs, with the given prompt.

Seeing the Web

So why is it interesting that this is like web archiving technology?

Ok, maybe it isn’t interesting to you, but (ahem) in my dissertation research (Summers, 2020) I spent a lot of time (way too much time tbh) looking at how web archiving technology enacts different ways of seeing the web from an archival perspective. I spent a year with NIST’s National Software Reference Library (NSRL) trying to understand how they were collecting software from the web, and how the tools they built embodied a particular way of seeing and valuing the web–and making certain things (e.g. software) legible (Scott, 1998).

What I found was that the NSRL was engaged in a form of web archiving, where the shape of the archival records was determined by their initial conditions of use (in their case, forensics analysis). But these initial forensic uses did not overdetermine the value of the records, which saw a variety of uses, disuses, and misuses later: such as when the NSRL began adding software from Stanford’s Cabrinety Archive, or when the teams personal expertise and interest in video games led them to focus on archiving content from the Steam platform.

So I guess you could say I was primed to be interested in how Cloudflare’s Crawl service sees the web. This matters because models (LLMs, etc) and other services will be built on top of data that they’ve collected. But also because, if it succeeds, the service will likely get repurposed for other things.

Testing

To test how Cloudflare sees the web, I simply asked it to crawl my own static website–the one that you are looking at right now. I did this for a few reasons:

  1. It’s a static website, and I know exactly how many HTML pages were on it. All the pages are directly discoverable since the homepage includes pagination links to an index page that includes each post.
  2. I can easily look at the server logs to see what the crawler activity looks like.
  3. I don’t use any kind of Web Application Firewall or other form of bot protection on my site (I do have a robots.txt but it doesn’t block CloudflareBrowserRenderingCrawler/1.0)
  4. I host my website on May First which doesn’t use Cloudflare as a CDN. So the web content wouldn’t intentionally be in Cloudflare’s CDN already.

This methodology was adapted from previous work I did with Jess Ogden and Shawn Walker analyzing how the Internet Archive’s Save Page Now service shapes what content is archived from the web (Ogden, Summers, & Walker, 2023).

I wrote a little command line utility cloudflare-crawl to start, monitor and download the results from the crawl. While the crawler ran I simultaneously watched the server logs. Running the utility looks like this:

$ uvx https://github.com/edsu/cloudflare-crawl crawl https://inkdroid.org

created job 36f80f5e-d112-4506-8457-89719a158ce2
waiting for 36f80f5e-d112-4506-8457-89719a158ce2 to complete: total=1520 finished=837 skipped=1285
waiting for 36f80f5e-d112-4506-8457-89719a158ce2 to complete: total=1537 finished=868 skipped=1514
...
wrote 36f80f5e-d112-4506-8457-89719a158ce2-001.json
wrote 36f80f5e-d112-4506-8457-89719a158ce2-002.json
wrote 36f80f5e-d112-4506-8457-89719a158ce2-003.json
wrote 36f80f5e-d112-4506-8457-89719a158ce2-004.json
wrote 36f80f5e-d112-4506-8457-89719a158ce2-005.json

Each of the resulting JSON files contains some metadata for the crawl, as well as a list of “records”, one for each URL that was discovered.

{
  "success": true,
  "result": {
    "id": "36f80f5e-d112-4506-8457-89719a158ce2",
    "status": "completed",
    "browserSecondsUsed": 1382.8220786132817,
    "total": 1967,
    "finished": 1967,
    "skipped": 6862,
    "cursor": 51,
    "records": [
      {
        "url": "https://inkdroid.org/",
        "status": "completed",
        "metadata": {
          "status": 200,
          "title": "inkdroid",
          "url": "https://inkdroid.org/",
          "lastModified": "Sun, 08 Mar 2026 05:00:39 GMT"
        },
        "markdown": "..."
        "html": "...",
      },
      {
        "url": "https://www.flickr.com/photos/inkdroid",
        "status": "skipped"
      }
    ]
  }
}

Analysis

I decided I wasn’t very interested in testing their model offerings, so I didn’t ask for JSON content (the result of sending the harvested text through a model). If I had, each successful result would have had a json property as well. I am sure that people will use this, but I was more interested in how the service interacted with the source website, and wasn’t interested in discovering the hard way how much it cost to run the content through their LLMs.

Below is a snippet of how the Cloudflare bot shows up in my nginx logs. As you can see the logs provide insight into what machine on the Internet is doing the request, what time it was requested, and what URL on the site is being requested.

104.28.153.137 - - [12/Mar/2026:14:34:58 +0000] "GET /about/ HTTP/1.1" 200 5077 "-" "CloudflareBrowserRenderingCrawler/1.0"
104.28.153.137 - - [12/Mar/2026:14:34:58 +0000] "GET /css/main.css HTTP/1.1" 200 35504 "https://inkdroid.org/about/" "CloudflareBrowserRenderingCrawler/1.0"
104.28.153.137 - - [12/Mar/2026:14:34:58 +0000] "GET /css/highlight.css HTTP/1.1" 200 1225 "https://inkdroid.org/about/" "CloudflareBrowserRenderingCrawler/1.0"
104.28.153.137 - - [12/Mar/2026:14:34:58 +0000] "GET /css/webmention.css HTTP/1.1" 200 1238 "https://inkdroid.org/about/" "CloudflareBrowserRenderingCrawler/1.0"
104.28.153.137 - - [12/Mar/2026:14:34:58 +0000] "GET /images/feed.png HTTP/1.1" 200 8134 "https://inkdroid.org/about/" "CloudflareBrowserRenderingCrawler/1.0"
104.28.153.137 - - [12/Mar/2026:14:34:58 +0000] "GET /js/bootstrap.min.js HTTP/1.1" 200 17317 "https://inkdroid.org/about/" "CloudflareBrowserRenderingCrawler/1.0"
104.28.153.137 - - [12/Mar/2026:14:34:58 +0000] "GET /images/ehs-trees.jpg HTTP/1.1" 200 63047 "https://inkdroid.org/about/" "CloudflareBrowserRenderingCrawler/1.0"
104.28.153.137 - - [12/Mar/2026:14:34:59 +0000] "GET /js/highlight.min.js HTTP/1.1" 200 20597 "https://inkdroid.org/about/" "CloudflareBrowserRenderingCrawler/1.0"

So how did Cloudflare Crawl see my website?

Maybe it’s early days for the service, but one thing I noticed is that each time I requested the site to be crawled the results seemed to be radically different.

crawl time completed skipped queued errored unique_urls
2026-03-12 13:13:00 165 84 0 1 223
2026-03-12 13:44:00 72 4 2 0 78
2026-03-12 14:09:00 1947 7304 0 23 9191
2026-03-12 16:33:00 72 4 2 0 78
2026-03-12 17:34:00 1948 7365 0 22 9191
2026-03-13 16:50:00 1947 7363 0 23 9187
2026-03-14 07:32:00 72 4 2 0 78

The more successful crawls did a good job of crawling the entire site. My website is well linked, with a standard homepage, that has anchor tag based paging that includes links to all the posts. But knowing when your results are a partial crawl seems to be difficult. Knowing the actual dimensions of a “website” is one of the more difficult things about web archiving practice. The URLs that were labeled as “skipped” were not in scope for the crawl. If you wanted to include those apparently there is a options.includeExternalLinks option when setting up the crawl.

From watching the web server logs it was clear that:

  1. Cloudflare does appear to be relying on previously cached data, but it’s not entirely clear what the logic is. For example one crawl took 5 minutes to complete, it returned 1,974 completed results but the web server only saw requests for 594 of those URLs. I turned around and ran the exact same crawl again and it took 20 minutes longer, return 1,974 results, but 847 pages were requested. In between no content on the website changed. 🤷
  2. Cloudflare appears to be fetching CSS, JavaScript and images for the rendering of each page (they aren’t being cached by the Browser Worker).
  3. The throughput on the web server seemed to peak around 300 requests / minute (5 requests / second). For most sites this seems perfectly feasible.

For the more successful crawls it looked like there were 246 independent IP addresses within Cloudflare’s network block that were doing the crawling.

ip request_count
104.28.153.88 405
104.28.163.131 266
104.28.161.242 232
104.28.165.231 223
104.28.153.132 212
104.28.163.132 212
104.28.163.81 201
104.28.166.65 188
104.28.166.121 186
104.28.164.201 185
104.28.153.179 182
104.28.153.137 178
104.28.164.202 172
104.28.161.243 172
104.28.166.127 163
104.28.165.232 155
104.28.153.119 153
104.28.165.14 151
104.28.153.83 148
104.28.153.140 145
104.28.153.87 145
104.28.153.55 143
104.28.153.136 142
104.28.163.133 132
104.28.153.118 131
104.28.166.58 130
104.28.163.78 126
104.28.160.31 125
104.28.153.139 124
104.28.161.245 124
104.28.163.214 123
104.28.153.120 123
104.28.165.230 121
104.28.153.180 121
104.28.164.156 119
104.28.153.96 119
104.28.153.64 112
104.28.153.133 111
104.28.166.128 111
104.28.153.128 109
104.28.166.126 104
104.28.165.17 103
104.28.165.18 103
104.28.160.30 103
104.28.153.134 101
104.28.166.120 101
104.28.153.129 101
104.28.153.181 100
104.28.153.86 100
104.28.165.229 100
104.28.163.134 99
104.28.164.203 99
104.28.162.194 98
104.28.166.62 98
104.28.163.212 98
104.28.153.123 97
104.28.164.154 97
104.28.166.61 97
104.28.161.246 96
104.28.153.92 96
104.28.166.125 96
104.28.153.68 93
104.28.159.23 92
104.28.153.76 91
104.28.153.71 91
104.28.153.124 90
104.28.158.143 88
104.28.165.21 88
104.28.153.94 87
104.28.166.118 86
104.28.161.133 84
104.28.153.85 82
104.28.164.152 82
104.28.163.77 82
104.28.153.148 79
104.28.164.150 79
104.28.165.12 79
104.28.161.201 79
104.28.153.183 78
104.28.160.65 78
104.28.153.126 77
104.28.153.138 77
104.28.159.133 76
104.28.165.20 75
104.28.158.137 75
104.28.153.56 75
104.28.153.81 74
104.28.153.131 73
104.28.153.59 72
104.28.166.60 72
104.28.166.66 69
104.28.159.120 69
104.28.153.53 68
104.28.153.185 68
104.28.153.191 67
104.28.166.119 66
104.28.153.95 64
104.28.165.76 64
104.28.154.20 62
104.28.153.121 57
104.28.158.142 57
104.28.160.68 56
104.28.163.177 56
104.28.153.80 56
104.28.161.215 55
104.28.161.244 55
104.28.153.62 55
104.28.166.134 55
104.28.153.122 54
104.28.165.19 53
104.28.153.127 53
104.28.159.118 53
104.28.157.166 53
104.28.153.226 53
104.28.157.169 52
104.28.159.111 48
104.28.153.196 48
104.28.161.132 48
104.28.153.84 47
104.28.161.214 47
104.28.165.13 46
104.28.153.219 46
104.28.163.171 46
104.28.165.15 45
104.28.163.176 45
104.28.159.109 45
104.28.158.155 45
104.28.153.218 45
104.28.158.131 44
104.28.161.200 44
104.28.153.222 44
104.28.161.197 44
104.28.159.74 44
104.28.158.139 44
104.28.158.138 44
104.28.153.235 43
104.28.153.106 43
104.28.164.160 43
104.28.153.57 38
104.28.159.119 37
104.28.163.82 36
104.28.153.197 36
104.28.153.93 36
104.28.160.25 35
104.28.153.78 34
104.28.153.72 34
104.28.153.125 34
104.28.153.61 34
104.28.166.131 34
104.28.158.132 33
104.28.159.135 33
104.28.160.34 33
104.28.163.220 33
104.28.153.77 33
104.28.166.135 33
104.28.164.155 33
104.28.163.213 33
104.28.158.136 33
104.28.160.121 33
104.28.157.174 33
104.28.165.71 33
104.28.153.130 33
104.28.163.76 32
104.28.160.32 32
104.28.160.64 32
104.28.153.89 32
104.28.159.110 32
104.28.163.172 32
104.28.154.18 32
104.28.163.178 31
104.28.166.124 30
104.28.165.114 25
104.28.153.182 25
104.28.166.132 25
104.28.159.108 24
104.28.165.75 24
104.28.157.171 24
104.28.153.240 23
104.28.164.204 23
104.28.153.108 23
104.28.159.24 22
104.28.157.242 22
104.28.153.63 22
104.28.153.105 22
104.28.159.229 22
104.28.158.130 22
104.28.164.213 22
104.28.159.136 22
104.28.164.158 22
104.28.157.83 22
104.28.153.107 22
104.28.159.83 22
104.28.157.172 22
104.28.157.82 22
104.28.158.145 22
104.28.162.93 22
104.28.163.174 22
104.28.153.98 22
104.28.157.170 21
104.28.158.126 21
104.28.165.74 21
104.28.153.216 21
104.28.159.112 21
104.28.161.199 14
104.28.153.194 13
104.28.154.15 13
104.28.159.232 13
104.28.166.59 13
104.28.159.150 12
104.28.165.72 12
104.28.158.252 12
104.28.153.104 12
104.28.158.254 11
104.28.158.129 11
104.28.153.58 11
104.28.162.195 11
104.28.160.28 11
104.28.159.115 11
104.28.158.255 11
104.28.153.214 11
104.28.153.67 11
104.28.160.29 11
104.28.153.195 11
104.28.164.153 11
104.28.160.23 11
104.28.160.24 11
104.28.159.114 11
104.28.160.27 11
104.28.160.66 11
104.28.157.175 11
104.28.157.173 11
104.28.159.122 11
104.28.154.12 11
104.28.160.33 11
104.28.164.159 11
104.28.163.170 11
104.28.165.11 11
104.28.154.17 10
104.28.163.222 10
104.28.159.121 2
104.28.157.243 2
104.28.153.73 2
104.28.157.233 2
104.28.153.54 2
104.28.158.146 2
104.28.163.169 2

I spot checked some of the HTML and it did appear to be near identical to what was on the live web. With the fullest results I noticed 4% of URLs were not crawled. One exception to that was a few XML files like an OPML and RSS feed which only showed the XSL element in the text and markdown results.

I think there are a few directions this could go from here:

  1. testing what happens when instructing the crawl to collect (instead of skip) pages that are off site
  2. testing what happens with more dynamic content, and how much to wait for pages to render
  3. trying to understand why truncated results come back sometimes, and if there are any signals for identifying when it is happening.
  4. explore more what the logic Cloudflare is using to determine when it can use its internal cache.

One thing I didn’t mention is that the Cloudflare free plan limits you to maximum of 100 pages per crawl. I set up a $5/month paid plan account in order to do this testing. In all my testing I only seemed to use 0.7 of “browser hours” which will fit well within the 10 hours allowed per month. It currently costs $0.09 / hour when you exceed your limit.

PS. If you are curious the Marimo notebook I was using for some of the analysis can be found here.

References

Ogden, J., Summers, E., & Walker, S. (2023). Know(ing) Infrastructure: The Wayback Machine as object and instrument of digital research. Convergence: The International Journal of Research into New Media Technologies, 135485652311647. https://doi.org/10.1177/13548565231164759
Scott, J. C. (1998). Seeing like a state: How certain schemes to improve the human condition have failed. Yale University Press. Retrieved from https://theanarchistlibrary.org/library/james-c-scott-seeing-like-a-state
Summers, E. H. (2020). Legibility Machines: Archival Appraisal and the Genealogies of Use. Digital Repository at the University of Maryland. https://doi.org/10.13016/U95C-QAYR