Planet Code4Lib

Hugh Rundle: Aligning open education programs with academic reward and recognition

2024-11-15T00:00:00+00:00

Today a new open book, Open Education down undOER: Australasian case studies was launched at the OE GLobal conference in Brisbane. One of the chapters is by Steven Chang and me, describing some work and thinking we've done this year to reconfigure our open education program to make it more attractive to prospective authors. I'm pretty excited to share this work. I think we have found an approach that will help us to attract and incentivise better and more open textbook proposals and – perhaps more importantly – result in better textbooks and better teaching.

Like open access research, discussion of open education and open educational resources (OER) usually focusses on the social justice aspects and licensing technicalities. OER is promoted as good for students because they don't have to pay for textbooks, and good for academics because the license allows them to "remix" and adjust the books. This is all true, but the reality for most academics is that they feel overwhelmed by workload and if they care about education they care most about the effectiveness of the resources for helping students understand the concepts they want to teach. If it's free to read that's nice, and if it's "free as in speech" then they're not really sure why that matters. It's not that academics don't care about social justice or saving students money. But the primary goal for teaching-focussed educators is achieving educational outcomes, not being idealogically pure.

We've spent a couple of years grappling with how to make OER publishing attractive, and after a lot of consultation and listening, realised that focussing on the benefits to students was where we were going wrong. Students don't decide which textbook is assigned, they just have to use whatever their subject coordinator decides. Once we start thinking about the benefits of open education to educators, a whole bunch of opportunities appear. Openly-licensed teaching and learning resources are more flexible and adaptable than commercially-licensed texts, and educators can customise them to the curriculum rather than forcing the curriculum to align with a given text. Without commercial pressures, authors are free to publish something niche, or experimental. And OERs are a tangible object educators can point to when it comes time to demonstrate their contribution to education beyond student grades or the highly-flawed student satisfaction surveys. Crucially, we've seen that publishing open textbooks changes educators – they think about their work differently and have a more expansive view of education, impact, and collaboration. We would argue that they become better educators.

The key, in essence, is to stop talking about open resources and start talking about open education. More specifically: what is the educator's biggest teaching challenge, and can we use an open education approach to help them resolve that challenge?

This results in higher quality teaching, the OERs are still free to read, and we're aligning the way we describe the benefits with how teaching-focussed academics are rewarded and recognised by their institutions and peers.

We're still learning, but today in the spirit of open we're sharing our thinking and what we've learned, along with a rubric we'll be using to assess future proposals for our OE program. I'm really looking forward to seeing how it works out.

John Mark Ockerbloom: “The art of self-tormenting is an ancient one”

2024-11-14T14:49:30+00:00

Dorothy L. Sayers’ Omnibus of Crime was a selection of the then-still-new Book of the Month Club in 1929. It includes stories of detection, mystery and horror spanning over 2000 years, and opens with an extended essay by Sayers on the development and art of those genres that’s often been reprinted. Rights issues made this edition’s selections differ a bit from a differently-titled British edition, but both versions will definitely be public domain in the US in 48 days. #PublicDomainDayCountdown

HangingTogether: Transforming the library into a research support hub

2024-11-14T12:52:00+00:00

This post is part of a growing series on the Library Beyond the Library.

Libraries are increasingly engaged in partnerships with other units across campus to contribute to institutional needs in research support. In many cases, this means the establishment of new operational structures that extend beyond internal library hierarchies and allow libraries—and their partners—to synergistically support institutional priorities.

The Montana State University Research Alliance centralizes multiple research support units within the library, fostering collaboration and streamlining research support for the entire campus. Launched in 2023, this initiative eliminates the need for researchers to navigate a complex network of disparate resources. The shared space encourages deeper collaboration between units, enhances referral processes, and creates new connections.

The Research Alliance has increased awareness of library services and solidified the library’s role as a central hub for research support at MSU.

On 2 October 2024, the OCLC Research Library Partnership (RLP) hosted a webinar featuring Research Alliance partners. In this webinar, the partners offered candid reflections on the challenges and rewards of this innovative model after one year. The webinar recording offers details about the five stakeholder units’ perspectives on the creation of the Research Alliance. Their shared experiences may be of value to institutions considering a similar effort.

^{View the webinar Partnering across the research enterprise to create robust research support at Montana State University}

The Research Alliance

The MSU Research Alliance unites five campus units under a single roof to streamline and enhance research support:

Office of Research Development
Center for Faculty Excellence
Undergraduate Research
Research Cyberinfrastructure
Research Optimization, Analytics, and Data Services (ROADS) within the MSU Library

These units collectively support faculty, researchers, and students through the full research life cycle. Services include proposal development, data management and visualization, publication assistance, data sharing, and research analytics.

Professional staff from each unit are now co-located on the third floor of the MSU Library in a flexible space designed for consultations, workshops, and events. This co-location and unified branding increase visibility and create a seamless, integrated support system, reducing the need for users to navigate previously dispersed services.

_{Space configuration of the multi-unit Research Alliance on the 3rd floor of the Montana State University Renne Library}

The recent webinar featured perspectives from each of the Alliance’s stakeholder groups, with presentations by:

Jason Clark, Head of ROADS, MSU Library
Coltran Hophan-Nichols, Director of Systems and Research Computing
Nicole Motzer, Director, Office of Research Development
Chatanika “Nika” Stoop, Assistant Director, Center for Faculty Excellence
Anna Tuttle, Director, Undergraduate Scholars Program

Origin story

The idea for the Research Alliance originated nearly a decade ago with former University Librarian Kenning Arlitsch, who envisioned the library as a central hub for research support.

_{Photo by Tim Mossholder on Unsplash}

Realizing this vision required years of patient effort—socializing the idea, identifying partners, building relationships, securing buy-in from those partners, and eventually gaining institutional support and funding from the provost. This effort exemplifies what we call “social interoperability,” the building and maintaining of collaborative relationships across units to enable project success and user adoption.

OCLC Research Library Partnership affiliates have, as a benefit of membership, the opportunity to consult directly with RLP program officers, and I met with Jason Clark and others several times in 2023 to provide an external perspective on their effort. Jason noted in an earlier blog post, that “Conversations with Rebecca crystallized our thinking about the Research Alliance partnership and helped us clearly define the MSU Library’s role. Moreover, her [social interoperability] workshop with the Alliance partners helped ground our work together, got us thinking about shared services and projects, and set us on our current path to a successful opening of the Research Alliance. . . .”

To make space for the Research Alliance, the library reallocated a student study area, appointing library faculty and staff as organizational leads to establish the Research Alliance’s home. This move raised valid concerns about potential impacts on library space and autonomy. However, the library’s leadership saw it as a bold step to cement its role as the physical and strategic center of research support.

Why the creation of a research hub matters

At most research institutions, research support services are scattered across campus, making it harder for students and researchers to find and use what they need. The Research Alliance addresses this by centralizing support.

MSU’s initiative also aligns with its strategic priorities as an R1 (very high research activity) institution. The university aims to strengthen its scholarly reputation and increase research impact. It also seeks to deliver high-impact teaching and is committed to providing undergraduates with early research opportunities.

Benefits, challenges, and lessons learned

The webinar presenters openly reflected back on their first year of operations, offering insights that other institutions may leverage.

Benefits

Increased visibility and awareness of services: Co-location has increased awareness and use of campus services by researchers and students.
Centralized research support: Researchers no longer have to navigate multiple units spread across campus to find the research support services they need. Staff can make real-time connections with experts nearby, helping researchers take advantage of all the resources available.
Enhanced collaboration: Co-location has fostered informal relationship-building, helping staff gain deeper knowledge of each unit’s expertise. This leads to more informed referrals and improved service for faculty and students.
Competitive advantage: Improved convenience and access to research support services supports MSU’s competitiveness in a global research landscape.
Increased visibility for the library: The library is now physically and operationally positioned at the center of the research support activities at MSU.

Challenges

Space constraints: MSU’s expanding campus creates ongoing space challenges. While the multipurpose design of the Research Alliance space is inviting, events and meetings sometimes disrupt staff and users.
Leadership churn: Key leaders left MSU during the implementation of the Research Alliance, creating a temporary leadership gap that resulted in uncertainty and slowed progress.
Decision-making structure: While institutional hierarchies and reporting lines remain unchanged, the Alliance still lacks a formal decision-making framework for its confederated units. Work is underway to develop a rubric for equitable and effective decision-making.

Lessons learned

Articulating shared goals. While each unit has its own mission and goals, members of the Alliance have found that they need an apparatus for coordinating and asserting their shared vision and goals. They have facilitated this work through a strategic planning retreat and regular meetings.
Intentionality: Intentional engagement is key to maximizing the Alliance’s benefits. Not all team members can be co-located due to limited library space, which means some individuals are separated from their main unit. Regardless of where their desk permanently resides, it’s important to stay connected with their home unit, in addition to connecting with others in the Alliance space. For example, while Coltran Hophan-Nichols’s principal office remains off-campus, he holds weekly office hours and workshops at the Alliance and remains in the space afterward to work. This practice strengthens cross-unit relationships and fosters collaboration. Fun and informal events, like a Great British Baking Show bakeoff, have further strengthened both professional and social ties.

Synergies

Proximity has helped Research Alliance units collaborate in new ways that benefit each unit and the campus as a whole.

For instance, the Library’s ROADS unit and the Office of Research Development jointly created a partnership database to capture, track, and promote faculty engagement. This tool helps connect new faculty with potential collaborators across campus.
The Office of Research Development and the Center for Faculty Excellence have also co-hosted two grant writing bootcamps. Their close proximity and deeper awareness of campus services now streamline referrals, helping researchers access the resources and expertise needed for competitive proposals.

Situating the library as a hub for research support

By hosting the Research Alliance, the MSU Library has strategically positioned itself at the center of research support—both physically and in terms of its perceived role on campus.

By physically positioning the library as the hub of research support, the library powerfully asserts its central role in supporting institutional priorities.

Jason Clark highlighted this shift, noting, “The visibility, the interactions, the people moving into our space who haven’t really been a part of library visits–this is an understated benefit. The movement of administrators, new faculty, and established faculty into the library . . . is novel and important. These are stakeholders actually moving into our space.”

This shift is significant because many faculty may not have physically visited the library for years, particularly if they no longer use physical collections. As a result, they may not have perceived the library as a valuable contributor and stakeholder in research. Today, faculty come to the library not just for collections, but for research support—physically embodying the evolving, largely virtual role libraries play in research.

The Research Alliance exemplifies what we call “the library beyond the library,” a concept describing how library expertise and capacities are being combined with those from other campus units, often through transformative new collaborative structures. These partnerships extend the library’s role beyond traditional collections management, communicating a more complex value proposition to stakeholders who may be unfamiliar with how library skills align with institutional priorities. The Research Alliance visibly demonstrates and embodies the continued value and central role of the library in supporting institutional research.

Looking ahead

This initiative has improved research support at Montana State, breaking down silos and helping researchers connect with services that can boost their productivity. The team’s optimism is truly inspiring. As one member said, “I love us”—a powerful affirmation of the success of this cross-unit partnership.

What’s most exciting to me is how this effort has demonstrated the library’s value proposition to a host of campus stakeholders. By physically positioning the library as the hub of research support, the library powerfully asserts its central role in supporting institutional priorities. This has enhanced the library’s visibility among campus stakeholders, aligning its value proposition more closely with institutional priorities.

The post Transforming the library into a research support hub appeared first on Hanging Together.

Terry Reese: MarcEdit 7.7 Update

2024-11-13T19:29:58+00:00

The update work for MarcEdit 7.7 has been completed and posted.

MarcEdit 7.7 has been posted to the website. Additionally, on the download page, you’ll find access to the MarcEdit 7 VPAT. Information about the update:

Major Changes:

New Code Signing Certificate
Task List Control flow options
Compiled against NET 8
Koha Plugin Authentication Changes
Plug-in restructure work
logfile, autosave, etc. cleanup
bug fixes

Download links can be found at: https://marcedit.reeset.net/downloads/

The VPat can be found here: https://marcedit.reeset.net/software/marcedit_7_vpat.docx

And finally, I posted a video going over the changes here:

Lucidworks: Gen AI Implementation Costs Skyrocket: Navigating the AI Landscape in Manufacturing

2024-11-13T19:25:55+00:00

AI costs in manufacturing surged 14-fold, prompting strategic adoption. Discover insights from Lucidworks' report on trends and challenges.

The post Gen AI Implementation Costs Skyrocket: Navigating the AI Landscape in Manufacturing appeared first on Lucidworks.

Open Knowledge Foundation: Panel: The Tech We Want is Sustainable for People and the Planet

2024-11-13T16:48:32+00:00

The Tech We Want Summit took place between 17 and 18 October 2024 – in total, 43 speakers from 23 countries interacted with 700+ registered people about new practical ways to build software that is useful, simple, long-lasting, and focused on solving people’s real problems.

In this series of posts, OKFN brings you the documentation of each session, opening the content generated during these two intense days of reflection and joint work accessible and open.

Above is the video and below is a summary of the topics discussed in:

[Panel 3] The Tech We Want is Sustainable for People and the Planet

17 October 2024 – 12:30 UTC

Eco, green, or simply sustainable technologies have several implicit meanings: long life, affordable maintenance, skilled people, resource-friendly, economical to use, renewable, regenerative, etc. In this panel, thinkers, practitioners and promoters of different aspects of software sustainability will discuss if and how it is possible to achieve a development model for people and the planet. Is there a way out of the disaster versus greenwashing narratives?

Christoph Becker – Professor, University of Toronto, author of ‘Insolvent: How to Reorient Computing for Just Sustainability‘
Maxwell Beganim – Director, Open Knowledge Ghana, and Co-lead of the Open Goes COP Coalition
Shweata Hegde – Developer at #semanticClimate and Young India Fellow at Ashoka University
Fieke Jansen – Co-principal Investigator, Critical Infrastructure Lab
Valmik Patel – Data Scientist, Patrick J. McGovern Foundation
Paz Peña – Independent consultant and activist, author of ‘Technologies for a Burning Planet‘
Lucas Pretti – Communications & Advocacy Director, OKFN [moderator]

Summary

This panel explored the multifaceted issue of achieving sustainability in technology. The lively discussion touched on several critical issues:

The inherent violence of the internet infrastructure: Fieke Jansen emphasised that the current internet infrastructure is built on a foundation of exploitation and violence, and called for a shift from the technocapitalist mindset of Silicon Valley to a more reparative and redistributive approach.
Energy efficiency vs. true sustainability: Christoph Becker explained that simply improving energy efficiency is not enough. He argued for sustainable technology that is simple, repairable and supports community and local economies – essentially the “bicycle” of the tech world.
The role of government and policy: Several contributions highlighted the central role of government intervention in regulating and guiding sustainable technology development. From enforcing end-of-life optimisation for hardware to investing in renewable technologies, the role of the state is indispensable.
Open source and the commons: The panel highlighted the importance of open source projects as a means of reducing waste, fostering community-driven innovation, and creating sustainable, collaborative technology solutions.
Indigenous knowledge and pluralism: Paz Peña and Lucas Pretti made a compelling case for integrating indigenous knowledge systems and plural perspectives into technology design and policy-making. Indigenous lands, which make up 6% of the Earth’s surface but contain 85% of its biodiversity, offer crucial lessons in sustainable living.
Technodiversity and anti-monopoly measures: Speakers called for a move away from tech monopolies towards the promotion of technodiversity. Supporting a variety of smaller, community-driven projects can provide more resilient and contextually appropriate technological solutions.
Practical tools and youth engagement: Shweata Hedge shared insights from the #semanticClimate project, which transforms static reports into dynamic, machine-readable formats, making climate knowledge more accessible and actionable. She also highlighted the importance of engaging youth in these initiatives to build a sustainable community for conservation and innovation.

Ultimately, the panel converged on the urgent need for systemic change – moving away from the unsustainable practices of the current technocapitalist framework towards a more equitable, just and sustainable technological future. This will require collective activism, policy intervention and a fundamental rethinking of what ‘development’ means in the context of a finite planet.

John Mark Ockerbloom: I yam what I yam, kinda

2024-11-13T16:26:10+00:00

Thimble Theatre was a 10-year-old comic with waning readership when its lead character Castor Oyl hired a wisecracking sailor to crew a ship he’d bought. Popeye left after their ocean voyage ended, but audience appeal brought him back after a few weeks away. He wasn’t fully developed in 1929, lacking spinach and not yet Olive Oyl’s beau, but would soon be star of the strip and of an ongoing multimedia franchise. In 49 days, his earliest adventures will be public domain. #PublicDomainDayCountdown

Mita Williams: The City As Classroom vs. The City As Advertising Platform

2024-11-13T02:19:17+00:00

On May 2nd, 2016 I had the pleasure of speaking to York University Libraries as part of their Library Futures Series. This is what I said all those years ago.

HangingTogether: Advancing IDEAs: Inclusion, Diversity, Equity, Accessibility, 12 November 2024

2024-11-12T21:40:24+00:00

The following post is one in a regular series on issues of Inclusion, Diversity, Equity, and Accessibility, compiled by a team of OCLC contributors.

2024 International Conference of Indigenous Archives Library and Museums

In the United States, November is celebrated as Native American Heritage Month. The International Conference of Indigenous Archives Library and Museums is traditionally held in November, hosted by the Association of Tribal Archives, Libraries and Museums (ATALM). This week, conference goers will meet in Palm Springs, California for a conference that incorporates workshops, tours and cultural events alongside conference sessions and awards.

^{US Department of Education, CC BY 2.0, via Wikimedia Commons}

I have been fortunate to attend the ATALM conference in the past and have learned so much at this very rich gathering. I look forward to hearing about the outcomes of sessions that intersect with library and archival practice. In addition to the session that I referenced in the last IDEAs post, I’ll mention a few other tantalizing sessions I’m interested in: Library of Congress Indigenous Headings Project and Community Engagement; Revising Metadata Standards: Library of Congress Task Group Listening Session; Allies in a Shared Vision: State Library Support for Tribal Libraries; and Indigenizing Archival Training: Reflecting on New Models for Training/Archival Principles; and Strengthening Library Liaison Relationships + Elevating Indigenous Knowledge Systems. Contributed by Merrilee Proffitt.

IDEAs in Library Resources & Technical Services special issue

The October 2024 (Volume 68, Number 4) issue of Library Resources & Technical Services is devoted entirely to inclusion, diversity, equity, and access. As the editors Rachel E. Scott (Associate Dean for Information Assets at Illinois State University, OCLC symbol: IAI) and Michael Fernandez (Head of Technical Services at Boston University Libraries, OCLC symbol: BOS) write, these issues have preoccupied librarians for some time and get to the core of all aspects of library work and the philosophy that undergirds libraries themselves: “As is often the case, the impacts may not be immediately evident within technical services work, but there are numerous avenues for technical services workers to foreground principles of IDEA and adapt a mindset for advocacy.” Reflective of this preoccupation, the editors did not even need to call for proposals because IDEA – however one may refer to or express the notion – has become a library priority. In this LRTS issue, you will find useful and timely pieces on book challenges, textbook affordability, inclusivity, the treatment of name changes, discoverability of LGBTQ+ resources, and a host of other topics.

For over sixty years, LRTS was the official journal of ALA’s Association for Library Collections and Technical Services (ALCTS). In 2020, it became the publication of ALA’s newly formed Core: Leadership, Infrastructure, Futures division. Since then, LRTS has moved from a subscription model to open access, which is itself a manifestation of IDEA practices. In the wake of the 2024 election in the United States, ALA President Cindy Hohl has noted that ALA will continue to defend library values: “We know that many of our members are concerned that the election results portend attacks on libraries, library workers, and readers. Whatever happens, ALA will stand up for all Americans’ freedom to read—and we will need everyone who loves libraries to stand with us.” Inclusion, diversity, equity, and access will remain central to what libraries are and do. Contributed by Jay Weitz.

Neuroinclusive program provides future librarians with tools to succeed

Johanna M. Jacobsen Kiciman and Alaina C. Bull continue their discussion of their redeveloped learning employment program for MLIS students at the University of Washington-Tacoma (OCLC Symbol: WAU). In “Apprenticeships, MLIS Students, and Neurodiversity: Centering the Humanity of Student Workers, Part 2” (College & Research Libraries News, Volume 85, Number 10, November 2024), the authors discuss how they incorporated reflective practices into the program like finishing meetings by asking, “What do you need from us to make what you’re working on go smoothly?” These reflective practices are neuroinclusive because they teach the students self-reflection and advocating for their own needs. The authors note how their program prepares future librarians to deal with burnout, creating future professional success: “We deeply believe that other institutions should be doing this kind of work. We deeply believe that what you are doing in this sort of apprenticeship program is creating healthy colleagues who you will work with in the future, and who will impact your own well-being in a job environment.” The article is a continuation of Part 1, which appeared in the October 2024 issue of College & Research Libraries News and was covered in the 15 October edition of Advancing IDEAs.

This article is a wonderful demonstration of how neuroinclusivity benefits everyone. The library has more motivated and skilled student workers, the supervisors understand better how to help the students be more successful, and the library profession gains new members who better understand how to prevent burnout. Most importantly, the student workers felt included and empowered. One student said of the positive impact of the program, “Working with [Alaina and Johanna] was the first time where I felt that my neurodivergence wasn’t just tolerated, but actually understood and even embraced. Instead of masking constantly, I was able to work more comfortably (and, as a result, better).” Contributed by Kate James.

The post Advancing IDEAs: Inclusion, Diversity, Equity, Accessibility, 12 November 2024 appeared first on Hanging Together.

John Mark Ockerbloom: The original “Middletown” study in the public domain

2024-11-12T18:47:29+00:00

In the 1920s, Robert and Helen Lynd conducted a study of white residents of Muncie, Indiana. Their 1929 book reporting on the social dynamics of the town’s “working class” and “business class” became a surprise best-seller, inspired many followup studies, analyses, and retrospectives, and did much to shape how many educated Americans viewed “middle America”. The Lynds’ Middletown: A Study in Contemporary American Culture joins the public domain in 50 days. #PublicDomainDayCountdown

Eric Hellman: Running away from home

2024-11-12T16:22:54+00:00

(I'm blogging my journey to the 2024 New York Marathon.)

For a long time, it's been a goal of mine to live and work someplace where the language is something other than English. I've studied French in school and I've studied a bit of Mandarin and Japanese. And Swedish. But I'd never had the opportunity to live in another language, to get comfortable enough to have casual conversations and say the things I want to say.

Two years ago (2022) my Aunt Siv planned an 80th birthday celebration for herself, inviting the whole family to join her for a party in Lappland (northern Sweden). Coming out of two long pandemic years, we were eager to go and travel. There was still a lot of uncertainty about Covid, and with the invasion of Ukraine adding to the feeling that the trip might or might not happen, we booked refundable tickets for a vacation in Sweden.

Swedish was my first language! My parents both grew up in Sweden, but met and married in Ohio. My mom's teenaged sister Siv came over to help my mom with the baby (me) so there was a lot of Swedish in the house. When I started going to nursery school I quickly learned English, and began refusing to speak Swedish. By the time I got to kindergarten, I had completely forgotten all of my Swedish language. But traces remained. After college I decided I should learn Swedish and I took a class in Stockholm. Learning Swedish was completely different from learning French in school, because I could hear in my head if it was right. After one day of class, I could speak 2 sentences of perfect Swedish. I confidently went into a shop, used my 2 perfect sentences, and got into deep trouble because I had no clue what the answers meant. I had a good accent without much trying. This has been very helpful, because when swedes hear a foreigner try to speak Swedish, they immediately switch to English, making it rather difficult for the foreigner to learn. Not me. Swedish people are amazed that I seem to be able to speak good Swedish.

I wanted to improve my Swedish, so I wanted a little longer in Sweden than the rest of the family, and our planning took its final shape when my wife said "Eric, you should just stay! For years you been saying you want to live somewhere in another language, and now the internet lets you work from where ever you want!" So all of a sudden I was going to spend four weeks in Stockholm on my own without much of a plan. I was scared. How would I meet people? Sure, I could sit in my AirBnB and work as a digital nomad, but what would be the point?

Running was one of the answers. There was a half-marathon to run, RUNmaröloppet, that would take me out to an island in Stockholm's archipelago. I had identified a running club, Mikkeller Running Club Stockholm, that seemed sociable, as they meet at a bar on Tuesdays and have beers afterward. Both of these turned out to be awesome. And so I started running away from home.

Running with a group is universal and local at the same time. No matter where you run you can have the same conversations with whoever's running next to you. "Are you training for a race?" "My legs are so stiff." "I'm recovering from an IT-band strain." "My name is Eric, have we run together before?" But every route you run is different in its own beautiful way, and the group helps newcomers (and often the regulars!) to avoid getting lost. By the end of the run, the group has shared an indelible experience and there aren't strangers anymore.

RUNmaröloppet was a blast. You have to take a boat to the island. The course is quite technical in places and is also the most beautiful race I've ever run. I did it again this year, and finished 5th in my age group, despite a lingering knee injury that force me to use walk-run again. Full disclosure: I also finished DFL (Dead F-in Last) out of 282 runners, and was never so happy with a finish.

Mikkeller Running Club Stockholm meets every Tuesday on the lively urban island of Södermalm. Good people, good beer, 5K, 7K and longer routes. The 5K is at a "cozy" pace and welcomes runners of all paces. (Linguistic note: back home we call it "sexy" pace. Maybe this has deep sociological meaning. Or maybe it's the conversion from km to mi.)

In Stockholm I discovered this thing called ParkRun. These people have taken "running away from home" to extremes. ParkRun started somewhere in England and has spread around the world like a pandemic. They have special t-shirts to commemorate milestones such as a runner's 100th ParkRun. I've now run the ParkRun in Stockholm's Haga Park 6 times. It's a timed 5K run. At every run there are people from all over the world - last week I met a couple from Sheffield who had hopped off their cruise ship and took a taxi to the ParkRun so they could add Sweden to their list of ParkRun countries. Some of them even try to run ParkRun places starting with every letter of the alphabet! I love how crazy runners can be.

My Stockholm 2022 sojourn was topped off by a 10K race around Södermalm called "Midnattsloppet". Midnattsloppet is sort of a night-time EuroPop Bay-to-Breakers. 22,000 runners in the 10K, another 17K in the 5K. There was a musical act every kilometer to fire up the runners but only two water stations on that pretty warm night. At the top of the first big hill, there was a choir of ~20 blonde women singing “Waterloo” which I thought a poor choice given the pre-ABBA history of Waterloo. The faster waves of runners got “We are the Champions”. At the start, runners were prompted to sing a song which apparently is the anthem of the Hammarby Football Club, written by a guy who must have been the guitarist for a Swedish Spinal Tap. Apparently he caused a scandal by wearing a "69" T-shirt on Swedish television and sadly died at a young age. On Midnattsloppet night you can walk into any bar in Stockholm in a shirt dripping with sweat and the bouncer will say "Good Jobb!". (I verified this.)

I now have a pair of ruby red New Balance 1080 version 12s. (NOT v13!) My running gait is such that there's a flat wear spot where my feet click together. There's no place like home. There's no place like home.

This series of posts:

Eric Hellman: All the streets in Montclair

2024-11-12T16:22:10+00:00

(I'm blogging my journey to the 2024 New York Marathon. )

At the end of 2020, Strava told me I had run 1362 miles over 12 months. "I hope I never do that again!" I told a running friend. It seemed appropriate that my very last running song from shuffle was Fountains of Wayne's "Stacy's Mom"; founding member Adam Schlesinger had died of Covid. For months of that pandemic year, there wasn't much to do except work on my computer and run. It was boring, but at the same time I loved it. In retrospect, the parks needn't have closed (or later in the year, required masks while running through). Remember how we veered around other people just enjoying fresh air? In that year, running was one thing that made sense. But never have I celebrated the new year as joyously I did on the eve of 2021. Vaccines were on the way, the guy who suggested drinking bleach was heading to Florida, and I had a map of Montclair to fill in.

I've called Montclair, the New Jersey town where I live, "a running resort". It has beautiful parks, long, flat tree-lined streets without much traffic, short steep streets for hill work, well maintained tracks, a wonderful running store, and at least 3 running clubs. During pandemic, everyone seemed to be out running. Even my wife, who for many years would tell me "I don't understand how you can run so much", started running so much. At Christmas our son gave us both street maps of Montclair to put on the fridge so we could record our running wanderings.

So, come 2021 the three of us said goodbye to the boring routine of running favorite routes. Montclair has 363 streets, and a couple of named alleys so we could have done a street a day for a year if we had wanted to. But it was more fun to construct routes that crossed off several streets ata time. While I was at it, I could make strava art or spell words. Most of my running masterpieces were ex post cursus pareidolia. Occasionally I spelled out words. Here's "love" (in memory of a running friend's partner).

Starting on New Year's Day with the Resolution run up "Snake Hill", I methodically crossed off streets. I passed Yogi Berra's "Fork in the Road" I finished the complete set of Montclair streets on Jun 13 .

The neighboring town of Glen Ridge came quickly on July 18, as I had done well over half on the way to Montclair streets. Near me, Glen Ridge is only 2 and a half blocks wide! I took a peek at the Frank Lloyd Wright house on a street I'd not been on before!

With five months left I started on Bloomfield, the next town east. Bloomfield is cut in half by the Garden State Parkway, the source of the "which exit?" joke about New Jersey, and I focused on the half near to me. I got to know Clark's Pond. My streets running helped me set my half marathon PR, in the lovely town of Corning, New York.

I know of other streets running completists - it seems there's even an app to help you do it. Author Laura Carney wrote about it in her book "My Father's List" My friend Chris has continued to add towns and cities to his list and has only 9 streets left to finish ALL OF ESSEX COUNTY. Update: He finished! and was written up by nj.com!

To finish the year I spelled out 2021.

2021: 1,268.3 miles, 223 hours 36minutes, 40,653 ft vertical. I ran to 1,700 different songs. Last running song of the year (on shuffle): Joy Division's "No Love Lost":

Wishing that this day won't last
To never see you show your age
To watch until the beauty fades

This series of posts:

Eric Hellman: Running Song of the Day

2024-11-12T16:21:38+00:00

(I'm blogging my journey to the 2024 New York Marathon. )

Steve Jobs gave me back my music. Thanks Steve!

I got my first iPod a bit more than 20 years ago. It was a 3rd generation iPod, the first version with an all-touch control. I loved that I could play my Bruce, my Courtney, my Heads and my Alanis at an appropriate volume without bothering any of my classical-music-only family. Looking back on it, there was a period of about five years when I didn't regularly listen to music. I had stopped commuting to work by car, and though commuting was no fun, it had kept me in touch with my music. No wonder those 5 years were such a difficult period of my life!

Today, my running and my music are entwined. My latest (and last 😢) iPod already has some retro cred. It's a 6th generation iPod Nano. I listen to to my music on 90% of my runs and 90% of my listening is on my runs. I use shuffle mode so that over the course of a year of running, I'll listen to 2/3 of my ~2500 song library. In 2023, I listened to 1,723 songs. That's a lot of running!

Yes, I keep track. I have a system to maintain a 150 song playlist for running. I periodically replace all the songs I've heard in the most recent 2 months (unless I've listened to the song less than 5 times - you need at least that many plays to become acquainted with a song!) This is one of the ways I channel certain of my quirkier programmerish tendencies so that I project as a relatively normal person. Or at least I try.

Last November, I decided to do something new (for me). I made a running playlist! Carefully selected to have the right cadence and to inspire the run! It was ordered to have to have particular songs play at appropriate points of the Ashenfelter 8K on Thanksgiving morning. It started with "Born to Run" and ended with either "Save it for Later", "Breathless" or "It's The End Of The World As We Know It", depending on my finishing time. It worked OK. I finished with Exene. I had never run with a playlist before.

Last year, I started to extract a line from the music I had listened to during my run to use as the Strava title for the run. Through September 3, I would choose a line from a Springsteen song (he had to take a health timeout after that). For my New Year's resolution, I promised to credit the song and the artist in my run descriptions as well.

I find now that with many songs, they remind me of the place where I was running when I listened to them. And running in certain places now reminds me of particular songs. I'm training the neural network in my head. I prefer to think of it as creating a web of connections, invisible strings, you might say, that enrich my experience of life. In other words, I'm creating art. And if you follow my Strava, the connections you make to my runs and my songs become part of this little collective art project. Thanks!

This series of posts:

Eric Hellman: We'll run 'til we drop

2024-11-12T16:20:46+00:00

(I'm blogging my journey to the 2024 New York Marathon. )

It wasn't the 10 seconds that made me into a runner.

I started running races again 20 years ago, in 2004. It was a 10K sponsored by my town's YMCA. I had run an occasional race in grad school to join my housemates; and I continued to run a couple of miles pretty regularly to add some exercise to my mostly sitting-at-a-computer lifestyle. I gradually added 10Ks - the local "turkey-trot" because the course went almost by my house - and then a "cherry-blossom" run, through beautiful Branch Brook Park. But I was not yet a real runner - tennis was my main sport.

In 2016, things changed. My wife was traveling a lot for work, and one son was away at college, and I found myself needing more social interaction. I saw that my local Y was offering a training program for their annual 10K, and I thought I would try it out. I had never trained for a race, ever. The closest thing to training I had ever done was the soccer team in high school. But there was a HUGE sacrifice involved - the class started at 8AM on Saturdays, and I was notorious for sleeping past noon on Saturdays! Surprise, surprise, I loved it. It was fun to have people to run with. I'm on the silent side, and it was a pleasure to be with people who were comfortable with the somewhat taciturn real me.

I trained really hard with that group. I did longer runs than I'd ever done, and it felt great. So by race day, I felt sure that I would smash my PR (not counting the races in my 20's!). I was counting on cutting a couple of minutes off my time. And I did it! But only by a measly 10 seconds. I was so disappointed.

But somehow I had become a runner! It was running with a group that made me a runner. I began to seek out running groups and became somewhat of a running social butterfly.

Fast-forward to five weeks ago, when I was doing a 10-miler with a group of running friends (A 10 miler for me, they were doing longer runs in training for a marathon). I had told them of my decision to do New York this fall, and they were soooo supportive. I signed up for a half marathon to be held on April 27th - many of my friends were training for the associated full marathon. The last 2 miles were really rough for me (maybe because my shoes were newish??) and I staggered home. That afternoon I could hardly walk and I realized I had strained my right knee. Running was suddenly excruciatingly painful.

By the next day I could get down the stairs and walk with a limp, but running was impossible. The next weekend, I was able to do a slow jog with some pain, so I decided to stick to walking, which was mostly pain-free. I saw a PT who advised me to build up slowly and get plenty of rest. It was working until the next weekend, when I was hurrying to catch a train and unthinkingly took a double step in Penn Station and re-sprained the knee. It was worse than before and I had only 3 weeks until the half marathon!

The past three weeks have been the hardest thing I've had to deal with in my running "career". I've had a calf strain, T-band strains, back strains, sore quads, inter-tarsal neuromas and COVID get in the way of running, but this was the worst. Because of my impatience.

Run-walk (and my running buddies) were what saved me. I slowly worked my way from 2 miles at a 0.05-to-0.25 mile run-to-walk ratio up to 4 miles at 0.2-to-0.05 mile run-to-walk, with 2 days of rest between each session. I started my half marathon with a plan to run 2 mimutes and walk 30 seconds until the knee told me to stop the running bits. I was hoping for a 3 hour half.

The knee never complained (the rest of the body complained, but I'm used to that!!) I finished with the very respectable time of 2:31:28, faster than 2 of my previous 11 half marathons. One of my friends took a video of me staggering over the finish.

I'm very sure I don't look like that in real life.

Here's our group picture, marathoners and half-marathoners. Together, we're real runners.

After this weekend, my biggest half marathon challenge to date, I have more confidence than ever that I'll be able to do the New York Marathon in November - in one piece - with Team Amref.

We're gonna get to that place where we really wanna go and we'll walk in the sun

Jim Thorpe Half Marathon 2024 results.

My half on Strava.

This series of posts:

Eric Hellman: Thank you, New York City

2024-11-12T16:18:46+00:00

fresh off the bus

It was 11:15AM in the pink D corral of the fifth wave, and surrounding me were runners of all shapes and sizes, from around the world, all of us waiting for our race to start in 15 minutes. We had waited through the morning (five hours for me) as our faster friends drifted away excitedly and cannons sounded the starts of earlier waves. There was a determined silence as each of us thought ahead to our 2024 New York City Marathon.

A few meters to my right I saw a woman wearing a large pink button proclaiming her status as a "Birthday Girl". Her shirt had the name "HEATHER" across the front. I shouted "HAPPY BIRTHDAY HEATHER!", and she turned to look at me, a bit startled. I walked over and we chatted a bit. She was from the UK, and was running New York to celebrate turning 50. I told her she was going to have fun, and that the crowd would be calling to her the whole way. "Really?" she said. "Hey, this is New York", I reassured her. "You don't have to know someone 10 years before you can talk to them on a first name basis!"

Then, over to the side of the corral, I saw another woman, wearing a BIRTHDAY GIRL shirt. "Heather, you must go over and wish her happy birthday!" Heather hesitated, but I said "Aw come on!" and led her through the crowd to the other birthday girl. The two marathon twins hugged, and everything felt right with the world. I looked around and the crowd seemed a bit anxious waiting. I shouted "Hey everyone! We have two birthday girls running with us! Let's sing Happy Birthday!"

And so I led a happy chorus of more than a thousand runners in a joyful rendition of "Happy Birthday". Miraculous. My whole day was like that. From start to end, the crowd was shouting my name. They got riled up when I acknowledged them, sometimes chanting "ERIC, ERIC, ERIC" as I gave them high fives.

I had decided to run the 2024 New York City Marathon about ten months earlier. A friend heard me talk about running and suggested that I get a fundraising entry through the charity he was involved with. At that point I had just run my 11th Half Marathon but never a marathon. A marathon seemed an unnecessary stretch for me and my creaky legs. But I decided in an instant. Two days later I told a running friend, Janell, and a few others about my decision. I knew I couldn't back out after that.

still looking good at mile 9

The first 10 miles of the race flew by as I ran at a pace that was faster than I expected (I was doing a 3:1 run:walk). Axel and Karen were there rooting for me at mile 9 with my Fleet Feet friends and then again around mile 12. The crowd on 1st Avenue at mile 16 made me forget that I had never raced that far. More running friends were waiting at mile 18 where it really helped. At mile 21 my 3:1 cycle became 2:1, and at mile 23 it was 1:1. On Fifth Avenue it seemed like everyone I knew was there cheering me on. The bearded prophet with "The End is Near" on a sign could have been a hallucination. Coming out of the Bronx I had switched to my running playlist, and in the Park I started "singing" the lyrics out loud: "It's the End of the World and We Know It!". I wasn't feeling that fine and I switched to 100% brisk walk.

Re-entering the park for the last half mile, I was determined to finish it running. BIG MISTAKE! I cramped up immediately and could barely stagger on. But after a few minutes, my legs consented to a sloooow walk and finally relented on a brisk finish. Then a second miracle occurred. I knew I had friends who were volunteering at the finish line, but to see and hug them all was a blessing I had not expected. And to get the medal from my friend Janell!

Back of the medal with braille text "TCS New York City Marathon"

Thank you to everyone who donated to my fundraiser for Amref Health Africa. Thank you to Karen and Axel for getting me home with my cramping legs. Thank you to the coaches, runners and PTs who helped my get through the training. Thank you to all the spectators and to the volunteers who got me from the start to the finish, and thank you to the zombies that trudged with me for the long long long walk out of the park.

Strava: All my friends are in New York

This series of posts:

John Mark Ockerbloom: A Farewell to Arms

2024-11-11T15:57:32+00:00

“If you look at Hemingway’s prose and the writing he did about war, it was as radical in its time as anything we have seen since,” wrote critic Gail Caldwell, quoted by Thomas Putnam in 2006 in a piece about Hemingway’s wartime experience and writing. But the prose did not come as simply as it may look. For A Farewell to Arms, his best-known war novel, Hemingway wrote at least 47 versions of the ending. The version he published in 1929 becomes public domain in 51 days. #PublicDomainDayCountdown

Digital Library Federation: Announcing Incoming NDSA Coordinating Committee Members for 2025-2027

2024-11-11T14:00:43+00:00

Please join me in welcoming the three newly elected Coordinating Committee members: Kari May, Margo Padilla, and Sylvia Umana. Their terms begin January 1, 2025 and run through December 31, 2027.

Kari May

Kari is a full-time digital preservationist for the University of Pittsburgh Library System. She became one of the university’s NDSA representatives and a member of the Excellence (Innovation) Awards Working Group (EAWG) in 2019. In 2023, Kari became a Co-Chair for the EAWG and has sought to increase transparency and ensure equity and inclusion in all aspects of EAWG processes by initiating new activities and encouraging more standardization in completing and documenting the awards cycle. Kari has also been a member of the NDSA DigiPres Planning Committee (PC) for 2022 and the 2023 Storage Survey Working Group and is currently a member of the Events Strategy Working Group. Her work with other professional organizations includes Co-Chair of the 2025 BPE Program Committee, member of DLF PC 2020-2024, member of LD4 PC 2022, Digital Preservation Coalition Digital Preservation Awards guest Judge 2022, and member of SAA Collection Management Steering Committee 2023-2025.

Kari feels that digital stewardship challenges continue to expand and require professionals to provide creative solutions supported by limited resources. Working with the Coordinating Committee would offer an opportunity to encourage valuable connections throughout the field of digital stewardship and offer strategies to foster collaboration to maximize benefits for all.

Margo Padilla

Margo Padilla is the Digital Preservation Librarian at New York University where she unifies strategies and processes across the Division of Libraries to facilitate the preservation of digital resources. Prior to NYU, she was the Digital Archivist at the New-York Historical Society where she led the development of infrastructure for collecting, preserving, and providing access to born-digital collections. Margo recently served as a member of the National Best Practices for Archival Accessioning Working Group born-digital accessioning and digital preservation subgroup, and previously participated in Collective Responsibility: National Forum on Labor Practices for Grant-Funded Digital Positions.

Margo received her MLIS with a concentration in Management, Digitization, and Preservation of Cultural Heritage and Records from San José State University and her undergraduate degree from the University of California, Berkeley. Margo is interested in furthering the conversation on reliance on contingent labor in cultural heritage organizations, as well as advancing digital preservation best practices that can be realistically implemented by differently resourced institutions. She brings active engagement to committee work and believes the value of NDSA membership is derived from the collective dedication of the digital preservation community, as exemplified by the Interest and Working Groups.

Sylvia Umana

Sylvia is a dedicated Digital Collections Librarian at the Namibia University of Science and Technology Library with a deep passion for her role in preserving and managing digital assets. She holds a Master’s degree in Library and Information Science from the University of Namibia in 2020, where her area of research focused on the digital preservation of institutional repositories. In her role as the digital collections’ librarian, Sylvia worked on various digitization projects including collaborations with the National Archives of Namibia and Desert Research Foundation of Namibia. She is committed to advancing her knowledge on active digital preservation, and thus continues to explore as she aims to actively implement these in her organization.

With a strong commitment to safeguarding digital collections for future generations, she is eager to expand her expertise and contribute to the evolving field of digital preservation and information management especially in developing countries such as Namibia. Her enthusiasm for learning and her attention to detail drive her mission to ensure the longevity and accessibility of valuable digital resources.

We are also grateful to all of the very talented, qualified candidates who participated in this election.

We are indebted to our outgoing Coordinating Committee members, We gratefully thank our outgoing Coordinating Committee members, Stacey Erdman, Jenny Mitcham, and Hannah Wang, for their dedicated and thoughtful leadership, service, and contributions. To sustain a vibrant, robust community of practice, we rely on and deeply value the contributions of all members, including those who took part in voting.

~ Shira Peltzman, Vice Chair

On behalf of the NDSA Coordinating Committee

The post Announcing Incoming NDSA Coordinating Committee Members for 2025-2027 appeared first on DLF.

John Mark Ockerbloom: The debut of a long career in mystery and romantic suspense

2024-11-11T01:08:33+00:00

Mignon G. Eberhart published more than 50 books over a 60-year career, and was named a Grand Master by the Mystery Writers of America in 1971. Her first novel, The Patient in Room 18, introduces nurse Sarah Keate and her detective boyfriend Lance O’Leary, as they puzzle out what’s behind mysterious deaths in a hospital ward. A 2023 Time magazine panel named it one of the 100 best mystery and thriller books of all time. It joins the public domain in 52 days. #PublicDomainDayCountdown

John Mark Ockerbloom: Not eliminating the impossible

2024-11-09T23:48:04+00:00

By 1929, Arthur Conan Doyle had retired Sherlock Holmes, and his stories had more fantastical elements than Holmes would have put up with. The title story of The Maracot Deep and Other Stories involves encounters with supernatural beings in Atlantis. “The Disintegration Machine”, another story in the collection, and his last featuring Professor Challenger, deals with an invention not unlike Star Trek‘s later transporter. The book joins the public domain in 53 days. #PublicDomainDayCountdown

Ed Summers: Love is

2024-11-09T08:00:00+00:00

Love is

John Mark Ockerbloom: Ain’t these tears in these eyes tellin’ you?

2024-11-08T20:46:10+00:00

Warner Brothers’ full-color 1929 musical film On With the Show featured Ethel Waters singing “Am I Blue?”, a song so pervasive that it was also in 3 other films that year. Singers that have since covered this standard include Billie Holiday, Eddie Cochran, Ray Charles, Cher, Bette Midler, and Linda Ronstadt. It’s also been in later films like To Have and Have Not, Funny Lady, and The Cotton Club. The song and the movie it debuted in join the public domain in 54 days. #PublicDomainDayCountdown

David Rosenthal: Nvidia vs. Intel

2024-11-08T16:21:27+00:00

NV1-based Diamond Edge
Swaaye, CC-By-SA 3.0

Today Nvidia replaced Intel in the Dow Jones Industrial Average with a market cap of about $3.6T, about the same as Apple, as against Intel's market cap about 33 times less.

That is a long way from Curtis Priem's kitchen table, a $2.5M A-round from Sutter Hill and Sequoia, and the NV1.

Ed Summers: Hope in the Dark

2024-11-08T08:00:00+00:00

Hope in the Dark

“Cause and effect assume history marches forward, but history is not an army. It is a crab scuttling sideways, a drip of soft water wearing away a stone, an earthquake breaking centuries of tension. Sometimes one person inspires a movement, or her words do decades later; sometimes a few passionate people change the world; sometimes they start a mass movement and millions do; sometimes those millions are stirred by the same outrage or the same ideal, and the change comes upon us like a change of weather. All that these transformations have in common is that they begin in hope. To hope is to gamble. It’s to bet on the future, on your desires, on the possibility that an open heart and uncertainty is better than gloom and safety. To hope is dangerous, and yet it is the opposite of fear, for to live is to risk.”

John Mark Ockerbloom: All singing! All dancing!

2024-11-07T22:38:32+00:00

In 1929, just two years after The Jazz Singer introduced synchronized sound to theaters nationwide, The Broadway Melody was released as a full-length movie musical with synchronized sound nearly throughout. One sequence was even in Technicolor.

The movie won the first best-picture Oscar awarded to a sound film. Despite its fame and technical innovation, we won’t see it in its full glory when it joins the public domain in 55 days: the Technicolor version is now lost. #PublicDomainDayCountdown

HangingTogether: Nieuw OCLC Research-rapport over Open Access gelanceerd

2024-11-07T18:41:32+00:00

Ons onderzoeksrapport over Verbetering van de vindbaarheid van Open Access voor gebruikers van academische bibliotheken is recent gepubliceerd. Het is een onderzoek naar strategieën om wetenschappelijke, peer-review open access (OA)-publicaties beter vindbaar te maken voor bibliotheekgebruikers. De bevindingen zijn gebaseerd op onderzoek door zeven academische instellingen binnen Nederland. We hebben bibliotheekpersoneel geïnterviewd over hun inspanningen rondom de vindbaarheid van OA en bibliotheekgebruikers bevraagd over hun ervaringen met OA. De synthese van deze bevindingen biedt nieuwe inzichten in de mogelijkheden om OA-vindbaarheid te verbeteren.

Van OA-beschikbaarheid naar vindbaarheid: de kloof overbruggen

Vanaf het begin hebben we het OA-onderzoek opgezet en uitgevoerd in samenwerking met twee Nederlandse academische bibliotheekconsortia—UKB (Universiteitsbibliotheken en Nationale Bibliotheek) en SHB (Samenwerkingsverband Hogeschoolbibliotheken). Zij spelen een belangrijke rol naar volledige OA van Nederlandse wetenschappelijke publicaties. Juist omdat zij vooropliepen bij de overgang naar OA en veel investeerden in OA-publicaties, wilden zij de vindbaarheid van OA-publicaties beoordelen. Daarnaast wilden zij de opkomende kloof tussen OA-beschikbaarheid en vindbaarheid aanpakken.

Deze kloof kwam voor het eerst aan het licht door bevindingen uit de OCLC Global Council-enquête van 2018-2019 over open content in bibliotheken wereldwijd. De resultaten lieten een disbalans zien binnen de investeringen van academische bibliotheken: er werd meer moeite gestoken in eerder gesloten content te openen dan in het promoten van de vindbaarheid van open content. Toch gaven de meeste respondenten aan dat dit voor hen net zo belangrijk was. Ook opmerkelijk was de bijna unanieme mening dat OCLC een ondersteunende rol speelde in het vindbaar maken van open content van bibliotheken. Een bevestiging van het belang van de rol van OCLC in het open access-ecosysteem.

Een reeks kennisdelingsconsultaties van de Nederlandse academische bibliotheekgemeenschap in 2021 bevestigde deze kloof. Ook kwam hieruit het belang naar voren om de rol van OA in het zoekgedrag van gebruikers beter te begrijpen. Als gevolg hiervan besloten UKB, SHB en OCLC een onderzoek uit te voeren om te onderzoeken hoe de verwachtingen en gedragingen van academische studenten, docenten, onderzoekers en hoogleraren bibliotheken kunnen helpen bij hun inspanningen met betrekking tot verbetering van OA-vindbaarheid. Tot zover het begin van het Open Access Discovery-project.

Het ontstaan van het OA-ontdekkingslandschap en de rol die bibliotheken hierin spelen

Het geïnterviewde bibliotheekpersoneel beschreef de opkomst van een complex landschap om OA-publicaties vindbaar te maken. Nieuwe spelers wilden hun terrein afbakenen, terwijl bibliothecarissen deden wat zij het beste vonden, maar OA-publicaties pasten niet in hun traditionele processen. Er waren geen richtlijnen, best practices of benchmarks voor het toevoegen van OA-publicaties aan hun collecties en het integreren ervan in gebruikersworkflows. Nationale samenwerkingen en nieuwe processen zijn in eerste instantie opgezet om metadata te creëren en bloot te leggen voor institutioneel gepubliceerde OA-publicaties. Echter had het bibliotheekpersoneel te maken met uitdagingen op het gebied van ontsluiten van publicaties en de kwaliteit van metadata.

De geïnterviewden waren niet overtuigd dat hun inspanningen verschil maakten voor hun gebruikers, maar ons rapport laat zien dat dit wel zo is.

Hoewel zij terecht geloofden dat de bibliotheek niet de eerste plaats was waar gebruikers zochten, stond de zoekpagina van de bibliotheek in de top drie van meest gezochte systemen. De antwoorden van gebruikers schetsen een enigszins verwarrend beeld van de rol die OA speelt in hun ontdekkingstocht. Respondenten vonden OA-publicaties niet erg makkelijk te vinden of toegankelijk. Ook gaf bijna de helft aan niet veel te weten over OA. De meeste respondenten vertrouwden echter wel op OA-alternatieven wanneer zij moeilijk toegang kregen tot de volledige tekst. Hoewel OA niet hun eerste keuze was, beïnvloedde de groeiende hoeveelheid OA-publicaties hun zoek-, toegangs- en gebruiksprocessen. Deze bevindingen leidden tot de volgende ontdekking in het rapport:

“De voorlichtings- en instructieactiviteiten van het bibliotheekpersoneel waren voornamelijk gericht op het vergroten van het bewustzijn van gebruikers over OA-publicaties. Gebruikers hadden extra instructie nodig over het ontdekken, evalueren en gebruiken van deze nieuwe publicaties.”

Introductie van het rapport aan de Nederlandse bibliotheekgemeenschap

Met genoegen en trots presenteerden Ixchel Faniel en ik het eindrapport, met bevindingen en belangrijkste conclusies, aan UKB- en SHB-vertegenwoordigers op de OCLC Contactdag op 8 oktober 2024 in Amersfoort, Nederland. De Contactdag is een jaarlijkse bijeenkomst van Nederlandse en Vlaamse professionals uit academische en openbare bibliotheken die geïnteresseerd zijn brancheontwikkelingen en in de strategische richting en productontwikkeling van OCLC. Het is tevens een plek waar zij praktijken en innovatieve projectresultaten delen.

Bij de introductie van het OA-ontdekkingsrapport deelde ik de belangrijkste conclusie voor de Nederlandse bibliotheekgemeenschap als volgt:

“Als je je afvraagt of de investering van je bibliotheek in OA-vindbaarheid de moeite waard is, is het antwoord volmondig JA!”

De omslag van het rapport—een foto van een Nederlands polderlandschap—is een knipoog naar de Nederlandse setting van ons onderzoek. Het is ook als een vergelijking met het harde werk dat nodig is om OA-publicaties vindbaar te maken. Een polder wordt gecreëerd door het graven van sloten en het bouwen van dammen en dijken om stukken laagland van water te ontdoen. Zoals ik het publiek vertelde, vergelijkbaar als bij de polder: “er is nog veel werk te doen. OA is nog onontgonnen terrein dat verkend en gecultiveerd moet worden. We kunnen het ons niet veroorloven om achterover te leunen en toe te kijken!”

Volgende stappen: slimmer samenwerken

Tijdens de middagsessie van de OCLC Contactdag bespraken de deelnemers de bevindingen, uitdagingen, kansen en vervolgstappen in break-outgroepen. Velen herkenden de dilemma’s rondom OA-ontdekkingen, zoals weergegeven in het rapport. Ze waren ook geïnteresseerd in de bevindingen om strategieën te ontwikkelen voor het verbeteren van de vindbaarheid van OA.

Een terugkerend thema was de noodzaak tot samenwerking. Deelnemers bespraken de mogelijke voordelen van samenwerking bij het selecteren van OA-titels per vakgebied en het vergroten van het bewustzijn van gebruikers over OA-bronnen. Ze wilden ervaringen delen over het blootleggen van institutionele metadata, samenwerken bij het verzamelen van metadata, en partnerschappen aangaan met OCLC om de kwaliteit van metadata te verbeteren. Ook werd gesproken over grotere betrokkenheid, op de campus en nationaal, met recente Diamond OA-publicatie-initiatieven om te pleiten voor ontdekkingsmetadata die goed werken voor zowel bibliotheekworkflows als gebruikersbehoeften. Deze ideeën illustreren de noodzaak van samenwerking tussen belanghebbenden, van OA-publicatie tot vindbaarheid. Deze sluiten dan ook goed aan bij de laatste woorden van ons rapport:

“Om de vindbaarheid van OA-publicaties echt te verbeteren, moeten alle betrokkenen de behoeften van anderen binnen de keten in overweging nemen.”

Lees het rapport voor meer informatie over het overbruggen van de kloof tussen de beschikbaarheid en de vindbaarheid van OA-publicaties. https://oc.lc/oa-discovery

The post Nieuw OCLC Research-rapport over Open Access gelanceerd appeared first on Hanging Together.

John Mark Ockerbloom: A writer of pessimism and grace

2024-11-06T19:57:29+00:00

William Golding called the bipolar Catholic author Graham Greene “the ultimate chronicler of twentieth-century man’s consciousness and anxiety”. Both Greene’s thrillers and his more serious novels are suffused with concerns of politics and religion, flawed institutions, characters who betray others and their own consciences, and grace and redemption in unexpected places.

His first novel, The Man Within, was published in 1929. It joins the public domain in 56 days. #PublicDomainDayCountdown

LibraryThing (Thingology): Author Interview: Andrea Jo DeWerd

2024-11-06T18:16:06+00:00

LibraryThing is pleased to sit down this month with author Andrea Jo DeWerd, who, in addition to her career in publishing and as an independent book marketer, recently saw her debut novel, What We Sacrifice for Magic, released by Alcove Press. DeWerd worked for more than a decade in the marketing and publicity departments of a number of Big 5 publishers, including Crown, Random House, Simon & Schuster, and most recently, the Harvest imprint of HarperCollins. In 2022 she launched her own marketing and publishing consulting agency, the future of agency LLC. Her authorial debut, published in late September, is a fantastical coming-of-age story following three generations of Minnesota witches during the 1960s. DeWerd sat down with Abigail to answer some questions about this new book.

How did the idea for What We Sacrifice for Magic first come to you, and how did the story develop? Did your heroine Elisabeth come first? Was it always a multi-generational family story in your mind, always a witchy tale?

I was trying to write a very different book about the American Dream, and my own family’s experience with it. My grandfather’s family were Dutch immigrants in Minnesota. My great-grandfather and his cousin operated several feed mills and fish hatcheries. The next generation, my grandfather and his brothers, all became doctors. I was fascinated by this story, and by what happens after the American Dream is achieved—what happens to the next generation? But it was too close to home for me to write in the years after my grandfather passed away.

What We Sacrifice for Magic grew out of the question: what were the women doing while the men were building their empire? I started to imagine a world in which the men ostensibly held the power, but beneath the surface, it was really the women pulling the strings; a world in which the women could be running a full-on witchcraft operation out of the side door of the kitchen while the men were off fighting their wars and building their supposed influence.

Elisabeth’s voice came to me first. I started to hear her voice, and the first thing I knew about her was that she was ruled by water. From there, I explored how she would’ve come to be that way, who would’ve taught her about her power, and Magda, her grandmother, her teacher, emerged pretty quickly.

Your book addresses themes of familial history, obligation and conflict, and the individual’s struggle to both belong to and be independent of the family circle. How does the witchy element in your story add to or complicate those themes? How different would your story be if the Watry-Ridder women weren’t witches?

In many books with magic, the magic acts as the deus ex machina that lifts the characters out of their unfortunate situations. Magic breaks oppressive forces in many ways. For Elisabeth, magic is what is holding her back, her burden. Aside from that magical burden, Elisabeth would still need her coming-of-age journey. I believe that even without magic, Elisabeth would’ve always felt separate from her family. She needed to learn who she is on her own, away from the reputation of her family and the name she was born to.

Without magic, this story becomes a much more familiar one. Anyone who has ever dealt with the pressures of a family business knows what it feels like to be torn between wanting to forge your own path and getting pulled back into the family responsibility. Adult children who take care of their aging parents know that tug-of-war as well. I think we all feel family pressure in some way or another in our lives, and beneath the magic, that is what I wanted to explore in this book.

What We Sacrifice for Magic is set in your own home state of Minnesota, and opens in 1968. What significance do the setting and time period have to your story?

The setting came to me first. Elisabeth, ruled by water, was always going to be from a small lakeside town in Minnesota. The town of Friedrich was inspired by my own beloved Spicer, Minnesota, where my family has had a cabin on Green Lake since 1938. The lake felt so integral to this story and this community that the Watry-Ridder family serves.

Moreso, this family had to come from a place that was rural enough for them to fly under the radar, a pastoral community that just accepted their local eccentrics, and even came to depend on them. I was also fascinated by the sort of gossip that happens in a small town. In a closeknit community, it’s impossible to walk down the street without everybody knowing everything about you, who you’re dating, etc. I wanted to see Elisabeth and her younger sister, Mary, engage with that gossip, and it certainly shapes them as they’re growing up in Friedrich with the sometimes unwanted attention.

More broadly, 1968 was a time when many young women were starting to have more choices in their education and the opportunity for careers outside of the home, in many parts due to contraception. Those choices were not available to Elisabeth—she is stuck in this small town, tied to her community, as she watches her high school classmates going off to their next chapters.

What influence has your career in publishing and book marketing had on your storytelling? Have you been inspired by any of the authors whose books you have promoted?

I started writing this book when I was working full-time as a book marketer at Random House. I had been a creative writing minor in college, but I wasn’t really writing in my first 8 years in New York while I was in grad school and volunteering and focused on other things. I was inspired to start writing again in earnest when I would be in meetings with these amazing authors like Catherine Banner and Emma Cline, who were both a few years younger than me. I thought if they found time to do it, why couldn’t I? On the flip side, I was working with Helen Simonson at the time, who said that she didn’t really get to start writing until her kids were grown and out of the house, and I thought, “I’m single, I don’t have kids, what am I waiting for?”

I was also greatly inspired by Laura Lynne Jackson’s books The Light Between Us and Signs. Her first-person account of how close we are to the spirits on the other side very much influenced my own personal spiritual beliefs, some of which are woven into Elisabeth’s outlook and her experiences with her guide from the other side, Great-Grandma Dorothy, and the energy healing work that the family does.

Tell us about your writing process. Do you have a particular place you prefer to write, a specific way of mapping out your story? Did you know from the beginning what the conclusion would be?

I wrote at least 50% of this book long-hand in a journal. I write in the morning in bed before the rest of the world comes crashing in, i.e. before I look at my phone or email. My phone stays in the kitchen until after I’m done writing for the day. Once I got further into the story, though, I switched to drafting on my laptop when I was really building momentum.

I don’t believe you have to write every day. I have a day job! I write maybe a few days a week, and this book came together 100 words at a time. I would write a single paragraph in the morning before hopping in the shower and heading into Random House. My writing group talks often about setting realistic goals because the minute you set a lofty goal and miss that first day of “write every day,” it makes it that much harder to get back on track.

I barely outlined this book. This was very much a discovery writing project, but when I got into revision, I reverse-outlined what had happened so far in the book so that I could confidently write my way through to the end. I didn’t know the exact ending of the book until I was about ⅓ of the way through. I remember emailing my writing group one day to say, “I think I just wrote the last line of my book.”

For revision, the book Dreyer’s English by friend and former Random House colleague Benjamin Dreyer was essential to me. It was very helpful to read books like his as I was enmeshed in the revision process.

What can we look forward to next from you? Do you have other writing projects in the offing?

I am working on something completely different next! I am finishing a first draft this fall of my second novel, a contemporary Christmas rom-com set in southern Minnesota. There’s Christmas cookies, a local hottie, and a girl home from the big city. I’m approaching this book a little differently—starting with an outline!

Tell us about your library. What’s on your own shelves?

I am very much a mood reader and I read just about every genre out there. I love sci fi and fantasy or romance for a quick vacation read. I try to keep up with the new, big literary novels. I have my section of craft books, like Big Magic and Bird by Bird. I have sections of series that I’m hoping to finish one day, like Outlander. I’m always reading our clients’ books for work. I have a celebrity chef’s memoir and a performance and productivity expert to read next for work. But truthfully, my shelves are full of books I haven’t read that have come with me from job to job. I have classics, I have the hot releases dating back to 2010, I have signed copies of books I’ve worked on, like Educated and Born a Crime. I also have an amazing cookbook collection from my time working in lifestyle books, lots of Mark Bittman and Jacques Pépin and Dominique Ansel.

What have you been reading lately, and what would you recommend to other readers?

I just finished the new Louise Erdrich novel, The Mighty Red. She’s my favorite author and as a contemporary Minnesotan author, she has had a huge impact on me as a reader and a writer. I think Erdrich most accurately captures contemporary women—and the myriad ways the world disappoints us—like no one else I’ve ever read. I make a point to buy the new books by Louise Erdrich and William Kent Krueger, another Minnesotan author, in hardcover from indie bookstores when I’m back in MN. If you haven’t read Louise Erdrich before, one of my favorite books is The Round House. I recommend that book to everyone.

Jodi Schneider: Information Quality Lab at the 2024 iSchool Research Showcase

2024-11-06T16:24:11+00:00

While I’m in Cambridge, today members of my Information Quality Lab present a talk and 9 posters as part of the iSchool Research Showcase 2024, noon to 4:30 PM in the Illini Union. View posters from 12 to 1; during the break between presentation sessions 2-2:45; and 4-4:30 PM.

TALK by Dr. Heng Zheng, based on our forthcoming JCDL 2024 paper:
Addressing Unreliability Propagation in Scientific Digital Libraries
Heng Zheng, Yuanxi Fu, M. Janina Sarol, Ishita Sarraf, Jodi Schneider

POSTERS
Addressing Biomedical Information Overload: Identifying Missing Study Designs to Design Multi-Tagger 2.0
Puranjani Das, Jodi Schneider

Assessing the Quality of Pathotic Arguments
Dexter Williams

Cognitive and Behavioral Approaches to Disinformation Inoculation through a Hidden Object Game
Emily Wegrzyn

Distinguishing Retracted Publications from Retraction Notices in Crossref Data
Luyang Si, Malik Oyewale Salami, Jodi Schneider

Harmonizing Data: Discovering “The Girl From Ipanema”
John Rutherford, Liliana Giusti Serra, Jodi Schneider

“I Lost My Job to AI” — Social Movement Emergence?
Ted Ledford, Jodi Schneider

Recognizing People, Organizations, and Locations Mentioned in the News
Xioran Zhou, Heng Zheng, Jodi Schneider

Representation of Socio-technical Elements in Non-English Audio-visual Media
Puranjani Das, Travis Wagner

What People Say Versus What People Do: Developing a Methodology to Assess Conceptual Heterogeneity in a Scientific Corpus
Yuanxi Fu, Jodi Schneider

Open Knowledge Foundation: Panel: The Tech We Want is Built and Maintained with Care

2024-11-06T11:04:00+00:00

In this series of posts, OKFN brings you the documentation of each session, opening the content generated during these two intense days of reflection and joint work accessible and open.

Above is the video and below is a summary of the topics discussed in:

[Panel 2] The Tech We Want is Built and Maintained with Care

17 October 2024 – 11:30 UTC

Digital technologies need people to care for them and keep them alive. In a time of obsession for innovation and disruption, in this panel we will shine a light on the invisible but essential work of maintenance.

Sara Petti – International Network Lead, OKFN [moderator]
Mathieu Jacomy – Assistant Professor, Aalborg University Tantlab, and co-creator of Gephi and Hyphe
Allison Pike – Co-founder, Infield
Katharina Meyer – Director, Digital Infrastructure Insights Fund

Summary

This panel sheds light on the often invisible, essential work of maintaining digital infrastructure, particularly open source software. The speakers argue passionately that the maintenance of software systems, like the ongoing care of a garden, is crucial to the sustainability of digital ecosystems. They highlight the systemic problems that maintainers face, such as burnout, lack of recognition and inadequate funding, and call for a radical shift in how this work is valued and supported.

Emphasising the ethical and social consequences of neglect, and the urgent need for a supportive community and adequate funding, the panellists argue for a culture of shared responsibility and visibility. They urge both corporations and open source communities to recognise this work, to create supportive structures, and to recognise that maintenance is as critical as innovation. The discussion is a clarion call to action, emphasising that we must prioritise care and sustainability in our digital world.

Richard Wallis: BIBFRAME Dilemmas for Libraries: Challenges and Opportunities

2024-11-06T08:28:02+00:00

I recently attended the 2024 BIBFRAME Workshop in Europe (BFWE), hosted by the National Library of Finland in Helsinki. It was an excellent conference in a great city!

Having attended several BFWEs over the years, it’s gratifying to witness the continued progress toward making BIBFRAME the de facto standard for linked data in bibliographic metadata. BIBFRAME was developed and is maintained by the Library of Congress to eventually replace the flat record-based metadata format utilised by the vast majority of libraries – MARC (a standard in use since 1968).

This year, Sally McCallum from the Library of Congress shared significant updates about their transition to becoming a BIBFRAME-native organisation. In August 2024, they began a pilot with 15 cataloguers inputting records directly into BIBFRAME, marking the start of the next stage of a long journey. This process not only involved adopting a new system but also retraining a large number of staff—a significant challenge but a major step forward.

Several other organisations, including the Share Community, OCLC, Ex Libris, and FOLIO LSP, also presented their advancements in linked bibliographic metadata and BIBFRAME. While the progress is encouraging, there are some dilemmas, not really addressed in the conference, that libraries face as they consider adopting BIBFRAME, and I’d like to explore those here.

Table of contents

#1: Should linked data only be limited to bibliographic resources?

One of the key benefits of linked data is its ability to connect and relate resources across different domains, not just within traditional library systems. However, many libraries aiming to leverage linked data are primarily focused on bibliographic resources, especially as current BIBFRAME-enabled cataloguing solutions are often seen only as replacements for MARC-based systems.

The challenge arises when libraries want to integrate other types of resources—such as archival collections, historical documents, or art-related information—that don’t neatly fit into the BIBFRAME model. BIBFRAME excels at describing bibliographic resources, but it struggles with the nuances of these other resource types. There are initiatives to extend BIBFRAME to handle arts materials etc., but they are still very [bibliographic] library system focused.

Dilemma: Should a library implement a linked data solution solely for bibliographic resources (essentially as a MARC replacement), or should they adopt a broader linked data strategy that integrates all types of resources across the organisation?

My thought: If a [linked data enabled] replacement for a current library system is all you are looking for, that’s fine. However, if that is all, you need to examine the benefits that would accrue from such a significant move and investment. If your ambition is to present a linked aggregated view of all your resources to your users, a BIBFRAME replacement library system probably will not be flexible enough.

#2: How to bridge the gap between the library world and the wider web?

One of the widely-touted benefits of BIBFRAME is the ability to share library data more openly across the web. In theory, other libraries, research institutions, and even the broader public could link to a library’s BIBFRAME data. For the library community, BIBFRAME offers a comprehensive linked data vocabulary that facilitates data sharing.

However, outside of the library world, the web at large, driven by the search engines, is largely adopting Schema.org as the preferred vocabulary for sharing data. Libraries have long been seen as silos, with their data mostly confined to standalone search interfaces and complex data formats such as MARC.

BIBFRAME, while a step forward, doesn’t fully resolve this issue. Yes, it makes data more open and linked, but it still speaks primarily to the library community. If libraries want their data to enrich the wider web, they may need to also incorporate Schema.org alongside BIBFRAME to ensure comprehension and therefore visibility of their resources.

Dilemma: Should libraries focus exclusively on sharing data within the library and research community using BIBFRAME, or should they also aim to make their data more accessible to the general web audience by enriching their data with Schema.org terms?

My thought: Whatever specialist online discovery routes our users may take, they and we are also users of the wider web in general. To make best use of our resources we need our potential users to be guided to those resources. Guided from where they are, which is often not within a library interface or specialist site. To be visible beyond library focused sites, our resources need to be also described using the de facto vocabulary for the rest of the web – Schema.org.

#3: The costs and challenges of transitioning to BIBFRAME

Transitioning to BIBFRAME can involve significant upheaval for a library, especially for those still reliant on MARC-based systems. Replacing these systems often comes with substantial costs, retraining efforts, and disruptions to daily operations.

Many libraries may question whether the perceived benefits of linked data and BIBFRAME—such as improved data sharing and discoverability—are worth the investment. For smaller institutions, the costs of a full-scale BIBFRAME implementation may seem prohibitive, especially when the advantages are not always immediately tangible.

Dilemma: Should libraries undertake a full-scale, costly transition to BIBFRAME and linked data, or is there a way to adopt linked data principles more gradually, without completely overhauling existing systems?

My thought: My many years working with libraries has taught me that any significant change in systems and or practices often results in far greater investment in time, people, and money than was initially envisaged. Part of the reason for this being the integrated nature of traditional library systems. Swapping out one system for another, say to change cataloguing practices, will often result in changes to circulation and acquisition processes for example. All this whilst the library needs to continue its business as usual. Equally, is retraining of staff a necessary first step to adopting linked data, or could/should it be a more evolutionary process.

My recent work, in partnership with metaphacts, for the National Library Board Singapore has demonstrated that it is possible to make significant beneficial moves into linked data, without replacing established systems and processes or disrupting business as usual. A route others may want to consider.

In addition to attending the BFWE conference, I had the privilege of delivering a presentation titled “Building a Semantic Knowledge Graph at National Library Board Singapore” [slides, video] This project represents a two-year effort to develop and deliver a linked data management system based on both BIBFRAME and Schema.org, powered by metaphactory. What makes this initiative unique is that it integrates data from various systems across the library without requiring a complete systems replacement.

Conclusion

Since its launch 18 months ago, this system has continued to evolve, delivering linked data services back into the library. The approach has allowed the library to realise many of the benefits of linked data without the disruption of replacing its core systems. These benefits include cross-system entity aggregation & reconciliation, navigational widgets for non-linked systems, and an open linked data knowledge graph interface. Besides leveraging the benefits of linked data for library curators, the immense knowledge graph built across data sources united using Schema.org data modelling opens the opportunities of publishing rich cross-domain data to the general public. To learn more about our work with NLB, have a look at this metaphacts blog post.

For those grappling with any of the dilemmas I’ve outlined here or interested in exploring linked data further, feel free to reach out—I’d be happy to help facilitate a discussion.

(Note: This post is also featured as a guest post on the metaphacts blog)

John Mark Ockerbloom: A woman who made her mark on the map

2024-11-05T14:53:23+00:00

Emma Willard had remarkable persistence. She founded the first higher education institution for women in America, and appealed tirelessly for its support in multiple states. She wrote textbooks for it that include groundbreaking work in history and graphic design.

Alma Lutz’s 1929 biography of Willard, joining the public domain in 57 days, is titled Emma Willard, Daughter of Democracy. May all American daughters and other children of democracy vote to defend it today. #PublicDomainDayCountdown

John Mark Ockerbloom: “You know it too well already…”

2024-11-04T17:27:09+00:00

“I listen to Mussolini’s gentle voice talking to me of friendship, while my ears still ring with the death threats…”

French Prix Goncourt laureate Maurice Bedel wrote in the 1920s and 30s of the appeal and threat of fascism, and the people seduced by it in Italy and Germany. Parts of his book Fascisme An VII appeared in English translation in the November 1929 Atlantic as “A Frenchman Looks at Fascism“. It joins the public domain in both Europe and America in 58 days. #PublicDomainDayCountdown

Lucidworks: OpenAI’s ChatGPT Adds Web Search: A Q&A on What It Means for Enterprise Search

2024-11-04T16:57:22+00:00

ChatGPT's web search capabilities are exciting but enterprise search requires an even more robust solution.

The post OpenAI’s ChatGPT Adds Web Search: A Q&A on What It Means for Enterprise Search appeared first on Lucidworks.

LibraryThing (Thingology): SantaThing 2024: Bookish Secret Santa!

2024-11-04T16:43:09+00:00

It’s the most wonderful time of the year: the Eighteenth Annual SantaThing is here at last!

This year we’re continuing to focus on indie bookstores. You can still order Kindle ebooks, we have Kenny’s and Blackwell’s for international orders, and also stores local to Australia, New Zealand, and Ireland.
» SIGN UP FOR SANTATHING NOW!

What is SantaThing?

SantaThing is “Secret Santa” for LibraryThing members.

How it Works

You pay $15–$50 and pick your favorite bookseller. We match you with a participant, and you play Santa by selecting books for them. Another Santa does the same for you, in secret. LibraryThing does the ordering, and you get the joy of giving AND receiving books!

Even if you don’t want to be a Santa, you can help by suggesting books for others. Click on an existing SantaThing profile to leave a suggestion.

Every year, LibraryThing members give generously to each other through SantaThing. If you’d like to donate an entry, or want to participate, but it’s just not in the budget this year, be sure to check out our Donations Thread here, run once again by our fantastic volunteer coordinator, mellymel1713278.

Important Dates

Sign-ups close MONDAY, November 25th at 12pm EST. By the next day, we’ll notify you via profile comment who your Santee is, and you can start picking books.

You’ll then have a little more than a week to pick your books, until THURSDAY, December 5th at 12pm EST (16:00 GMT). As soon as the picking ends, the ordering begins, and we’ll get all the books out to you as soon as we can.

» Go sign up to become a Secret Santa now!

Supporting Indie Bookstores

To support indie bookstores we’re teaming up with independent bookstores from around the country to deliver your SantaThing picks, including BookPeople in Austin, TX, Longfellow Books in Portland, ME, and Powell’s Books in Portland, OR.

And to continue previous years’ success, we’re bringing back the following foreign retail partners: Readings for our Australian participants, Time Out Books for the Kiwi participants, and Kennys for our Irish friends.

And since Book Depository has closed, this year we’re offering international deliveries through Kennys and Blackwell’s.

Kindle options are available to all members, regardless of location. To receive Kindle ebooks, your Kindle must be registered on Amazon.com (not .co.uk, .ca, etc.). See more information about all the stores.

Shipping

Some of our booksellers are able to offer free shipping, and some are not. Depending on your bookseller of choice, you may receive $6 less in books, to cover shipping costs. You can find details about shipping costs and holiday ordering deadlines for each of our booksellers here on the SantaThing Help page.
» Go sign up now!

Questions? Comments?

This is our EIGHTEENTH year of SantaThing. See the SantaThing Help page further details and FAQ.
Feel free to ask your questions over on this Talk topic, or you can contact Kate directly at kate@librarything.com.
Happy SantaThinging!

Open Knowledge Foundation: Open Knowledge Achieves US Charitable Organisations Equivalency Status

2024-11-04T15:31:14+00:00

We’re thrilled to announce that the Open Knowledge Foundation (OKFN) has achieved NGOsource Equivalency Determination (ED) certification, formally establishing our recognition as equivalent to a US public charity. This status represents a major milestone for OKFN and opens new avenues for partnerships and support from US-based donors and foundations.

What is NGOsource Equivalency Determination (ED)?

The Equivalency Determination process, administered by NGOsource, evaluates nonprofit organisations outside the United States to confirm their operations are in accordance with the guidelines that US tax authorities require for public charitable organisations. By meeting NGOsource’s rigorous criteria, Open Knowledge demonstrates its commitment to transparency, accountability, and impact on a global scale. This designation means that foundations and individuals in the United States can now make tax-deductible grants and donations to OKFN with fewer restrictions, knowing that their contributions are directed toward a recognised, vetted nonprofit organisation.

What This Means for Our Work and Communities

As a certified organisation, OKFN can now access new grant opportunities and accept tax-deductible donations from US-based donors. This expanded support base will enable us to continue our work on a global scale, advancing open knowledge, promoting transparency, and advocating for and building accessible, digital tools that serve the public interest worldwide. With US charity recognition, OKFN is now even better positioned to partner with organisations, donors, and advocates who share our vision of a world open by design, where all knowledge is accessible to all.

Renata Avila, CEO of Open Knowledge, shared her appreciation for the recognition: “Receiving NGOsource’s Equivalency Determination isn’t just an acknowledgement of OKFN’s work; it’s a profound opportunity to expand our mission. With US charity recognition, we can continue to nurture and grow a network of leaders and communities in every region of the world. We will also continue to innovate legal, technical and accessibility tools for citizens and governments to unlock the potential of open knowledge, data and digital technologies that can be applied to their work and transform lives. This will lead to open knowledge, data and technologies that are open, participatory, accountable and sustainable for a better world and empowered communities everywhere.”

This achievement is directly in line with OKFN’s vision and strengthens its ability to advocate and implement open knowledge initiatives worldwide. You can find this and other relevant information about OKFN’s institutional functions on our Governance page.

We’re excited about the opportunities this opens up and grateful for the continued support of our community. Thank you for being part of our journey towards a fair, sustainable and open future.

John Mark Ockerbloom: “He himself is so much bigger than his books”

2024-11-03T13:31:39+00:00

It’s the last day of Diwali, the Hindu festival of lights that’s also celebrated by various other traditions in India, and in the Indian diaspora.

Among the Indian diaspora’s cultural ambassadors was Newbery medalist Dhan Gopal Mukerji. His 1929 books include Hindu Fables for Little Children, illustrated by Kurt Wiese, introducing tales he grew up with in India to a wide variety of readers. John Neihardt reviewed it when it came out. It goes public domain in 59 days. #PublicDomainDayCountdown

Ed Summers: Cut-ups and LLMs

2024-11-03T07:00:00+00:00

If language is a virus what are LLMs?

I’ve had this kinda random notion about Large Language Models (LLM) and the Cut-up technique rumbling around in my brain for the past year. Unless you’ve been living in a cave I’m guessing you already know about LLMs. You probably already know about Cut-ups too, but just in case here is how Burroughs and Gysin describe this creativity tool (Burroughs & Gysin, 1982, p. 34):

Writing is fifty years behind painting. I propose to apply the painters’ techniques to writing; things as simple and immediate as collage or montage. Cut right through the pages of any book or newsprint… lengthwise, for example, and shuffle the columns of text. Put them together at hazard and read the newly constituted message. Do it for yourself. Use any system which suggests itself to you. Take your own words or the words said to be “the very own words” of anyone else living or dead. You’ll soon see that words don’t belong to anyone. Words have a vitality of their own and you or anybody else can make them gush into action.

p. 34 of The Third Mind

p. 35 of The Third Mind

Burroughs famously used this technique in his Nova Trilogy (and elsewhere) to mix together the works of other authors (Shakespeare, Rimbaud, Kerouac, Genet, Kafka, Eliot, Conrad, …). It has since been widely used as a creativity tool, apparently by musicians like David Bowie, Kurt Cobain and Thom Yorke. The purpose of this hack isn’t simply to come up with new ideas, but to dismantle discursive systems of control:

The Burroughs machine, systematic and repetitive, simultaneously disconnecting and reconnecting—it disconnects the concept of reality that has been imposed on us and the plugs normally dissociated zones into the same sector–eventually escapes from the control of its manipulator; it does so in that it makes it possible to lay down a foundation of an unlimited number of books that end by reproducing themselves. (Burroughs & Gysin, 1982, p. 17)

So what do LLMs and Cut-ups have to do with each other?

One superficial way of thinking about LLMs is as the cut-up machine, par excellence. LLMs are built by taking a massive amount of content from the Web, chopping it up into words (tokens), and then creating a neural network that represents the likelihood of one token following another. This allows new text to be generated word by word given an initial sequence, or prompt. Similar to the cut-up, it’s no longer possible to attribute LLM generated text to a particular author or authors. The very idea of authorship and attribution is completely dissolved in the model.

However the big difference with LLMs is that they are optimized for predicting the next likely word, given an initial sequence of words. An LLM is ultimately a statistical representation of likely text. The Cut-up on the other hand is specifically designed to break the typical associations of words, but without totally obscuring where those words came from.

LLMs discipline communication, and routinize language in an attempt to simulate meaningful text. Cut-ups intentionally break word associations in order to reveal non-obvious, possibly absurd, latent meanings in given texts. Comparing LLMs to Cut-ups unmasks the LLM as a normalization tool for language control, and the Cut-up as a tool for wresting back control, for peeking inside the discursive machinery of language.

I was reminded of this today when I ran across a lovely short paper by Max Kreminski entitled Computational Poetry is Lost Poetry which he presented at the Halfway to the Future conference (open-access Proceedings) recently.

In this paper he draws a comparison between Found Poetry, where poetry is discovered in everyday use of language, and LLMs “whose central purpose is to arrange units of language, without fully understanding them, in combinations that can later be found to be poetry”. He calls this LLM generated text “Lost Poetry”. Of course not everyone using LLMs is trying to write poetry, or even think creatively, so this analogy doesn’t totally work for all LLM use cases. But he goes on to make some insightful observations about the flaws of generative AI:

I argue that machines are often usefully creative because they fail to see things completely as humans do: their oversights and inabilities lead them to mix human-like with non-human-like creative decisions in unanticipated ways, and thereby to supply human creators with ideas that they otherwise never would have considered. Somewhat counter-intuitively, then, I suggest that a dogged pursuit of perfect overlap between human and machine understanding of aesthetic domains may in fact inhibit the usefulness of machines as generators of unexpected inputs to the human creative ecosystem.

The flaws that we see in these generative systems is what makes them useful, and the quest to build bigger and bigger models that better model “reality” is at cross purposes with their use in creative endeavors. It’s the glitches that provide value. He goes on to say:

… the design of novel computationally creative systems could be guided in part by a deliberate choice of what to make invisible to the machine. By selectively limiting the machine’s capacity to take certain facets of human aesthetic perception into account, we can produce different kinds of losers that can help to break us out of familiar patterns toward new techniques of expressive communication.

This is a provocative idea I think, that it’s the limitations that we build (intentionally or not) into computational systems that make them legible to us humans. These limitations help distinguish one tool from another. Just as Cut-ups engineer for the unexpected, and transgress predictable narrative structures, these LLM generated “losers” have more potential for creative thought because they are errors. Maybe this is one move too far, but there seems to be some parallels here to seamful design, where “strategic revelation of complexity, error, or backgrounded tasks” provide value instead of distraction (Inman & Ribes, 2019).

But perhaps Kreminski, as a chief scientist at a generative AI company, is trying very hard to find value in these statistical models that ultimately drive out and exploit creativity. They do this by disciplining language by normalizing our words to fit a the types of words found on the World Wide Web at a particular point in time. I wish him well in his efforts to make these models smaller, more quirky, and more useful for actual artists–instead of larger and smoother for people who don’t want to employ human artists anymore.

I do wonder, what would it look like if LLMs worked more like Cut-ups, where we got unfamiliar juxtapositions and the sources weren’t completely obfuscated/concealed?

References

Burroughs, W. S., & Gysin, B. (1982). The third mind (First paperbound edition). New York: Seaver Books.

Inman, S., & Ribes, D. (2019). Beautiful Seams. In CHI Conference on Human Factors in Computing Systems proceedings. Glasgow, Scotland. https://doi.org/10.1145/3290605.3300508

Cynthia Ng: Alternatives for Blogs on WordPress.com

2024-11-02T22:28:48+00:00

A lot of alternatives recommendation articles are for moving websites to other Content Management Systems (CMS). A simple blog though does not need that. This article is for those people, like me, who just blog. Though, I do include some alternatives for WordPress if your goal is just to move off of WP.com. I’ve been … Continue reading

Lucidworks: How Zero Results Are Killing Ecommerce Conversions

2024-11-02T12:21:29+00:00

Learn how to power the product discovery experience with semantic vector search to eliminate false zero results and accelerate the path to purchase.

The post How Zero Results Are Killing Ecommerce Conversions first appeared on Lucidworks.

The post How Zero Results Are Killing Ecommerce Conversions appeared first on Lucidworks.

John Mark Ockerbloom: A Room of One’s Own, for all

2024-11-02T11:53:41+00:00

“A woman must have money and a room of her own if she is to write fiction.”

Virginia Woolf’s classic 1929 essay on feminism and creative work has inspired numerous analyses (like this one), adaptations (like this one), and projects (like this one).

Copyright is one way writers get money, but it often enriches publishers and estates more than it helps creators. We begin this year’s #PublicDomainDayCountdown anticipating Woolf’s A Room of One’s Own arriving in the US public domain in 60 days.

John Mark Ockerbloom: The remainder of the Roaring 20s about to join the public domain

2024-11-02T00:45:24+00:00

Just two months from now, much of the world will celebrate another Public Domain Day, welcoming a year’s worth of works into the public domain. Many countries that have had life+70 years copyright terms for a while will get works by authors who died in 1954. Those still fortunate enough to still have life+50 years terms will get works by authors who died in 1974. The rules in the United States are more complicated, but we’ll have nearly all our remaining copyrights from 1929 expire. That means that, for us, essentially all of the publication history of the “roaring 20s” will be public domain when the new year arrives.¹ That’s a wide sweep of culture available for everyone to enjoy, share, build on, and reuse.

The Twenties encompass the start of national women’s suffrage, the rise of the Jazz Age and the Harlem Renaissance, and the dawn of “talking” motion pictures, and extend to the “Black Tuesday” stock market crash and the beginning of the Great Depression. The Twenties had political upheaval to match the cultural and economic upheaval, including civil war in Ireland and many other places around the world, the birth of fascism in Europe, and the revival and decline of the Ku Klux Klan as waves of anti-immigrant and racist sentiment washed over much of America. But the decade also saw widespread international efforts to try to end war generally among nations. While the 1928 pact that many nations signed on to has often been viewed as a failure for not preventing World War II, it set a precedent for later international cooperation and peacekeeping efforts that can be credited with more success.

As I have in past years, I’ll be featuring a Public Domain Day countdown in the days leading up to New Year’s Day 2025, each day featuring an interesting work that will be joining the public domain then. You can follow it on this blog, or using RSS readers or social media that can connect with this blog. That includes Mastodon and other “fediverse” sites that connect with Mastodon using the ActivityPub protocol. I’ll also boost or link to the daily posts from my Mastodon account. (Most of the posts will have 500 characters or fewer, the size of a typical Mastodon post; a few may be longer.) You might also be able to follow my boosts and links from Bluesky (since my account is hooked up to Bridgy Fed), as well as possibly from Threads if they’ve enabled following Mastodon accounts. (That was on their roadmap for 2024, but I don’t know if it’s working yet.) My posts will include the hashtag #PublicDomainDayCountdown. I’ll be focusing on works joining the US public domain that are of interest to me, but you’re also welcome to post about works of interest to you joining the public domain where you are, and use the same hashtag if you like.

Right now for me, and for many others I’ve talked to, it’s hard to think much beyond next Tuesday. But I hope these posts help us anticipate some good things coming in the future, built on the knowledge and creativity of the past. May we all see and help bring about a better future in the days to come!

The rules in the US are different for unpublished works, and for sound recordings that aren’t part of motion pictures. (I told you US copyright law was complicated.) But this January 1, along with publications from 1929, we will be welcoming sound recordings released in 1924 (which have a 100-year term) into the public domain, as well as many unpublished works by people who died in 1954. For lots more details and special cases, see Cornell University Library’s public domain table. ︎

LibraryThing (Thingology): November 2024 Early Reviewers Batch Is Live!

2024-11-01T18:31:37+00:00

Win free books from the November 2024 batch of Early Reviewer titles! We’ve got 209 books this month, and a grand total of 4,102 copies to give out. Which books are you hoping to snag this month? Come tell us on Talk.

If you haven’t already, sign up for Early Reviewers. If you’ve already signed up, please check your mailing/email address and make sure they’re correct.

» Request books here!

The deadline to request a copy is Monday, November 25th at 6PM EST.

Eligibility: Publishers do things country-by-country. This month we have publishers who can send books to the US, the UK, Canada, Australia, Ireland, Netherlands, New Zealand, Germany, Italy, Spain and more. Make sure to check the message on each book to see if it can be sent to your country.

Thanks to all the publishers participating this month!

Alcove Press	Arctis Books USA	Baker Books
Before Someday Publishing	Bethany House	Broadleaf Books
CarTech Books	Census Press	City Owl Press
Crooked Lane Books	Entrada Publishing	eSpec Books
Gefen Publishing House	IngramSpark	Islandport Press
Lerner Publishing Group	The New Press	Prosper Press
PublishNation	Purple Diamond Press, Inc	Purple Moon Publishing
Revell	Riverfolk Books	Rootstock Publishing
Running Wild Press, LLC	Simon & Schuster	Somewhat Grumpy Press
Stone Bridge Press	Tundra Books	Twisted Road Publications
Unsolicited Press	Vibrant Publishers	Wise Media Group
Yorkshire Publishing

Raffaele Messuti: Gli oggetti digitali del catalogo SBN

2024-11-01T00:00:00+00:00

Ho recentemente scoperto la disponibilità delle API del catalogo SBN, sebbene non sappia da quanto tempo siano state rilasciate. È un argomento di cui mi sono interessato in passato, più per curiosità personale che per necessità professionale, credendo molto nel valore di dati e metadati aperti nel settore dei beni culturali. Anni fa avevo individuato l'esistenza di alcune API non ufficiali utilizzate dalle applicazioni mobili del catalogo, che ancora oggi funzionano, seppur con funzionalità limitate. Queste API continuano a suscitare interesse in ricercatori o sviluppatori che mi contattano per avere ulteriori dettagli, che purtroppo non sono in grado di fornire.

Quella che segue è una mia analisi di queste nuove API ufficiali del catalogo SBN, e il modo in cui le ho utilizzate per uno specifico caso di studio: ottenere l'elenco dei documenti per i quali è disponibile una risorsa digitale. L'intero catalogo SBN conta 20+ milioni di documenti. Il sottoinsieme che a me interessa, con i documenti digitalizzati, poco meno di un milione (938.000+). Ottenere la lista dei documenti di cui è disponibile l'oggetto digitale mi è sembrato un buon esperimento per esplorare il catalogo in modo casuale e scoprirne qualche contenuto rilevante (serendipità!).

Ho riscontrato alcune particolarità nella modellazione dei dati, e la mancanza di una documentazione dettagliata e completa mi ha fatto procedere a tentativi e intuizioni. Non intendo criticare o sminuire il lavoro svolto dall'ICCU, anzi credo che sia un risultato importante e spero che una maggiore discussione pubblica su questi strumenti e interfacce sui dati possa contribuire a migliorarli e incentivarne l'utilizzo.

Voglio però precisare che negli ultimi tempi ho maturato una visione diversa e meno ortodossa sul modo in cui i dati dei beni culturali dovrebbero essere distribuiti: ne ho scritto qui Beyond HTTP APIs: the case for database dumps in Cultural Heritage, sostenendo che dovremmo preferire degli export completi, autonomi e pronti all'uso rispetto alle API.

Quickstart per usare le API

Le API sono raggiungibili da questo portale https://api.iccu.sbn.it/devportal/apis. L'utilizzo non è pubblico e anonimo, per potere essere usate è necessario registrare un account e successivamente creare delle chiavi OAuth2, che serviranno per generare un token da includere in tutte le chiamate.

Il prodotto software qui usato è WSO2 API manager e da quello che ho potuto capire espone direttamente delle API di Solr (in sola lettura, ovviamente). Esistono diverse API, divise per servizio, presentate graficamente con una sorta di tavola periodica. Non è immediatamente chiaro a cosa di riferiscono e la terminologia usata è per persone che già conoscono l'ecosistema dei servizi di SBN. A me risulta del tutto ignoto cosa siano CA (Cataloghi Storici) oppure IC (ICFE Services), e ho intuito che AB si riferisse all'Anagrafe Biblioteche. Ma quello a cui sono interessato è SB, SBN Integrato.

Ognuna della API ha ovviamente delle chiamate e delle risposte di tipo diverso. Sono messi a disposizione degli SDKs già pronti in Java e Javascript. Per la mia attività ho preferito iniziare a scrivere una libreria in linguaggio Go: la trovate qui https://github.com/atomotic/iccu. Non è un SDK completo, è ancora un modulo molto spartano, e col tempo potrei completarlo.

La cattura dei documenti con oggetto digitale

Per esplorare il catalogo ho abbandonato fin da subito l'idea di interpretare in tempo reale le risposte delle API: ho deciso di salvarmi tutti i dati in locale e poi successivamente parsarli. Ho salvato le risposte in un database SQLite, estremamente semplice: un field doc di tipo json in cui salvo il json raw risultante dalla api, e una colonna bid popolata automaticamente dal field unimarc 003

CREATE TABLE sbn (
        bid TEXT GENERATED ALWAYS AS (json_extract(doc, '$.unimarc.fields[1].003')) VIRTUAL,
        doc json
);

CREATE INDEX bid_idx on sbn(bid);

La API chiamata è la seguente: gli argomenti rilevanti sono presenza_digitale=Y e format=full (diversamente avrete un oggetto minimale non completo di tutto l'unimarc).

GET https://api.iccu.sbn.it/sbn/1.0.0/search
	format=json
	detail=full
	page-size=500
	presenza_digitale=Y

Questo un esempio completo di risposta di un singolo record RAV0302299 (http://id.sbn.it/bid/RAV0302299).

Ho usato una paginazione abbastanza alta, 500 documenti per risposta. Aumentare il numero di documenti restituiti fa diminuire il numero di chiamate HTTP e può velocizzare tantissimo la cattura; ma c'è il problema che spesso alcuni documenti contengono errori di encoding e il JSON restituito non è valido. Quando li incontrate perderete il contenuto di quei documenti nella finestra di paginazione: è capitato anche nella mia analisi, e non ho ulteriormente indagato ne ho voluto implementare un parsing più efficiente: ho perso qualche migliaio di documenti, ed è un margine di errore accettabile.

Ne ho ottenuto un database di 936500 righe, del peso di 4.7G. Non distribuirò pubblicamente questo database (non ho ben chiara la licenza d'uso di questi dati), ma se qualcuno fosse interessato lo condivido.

Come nel caso di attività di scraping, anche in questo caso di utilizzo di API restano valide delle norme di buona condotta: limitare l'aggressività e la velocità delle chiamate, identificarsi sempre nello User Agent delle chiamate HTTP (anche se queste API hanno un token quindi presumo che l'origine e ogni attività sia sempre rintracciabile).

Il codice usato per la cattura è qui disponibile: https://github.com/atomotic/iccu/cmd/sbn-metadata-fetch

L'analisi e l'esplorazione dei metadati

Pensavo ingenuamente che mi sarebbero state sufficienti delle query SQL nel campo JSON del database SQLite per poter esplorare questi dati: purtroppo la mancanza di uno schema e la modellazione di alcuni dati rendono difficoltoso poter fare tutto in SQL, e ho dovuto scrivermi dei metodi all'oggetto Go che implementassero alcune logiche su questi dati.

Non sono interessato a TUTTI i metadati disponibili, ma solo ad un insieme ridotto, la mia necessità è ottenere i link agli oggetti digitalizzati più che i metadati. Dalla trasformazione dei metadati di origine ho voluto ottenere degli oggetti semplificati come il seguente (sono volutamente mancanti dati come gli autori, etc).

{
  "bid": "IT\\ICCU\\VIAE\\007373",
  "id": "http://id.sbn.it/bid/VIAE007373",
  "idmanus": "",
  "title": "Risposta apologetica, e critica alle osservazioni, ed alla lettera del molto reverendo padre Cantova della Compagnia di Gesu, stampate in Milano l'anno 1752. Contro a chi ha ultimamente difesa la necessita dell'amor di Dio nel sagramento della penitenza",
  "iiif": [
    "https://jmms.iccu.sbn.it/jmms/metadata/UW01alpnX18_/b2FpOmJuY2YuZmlyZW56ZS5zYm4uaXQ6MjE6RkkwMDk4Ok1hZ2xpYWJlY2hpOlZJQUUwMDczNzM_/manifest.json"
  ],
  "link": [
    "http://books.google.com/books?vid=IBSC:SC000005684",
    "http://books.google.com/books?vid=IBSC:SC000008356",
    "http://teca.bncf.firenze.sbn.it/ImageViewer/servlet/ImageViewer?idr=BNCF0003334533"
  ],
  "type": "Testo",
  "material": [
    "Libro antico"
  ],
  "thumbnails": [
    "https://jmms.iccu.sbn.it/jmms/resource/ad/first/UW01alpnX18_/b2FpOmJuY2YuZmlyZW56ZS5zYm4uaXQ6MjE6RkkwMDk4Ok1hZ2xpYWJlY2hpOlZJQUUwMDczNzM_"
  ],
  "start_date": 1753,
  "end_date": 1753
}

Lo script https://github.com/atomotic/iccu/cmd/sbn-metadata-transform estrae i dati dal db SQLite e genera un file in formato JSON Lines (~500M). Questo export è così pronto per essere caricato in diversi altri strumenti più adatti all'analisi dei dati, come SOLR o DuckDB.

Ho preferito usare DuckDB, e questo è il modo in cui ho caricato i dati:

~ duckdb sbn.duckdb "create table digital as select * from read_json_auto('sbn.jsonl');"

~ duckdb sbn.duckdb

D .schema
CREATE TABLE digital(bid VARCHAR, id VARCHAR, idmanus VARCHAR, title VARCHAR, iiif VARCHAR[], link VARCHAR[], "type" VARCHAR, material VARCHAR[], thumbnails VARCHAR[], start_date BIGINT, end_date BIGINT);

Ho esportato il database DuckDB in formato parquet e lo si può scaricare da qui https://atomotic.github.io/data/sbn.digital.parquet (93M).

Il file parquet può essere usato direttamente in DuckDB shell nel browser, senza installare nulla. È sufficiente creare una tabella (esempio):

CREATE TABLE digital AS FROM 'https://atomotic.github.io/data/sbn.digital.parquet';

Alcune query dimostrative:

Numero di documenti raggruppati per tipologia

D SELECT
    type,
    COUNT(*) AS count
  FROM digital
  GROUP BY type order by count desc;
┌───────────────────────────────────┬────────┐
│               type                │ count  │
│              varchar              │ int64  │
├───────────────────────────────────┼────────┤
│ Testo                             │ 506962 │
│ Registrazione sonora musicale     │ 310053 │
│ Risorsa grafica                   │  53829 │
│ Musica manoscritta                │  20721 │
│ Testo manoscritto                 │  19221 │
│ Musica a stampa                   │  11565 │
│ Registrazione sonora non musicale │   7180 │
│ Risorsa cartografica a stampa     │   4483 │
│ Risorsa elettronica               │   1965 │
│ Risorsa cartografica manoscritta  │    406 │
│ Risorsa da proiettare o video     │     72 │
│ Oggetto tridimensionale           │     29 │
│ Risorsa multimediale              │     14 │
├───────────────────────────────────┴────────┤
│ 13 rows                          2 columns │
└────────────────────────────────────────────┘

Numero di manifest IIIF

D SELECT COUNT(*) as manifest
    FROM (
        SELECT DISTINCT unnest(iiif)
        FROM digital
    );
┌──────────┐
│ manifest │
│  int64   │
├──────────┤
│   341324 │
└──────────┘

Numero di links esterni

D SELECT COUNT(*) as link
    FROM (
        SELECT DISTINCT unnest(link)
        FROM digital
    );
┌─────────┐
│  link   │
│  int64  │
├─────────┤
│ 1045225 │
└─────────┘

Origine dei link esterni

Riguardo ai link esterni ho voluto estrarre l'host del server e poi raggrupparli, in modo da indentificare la provenienza. Ho utilizzato trurl per il parsing della URL, che mi ha rilevato anche diversi errori di parsing, ma li ho tralasciati considerandoli marginali:

~ duckdb --list sbn.duckdb "SELECT DISTINCT TRIM(unnest(link)) AS unique_links FROM digital;" \
    | trurl -f - --get "{host}" --accept-space > urls.txt

Il file urls.txt contiene la lista degli host, non ordinata. Sarebbero sufficienti sort, uniq e wc per poter fare dei conteggi, ma c'è topfew (del noto Tim Bray!) che è molto più efficiente. Google Books, l'Istituto Centrale dei Beni Sonori, e la Teca della BNCF sono le sorgenti predominanti.

~ topfew -n 30 urls.txt

363190 books.google.com
312041 opac2.icbsa.it
134072 teca.bncf.firenze.sbn.it
58043 www.internetculturale.it
46714 books.google.it
12614 www.braidense.it
8558 www.bibliotecamusica.it
6290 www.widejef.com
6091 www.bdl.servizirl.it
5020 archive.org
4284 www.14-18.it
4276 corago.unibo.it
3772 www.google.it
3574 sbn.comune.eboli.sa.it
3562 www.cmarchiviodigitale.com
3177 digiteca.bsmc.it
3103 www.polodigitalenapoli.it
2602 www.aggiornamentisociali.it
2330 hdl.handle.net
2304 www.proquest.com
2280 atena.beic.it
1879 www.fondazionecircoloartistico.it
1698 badigit.comune.bologna.it
1546 doi.org
1431 digital.fondazionecarisbo.it
1431 5.175.50.107
1311 www.omeka.unito.it
1274 www.byterfly.eu
1196 www.repubblicaromana-1849.it
1164 turismo.comune.sanginesio.mc.it

Tra gli host figurano alcune cose bizzarre, molti IP e anche diversi file linkati da Google Drive (e mi sembra una pessima idea linkare in un catalogo degli oggetti da un file storage)

~ grep drive.google urls.txt | wc -l
467

Ancora peggio ci sono anche diversi link a Facebook. E al tempo stesso, mi meraviglio, che non ci siano link verso Wikisource o Wikimedia Commons (ma mi riservo di indagare ulteriormente).

Criticità incontrate

I problemi che ho incontrato non sono di natura tecnica sulle API, ma riguardano la modellazione dei metadati:

La struttura non è uniforme. C'è un oggetto unimarc che è una rappresentazione in json dell'xml unimarc (non è comodissimo da parsare ma va bene così), mentre invece ci sono una serie di campi accessori al di fuori di quell'oggetto (come ad esempio i manifest IIIF) oppure altri dati che duplicano informazioni già contenute nell'unimarc. Sospetto che siano dati presenti lì per facilitarne l'accesso. Penso che sia comunque normale per una base dati longeva come SBN dovere essere costretti ad aggiungere al bisogno dei campi accessori.
Alcuni valori non sono completi: ad esempio i manifest IIIF riportano solo il path, e manca sempre l'host. Con qualche euristica sono riuscito a ricavarlo, ma sarebbe bene che i valori fossero sempre completi. Altre volte invece ho notato che alcuni campi contengono valori multipli divisi con qualche carattere separatore: è il caso dei link esterni alcune volte divisi da " | ".
Locazione dell'oggetto digitale. Ho capito che possono essere di due tipi: manifest IIIF, che vengono anche visualizzati con un viewer direttamente nel catalogo web, oppure sono dei collegamenti a pagine esterne (ma possono esserci entrambi manifest e link). I manifest sono riportati con dei field nel livello principale dell'oggetto: esistono dig_cover, dig_manifest, dig_preview e dig_preview_URL, e non sempre mi è chiara la ridondanza. I link esterni invece sono riportati nell'oggetto unimarc in 899.u o altri.

Alcuni vocabolari fanno uso di lettere singole (ad esempio nel campo tipologie e materiale). Questi vocabolari sono scarsamente documentati, in questi casi sarebbe bene usare una URI (risolvibile!) che porti ad una pagina di documentazione. Esempio:

Codice a un carattere del tipo documento: a=Testo b=Testo manoscritto c=Musica a stampa d=Musica manoscritta e=Risorsa cartografica a stampa f=Risorsa cartografica manoscritta g=Risorsa da proiettare o video i=Registrazione sonora non musicale j=Registrazione sonora musicale k=Risorsa grafica l=Risorsa elettronica r=Oggetto tridimensionale m=Risorsa multimediale
---
codice ad un carattere del tipo materiale: v=Audiovisivi c=Cartografia g=Grafica A=Libro antico N=Libro moderno M=Musica

Manca uno schema: questo è il maggiore dei problemi. Ho dovuto procedere a tentativi ed euristiche per potere parsare quelle risposte, e sono certo di non avere individuato tutte le possibili casistiche o possibilità di errori. I metadati hanno bisogno obbligatoriamente di schemi, con i quali poter effettuare validazioni e costraint. Di possibili tecnologie ne esistono diverse, di complessità variabile: JSONSchema, Avro, Protobuf. Penso sia sufficiente un buon JSONSchema per iniziare. Esistono anche alcune cose nuove come PKL o CUE, finora mai impiegate in un ambito di serializzazione di metadati, che secondo me sono interessanti e il mondo delle digital libraries potrebbe iniziare a valutarle.

Conclusioni

Al netto dei problemi di modellazione dei dati mi sembra che l'infrastruttura tecnologica di questo prodotto di API sia altamente funzionante. Mi piacerebbe sapere se esistono delle statistiche di utilizzo o reali di esempi di integrazione su cataloghi o portali esterni. Penso poi che il mondo Wikidata, dove già esistono diverse integrazioni con il catalogo SBN, possa trarre beneficio da queste API e rendere più veloci e automatici diversi processi già esistenti.

Digital Library Federation: DLF Digest: November 2024

2024-10-31T13:30:26+00:00

A monthly round-up of news, upcoming working group meetings and events, and CLIR program updates from the Digital Library Federation. See all past Digests here.

Hello! We hope those who attended the DLF Virtual Forum enjoyed the panels. November brings Election Day, Thanksgiving, and a round of working group meetings.

— Team DLF

This month’s news:

Now available from the Digital Library Pedagogy Group: #DLFteach Toolkit Volume 4: Critical Digital Literacies, an open-access resource designed to support both information professionals and educators.
Now available from the Cultural Assessment Working Group: The Inclusive Metadata Toolkit serves as a centralized guide to the range of inclusive metadata tools and resources currently available to equip practitioners to implement inclusive metadata practices in their day-to-day work.
Register: 2024 IIIF Online Meeting, November 12-14.
Climate Action Webinar #3: Combatting Climate Anxiety Through Data: December 5, 2024, 3:30 pm – 5:00 pm ET. Learn how curating scientific data orients GLAMR institutions in the public conversation and can help combat climate anxiety through action.
Closures: CLIR and DLF offices will be closed for Thanksgiving 11/25 – 11/29.

This month’s open DLF group meetings:

For the most up-to-date schedule of DLF group meetings and events (plus NDSA meetings, conferences, and more), bookmark the DLF Community Calendar. Meeting dates are subject to change. Can’t find meeting call-in information? Email us at info@diglib.org. Reminder: Team DLF working days are Monday through Thursday.

DLF Born-Digital Access Working Group (BDAWG) Monthly Meeting: Tuesday, 11/05, 2 pm ET/11 am PT
DLF Digital Accessibility Working Group: Wednesday, 11/06, 2 pm ET/11 am PT
DLF AIG Metadata Working Group: Thursday, 11/07, 1:15 pm ET/10:15 PT
DLF AIG Cultural Assessment Working Group: Monday, 11/11, 2 pm ET/11 am PT
DLF AIG Cost Assessment Working Group: Monday, 11/11, 3 pm ET/12:00 pm PT
DLF AIG User Experience Working Group: Friday, 11/15, 11 am ET/8 am PT
DLF Committee for Equity & Inclusion: Monday, 11/18, 3 pm ET/12:00 pm PT
DLF Digital Accessibility Working Group – IT Subgroup (DAWG-IT): Monday, 11/25, 1:15 pm ET/10:15 PT
DLF Climate Justice Working Group: Wednesday, 11/27, 12:00 pm ET/ 9 am PT
DLF Digital Accessibility Policy & Workflows Subgroup: Friday, 11/29, 1:00 pm ET/10 am PT

DLF groups are open to ALL, regardless of whether or not you’re affiliated with a DLF member organization. Learn more about our working groups on our website. Interested in scheduling an upcoming working group call or reviving a past group? Check out the DLF Organizer’s Toolkit. As always, feel free to get in touch at info@diglib.org.

Get Involved / Connect with Us

Below are some ways to stay connected with us and the digital library community:

Subscribe to the DLF Forum newsletter.
Join, start, or revive a working group and browse their work on the DLF Wiki.
Subscribe to our community listserv, DLF-Announce.
Bookmark our Community Calendar.
Learn more about becoming a DLF member organization.
Follow us on Instagram, Facebook, LinkedIn, and YouTube.

Contact us at info@diglib.org.

The post DLF Digest: November 2024 appeared first on DLF.

David Rosenthal: 1.5C Here We Come

2024-10-30T20:57:19+00:00

Source

John Timmer's With four more years like 2023, carbon emissions will blow past 1.5° limit is based on the United Nations' Environmental Programme's report Emissions Gap Report 2024. The "emissions gap" is:

the difference between where we're heading and where we'd need to be to achieve the goals set out in the Paris Agreement. It makes for some pretty grim reading. Given last year's greenhouse gas emissions, we can afford fewer than four similar years before we would exceed the total emissions compatible with limiting the planet's warming to 1.5° C above pre-industrial conditions.
...
The report ascribes this situation to two distinct emissions gaps: between the goals of the Paris Agreement and what countries have pledged to do and between their pledges and the policies they've actually put in place.

Source

Back in 2021 in my TTI/Vanguard talk I examined one of these gaps, the one between the crypto-bros' energy consumption:

The leading source for estimating Bitcoin's electricity consumption is the Cambridge Bitcoin Energy Consumption Index, whose current central estimate is 117TWh/year.

Adjusting Christian Stoll et al's 2018 estimate of Bitcoin's carbon footprint to the current CBECI estimate gives a range of about 50.4 to 125.7 MtCO2/yr for Bitcoin's opex emissions, or between Portugal and Myanmar.

and their rhetoric:

Cryptocurrencies assume that society is committed to this waste of energy and hardware forever. Their response is frantic greenwashing, such as claiming that because Bitcoin mining allows an obsolete, uncompetitive coal-burning plant near St. Louis to continue burning coal it is somehow good for the environment.

But, they argue, mining can use renewable energy. First, at present it doesn't. For example, Luxxfolio implemented their commitment to 100% renewable energy by buying 15 megawatts of coal-fired power from the Navajo Nation!.

Second, even if it were true that cryptocurrencies ran on renewable power, the idea that it is OK for speculation to waste vast amounts of renewable power assumes that doing so doesn't compete with more socially valuable uses for renewables, or indeed for power in general.

Source

Note that the current CBECI estimate shows that Bitcoin's energy consumption has increased 43% since 2021, a 12.7%/yr increase.

Follow me below the fold for more details of the frantic greenwashing, not just from the crypto-bros but from the giants of the tech industry that aims to ensure that:

Following existing policies out to the turn of the century would leave us facing over 3° C of warming.

Luxxfolio wasn't an exception. The latest example of Bitcoin greenwashing comes from Hunterbrook Media:

TeraWulf Inc. (NASDAQ: $WULF) brands itself as a “zero-carbon Bitcoin miner” — and claims its commitment to renewable energy will help it land AI data center contracts. But the New York Power Authority, which supplies 45% of the facility’s energy, told Hunterbrook Media: “None of the power that NYPA provides the firm can be claimed as renewable power.”

The rest of TeraWulf’s power is sourced from the New York grid, which is less than half zero-carbon, according to the New York Independent System Operator, the organization responsible for managing the state’s wholesale electric marketplace.

The only way TeraWulf can legally substantiate its zero-carbon claims is by purchasing renewable energy credits (RECs), according to New York and federal regulators, but a TeraWulf spokesperson confirmed that the company has not done so. “Without the REC, there is no legal claim to the renewable attributes of electricity,” a spokesperson for the New York State Energy Research and Development Authority confirmed in an email to Hunterbrook.

These lies were just the start, Hunterbrook documents lies about most aspects of their business. Note TeraWulf's pivot to AI. In Bitcoin Miners Take Divergent Paths Six Months After Revenue ‘Halving’, David Pan explains that TeraWulf is part of a trend:

Six months after rewards for validating transactions on the Bitcoin network were reduced by half, crypto mining companies are choosing between two divergent paths to remain viable.

Public miners including MARA Holdings, Riot Platforms and CleanSpark are keeping the Bitcoin they produce with the expectation that the digital asset will rise in value. At the same time, an increasing number of companies are spending more on developing data centers that power artificial intelligence applications.

It isn't just the crypto-bros who are apperently lying about using renewables. Back in July Adele Peters revealed that Amazon says it hit a goal of 100% clean power. Employees say it’s more like 22%:

Today, Amazon announced that it hit its 100% renewable electricity goal seven years early. But a group of Amazon employees argues that the company’s math is misleading.

A report from the group, Amazon Employees for Climate Justice, argues that only 22% of the company’s data centers in the U.S. actually run on clean power. The employees looked at where each data center was located and the mix of power on the regional grids—how much was coming from coal, gas, or oil versus solar or wind.

Amazon, like many other companies, buys renewable energy credits (RECs) for a certain amount of clean power that’s produced by a solar plant or wind farm. In theory, RECs are supposed to push new renewable energy to get built. In reality, that doesn’t always happen. The employee research found that 68% of Amazon’s RECs are unbundled, meaning that they didn’t fund new renewable infrastructure, but gave credit for renewables that already existed or were already going to be built.

And in August Amy Castor and David Gerard posted How to fix AI’s ghastly power consumption? Fake the numbers!:

Big tech uses a stupendous amount of power, so it generates a stupendous amount of CO2. The numbers are not looking so great, especially with the ever-increasing power use of AI.

So the large techs want to fiddle how the numbers are calculated!

Companies already have a vast gap between “market-calculated” CO2 and actual real-world CO2 production. The scam works a lot like carbon credits. Companies cancel out power used on the coal/gas-heavy grid in northern Virginia by buying renewable energy credits for solar energy in Nevada.

So in 2023, Facebook listed just 273 tonnes of “net” CO2 and claimed it had hit “net zero” — but it actually generated 3.9 million tonnes.

In practice, RECs don’t drive new clean energy or any drop in emissions — they only exist for greenwashing.

It gets worse. Large techs are already the largest buyers of RECs. So they’re lobbying the Greenhouse Gas Protocol organization to let them report even more ludicrously unrealistic numbers.

RECs currently have to be on the same continent at the same time of day. Amazon and Facebook propose a completely free system with no geographical constraints. They could offset coal power in Virginia with wind power from Norway or India.

This will make RECs work even more like the carbon credit market — where companies can claim hypothetical “avoided” CO2 against actual, real-world CO2.

Source

In Data center emissions probably 662% higher than big tech claims. Can it keep up the ruse? Isabel O'Brien reinforced the message:

Amazon is the largest emitter of the big five tech companies by a mile – the emissions of the second-largest emitter, Apple, were less than half of Amazon’s in 2022. However, Amazon has been kept out of the calculation above because its differing business model makes it difficult to isolate data center-specific emissions figures for the company.

As energy demands for these data centers grow, many are worried that carbon emissions will, too. The International Energy Agency stated that data centers already accounted for 1% to 1.5% of global electricity consumption in 2022 – and that was before the AI boom began with ChatGPT’s launch at the end of that year.

AI is far more energy-intensive on data centers than typical cloud-based applications. According to Goldman Sachs, a ChatGPT query needs nearly 10 times as much electricity to process as a Google search, and data center power demand will grow 160% by 2030. Goldman competitor Morgan Stanley’s research has made similar findings, projecting data center emissions globally to accumulate to 2.5bn metric tons of CO2 equivalent by 2030.

In the meantime, all five tech companies have claimed carbon neutrality, though Google dropped the label last year as it stepped up its carbon accounting standards. Amazon is the most recent company to do so, claiming in July that it met its goal seven years early, and that it had implemented a gross emissions cut of 3%.

Because the tech giants are funnelling vast amounts of cash to Nvidia for hardware to train AIs to, for example, tell people to eat at Angus Steakhouse, or put glue on pizza, convince them that black people's IQ is inferior to whites, hallucinate patient's responses to doctors, persuade teens to commit suicide, and so on they will need lots of power. The smart miners have figured out that their access to lots of power is worth more to the AI bubble than the Bitcoin it could mine. Especially since the halvening. The market has figured this out too:

while the shares of the majority of the companies have underperformed Bitcoin’s more than 60% rally this year with future mining revenue constrained, traders appear to be voting which strategy will succeed, with those embracing AI posing the largest gains.

MARA and Riot, two of the largest publicly traded Bitcoin miners and both “hodlers,” have seen their shares slump 20% and 36%, respectively, this year.

On the other hand:

Northern Data AG is examining a possible sale of its crypto mining business to free up funds for expanding its artificial-intelligence operations.

The Frankfurt-listed company, whose main shareholder is stablecoin issuer Tether Holdings Ltd., would use proceeds from the sale of Peak Mining to focus on its AI solutions unit, it said in a statement Monday. Shares of Northern Data jumped as much as 12% on the news, and were up 9.8% as of 12:06 p.m. in Frankfurt.

The big tech companies are desperate for power:

They are continuing to burn coal at plants that were due to shut down in Montana, Omaha (Google & Facebook), Utah, Georgia and Wisconsin:

“This is very quickly becoming an issue of, don’t get left behind locking down the power you need, and you can figure out the climate issues later,” said Aaron Zubaty, CEO of California-based Eolian, a major developer of clean energy projects. “Ability to find power right now will determine the winners and losers in the AI arms race. It has left us with a map bleeding with places where the retirement of fossil plants are being delayed.”
Morgan Stanley estimates that:

The datacenter industry is set to emit 2.5 billion tonnes of greenhouse gas (GHG) emissions worldwide between now and the end of the decade, three times more than if generative AI had not been developed.
S&P Global Commodity Insights:

noted that only 54 gigawatts of the US coal industry is projected to be powered off by 2030 – down 40 percent from a prediction made in July last year. The total number of coal plants retired by 2050 is still expected to be roughly the same, but the pace of retirement from now to the end of the decade will be significantly slower compared to last year's estimates.
...
Coal plants can credit their new lease on life to the datacenter industry, which is expanding and upgrading existing bit barns as well as building new facilities. The age of AI requires lots of energy – Google search powered by AI alone is expected to use ten times the power of a more traditional information request, according to the International Energy Agency's (IEA) January report.
Microsoft signed a 20-year contract to restart Three Mile Island:

Constellation Energy shut down the Unit 1 reactor in 2019 — not the one that melted down in 1979, the other one — because it wasn’t economical. Inflation Reduction Act tax breaks made it viable again, so Constellation went looking for a customer. Microsoft has signed up for 835 megawatts for the next 20 years.
...
Other mothballed nuclear reactors want to restart for data centers, including Palisades in Michigan and Duane Arnold in Iowa. These both shut down because renewables and natural gas were cheaper — but the data centers need feeding.

TMI Unit 1 should be back online in 2028, going into the strained local grid — so when the AI bubble pops, the clean-ish power will still be there.
Google and Amazon have signed deals for Small Modular Reactors (SMRs), and so has Oracle, but:

Google has signed a deal with California startup Kairos Power for six or seven small modular reactors. The first is due in 2030 and the rest by 2035, for a total of 500 megawatts.

Amazon has also done three deals to fund SMR development.
...
Only three experimental SMRs exist in the entire world — in Russia, China, and Japan. The Russian and Chinese reactors claim to be in “commercial operation” — though with their intermittent and occasional hours and disconcertingly low load factors, they certainly look experimental.

Like general AI, SMRs are a technology that exists in the fabulous future. SMR advocates will talk all day about the potential of SMRs and gloss over the issues — particularly that SMRs are not yet economically viable.

Kairos doesn’t have an SMR. They have permission to start a non-powered tech demo site in 2027. Will they have an approved and economically viable design by 2030?

Of course, the nuclear options won't add CO2 to the atmosphere, but they won't come on line until after we've breached 1.5C. The result is the rapidly increasing "emssions gap" of the large tech companies. But the problem is even worse than it appears. In my EE380 talk I discussed the carbon emmissions from Bitcoin's hardware:

Bitcoin's growing e-waste problem by Alex de Vries and Christian Stoll concludes that:

Bitcoin's annual e-waste generation adds up to 30.7 metric kilotons as of May 2021. This level is comparable to the small IT equipment waste produced by a country such as the Netherlands.
That's an average of one whole MacBook Air of e-waste per "economically meaningful" transaction.

Source

Why does Bitcoin generate so much e-waste?:

The reason for this extraordinary waste is that the profitability of mining depends on the energy consumed per hash, and the rapid development of mining ASICs means that they rapidly become uncompetitive. de Vries and Stoll estimate that the average service life is less than 16 months. This mountain of e-waste contains embedded carbon emissions from its manufacture, transport and disposal. These graphs show that for Facebook and Google data centers, capex emissions are at least as great as the opex emissions.

Lindsay Clark's GenAI's dirty secret: It's set to create a mountainous increase in e-waste points out that AI has the same problem:

Computational boffins' research claims GenAI is set to create nearly 1,000 times more e-waste than exists currently by 2030, unless the tech industry employs mitigating strategies.

The study, which looks at the rate AI servers are being introduced to datacenters, claims that a realistic scenario indicates potential for rapid growth of e-waste from 2.6 kilotons each year in 2023 to between 400 kilotons and 2.5 million tons each year in 2030, when no waste reduction measures are considered.

Assuming that the tech giants eventually succeed in generating profits from their massive investments in AI data centers, it is likely that the economic life of Nvidia's hardware is longer than that of Bitmain's mining rigs. But the investment is much bigger, so it is likely that the capex emissions from AI data centers add greatly to the overall climate impact of AI. Even if they never make profits, the capex emissions from the current build-out will still be in the atmosphere.

Interestingly, the mainstream media has started to pay attention. Back in June the Washington Post's Evan Halper and Caroline O'Donovan's AI is exhausting the power grid. Tech firms are seeking a miracle solution reported on the latest shiny object:

So near the river’s banks in central Washington, Microsoft is betting on an effort to generate power from atomic fusion — the collision of atoms that powers the sun — a breakthrough that has eluded scientists for the past century. Physicists predict it will elude Microsoft, too.

The tech giant and its partners say they expect to harness fusion by 2028, an audacious claim that bolsters their promises to transition to green energy but distracts from current reality.

Even if they could "harness fusion by 2028", it would be too late to avoid 1.5C. But no-one has yet built a fusion reactor with a positive power output, so the 2028 claim is obvious BS. Pay attention to their actions not words:

In fact, the voracious electricity consumption of artificial intelligence is driving an expansion of fossil fuel use — including delaying the retirement of some coal-fired plants.
...
The data-center-driven resurgence in fossil fuel power contrasts starkly with the sustainability commitments of tech giants Microsoft, Google, Amazon and Meta, all of which say they will erase their emissions entirely as soon as 2030. The companies are the most prominent players in a constellation of more than 2,700 data centers nationwide, many of them run by more obscure firms that rent out computing power to the tech giants.

“They are starting to think like cement and chemical plants. The ones who have approached us are agnostic as to where the power is coming from,” said Ganesh Sakshi, chief financial officer of Mountain V Oil & Gas, which provides natural gas to industrial customers in Eastern states.

And this month the New York Times' David Gelles' The A.I. Power Grab reported that Nvidia was also pushing the "AI will solve the climate" fantasy:

Nvidia’s chips are incredibly power-hungry. As the company rolls out new products, analysts have taken to measuring the amount of electricity needed to power them in terms of cities, or even countries.

There are already more than 5,000 data centers in the U.S., and the industry is expected to grow nearly 10 percent annually. Goldman Sachs estimates that A.I. will drive a 160 percent increase in data center power demand by 2030.

Dion Harris, Nvidia’s head of data center product marketing, acknowledged that A.I. was creating a huge spike in power usage. But he said that over time, that demand would be offset as A.I. made other industries more efficient.

“There is sort of a myopic view on the data center,” he said, “but not really an understanding that a lot of those technologies are going to be the main way that we’re going to innovate our way to a net-zero future.”

Apart from continuing to burn fossil fuels as fast as they can and signing deals that won't make a difference until after the world has committed to 1.5C, what are the tech giants doing? Just like the crypto-bros, they are greenwashing, and spinning ludicrous futures to prevent current action. Here, for example, is Eric Schmidt:

Eric Schmidt, the former chief executive of Google, recently said that the artificial intelligence boom was too powerful, and had too much potential, to let concerns about climate change get in the way.

Schmidt, somewhat fatalistically, said that “we’re not going to hit the climate goals anyway,” and argued that rather than focus on reducing emissions, “I’d rather bet on A.I. solving the problem.”

Schmidt at Sun

Full disclosure: I reported to Schmidt at Sun Microsystems, and my opinion of him is less negative than most of my then peer engineers. But I would not expect him to sacrifice immediate profits for the health of the planet. He is right that “we’re not going to hit the climate goals anyway", but that is partly his fault. Even assuming that he's right and AI is capable of magically "solving the problem", the magic solution won't be in place until long after 2027, which is when at the current rate we will pass 1.5C. And everything that the tech giants are doing right now is moving the 1.5C date closer.

HangingTogether: Celebrating Halloween with Gothic fiction in WorldCat

2024-10-30T20:52:00+00:00

We love Halloween at OCLC. Some of us decorate our cubicles. Some of us dress in costume. All of us rejoice in the amazing resources represented in WorldCat that are often read at this time of year. In this post I share with you, my fellow bibliophiles and Gothic fiction fans, a few of my favorite resources available in WorldCat—hopefully at a library near you!

The OCLC cubicle of Kate James; photo courtesy of the author

Tales of the Grotesque and Arabesque

This two-volume collection of short stories contains “The Fall of the House of Usher.” Told by an unnamed narrator, this story describes a seemingly haunted house that splits into half after all the members of the Usher family die. The story is an exemplar of Gothic fiction and has been adapted multiple times as a film and television program. The 2023 limited series The Fall of the House of Usher, created by Mike Flanagan, is actually a loose adaptation of multiple Poe stories including “The Fall of the House of Usher,” “The Tell-Tale Heart,” and “The Black Cat.” Tales of the Grotesque and Arabesque also includes several lesser-known Poe stories such as “The Duc de L’Omelette.” Poe is best known for writing horror, but “The Duc de L’Omelette” is humorous. After dying from eating an ortolan, the Duc goes to hell and plays cards with Baal-Zebub, Prince of the Fly.

750 copies were printed in this 1850 publication of Tales of the Grotesque and Arabesque. During the printing run, the typeset of volume 2, pages 213 and 219 loosened causing variations such as some copies having page 213 numbered as 231. Member libraries holding copies of this book include the Newberry Library, National Library of Scotland, and University of Sydney. Harvard University has inscribed by Poe on the front endleaf: “For Miss Anna and Miss Bessie Pedder, from their most sincere friend, The Author.”

Frankenstein

The title page of volume 1 of the 1818 publication of Frankenstein, courtesy of the Library of Congress, Rare Book and Special Collections Division

Its well-known that this novel is a result of a competition among Mary Shelley, Percy Bysshe Shelley, John Polidori, and Lord Byron. However, it less known among today’s readers that the first edition, published in 1818 lacked any statement of authorship. The preface was written by Mary’s husband, Percy, and the novel was dedicated to her father, the writer and philosopher William Goodwin. Some critics speculated that Percy Bysshe Shelley was the author, and others speculated that the author was a woman. While anonymous novels were not rare in this time period, the British Critic’s harsh review of Frankenstein reveals a contempt for female authorship that Shelley would have anticipated: “The write of it is, we understand, a female; this is an aggravation of that which is the prevailing fault of the novel; but if our authoress can forget the gentleness of her sex, it is no reason why we should; and we shall therefore dismiss the novel without further comment.” (For more information on anonymous authorship of this time, see the University of Minnesota Press blog.)

Many subsequent editions and adaptations as motion pictures, plays, musicals and comic books dispute the British Critic’s review. That journal ceased publication in 1843, but in what is probably the most recent publication of the novel, Dover Publications published Frankenstein in August 2024. The novel appeals to horror fans with its reanimated monster, but it has broad appeal for any reader who has ever felt like they don’t belong. Member libraries holding copies of the 1818 edition include the British Library and Smith College. The Library of Congress owns a copy and had digitized it for anyone who wants to read it freely online.

Varney the Vampire, or, The Feast of Blood

This horror story, generally attributed to James Malcolm Rymer and Thomas Peckett Prest, was first published as a penny dreadful between 1845-1847. It was published as a book in 1847, but sadly I could not find any records in WorldCat for the 1847 print edition. (Catalogers if you are reading this and your library has a copy of this edition, please contribute your record to WorldCat!) We do have a record for a reprint of the 1847 edition with new prefatory matter, which I have provide the link for above. This is not a classic like Dracula or Frankenstein. In fact, it is more like the 19th-century version of the low-budget horror movie. The 1847 book was 232 chapters—847 pages with two columns of text on each page. This is because the author was paid by the typeset line. The protagonist is the vampire Frances Varney, and he is the first vampire described in fiction as having sharpened teeth. Perhaps Bram Stoker was inspired by Varney in his description of Dracula.

Whether you celebrate Halloween by Trick or Treating, watching a scary movie, reading a good novel, or attending a costume party, may you have a Happy Halloween!

The post Celebrating Halloween with Gothic fiction in WorldCat appeared first on Hanging Together.

Lucidworks: You Don’t Need Gen AI for Every Part of AI Search—Here’s Why

2024-10-30T17:23:38+00:00

Generative AI is grabbing headlines, but it’s not the only powerhouse in AI search. Discover why technologies like machine learning and NLP might be better suited for boosting your digital experience—efficiently and cost-effectively.

The post You Don’t Need Gen AI for Every Part of AI Search—Here’s Why appeared first on Lucidworks.

Digital Library Federation: Now Available: Inclusive Metadata Toolkit from the Cultural Assessment Working Group

2024-10-30T17:00:52+00:00

From the Cultural Assessment Working Group:

The DLF Cultural Assessment Working Group (CAWG) is excited to announce the publication of the Inclusive Metadata Toolkit! This toolkit serves as a centralized guide to the range of inclusive metadata tools and resources currently out there, in order to equip practitioners to implement inclusive metadata practices in their day-to-day work.

The toolkit consists of two components:

The Inclusive Metadata Toolkit guide document, which provides context for the listed tools and resources in order to make them easier to use and navigate
The complete Inclusive Metadata Toolkit Resource Directory, which serves as a sortable and filterable directory of inclusive metadata tools and resources to help you wherever your institution is at

We hope the Inclusive Metadata Toolkit Resource Directory can continue to change and grow, providing a living directory as more inclusive metadata tools and resources are created and published over time. Additional resources can be suggested through the Inclusive Metadata Toolkit Suggested Resource & Feedback Form. General feedback or questions are also welcome.

The post Now Available: Inclusive Metadata Toolkit from the Cultural Assessment Working Group appeared first on DLF.

Open Knowledge Foundation: Mapping Openness in Europe: A Regional Meeting with Open Knowledge Foundation

2024-10-30T10:18:01+00:00

On 10 October 2024 the regional call for Europe for the Open Knowledge Network was held online. The discussion was facilitated by Esther Plomp, the Regional Coordinator for Europe.

Objectives and context of the meeting

The Europe regional call aimed to understand whether it would be helpful to map the connections between existing OFKN network and chapter members, based on a pilot map created by Esther. Discussions led to the conclusion that instead of mapping the network members – it would be more helpful to map the projects on which they are working on.

Mapping

Esther kicked off the call by presenting the pilot map for the European region with a heavy focus on the OKFN and its individual members. She also shared alternative mappings that have been made available by others in the Open landscape:

Existing map on scientific disciplines and open science practices by Access 2 Perspectives
Research institute map by Cassandra Gould van Praag
Open Scholarly communications map by ASAPBio
Mapping the open movement by Open Future
South African DH & CSS Stakeholder Map Project by SaDiLaR

As well as overviews related to Open Knowledge such as:

Open Science Capacity Building Index by UNESCO
Organisations in data governance.pdf by the Datasphere Initiative
EU Common Data by Our Common Data Space
National stakeholders for open/federated infrastructure by the European Commission
Open Infrastructure Funding Dashboard by Invest in Open Infrastructure

After a quick review of the pilot map and the available resources, our discussion went into a different direction: It would be more helpful to find synergies between the activities in the different topic areas, rather than mapping individual members. This would result in more focus towards concrete activities geared towards international cooperation. A good way forward would be the working groups that were discussed in the OKFN Gathering in Katowice, such as the ‘Open Knowledge Festival 2025’, a ‘toolkit for regional advocacy’, and the ‘Mentorship programme; for the network. This led to a discussion of the Project Repository as a good example of how the ongoing activities in the network are already mapped.

Project Repository

The Project Repository is an overview of open projects set up by both Network Members and other projects promoting openness developed by organisations that are close allies to the Network. During the Europe regional call it became clear that it would be helpful to continue building on the Project Directory and make it easier for Network Members to find, understand and replicate projects.Ultimately, the Project Repository can support outreach and capacity building: When projects have similar goals the teams can take action together.

Increase awareness

The Project Repository may increase awareness for projects and facilitate people working together on similar goals. The way that the Project Directory is currently set up may not yet facilitate this collaboration, as the Directory is currently underused by the Network members. If the Repository is not widely used amongst network members it is highly unlikely that all their projects are currently listed. The overview may also become outdated if there are no clear mechanisms to update existing projects. To avoid an overflood of information and make it easier to get started, it may be more helpful to list fewer projects that are still active and focus on their progress or replicability for others. There are many benefits to copying existing projects instead of reinventing the wheel. One benefit is that it is easier to get funding for projects if you have a proof of concept. The Project Repository can be especially helpful here.

More information about how successful projects can be replicated

It is currently difficult to determine which projects would be easier to build on for others based on the Project Repository structure. For example, right now both Network projects and external projects are listed in the same colours.

It would be helpful if the Project Repository focused more on projects that could be replicated in other regions. For this, different information is needed than is currently available in the Project Repository. For example:

What are the requirements to successfully implement the project?
What are the success factors?
How can you get started with a small prototype version of the project?
How can the existing project support other teams that want to replicate the project?
User stories: how can this project be useful for certain audiences?
How does this project link with other projects in the Project Repository? Which Network members were involved?
What is the level of a project? Is it part of a larger (international) initiative?

This may require some restructuring to make it clearer what projects are active and which ones are easier to replicate. Additionally it may be easier to find relevant projects when they are filtered on the problems that they aim to address. The possibility of using the Project Repository as an incubator space was also raised.

Share success stories

One example highlighted by the Switzerland chapter was the Prototype fund, where Switzerland learned from the German experiences. Both teams will co-present their collaboration to inspire others in the next OKFN network call!

Next Steps

To improve the awareness for the Project Repository the next Open Knowledge Foundation Network call on the 26th of November will focus on this topic.

To move us into action, Esther will kick off the working group on the Mentoring Programme. In the future it will be important to track the progress of these different working groups: what is moving forward and where are we moving?

Ed Summers: Support

2024-10-30T07:00:00+00:00

I like Molly White’s idea that the web isn’t just a place for big corporations. It’s a place where we can try new things, and support others that are doing work that helps and inspires us.

The early web was marked by a lot of idealism, which has turned out to have been way off the mark given the degree to which we are exploited online. But not all the web is this way. We have more autonomy and agency than we think. We are able to experiment in ways that big tech can’t. We can wire things together in ways that they won’t, and talk about things that they won’t, and focus on our communities and co-ops and unions in ways that they can’t and won’t. And we can choose to support the people we see building the web this way.

It’s a simple idea, but worth watching the whole talk to fully understand this point she is making.

With that in mind, I thought it would be kind of fun to add a page to my website listing the people and projects I choose to financially support who work on the web. You can find it here.

I was wondering, is there a way to markup the HTML to indicate that I support these projects?

The primary purpose is to communicate this list to other people, so there’s not really a strong use case for marking it up so it could be understood by software. But I do like how tools like [StreetPass] can discover people in the Fediverse as you browse the web.

I suppose there’s probably some way to cobble something together with schema.org or Microformats, but perhaps something a well-known-URI for discovery would be helpful?

Planet Code4Lib

Hugh Rundle: Aligning open education programs with academic reward and recognition

John Mark Ockerbloom: “The art of self-tormenting is an ancient one”

HangingTogether: Transforming the library into a research support hub

The Research Alliance

Origin story

Why the creation of a research hub matters

Benefits, challenges, and lessons learned

Benefits

Challenges

Lessons learned

Synergies

Situating the library as a hub for research support

Looking ahead

Terry Reese: MarcEdit 7.7 Update

Lucidworks: Gen AI Implementation Costs Skyrocket: Navigating the AI Landscape in Manufacturing

Open Knowledge Foundation: Panel: The Tech We Want is Sustainable for People and the Planet

[Panel 3] The Tech We Want is Sustainable for People and the Planet

Summary

Read More

John Mark Ockerbloom: I yam what I yam, kinda

Mita Williams: The City As Classroom vs. The City As Advertising Platform

HangingTogether: Advancing IDEAs: Inclusion, Diversity, Equity, Accessibility, 12 November 2024

2024 International Conference of Indigenous Archives Library and Museums

IDEAs in Library Resources & Technical Services special issue

Neuroinclusive program provides future librarians with tools to succeed

John Mark Ockerbloom: The original “Middletown” study in the public domain

Eric Hellman: Running away from home

Eric Hellman: All the streets in Montclair

Eric Hellman: Running Song of the Day

Eric Hellman: We'll run 'til we drop

Eric Hellman: Thank you, New York City

John Mark Ockerbloom: A Farewell to Arms

Digital Library Federation: Announcing Incoming NDSA Coordinating Committee Members for 2025-2027

John Mark Ockerbloom: The debut of a long career in mystery and romantic suspense

John Mark Ockerbloom: Not eliminating the impossible

Ed Summers: Love is

John Mark Ockerbloom: Ain’t these tears in these eyes tellin’ you?

David Rosenthal: Nvidia vs. Intel

Ed Summers: Hope in the Dark

John Mark Ockerbloom: All singing! All dancing!

HangingTogether: Nieuw OCLC Research-rapport over Open Access gelanceerd

Van OA-beschikbaarheid naar vindbaarheid: de kloof overbruggen

Introductie van het rapport aan de Nederlandse bibliotheekgemeenschap

Volgende stappen: slimmer samenwerken

John Mark Ockerbloom: A writer of pessimism and grace

LibraryThing (Thingology): Author Interview: Andrea Jo DeWerd

Jodi Schneider: Information Quality Lab at the 2024 iSchool Research Showcase

Open Knowledge Foundation: Panel: The Tech We Want is Built and Maintained with Care

[Panel 2] The Tech We Want is Built and Maintained with Care

Summary

Read More

Richard Wallis: BIBFRAME Dilemmas for Libraries: Challenges and Opportunities

#1: Should linked data only be limited to bibliographic resources?

#2: How to bridge the gap between the library world and the wider web?

#3: The costs and challenges of transitioning to BIBFRAME

Conclusion

John Mark Ockerbloom: A woman who made her mark on the map

John Mark Ockerbloom: “You know it too well already…”

Lucidworks: OpenAI’s ChatGPT Adds Web Search: A Q&A on What It Means for Enterprise Search

LibraryThing (Thingology): SantaThing 2024: Bookish Secret Santa!

What is SantaThing?

How it Works

Important Dates

Supporting Indie Bookstores

Shipping

Questions? Comments?

Open Knowledge Foundation: Open Knowledge Achieves US Charitable Organisations Equivalency Status

What is NGOsource Equivalency Determination (ED)?

What This Means for Our Work and Communities

John Mark Ockerbloom: “He himself is so much bigger than his books”

Ed Summers: Cut-ups and LLMs

References

Cynthia Ng: Alternatives for Blogs on WordPress.com

Lucidworks: How Zero Results Are Killing Ecommerce Conversions

John Mark Ockerbloom: A Room of One’s Own, for all

John Mark Ockerbloom: The remainder of the Roaring 20s about to join the public domain

LibraryThing (Thingology): November 2024 Early Reviewers Batch Is Live!

Raffaele Messuti: Gli oggetti digitali del catalogo SBN

Quickstart per usare le API